Charting the Course to Intelligence through ExperienceA recent thought-provoking paper, Welcome to the Era of Experience, charts a course where AI agents learn predominantly from their own…Apr 28Apr 28
Two-Minute Refresher on Mixture of ExpertsWith the introduction of Mixture of Experts (MoE) layers in Llama 4, let’s recap what a MoE Layer is. Up until this point, the Llama 3…Apr 7Apr 7
Perplexity (the Metric) IlluminatedPerplexity, the metric, measures a language model’s ability to predict the next word in a sequence. Its a measure of how perplexed (or…Feb 11Feb 11
Experiments in Text-to-Verilog with ChatGPT-4oThis post logs my experiments with ChatGPT-4o with regards to taking logic expressed in natural language and translating it to Verilog…Aug 17, 2024Aug 17, 2024
“Where’s the Beef”, Codestral’s Fill-In-the-Middle MagicFill-in-the-Middle (FIM) is the ability of an LLM to generate the middle tokens sandwiched between (supplied) prefix and suffix tokens. To…Jun 4, 2024Jun 4, 2024
Rule-based Reasoning in LLMs via Stepwise RefinementRemember learning to program? My teacher, and ardent disciple of Niklaus Wirth, taught us the principle of stepwise refinement as we worked…May 5, 2024May 5, 2024
Distilling DistillationQuestion: Is 2024 the year of slimming down LLMs into lithe and sprightly SLMs? SLM, of course stands for Small Language Model, and aptly…Feb 4, 2024Feb 4, 2024
Distilling with LLM-Generated Rationales Yields Outperformance in Task-Specific Fine-tuning!Large Language Models are challenging to serve in practice making implementers gravitate towards distilled models. Distillation yields…May 28, 2023May 28, 2023