
Technology
Course · 17 lessons · 2h 29m
Inside the Black Box: How LLMs and Transformers Work
After this course you can explain how transformers represent meaning, how LLM internals and circuits operate, and what interpretability tools reveal about model behavior.
By the end, you'll be able to
- Peering Inside the Black Box
- Using Symbolic AI to Explain LLMs
- The Weird Geometry That Makes AI Think
- Can Smaller Language Models Be Smarter?
Curriculum
17 lessons- 01Peering Inside the Black BoxMechanistic interpretability and artificial psycholinguistics are transforming our understanding of large language models. In this episode, Arshavir Blackwell explores how probing neural circuits, behavioral tests, and new tools are unraveling the mysteries of AI reasoning.
- 02Using Symbolic AI to Explain LLMsDelve into the mysterious world of neural circuits within large language models. We’ll dismantle the jargon, connect these abstract ideas to real examples, and discuss how circuits help bridge the gap between machine learning and human cognition.
- 03The Weird Geometry That Makes AI ThinkExplore how large language models use high-dimensional geometry to produce intelligent behavior. We peer into the mathematical wilderness inside transformers, revealing how intuition fails, and meaning emerges.
- 04Can Smaller Language Models Be Smarter?Today we explore whether mechanistic interpretability could hold the key to building leaner, more transparent—and perhaps even smarter—large language models. From knowledge distillation and pruning to low-rank adaptation, we examine cutting-edge strategies to make AI models both smaller and more explainable. Join Arshavir as he breaks down the surprising challenges of making models efficient without sacrificing understanding.6 min
- 05How Transformers Turn Words Into MeaningEmbark on a step-by-step journey through the inner workings of transformer models like those powering ChatGPT. Arshavir Blackwell breaks down how context, attention, and high-dimensional geometry turn isolated tokens into fluent, meaningful language—revealing the mathematics of understanding inside the black box.7 min
- 06Bridging Circuits and Concepts in Large Language ModelsHow do millions of computations inside large language models add up to something like understanding? This episode explores the latest breakthroughs in mechanistic interpretability, showing how tools like representational geometry, circuit decomposition, and compression theory illuminate the missing middle between circuits and meaning. Join Arshavir Blackwell as he opens the black box and challenges what we really mean by 'understanding' in machines.15 min
- 07The Mandela Effect in AI: Why Language Models MisrememberDive into how and why large language models like ChatGPT mirror the human Mandela Effect, reproducing our collective false memories and misquotations. Arshavir Blackwell examines the science behind errors in models and minds, and explores how new techniques can counteract these uncanny AI confabulations.11 min
- 08How Transformers Stack Meaning Like Finnish WordsExplore how large language models build up meaning in ways strikingly similar to the layered grammar of Finnish. Arshavir Blackwell reveals why understanding Finnish morphology offers a powerful analogy for interpreting the compositional logic inside modern AI systems.13 min
- 09Hallucinations, Interpretability, and the Seahorse MirageThis episode dives into why advanced language models still generate hallucinations, how interpretability tools help us uncover their hidden workings, and what the seahorse emoji teaches us about model and human reasoning. Arshavir connects groundbreaking research, practical business importance, and the statistical quirks that shape AI's version of 'truth.'10 min
- 10Inside Circuits: How Large Language Models UnderstandDive into the world of neural circuits within large language models. In this episode, Arshavir Blackwell unpacks how transformer circuits, attention mechanisms, and high-dimensional geometry combine to create the magic—and limits—of modern AI language systems.8 min
- 11When Knowledge Battles Noise in GPT ModelsExplore how GPT-2 balances fleeting factual recall with generic responses through internal competition among candidate answers. Discover parallels with human cognition and how larger models navigate indirect recall to reveal hidden knowledge beneath suppression.7 min
- 12Decoding Attention and Emergence in AIExplore how attention heads uncover patterns through learned queries and keys, revealing emergent behaviors shaped by optimization. Dive into parallels with natural selection and psycholinguistics to understand how meaning arises not by design but through experience in both machines and brains.6 min
- 13Decoding GPTs Hidden CircuitsExplore how sparse autoencoders and transcoders unveil the inner workings of GPT-2 by revealing functional features and computational circuits. Discover breakthrough methods that shift from observing raw network activations to mapping the model's actual computation, making AI behavior more interpretable than ever.11 min
- 14Cracking the Code of AI InterpretationDive into how we naturally explain neural networks with folk interpretability and why these simple stories fall short. Discover the journey toward mechanistic understandability in AI and what that means for how we talk about and trust large language models.10 min
- 15Unlocking BERTs Hidden GrammarExplore how BERT’s attention heads reveal an emergent understanding of language structure without explicit supervision. Discover the role of attention as a form of memory and what it means for the future of AI language models.9 min
- 16Inside a Fine-Tuned Language ModelA concise, single-segment episode of Inside the Black Box: Cracking AI and Deep Learning where Arshavir Blackwell explains, in one continuous narrative, what neural networks are, how their simple units combine into powerful systems, and how learning by backpropagation sculpts their behavior. This short episode is designed as an elegant, one-paragraph-style monologue that introduces listeners to neural nets without equations or jargon.19 min
- 17Fine Tuning Lora: It's Not What You ThinkWhen you fine-tune an AI model, what changes inside doesn't predict what changes outside. This week on Inside the Black Box, I break down why — and what it means for anyone auditing or regulating these systems.15 min
Your instructor
Inside the Black Box: Cracking AI and Deep Learning
How do Large Language Models like ChatGPT work, anyway?
Visit Inside the Black Box: Cracking AI and Deep LearningStart the course
17 lessons · 2h 29m. Free, no signup.
More in Technology
See all
GermanDokumentenmanagement und digitale Prozesse im Mittelstand
Nach diesem Kurs kannst du einen DMS-Rollout planen, Rechnungs- und Vertragsprozesse automatisieren und Compliance-Vorgaben wie NIS2 erfüllen.
13 lessons · 12 minStart
ItalianStrumenti open source per la didattica digitale
Al termine del corso sai usare piattaforme open source come NextCloud, PeerTube, OBS e BigBlueButton per gestire la didattica digitale in modo indipendente dalle big tech.
14 lessons · ~3h 2mStart