THE ultimate LLM course by Andrej Karpathy, OpenAI founder

#ai, #business, #chatGPT, #development

22. 12. 2025
Blaž Pregelj

On February 2025 Andrej Karpathy, one of the most influential educators and engineers in modern AI, and a founding member of OpenAI, published a 3 and a half hours long video “Deep Dive into LLMs like ChatGPT” where he explains why it is essential viewing for anyone working in AI and introduces the man behind it.

nox nox nox nox nox

Here are some of the comments below the video:

nox nox nox nox nox

Karpathy’s three and a half hour lecture is a guided tour through the entire lifecycle of large language models (LLMs) — from data collection and tokenization to post‑training reinforcement.

nox nox nox nox nox

The video is, despite very technical background, very easy to understand and is providing mental models that help reason about what LLMs are and how they actually work.

nox nox nox nox nox

Because LLMs now underpin many AI products and research directions, understanding their training pipeline and limitations has become as essential as knowing how to code.

nox nox nox nox nox

Watching the full video helps AI professionals and enthusiasts with this general, but very fundamental understanding and highlights both the magic and the sharp edges of today’s language models.

nox nox nox nox nox

Before we jump into the video, lets first answer the question

nox nox nox nox nox

Who Is Andrej Karpathy?

nox nox nox nox

Andrej Karpathy is one of the most influential educators and engineers in modern AI.

nox nox nox nox nox

He was a founding member of OpenAI, later served as Director of AI at Tesla, and led the development of the Autopilot computer‑vision system.

nox nox nox nox nox

During his PhD at Stanford, he co‑designed and taught the university’s first deep‑learning course, CS231n, which has since become one of the most popular machine‑learning courses in the world. Time magazine describes him as a key figure whose online lectures have reached millions and notes that he recently launched Eureka Labs, an AI‑native education platform, to bring AI‑assisted teaching tools to students.

nox nox nox nox nox

Karpathy’s teaching style is strongly influenced by the physicist Richard Feynman; he prioritizes clarity and builds intuition from the ground up. He often publishes free lectures, code examples (e.g., micrograd, nanogpt), and open‑source tools that demystify complex AI concepts.

nox nox nox nox nox

This free and high quality contributions has made him an anchor of the AI community and earned him a reputation for making cutting‑edge research accessible.

nox nox nox nox nox

What the Video Covers?

nox nox nox nox

Karpathy structures the lecture around the full LLM pipeline, with careful explanations of both the underlying mathematics and the practical engineering trade‑offs.

nox nox nox nox nox

Below is an outline of key topics he covers;

nox nox nox nox nox

Introduction and Mental Models

nox nox nox nox

LLMs are, at their core, next‑token predictors trained to maximize likelihood over huge corpora. He emphasizes that while LLMs can feel magical, they are not sentient; they operate by predicting the most probable continuation of a sequence.

nox nox nox nox nox

He introduces the idea of prompting as programming: the user writes a specification in natural language, and the model generates a completion. Understanding prompts as instructions helps demystify why specific phrasing can dramatically change outputs.

nox nox nox nox nox

Pre‑training: Data, Tokens and Transformers

nox nox nox nox

The first stage involves ingesting vast amounts of internet text. Karpathy breaks down how text is broken into tokens and fed into a transformer architecture. He explains attention mechanisms intuitively and demonstrates how context length (e.g., 4k, 8k, 128k tokens) affects capability. This stage teaches the model general knowledge but also inherits biases and noise from the source data.

nox nox nox nox nox

Fine‑tuning and Safety

nox nox nox nox

Next, he discusses supervised fine‑tuning, where the model is trained on curated question‑answer pairs, code examples and chat logs. This step aligns the model with a desired persona (e.g., helpful assistant), reduces toxicity and improves factual accuracy. It also introduces safety guardrails and specialized capabilities, such as code generation.

nox nox nox nox nox

Reinforcement Learning from Human Feedback (RLHF)

nox nox nox nox

Karpathy provides an accessible overview of RLHF/RLAIF, where a reward model is trained to prefer better responses. The LLM then learns to maximize this reward via reinforcement learning, leading to more coherent reasoning and safer outputs. He notes that the process is iterative and expensive but necessary to push models beyond mere mimicry.

nox nox nox nox nox

Evaluation and Limitations

nox nox nox nox

Throughout the lecture, Karpathy is candid about limitations: hallucinations (confidently wrong statements), brittle reasoning on complex tasks, difficulties with long‑horizon planning and the inherent unpredictability of large autoregressive models. He discusses strategies like prompt engineering, self‑reflection and retrieval‑augmented generation (RAG) to mitigate these issues. He also touches on ethical concerns around misuse, bias and the environmental cost of training.

nox nox nox nox nox

Tools, Demos and Code

nox nox nox nox

True to his Feynman‑inspired approach, Karpathy intersperses the talk with live coding demos using his nanoGPT implementation, showing how to build a tiny GPT from scratch. He demonstrates tokenization, training loops, sampling and how scaling up data and parameters leads to emergent abilities. These demos provide practitioners with a concrete starting point for experimentation.

nox nox nox nox nox

Why Should Watch It, specially if you work in AI?

nox nox nox nox

The lecture translates dense research papers into intuitive analogies and code. Even seasoned researchers will find new ways to explain LLM concepts to colleagues and stakeholders.

nox nox nox nox nox

Many of people who work with AI, use models via APIs without fully understanding how they were created.

nox nox nox nox nox

Karpathy connects the dots between data, architecture, training and deployment, fostering a systems‑level view.

nox nox nox nox nox

The talk highlights sharp edges — hallucinations, prompt injection, misalignment — and offers practical debugging techniques. In the video Andrej explains recent advances, such as reinforcement‑learning improvements and scaling trends.

nox nox nox nox nox

It is rare for a well‑known AI researcher to share such a comprehensive overview for free.

nox nox nox nox nox

Watching the video also supports the ethos of open knowledge sharing.

nox nox nox nox nox

nox nox nox nox nox nox nox nox nox nox nox nox nox nox nox nox

Share this post if you liked it.

Subscribe & dont miss next 📩

Continue reading

Arcads.ai

Arcads.ai Review: How good AI actors really are in 2026?

Arcads.ai is an ai tool for creating video UGC ads with AI actors. But how good it really is in 2026?nox

Continue Reading →

Tiktokenizer

#ai

Tiktokenizer Tool: Understand your tokens before they cost you

Tiktokenizer helps developers and AI engineers see exactly how LLMs break text into tokens, track usage, and optimize prompts to save time and API costs.nox

Continue Reading →

Meta acquired Manus

#ai

Meta acquired Manus

Manus joining Meta signals a shift from AI demos to AI agents that actually execute work at scale.nox

Continue Reading →

Excalidraw

#business

Excalidraw simple whiteboard tool

Excalidraw is a simple browser-based whiteboard for sketching ideas fast. A clean space for thinking with shapes, arrows, and rough diagrams.nox

Continue Reading →

Riley Walz

#development

Creative Inspiration #3: Riley Walz

I came across Riley Walz walzr.com almost by accident. noxnoxnoxnoxnox

Continue Reading →