Module 13 — What an LLM Actually Is (No Magic)

🎯 Goal: Understand what a Large Language Model (LLM) really is — and just as importantly, what it can't do — so you have good instincts for what's possible. ⏱️ Time: ~25 min of reading. No hands-on.

This is the "look under the hood" module. You don't need it to use Claude Code, but understanding roughly what's happening makes you far better at it — and far less likely to be surprised or fooled.

The big secret: it's fancy autocomplete

You know how your phone suggests the next word as you type? "See you" → "soon"? An LLM is that idea, scaled up almost unimaginably.

At its core, an LLM does one thing: given some text, it predicts the next little chunk of text (called a token — roughly a word or part of a word). Then it adds that chunk to the text and predicts the next one. And again. And again — one piece at a time — until it decides the answer is complete.

That's it. There's no tiny person inside thinking. It's a gigantic mathematical equation that has read a huge fraction of everything humans have ever written, and has become extraordinarily good at guessing what plausibly comes next.

Analogy: Imagine the world's most well-read improv actor. They've read nearly every book, article, and conversation ever written, and they're brilliant at continuing any piece of text in a believable way. Ask a question and they don't "look up" the answer — they continue the text in the way a knowledgeable person would. Usually that's right. Sometimes it just sounds right.

An LLM predicts the next token — often just part of a word — then adds it and predicts again. The whole answer is built one chunk at a time. That's autocomplete at massive scale.

Why "just autocomplete" can still feel so smart

If it's only predicting the next word, why does it write essays, debug code, and explain things clearly? Because to predict the next word really well across billions of examples, it had to absorb deep patterns: grammar, facts, reasoning styles, the structure of a good explanation, how code works. Genuine capability emerges from doing autocomplete at massive scale.

So it's not "just" autocomplete in a dismissive sense — but the autocomplete nature explains its quirks, which is what you actually need to know.

The most important limitation: it can be confidently wrong

Because the model produces plausible-sounding text, it can generate something that sounds completely authoritative but is simply made up. This is often called a hallucination. The improv actor never breaks character to say "I don't actually know" — they'll smoothly continue with something that sounds right.

This is the single most important thing to internalize. It's why this whole course drills verify the output:

It might invent a fact, a date, or a name.
It might write code that looks perfect but has a subtle bug.
It might confidently misread one of your documents.

None of this means it's bad — it's incredibly useful. It means you stay the checker. Trust it to do the work; verify the result matters.

It's frozen in time (and doesn't know your world)

An LLM is trained once — fed enormous amounts of text — and then frozen. After that, it doesn't learn, remember you between conversations, or know anything that happened after its training ended (its "knowledge cutoff"). On its own, it also can't see your files, check today's date, or look anything up.

Analogy: Picture a brilliant expert who's been sealed in a library room since the day their training ended. They know a vast amount up to that date — but nothing since, nothing about your specific documents, and they can't leave the room to check. To help them, you have to hand things through the door.

On its own the model is sealed in: no memory of you, no live facts, no reach into your files. Tools are the door it hands things through — the subject of the next module.

That "handing things through the door" is tools — the subject of the next module, and the reason Claude Code can actually read your PDFs and run programs despite the model itself being a frozen brain in a room.

Not one model — many models, many makers

"LLM" is a category, like "car." Several companies build them, each with their own family of models:

Maker	Their models (you'll hear these names)
Anthropic (makes Claude Code)	Claude
OpenAI	GPT / ChatGPT
Google	Gemini
DeepSeek, Meta (Llama), and others	various, some freely available

They differ in skill, speed, cost, and personality, and they all improve constantly. Claude Code, naturally, runs on Anthropic's Claude models.

Some models can see, not just read (multimodal)

Early models only handled text. Modern ones are multimodal — they can also take in images. That's why you can paste a screenshot of an error or a photo of a document into Claude and ask "what's going on here?" It still outputs text, but it can see pictures as input. (Some models can handle audio or other formats too.)

Fast-and-cheap vs. slow-and-thinking

Within one family there are usually different sizes, and you trade off speed/cost against raw capability. Claude's line-up is a good example:

Tier	Like…	Good for
Small / fast (e.g. Haiku)	a quick, sharp colleague firing off an answer	simple, high-volume, speed-sensitive tasks
Balanced (e.g. Sonnet)	a strong all-rounder	most everyday work
Most capable (e.g. Opus, and Anthropic's flagship models)	an expert who'll go away and really think it through	the hardest reasoning and long, complex jobs

Newer models can also do extended thinking — instead of answering instantly, they spend extra effort reasoning step-by-step before responding. Better answers on hard problems, but slower and more expensive. Cheap-and-fast for the easy stuff; powerful-and-thinking for the hard stuff. Claude Code generally picks a strong model for you, so you don't have to manage this — but now you know what people mean by it.

✅ Takeaways

An LLM predicts text one chunk at a time — fancy autocomplete at massive scale.
That scale produces real capability, and the risk of confident, plausible wrongness — so you always verify what matters.
The model is frozen and isolated: no memory between chats, no live knowledge, no direct access to your files — until you give it tools.
Many makers (Anthropic/Claude, OpenAI/GPT, Google/Gemini, …); models vary by speed, cost, and power; many are now multimodal (can see images).

Next: Module 14 — Giving the Brain Hands, where the frozen brain gets tools, and you learn how to direct it well.