What Is a Large Language Model?

You keep hearing the term “large language model,” often shortened to LLM. ChatGPT runs on one. So do Claude from Anthropic, Gemini from Google, and many other AI chatbots you encounter today. But what actually is an LLM?

The easiest way to separate the terms is this: ChatGPT, Claude, and Gemini are products you can talk to. An LLM is the model doing the language work underneath.

Think of it like a very experienced reader

Imagine someone who has read hundreds of millions of documents: novels, textbooks, articles, forum posts, code, recipes, legal contracts, transcripts, you name it.

After all that reading, they have developed an intuition for how language works. They can finish a sentence in a way that sounds natural. They can explain a concept, translate between styles, summarise an argument, or write in a specific tone.

They did not memorise everything word-for-word. They absorbed patterns. That is roughly what an LLM does.

What “large” means

The word “large” refers to the scale of two things:

The training data. LLMs learn from enormous amounts of text. GPT-4 is estimated to have trained on trillions of words.
The model itself. LLMs have billions of parameters, which are the internal numbers the model adjusts during learning. More parameters generally means the model can capture more complex patterns.

Large does not mean better in every way. Bigger models cost more to run and are slower. But at the scales used by modern LLMs, larger models tend to be more capable.

What “language model” means

A language model is a system trained to understand and generate text. Specifically, it learns to predict what comes next given what has come before.

During training, the model sees a sentence with the last word hidden, tries to predict it, checks how close it was, and updates itself to do better next time. Repeat that billions of times across trillions of words and you get a system that is surprisingly good at language.

How an LLM learns

1See text with a word hidden: “The cat sat on the ___“

2Predict the missing word

3Check against the real word

4Adjust and repeat, billions of times

LLMs versus traditional software

Traditional software follows rules. A calculator does exactly what you program it to do. If you did not write a rule for something, the program cannot handle it.

LLMs work differently. They do not have explicit rules for every situation. Instead, they have patterns absorbed from training. That is why they can handle questions no one specifically programmed an answer for. It is also why they sometimes get things wrong in unexpected ways.

What LLMs are good at

Following instructions in natural language
Explaining, summarising, and rewriting text
Writing code and spotting bugs
Translating between languages
Answering questions on topics covered in their training data

What LLMs struggle with

Facts that require real-time information
Precise arithmetic and counting
Reliable reasoning through complex chains of logic
Knowing when they do not know something