How LLMs Work 5 min read

What Is a Context Window?

Have you ever had a long conversation with an AI and noticed it seemed to forget something you mentioned much earlier? Or seen an error like “conversation is too long”? That is the context window at work.

Think of it like desk space

Imagine you are working at a desk. You can only fit so many papers on it at once. As new documents arrive, you have to slide older ones off the edge to make room. Once something slides off, it is gone for now. You cannot refer to it anymore.

The context window is the AI’s desk. It is the total amount of text the model can hold in working memory at one time: your messages, the AI’s replies, any instructions from the app, and any documents you paste in.

Everything fits on the desk or it does not get used.

Your context window

System instructions
Your first message
AI reply
Your second message
AI reply
Pasted document
← older messages slide off

When the window fills up, the oldest content is dropped.

How it is measured

Context windows are measured in tokens, not words or characters. You learned about tokens in the previous lesson. As a rough guide, 1,000 tokens is about 750 words.

Models vary a lot in their context window size:

ModelApproximate context window
GPT-3.516,000 tokens (about 12,000 words)
GPT-4o128,000 tokens (about 96,000 words)
Claude 3200,000 tokens (about 150,000 words)

Larger context windows let you paste in longer documents, have longer conversations, and give the model more background to work with.

Why it matters for you

Pasting long documents. If you paste a whole book into a chat, you may hit the limit. The model will either refuse or start ignoring the parts that do not fit.

Long conversations. In a very long chat session, the model may seem to “forget” something you said early on. It has not forgotten. That message simply slid off the desk.

System instructions. Apps built on top of AI often include hidden instructions at the start of the conversation. These take up context space before you even type your first word.

What you can do about it

  • Keep prompts concise. Extra words eat into your window.
  • Start a new conversation when switching topics. You get a fresh desk.
  • If the model seems confused about earlier context, summarise the key points in your next message.
  • When pasting documents, paste only the relevant section rather than the whole thing.
You now understand this

The context window is the total amount of text an AI model can work with at once, measured in tokens. Think of it as desk space. When it fills up, older content slides off and the model can no longer reference it. Knowing this helps you understand why AI sometimes “forgets” things and how to work around that limit.