Ever wonder why ChatGPT sometimes cuts off mid-sentence, or why using AI costs more when you send a long message? The answer comes down to one small but important idea: tokens.
Think of it like puzzle pieces
Imagine you have a sentence printed on a strip of paper. Now you take scissors and chop it into small pieces, not always at clean word boundaries, just wherever it is convenient to cut.
That is roughly what an AI model does before it reads your text. It chops everything into small chunks called tokens.
A token is a small piece of text, often a word, part of a word, or a punctuation mark. It is the unit an AI model actually works with.
Sentence strip
ChatGPT is helpful
Here is a simple sentence and how it might get split:
| Text | Tokens |
|---|---|
Hello, world! | Hello · , · world · ! |
ChatGPT is helpful | Chat · G · PT · is · helpful |
unbelievable | un · believ · able |
Notice that ChatGPT splits into three pieces. Long or unusual words often break apart. Short, common words usually stay whole.
Different AI models may split the same text in slightly different ways. The exact pieces are not the main point. The important idea is that the model reads chunks, not full sentences the way humans do.
The rough rule of thumb
In English, one token is about four characters, or roughly three-quarters of a word. So:
- 100 words ≈ 130 tokens
- 1,000 words ≈ 1,300 tokens
Other languages can use more tokens per word because their characters map differently.
Why does this matter?
Tokens are not just a technical detail hidden in the background. They directly affect three things you care about:
Cost
AI APIs charge by the token. Longer messages and longer responses cost more. If you are building something with AI, trimming unnecessary words saves real money.
Speed
The model has to work through the tokens you send, then generate its answer token by token. Fewer tokens usually means a faster response.
Context window
Every AI model has a context window, the maximum number of tokens it can hold in working memory at once. Think of it like a desk with limited space. If your conversation gets too long, older parts slide off the edge and the model forgets them.
That is why ChatGPT can “forget” something you said earlier in a very long chat. It is not a bug. The desk just ran out of room.
Context desk
When new text takes up the desk, the oldest pieces can slide out of reach.
A quick example
Take this message:
“Can you explain what machine learning is in simple terms?”
That is 12 words, which works out to roughly 15 tokens. Short and cheap.
Now imagine you paste in a ten-page document and ask a question about it. That could be several thousand tokens, and it eats into the context window fast.
What this means for you
You do not need to count tokens manually. But knowing they exist helps you:
- Write cleaner prompts. Get to the point. Extra fluff adds tokens without adding value.
- Understand context limits. If the AI seems to forget earlier parts of your chat, the context window is probably full.
- Estimate costs. If you are building an AI-powered tool, token counts help you budget.
AI models read text in chunks called tokens, not whole words or individual letters. One token is roughly four characters. Tokens affect how much a model costs to run, how fast it responds, and how much text it can remember at once.