How LLMs Work 5 min read

What Are AI Tokens?

Ever wonder why ChatGPT sometimes cuts off mid-sentence, or why using AI costs more when you send a long message? The answer comes down to one small but important idea: tokens.

Think of it like puzzle pieces

Imagine you have a sentence printed on a strip of paper. Now you take scissors and chop it into small pieces, not always at clean word boundaries, just wherever it is convenient to cut.

That is roughly what an AI model does before it reads your text. It chops everything into small chunks called tokens.

A token is a small piece of text, often a word, part of a word, or a punctuation mark. It is the unit an AI model actually works with.

Sentence strip

ChatGPT is helpful

ChatGPT is helpful

Here is a simple sentence and how it might get split:

TextTokens
Hello, world!Hello · , · world · !
ChatGPT is helpfulChat · G · PT · is · helpful
unbelievableun · believ · able

Notice that ChatGPT splits into three pieces. Long or unusual words often break apart. Short, common words usually stay whole.

Different AI models may split the same text in slightly different ways. The exact pieces are not the main point. The important idea is that the model reads chunks, not full sentences the way humans do.

The rough rule of thumb

In English, one token is about four characters, or roughly three-quarters of a word. So:

  • 100 words ≈ 130 tokens
  • 1,000 words ≈ 1,300 tokens

Other languages can use more tokens per word because their characters map differently.

Why does this matter?

Tokens are not just a technical detail hidden in the background. They directly affect three things you care about:

Cost

AI APIs charge by the token. Longer messages and longer responses cost more. If you are building something with AI, trimming unnecessary words saves real money.

Speed

The model has to work through the tokens you send, then generate its answer token by token. Fewer tokens usually means a faster response.

Context window

Every AI model has a context window, the maximum number of tokens it can hold in working memory at once. Think of it like a desk with limited space. If your conversation gets too long, older parts slide off the edge and the model forgets them.

That is why ChatGPT can “forget” something you said earlier in a very long chat. It is not a bug. The desk just ran out of room.

Context desk

System instructionsYour questionDocumentOld chat

When new text takes up the desk, the oldest pieces can slide out of reach.

A quick example

Take this message:

“Can you explain what machine learning is in simple terms?”

That is 12 words, which works out to roughly 15 tokens. Short and cheap.

Now imagine you paste in a ten-page document and ask a question about it. That could be several thousand tokens, and it eats into the context window fast.

What this means for you

You do not need to count tokens manually. But knowing they exist helps you:

  • Write cleaner prompts. Get to the point. Extra fluff adds tokens without adding value.
  • Understand context limits. If the AI seems to forget earlier parts of your chat, the context window is probably full.
  • Estimate costs. If you are building an AI-powered tool, token counts help you budget.
You now understand this

AI models read text in chunks called tokens, not whole words or individual letters. One token is roughly four characters. Tokens affect how much a model costs to run, how fast it responds, and how much text it can remember at once.