Intermediate5 min120 XP

Tokens & Context

Learn the units an AI actually reads and why it can only hold so much in mind at once.

Tokens, not words

An AI model does not read whole words or single letters. It reads tokens, which are small chunks of text. A token is often a short word or a piece of a longer word, so "cat" might be one token while "unbelievable" could be three or four. A rough rule of thumb is that one hundred words is about one hundred thirty tokens in English. This is why model limits and prices are described in tokens.

The context window

The context window is the amount of text a model can pay attention to at one time, measured in tokens. Think of it as the model's short-term memory or its desk space. Everything matters here: your question, the documents you paste, and the conversation so far all share that same space. If the total goes past the window, something has to be left out.

What happens when it fills up

When a chat runs long and the context window fills, the oldest parts of the conversation can fall out of view. That is often why an assistant seems to forget something you said at the very start of a long session. The model is not being careless; the early text simply scrolled past the edge of its desk. Restating the key facts you still need is a quick fix.

Why it matters in practice

Understanding tokens helps you work smarter. If you try to feed a model a giant report that exceeds its window, you will need to trim it or split it into chunks. Keeping prompts focused leaves more room for the model's answer and can lower cost. So tokens are not just trivia; they shape what you can fit in and how much you pay.

Key takeaways

Models read tokens (small word-pieces), not whole words or letters.
The context window is a token-measured limit on what the model can hold at once.
Big inputs must be trimmed or chunked, and focused prompts save room and cost.

Start quiz →

4 questions · pass at 60% to earn XP