When chatting with ChatGPT or Claude, the AI works within a context window—a fixed buffer holding your conversation plus system instructions.
What Are Tokens?
Context is measured in tokens, not words. One token ≈ 4 characters or ¾ word in English. Use OpenAI's Tokenizer to see how text splits.
Context Limits by Model
| Model | Context | ~Pages |
|---|---|---|
| GPT-3.5 | 16K tokens | ~25 |
| GPT-4 Turbo | 128K tokens | ~200 |
| Claude 3.5 | 200K tokens | ~300 |
| Gemini 1.5 | 1M tokens | ~1,500 |
Why It Matters
Forgetting. When conversations exceed the window, older messages drop. Research shows models also attend less to middle content.
Strategies
Summarize periodically. Ask AI to compress long discussions.
Front-load key info. Put important context at the start.
Use RAG. Retrieve relevant chunks instead of pasting everything.