What is Context Window in AI? Understanding Token Limits

A context window is the maximum text a large language model can process at once—the AI's working memory.

📊 Context Window Evolution
2K→200KToken growth
~4 charsPer token
~750Words/1K tokens
1M+Gemini 1.5

When chatting with ChatGPT or Claude, the AI works within a context window—a fixed buffer holding your conversation plus system instructions.

What Are Tokens?

Context is measured in tokens, not words. One token ≈ 4 characters or ¾ word in English. Use OpenAI's Tokenizer to see how text splits.

Context Limits by Model

ModelContext~Pages
GPT-3.516K tokens~25
GPT-4 Turbo128K tokens~200
Claude 3.5200K tokens~300
Gemini 1.51M tokens~1,500

Why It Matters

Forgetting. When conversations exceed the window, older messages drop. Research shows models also attend less to middle content.

Strategies

Summarize periodically. Ask AI to compress long discussions.

Front-load key info. Put important context at the start.

Use RAG. Retrieve relevant chunks instead of pasting everything.

Frequently Asked Questions

Why does AI forget earlier messages?
When conversations exceed the context limit, older messages are dropped.
How many words is 100K tokens?
About 75,000 words or ~150 pages in English.
Does output count against context?
Yes. Input and output share the same context window.
Why not unlimited context?
Compute cost grows quadratically with length.
Is bigger always better?
Not always. Attention quality can degrade with extreme length.