Large Language Models

Context Windows Are Getting Absurd — And That's a Good Thing

Not long ago, feeding a large language model more than a few paragraphs was a significant challenge. Today, models like Gemini 1.5 Pro can process 2 million tokens — roughly 1,500 books — in a single context window.

What Is a Context Window?

A context window is the total amount of text an LLM can process at once — both as input and output. Everything the model knows about your conversation, the document you uploaded, the instructions you gave — all of it must fit within the context window.

Why Size Matters

Early GPT models had context windows of just 2,048 tokens. GPT-3 expanded to 4,096. Today, Claude 3 offers 200K tokens, Gemini 1.5 offers 1M, and some models support 2M or more. This isn't just quantitative — it's qualitative.

What Becomes Possible

  • Analyzing entire codebases in a single prompt
  • Summarizing full books without chunking
  • Long-running conversations with complete memory
  • Multi-document synthesis and comparison

The Challenges

Larger context windows come with real costs. Processing 1M tokens requires enormous compute. There's also the lost in the middle problem: LLMs perform worse at retrieving information from the middle of very long contexts compared to the beginning and end.

Conclusion

Context windows will continue to expand. The more interesting question is whether models will learn to use large contexts effectively — not just process them. This combination may be one of the most important near-term developments in AI.