Context window
The maximum number of tokens an LLM can attend to in a single inference call — both the prompt and the generated output count against it. As of 2026, frontier models range from 200k tokens (GPT-5) to 1M+ tokens (Gemini 2.5, Claude Sonnet 4.6 with 1M extension).
Background
The context window is a hard limit on how much text the model can "see" at once. Larger windows let agents read more of a codebase, include longer conversation histories, and process larger documents. But context is not free: cost scales linearly with input tokens, and quality often degrades on retrieval tasks far inside a very large prompt ("lost in the middle"). Coding agents use techniques like sliding-window summarisation, file-level chunking, and RAG to stay within budget.
Tools that use it
- 01→Aider
Open-source CLI coding agent that pair-programs in your terminal and commits to git automatically.
- 02→Claude Code
Anthropic's official CLI agent for Claude — runs in the terminal, edits files, executes commands, and ships PRs.
- 03→Cursor
AI code editor forked from VS Code with built-in agent, multi-file edits, and tab-completion.