Why You Need To Clear Your Coding Agent's Context Window
Imagine you're delegating work to a developer with a really bad attention span. The longer they work on something, the worse quality their output gets. First, you brief them on the project structure and coding standards.
Then you spend 30 minutes lecturing them on the last change to the authentication module. You walk through the OAuth flow, explain the session handling, cover every edge case you discovered. They nod along, taking mental notes, building up the full context.
"By the way," you add, "that was all just background for the task we worked on yesterday. You won't need any of that for your work today."
Now you say: "Implement the checkout page CSS bug fix."
This is essentially what you're doing to coding agents when you don't clear context between tasks.
Why Context Size Matters
LLM attention is quadratic. Each token must attend to every other token. Double your context, quadruple the computation. But it's not just slower. It's worse.
More tokens means more noise. The signal (your current task) gets diluted by everything that came before it.
The Quality Zones
Here's a mental model for understanding how your agent is functioning based on context usage:
Your goal should be to maximize the amount of time your agent spends in the "High Quality" zone (under 40% context window size).
The Simulation
Below we simulate a coding agent implementing a feature that requires updating multiple files. The simulation rules:
- Each file read uses 5% of the context window
- Each file write uses 5% of the context window
- The agent reads each file before editing it
We compare two approaches:
- Compact Approach (default): Keeps working in one session, compacting context when it gets too full.
- One Session Per Task: Clears context between files, starting fresh every time.
We track what quality zone the agent is in at the time of each file edit.
Click Start to run simulation
Click Start to run simulation
The compact approach keeps reusing the same session, compacting when it gets too full. The one-session-per-task approach clears context between tasks. Both complete the work, but look at where the edits land.
The Rule
One task, one session. Clear between tasks.
It's that simple. When you finish a task and start something new, clear your context. The agent reads your CLAUDE.md fresh. It has full attention capacity for the new problem. No accumulated baggage.
What if a single task is so large it pushes past 40% of the context window? Split it into sub-tasks and clear between them. I have a sub-agent that reads each of my plan files and splits it up into tasks appropriately sized for 40% of a context window.
But What About Compaction?
People argue for compaction because isn't some context better than none? But when that context isn't relevant to the current task, it's noise. Compressed noise is still noise.
After compaction, you still have compressed summaries of Tasks A, B, and C taking up space. The agent's valuable and limited working memory is full of information that will not help it complete its current task.
"But what if Task D needs context from Task B?"
If Task D genuinely needs information from a previous task, the agent can fetch it. Use plan mode at the start of a task and your agent will explore the codebase, read relevant files, and gather exactly the context it needs.
Compaction gives you lossy summaries of everything, but a smaller version of something irrelevant is still irrelevant. To maximize output quality, don't shrink the noise. Remove it.
But What About Context I Need?
If you're worried about losing important context, you're thinking about it wrong. Important context should be persistent, not accumulated.
AGENTS/CLAUDE.md is persistent. It loads every session automatically. Put your coding standards, architecture decisions, and project-specific instructions here.
Plan files are persistent. Claude Code writes your implementation plans to ~/.claude/plans/ and you can reference them across multiple sessions instead of relying on conversation history.
Issue trackers are persistent. GitHub Issues, Beads, or even local markdown/json files store task context outside of conversation history.
"But what about all the files my agent read? It's losing that context." That's fine. Agents are great at finding and reading files. Let it rediscover what it needs for the new task instead of carrying stale reads from the old one.
Is there context you really need to save? Maybe requirements you've typed out or a solution you've brainstormed with your agent? Save it to a markdown file and reference it in your next session. Sometimes the only output of my conversation with a coding agent is a markdown file I end up referencing later in a new session.
How to Clear
Type /clear. The conversation resets, your CLAUDE.md reloads, and you're ready for the next task.
Start a new chat with Cmd+N (Mac) or Ctrl+N (Windows/Linux). Your .cursorrules loads fresh.
Click "New Chat" or use Cmd+L (Mac) / Ctrl+L (Windows/Linux) to start fresh with your rules reloaded.
Start a new chat session. Your copilot-instructions.md loads automatically.
Some people resist this because it feels like losing progress. However, the approach you should take is to instead start each task by figuring out how to get all of the right context in place (plan mode usually does this for you) and then execute the task. Old context is always bloat.
The Emerging Paradigm
The industry is converging on a new workflow for coding agents. Tools like Beads (a git-backed issue tracker designed for AI agents) and techniques like Ralph Wiggum (autonomous loops that feed the same prompt until completion) point to a pattern:
- Research - The agent explores your codebase, often with sub-agents, to understand the problem
- Plan - The agent creates an implementation plan with a task list
- Implement - A fresh agent picks up a single task, completes it, then exits. Repeat until all tasks are done.
Your workflow becomes: spend time researching and planning a feature, split it into tasks, then run your agent repeatedly with the same prompt like: "pick up a task from <source>, work on it until completion, validate your changes work, then mark it as complete."
Each implementation agent starts with no knowledge of what has happened so far, reads the plan, selects a task, gathers the context it needs, executes, and exits.
The Takeaway
Context accumulation feels productive. You're building up knowledge, right? But for LLMs, it's the opposite. Every token you add pushes attention away from what matters now.
For a complete workflow that incorporates these principles, see My Claude Code Workflow for Building Features.
Want more like this? Get my best AI tips in your inbox.

Former Amazon engineer, current startup founder and AI builder. I write about using AI effectively for work.
Recommended Articles

My Claude Code Workflow for Building Features
A structured workflow for using Claude Code with sub-agents to catch issues, maintain code quality, and ship faster. Plan reviews, code reviews, and persistent task management.

The Agentic Loop: Stop Babysitting Your Coding Agent
The more your agent can run code and feed output back to itself, the less you do. Learn how to close feedback loops and let your coding agent iterate autonomously.

Your Coding Agent Can Upgrade Itself
When your coding agent makes a mistake, use that moment to upgrade the agent, with the agent itself. One sentence, 30 seconds, permanent fix.