Advanced Context Engineering for Coding Agents

Jayant Upadhyaya
1d
7 min read

AI coding tools can feel magical on small, new projects. But once you bring them into a real codebase, the cracks show fast: messy changes, repeated rework, churn, and code that looks “done” but falls apart under review.

This is the gap that context engineering is trying to close.

Context engineering is not about finding a “perfect prompt.” It’s about building a workflow that keeps the model focused, accurate, and useful, especially in brownfield codebases (large, older, complex systems).

The core idea is simple:

Better input tokens = better output tokens.And since AI models are effectively “stateless,” the conversation window becomes the whole world they can see. If your context window is full of noise, wrong assumptions, or endless arguing, the agent will drift.

This blog explains practical techniques that help teams get real leverage from today’s models, without relying on hype.

Why AI Coding Feels Great in New Projects but Fails in Real Codebases

Greenfield vs. Brownfield coding illustration: "Clean Slate" in green with simple code, "Legacy Maze" in brown with complex code, rework needed. — AI image generated by Gemini

A common pattern shows up when teams use AI for software engineering:

Output increases (more code shipped)
But rework increases even more
Code gets rewritten soon after it ships
Complex tasks in existing codebases degrade into churn

This happens because:

Brownfield systems have deep hidden rulesArchitecture constraints, naming conventions, internal patterns, and history matter. The agent doesn’t know any of this unless you give it the right context.
The agent is easily steered off-trackIn large systems, there are hundreds of plausible “next steps.” Without precise context, the agent chooses wrong steps that look reasonable.
Bad context creates a bad trajectoryIf your conversation is filled with corrections and frustration, the agent is literally learning the pattern:“I do something wrong, user yells, I do something wrong again…”

So the enemy isn’t just “bad code.” The enemy is bad context.

The Naive Loop: Ask, Correct, Repeat, Run Out of Context

The most common way people use coding agents is:

Ask for a change
It does something wrong
Correct it
It does something else wrong
Correct it again
Repeat until you run out of context or patience

This loop fails because each turn adds more tokens, more confusion, and more contradictory information. In large projects, that turns your context window into a junk drawer.

The solution is not “more yelling” or “more prompts.”The solution is controlling the context window deliberately.

The Real Goal: Stay Out of the “Dumb Zone”

Every model has a context window limit. As you fill it, quality often drops. A useful way to think about it is:

Early context: model stays sharp and accurate
Past a certain point: it starts missing details, making wrong assumptions, and losing coherence

Many teams notice quality dropping hard around the middle of the context window, especially when the task is complex and the window is full of logs, tool output, and repeated back-and-forth.

Call that decline area the dumb zone.

You don’t beat the dumb zone by “being smarter.”You beat it by keeping your context lean.

Technique 1: Intentional Compaction (Reset the Conversation Without Losing Progress)

Cluttered code and notes on left, streamlined list on right. Arrow labeled "Compress & Clarify" points to organized context box. — AI image generated by Gemini

A powerful move is to compress the current state into a clean document, then restart in a fresh context window.

This is called intentional compaction.

Why it works

A model’s performance depends heavily on what it sees in the current window. If you can:

remove noise
remove irrelevant output
keep only the key facts and decisionsyou improve the odds of a correct next step.

What takes up most context?

In real coding sessions, huge chunks of context are wasted on:

file searching and navigation
codebase exploration
build and test output
tool logs dumping large JSON blobs
MCP tool descriptions and IDs
repeated arguments and corrections

Compaction replaces all of that with a clean summary.

What should a good compaction include?

A strong compaction focuses on truth and precision:

the goal (what you’re trying to change)
the relevant files
key functions/classes involved
exact line ranges or snippets that matter
what was tried and what failed
current decision and next step

Think of it as a “handoff note” to a fresh agent.

Technique 2: Use Sub-Agents to Control Context (Not to Roleplay)

A lot of people use sub-agents the wrong way. They create:

“frontend agent”
“backend agent”
“QA agent”
“data scientist agent”

That’s not the best use.

Sub-agents are not for pretending the system has personalities.Sub-agents are for controlling context.

The right way to use sub-agents

Use sub-agents for heavy research tasks that would pollute your main context, like:

searching a large codebase
reading multiple files
tracing code flow
finding where a feature is implemented
identifying the best insertion point

Then the sub-agent returns only a tight result, like:

the file path
the relevant section
short explanation of the flow
what to change and why

This keeps the parent agent’s context clean.

Why this matters

If the parent agent reads 10 huge files, you burn your context fast.If a sub-agent reads them and returns 10 lines of truth, you stay sharp.

Technique 3: Frequent Intentional Compaction (A Workflow, Not a Trick)

Flowchart showing Research, Plan, and Implement with arrows and icons. Includes "Iterative Feedback Loop" and "Stay in the Smart Zone" text. — AI image generated by Gemini

One-off compaction helps. But the real unlock comes when compaction becomes a regular part of how you work.

That leads to a workflow:

Research
Plan
Implement

This is often called Research → Plan → Implement (RPI), but the name doesn’t matter. What matters is the shape:

Research compresses truth
Planning compresses intent
Implementation executes with minimal confusion

This is how you keep your work in the “smart zone.”

Phase 1: Research (Compression of Truth)

The purpose of research is not to write code. It’s to answer:

How does the system work today?
Where is the correct place to make changes?
What are the constraints?
What are the footguns?
What files and functions matter?

Research output should be objective and grounded in the codebase.

Common research mistakes

reading too much without summarizing
letting the agent guess architecture
relying on outdated docs instead of code
producing a vague “overview” without actionable paths

A good research artifact should leave you with:

exact file paths
exact entry points
exact flows
clearly stated assumptions

Phase 2: Planning (Compression of Intent)

Planning is leverage.

A plan turns “what we want” into:

explicit steps
ordered changes
exact files to edit
exact snippets to modify
tests to run after each step

The best plans include code snippets (not full code, but enough to remove ambiguity).

Why planning improves reliability

The more precise the plan is, the less the model has to “invent.”

If your plan says:

“update X in file A”
“add function Y in file B”
“run tests Z”then even a weaker model can follow it.

The tradeoff

Longer plans can increase reliability but reduce readability.There is a sweet spot for each team.

Phase 3: Implement (Execution With Low Context)

Implementation should be boring.

If research and planning are strong, implementation becomes:

follow steps
make controlled edits
validate at each stage
keep diffs small and reviewable

This is where “no slop” becomes possible.

Why This Helps Teams: Mental Alignment

In normal development, code review serves two big jobs:

correctness
keeping the team aligned on why changes happened

When AI accelerates shipping volume, teams can lose alignment fast.

A plan-based workflow helps because:

a tech lead can review plans (high leverage)
reviewers can see why the change exists
you get a narrative, not just a wall of diff

Some teams even attach:

planning docs
prompt history
build/test resultsto PRs so reviewers understand the journey, not just the output.

This is important when code output becomes 2–3x larger than before.

Don’t Outsource the Thinking

This is the hard truth.

AI can generate code.AI can assist reasoning.But AI cannot replace thinking.

If you don’t read the plan, you are gambling.If your research is wrong, everything

downstream fails.

A helpful way to think about leverage:

A bad line of code is one bad line
A bad plan step can create 100 bad lines
A bad research assumption can destroy the entire change

So you move your attention to the highest leverage parts:

verify research
verify plan
then let implementation be automated

When You Don’t Need Full Context Engineering

Flowchart with four colored layers: Tiny Change, Small Feature, Medium Feature, Large Change. Arrows indicate increasing structure and artifacts. — AI image generated by Gemini

This workflow is powerful, but it’s not always necessary.

Use the right level of process for the task:

Tiny change (button color, copy edit): just tell the agent directly
Small feature (single module): quick plan + execute
Medium feature (multiple files): research + plan + execute
Large change (multi-repo, deep refactor): heavy research, compaction cycles, deeper planning

The “ceiling” of what you can solve increases with how much compaction and context discipline you apply.

Why “Spec-Driven Development” Became Confusing

A lot of people say “spec-driven dev,” but they mean different things:

writing better prompts
writing PRDs
making lots of markdown files
running feedback loops
treating code like assembly
documentation for OSS projects

When a term starts meaning everything, it becomes useless.

So instead of chasing labels, focus on what works:

compaction
context control
research and planning artifacts
staying out of the dumb zone

The Next Big Challenge: Team Workflow, Not Tools

Two men collaborate over a digital screen in a bright office. Diverse team works around, focused. Text: Contextual insights, clear intent. — AI image generated by Gemini

As coding agents become common, a new split appears:

seniors may avoid AI because early workflows create slop
juniors and mid-levels adopt AI heavily
seniors end up cleaning AI-generated messes
tension grows

That isn’t solved by “better models” alone. It’s solved by workflow and culture:

clear planning standards
context discipline
shared expectations
PR review that includes intent, not just diff

This change needs leadership. Otherwise, teams will ship faster but degrade system quality.

Practical Checklist to Start Using This Today

If you want a simple starting point, do this:

Stop doing long correction loopsIf you’re arguing with the agent, restart.
Compact intentionallySummarize the task, key files, and decisions into a doc.
Use sub-agents for researchDon’t pollute the main window with exploration.
Adopt Research → Plan → Implement for real workEspecially for brownfield tasks.
Make plans include filenames + snippets + testsRemove ambiguity.
Read the plan before you executeDon’t outsource thinking.
Keep context leanAvoid huge tool outputs where possible.

Final Thought: Context Engineering Is the Real Moat

The coding models will improve and become commoditized. What will separate strong teams from struggling teams is not which tool they picked.

It’s whether they built a workflow that:

keeps context clean
reduces churn
produces reviewable changes
maintains mental alignment
avoids the dumb zone
prevents slop

That’s what advanced context engineering is really about.

Talk to a Solutions Architect — Get a 1-Page Build Plan