Back to blog

The Agentic Loop: Stop Babysitting Your Coding Agent

January 5, 20264 min read
Share:

Most developers use coding agents wrong. They ask for code, run it themselves, hit an error, copy the error, paste it back, wait for a fix, run it again, hit another error, copy, paste, wait...

You're the bottleneck.

Your coding agent can work in a loop where it iterates by itself: write code, run it, see the error, fix it, run again, repeat until done. But you need to give it the right workflow. This post shows you how.

This applies to Claude Code, Cursor, Windsurf, Aider, and any coding agent that can run terminal commands.


The Anti-Pattern

Here's what most people do. The agent writes code, then waits while you manually run it and report back:

Broken Loop

Click Send to start the conversation

Add a user profile page with avatar upload

Four errors. Four times you had to context-switch, run the app, copy an error, and paste it back. Most of that 8 minutes was spent waiting on you.


The Fix

Same task. But this time, you tell the agent to run the code and verify it works:

Closed Loop

Click Send to start the conversation

Add a user profile page with avatar upload. Run the build and make sure it works before you're done.

The agent hit the same four errors. But it found them, fixed them, and moved on, all in 2 minutes with zero human intervention.

The difference? One sentence:

Run the build and make sure it works before you're done.


Why This Works

Your agent can already read error messages, interpret test failures, and understand what "broken" looks like in a browser. It just can't see them unless you show them, or tell it to look.

When you ask the agent to run and verify, you're giving it access to its own output. That's all it needs to iterate.

Yes, agentic loops use more tokens. The tradeoff is your time.


Four Ways to Close the Loop

1. Build, Lint & Test Errors

The gap: You run the build, see errors, paste them back. Same for tests.

Close it: Tell your agent to run builds and tests itself. It reads the error output, looks at the relevant files, and fixes issues. Test-driven development gets powerful when the agent can iterate on red-to-green without you.

Try these prompts:

Run the build and fix any errors until it passes.
Run the tests and fix any failures until they all pass.
Format, lint, build, and run tests. Fix any errors. Commit the changes with a descriptive message. If there are merge conflicts, resolve them. Push to the remote. Clean up any temporary test files or debug code.

2. API Integration

The gap: You ask for a Stripe integration. The agent writes 400 lines across 4 files. You run it: API key error. Fix touches all 4 files. Run again: invalid format. Another 4-file fix. Then a webhook signature error. Then a currency format issue. Each iteration touches the entire integration.

Close it: Have the agent write a 20-line test script first. One file. It runs the script, hits the API key error, fixes it, runs again, hits the format error, fixes it - same iterations, but contained to a single throwaway file. Once the script successfully creates a charge and handles a webhook, then the agent builds the real integration using the patterns it just validated.

Try this prompt:

Before building, write a test script to validate the integration works. Run it until it passes, then build the real feature.
Full Integration First
stripe.ts
180 lines
webhooks.ts
120 lines
types.ts
45 lines
errors.ts
60 lines
!API key error
fix 4 files
!Invalid format
fix 4 files
!Webhook sig
fix 4 files
!Currency fmt
fix 4 files
!Rate limiting
fix 4 files
!Idempotency
fix 4 files
405 lines touched per iteration
Script First
Phase 1: Validate
test-stripe.ts
20 lines
run → fix → run → fix → run
Works!
Phase 2: Build
stripe.ts
180 lines
webhooks.ts
120 lines
types.ts
45 lines
errors.ts
60 lines
Works first try
20 lines touched during iteration

The script-first approach front-loads the learning. Your agent figures out Stripe's quirks in a tight loop, then writes the production code once with confidence.

3. Data & Scripts

The gap: You need to investigate a bug in production data, so you write queries yourself, or log into your database client to check it out.

Close it: Have the agent write and run investigation scripts. It can query your database, print records, spot the issue, and either fix the data or fix the code causing bad data.

Try this prompt:

Write a script to investigate. Run it and tell me what you find.
Agent Investigation
Investigation
1
Query affected orders
SELECT * FROM orders WHERE discount IS NOT NULL
2
Spot the pattern
Total unchanged: $99 → $79.20
3
Trace to source
orders.ts:50 — total never updated
Code Fix
4
Fix the bug
order.total = original - discount
5
Write tests
Added unit tests for discount calculation edge cases
Data Correction
6
Dry run backfill
node backfill.js --dry-run
23 orders to fix, $1,847.20 total
7
Run backfill
node backfill.js
23 orders corrected
8
Verify fix
SELECT ... WHERE total != expected
0 rows returned ✓
9
Notify customers
node notify-customers.js
23 emails sent with corrected invoices
Done
Code fixed, data corrected, customers notified

Problems that used to require you to dig through data manually now get solved while you do something else. The agent queries, analyzes, and reports back with findings.

4. Visual Verification (Web Apps)

The gap: You build a feature, manually open the browser, click around, report what's wrong.

Close it: Give your agent browser access so it can see what it builds. If you use Claude Code, start with Claude in Chrome, Anthropic's official extension that controls your browser directly using your existing login sessions.

Other options:

  • Playwright MCP: Microsoft's MCP server with cross-browser support
  • Browser Use: Open-source automation for any AI agent
  • Browser MCP: Chrome extension for Cursor, VS Code, and Claude
  • Puppeteer: Google's Node.js library for headless automation

The key is giving your agent eyes on what it builds.

Try this prompt:

After implementing, open the browser and verify it looks correct. Take a screenshot to confirm.
Visual Verification Loop
Agent Terminal

Click Start to see the agent build and iterate

Browser Preview
localhost:3000/settings
MyApp
???

Same principle as terminal output. The agent sees what it built and can iterate.


Real Example: Shopify Sync

I needed to sync my local database and S3 to Shopify's product catalog. Instead of building the whole feature and hoping it worked, I had my agent create test scripts to figure out the process itself:

Real Example: Shopify Sync

Click Send to see the agent figure out Shopify's API

I need to sync products from our database and S3 to Shopify. Create test scripts to figure out the process, then build the sync feature.

I did nothing. The agent iterated through 5 failures, looked up documentation when stuck, and figured it out. I validated at the end. (I'd spent 2 hours debugging this manually before restarting with this approach. Finished in 15 minutes.)

That's the loop working.


Where Humans Belong

The loop doesn't mean zero human involvement. You're still the architect, and the guardrail against destructive mistakes. My checkpoints:

  • After reviews: plan review, code review, frontend review
  • Before data modifications: backfill scripts, migrations, anything destructive
  • Before commit: final sanity check

Let the agent iterate freely on read operations and safe changes. Pause before writes and destructive actions.


Homework

Pick one gap from above and close it this week:

  1. Easiest: Add this to your next request: "Run the build and fix any errors until it passes."

  2. Medium: Add this to your next request: "Run the tests and fix any failures until they all pass."

  3. Advanced: Install browser automation and try: "After implementing, open the browser and verify the feature looks correct. Take a screenshot to confirm."

The principle is simple: if your agent can see output, it can act on output. Stop being the messenger between your code and your agent.


Related

For my full Claude Code setup including CLAUDE.md configuration, hooks, and workflow patterns, see My Claude Code Workflow for Building Features.

Want more like this? Get my best AI tips in your inbox.