When Your AI Agent Breaks at 3 AM: A Production Debugging Playbook
Real failure modes from running autonomous AI agents in production — silent loops, stale locks, token spikes, hallucinated tool calls — and how to catch them before they cost you.
When Your AI Agent Breaks at 3 AM: A Production Debugging Playbook
Running an autonomous agent in production is not like running a web service. A web service either responds or it doesn't. An agent can be fully online, burning tokens, hitting your APIs, and still doing something completely useless — or worse, something destructive — and you won't notice until the bill shows up.
I've had agents go sideways in maybe fifteen distinct ways over the past year. Some failures repeat. The patterns below are the ones I now check for first, in roughly the order they tend to bite.
Failure Mode 1: The Silent Loop
The agent decides it needs to "check status" before doing the next step. The status check fails ambiguously. The agent tries again. The check fails again. Two hours later you've spent forty dollars on tokens and nothing has happened.
Loops are the most common production failure I see and they almost never log an error, because from the agent's point of view nothing is wrong. It's making progress. It just can't quite get past step three.
How to catch it:
- Token rate alerts. Set a hard ceiling on tokens-per-hour per agent. If it spikes past baseline, page yourself. Mine fires at 3x average over a 30-min window.
- Action diversity check. Log every tool call. If the same tool is called more than N times in a window with similar arguments, kill the run and surface it. I use N=5 within 10 minutes.
- Forced progress markers. Require the agent to write to a
progress_logtable on every meaningful step. If no rows appear in 15 minutes, something is wrong.
The fix is almost always to make the failing step return a clearer error so the agent knows to ask a human instead of retrying.
Failure Mode 2: The Stale Lock
You use a lockfile or a database row to make sure only one tick of your agent runs at a time. Good instinct. Then a tick crashes, the lock isn't released, and every subsequent tick exits immediately because the lock looks held.
The agent appears to be running on schedule (cron is firing) but nothing is actually happening. You'll find this hours or days later when you wonder why a project hasn't moved.
How to catch it:
- Lock rows should always include a
started_attimestamp. - On checkout, ignore locks older than your max-tick-duration (mine is 15 minutes).
- Log every lock acquire and release to a separate table so you can see the pattern.
This is a five-line fix once you know about it. It's a four-hour outage if you don't.
Failure Mode 3: Hallucinated Tool Calls
The agent calls a tool that doesn't exist, with parameters that don't match any schema. The MCP server returns an error. The agent reads the error, tries again with slightly different parameters, fails, and either loops (see #1) or gives up and tells you the task is "complete" when it isn't.
This happens most when you have many MCP servers loaded and the agent picks a tool name from the wrong one, or when you've recently renamed a tool and old conversation context still references the old name.
How to catch it:
- Log every tool call result. Filter for non-zero exit codes or error responses.
- Run a periodic "did the agent actually do what it said it did" audit. For my content engine, I check that files claimed to be created actually exist and have non-zero size.
- Validate completion claims against side effects, not against the agent's own self-report.
The agent is unreliable at telling you whether it succeeded. The filesystem isn't.
Failure Mode 4: The Permission Mismatch
Cron runs as one user. The agent runs as another. The script runs as a third. One of them owns the file, the other can't read it, and you get a permission denied that surfaces as a generic "I couldn't access that resource" in the agent's reply.
This one is easy to debug once you suspect it and almost impossible to figure out from inside the agent's reasoning, because LLMs are not great at distinguishing "file doesn't exist" from "file exists but I can't read it."
How to catch it:
- Run your agent under a single dedicated user, not root.
- All paths the agent touches should be owned by that user.
- When the agent says "I couldn't find X," your first move should be to
ls -la Xas the agent's user, not to assume the agent is wrong.
Failure Mode 5: The Context Bomb
The agent loads a 50,000-line log file into context to "investigate the issue." The next prompt costs $4. Then it loads another file. The next prompt costs $7.
Modern long-context models make this worse, not better, because the agent will happily fill 200K tokens of context if you let it.
How to catch it:
- Per-conversation token budget. Hard cap at, say, 500K tokens total before the agent has to start a fresh session.
- Monitor input token cost separately from output token cost. Input cost spiking is the canary.
- Teach the agent (in its system prompt) to use grep or head before reading entire files.
Failure Mode 6: The Confidently Wrong Database Write
The agent decides to update a database row. It picks the wrong WHERE clause. Now you have a project marked status='done' that isn't, or a customer flagged as paid who isn't.
This is the failure that scares me most because it can be silent for weeks.
How to catch it:
- Use database triggers to write every UPDATE to an audit table with
old_valuesandnew_values. - Periodically diff the agent's claimed actions (from its own log) against the audit table.
- For high-stakes tables, require a two-step process: agent stages a change, a separate validator agent or human approves.
Don't give a single agent unbounded write access to your billing tables. I shouldn't have to say this. I'm saying it because I've seen it.
What Actually Works for Monitoring
Out of all the dashboards and alerts I've tried, three signals carry 80% of the value:
- Tokens per hour, per agent. Anomaly detection on this catches loops, context bombs, and runaway retries.
- Time since last meaningful action. Defined as time since last write to your
progress_logtable. If this exceeds your tick interval, something is wrong. - Side-effect verification. A separate cron job that checks the agent did what it said it did. Files exist, rows were inserted, messages were sent.
You can build all three with a few hundred lines of code and a Postgres table. You don't need a vendor product.
The Mental Shift
The biggest debugging upgrade I made wasn't tooling. It was internalizing that the agent is an unreliable narrator about its own behavior. It will tell you it succeeded when it didn't. It will tell you it tried something when it didn't. It will explain its reasoning in a way that sounds plausible but doesn't match what it actually did.
Trust the side effects. The database row, the file on disk, the message that arrived (or didn't). Everything else is the agent telling you a story about what it thinks happened, and that story is sometimes fiction.
Build your monitoring around what's verifiable from outside the agent's own reasoning. That's the discipline. Once you have it, debugging in production stops being scary and starts being a normal engineering problem.