Module 19 · Lesson 02
Error Handling in Agentic Workflows — Validation, Degradation, and Escalation
Reading time: 20 minutes Track: Yungsten Tech Employee Curriculum · Implementation Engineering
The failure modes of agentic systems
Agentic systems fail differently than traditional software. A database query either returns data or throws an exception. An agent can return output that is syntactically correct, appears plausible, and is completely wrong — without any error signal.
The three failure categories to design for:
Category 1: Tool failures An MCP server returns an error, times out, returns empty results, or returns malformed data. The agent may or may not handle this gracefully depending on how the system prompt is written.
Category 2: Output quality failures The agent produces output that doesn't meet the quality criteria — hallucinated facts, wrong format, off-scope response, missing [VERIFY] flags where they were expected. No error is thrown. The operator gets a bad result.
Category 3: Scope failures The agent handles a request that's outside its intended scope — either because the request slipped through the constraints or because the operator used the agent for something unintended.
Validating agent output in system prompts
The most underused reliability tool is the system prompt itself. You can build validation logic directly into the prompt:
Structured output with a validation marker:
At the end of every response, include a validation line:
VALIDATION: [scope: IN/OUT] [sources: CITED/UNCITED] [flags: count]
- scope IN/OUT: was this request within my defined job?
- sources CITED/UNCITED: did I use external claims that should be verified?
- flags count: how many [VERIFY] flags did I include?
When operators see VALIDATION: [scope: OUT], they know the agent reached outside its lane. This doesn't prevent the output — it surfaces it.
Self-check instructions:
Before providing your response, check:
1. Is this request within my defined scope? If not, say so before proceeding.
2. Have I made any factual claims I can't verify from the provided context?
If yes, mark each with [VERIFY: reason].
3. Does my output match the specified format?
These don't guarantee correct output, but they dramatically reduce silent failures.
Handling MCP tool failures
When an MCP server fails, Claude's default behavior is to tell the user the tool failed and ask how to proceed. This is reasonable behavior for development; it's jarring in production.
Better pattern — graceful degradation in the system prompt:
If you cannot access the filesystem tools or they return an error:
1. Tell the operator clearly: "I couldn't access [tool name]. Error: [what you received]."
2. Ask: "Do you want to paste the document directly, or should we try again?"
3. Do not attempt to answer from memory or training data what should come from the tool.
The four-tier tool failure response:
| Failure type | Response |
|---|---|
| Tool unavailable | Tell operator, offer manual input path |
| Tool returns empty | Tell operator, ask if the empty result is expected |
| Tool returns error | Tell operator, include error message, ask how to proceed |
| Tool returns unexpected format | Tell operator, show what was received, ask for guidance |
Document these responses in the agent's runbook so operators know what to expect and what to do.
Human-in-the-loop escalation triggers
Some decisions should never be made by the agent alone. Escalation triggers define the conditions under which the agent must stop and defer to a human.
Building escalation into system prompts:
Escalate to a human (stop and say "This needs human review: [reason]") when:
- The request involves [specific high-stakes scenarios for this client]
- The confidence in the output is low and the stakes are high
- The user asks the agent to take an irreversible action
- There is a conflict between the request and the data classification policy
For each client engagement, define the escalation triggers based on their risk profile. A law firm's contract review agent has different triggers than a startup's proposal draft agent.
Diagnosing production failures
When a production agent starts behaving unexpectedly, the diagnostic process:
Step 1: Reproduce the failure Get the exact input that caused the problem. Don't diagnose from a description — reproduce it.
Step 2: Isolate the layer
- Is it a tool failure? Check MCP server connectivity independently.
- Is it a prompt failure? Run the same input without the tool, see if the behavior changes.
- Is it a context failure? Has the agent's context (CLAUDE.md, wiki) changed recently?
- Is it a model behavior change? Has Claude's behavior on this type of input shifted?
Step 3: Check the change log What changed since the agent was last working correctly? Config, credentials, system prompt version, MCP server version, upstream data.
Step 4: Fix at the right layer Fix the layer where the failure actually lives — not the nearest layer that's easy to modify. A prompt patch on top of a broken tool connection doesn't fix the broken tool.
Step 5: Update the runbook Every production failure is a runbook gap. Add the failure mode and its fix to the agent's runbook after you resolve it.