Module 14 · Lesson 02
Reviewing and Improving AI-Generated Work
Reading time: 16 minutes Track: Claude Fluency for Teams · Lead/Manager path
The review challenge
The review challenge for AI-generated work is genuinely different from reviewing human work. Human work comes with implicit context — you know the person, you can ask follow-up questions, and you can calibrate your review to their typical blind spots.
AI-generated work comes with different failure modes: confident errors, generic framing that misses your specific situation, and subtle misalignments with your actual intent that only become visible when you read closely.
Getting review right means catching these without creating so much friction that you lose the productivity benefit.
The three-level review framework
Level 1: Quick pass (30 seconds to 2 minutes)
For lower-stakes work (internal emails, meeting agendas, exploratory analysis):
- Does it achieve the goal?
- Is anything obviously wrong or misleading?
- Does it need adjustment before use?
Level 2: Content review (2-10 minutes)
For work that will leave your hands (external communications, reports, analysis):
- Check every specific claim against what you know or a primary source
- Read for tone — does it sound like you/your team?
- Check format — does it work for the actual audience and context?
- Verify any numbers
Level 3: Expert review
For high-stakes outputs (legal documents, technical architecture decisions, compliance determinations, public statements):
- A qualified domain expert must review regardless of how good the output looks
- Claude output is draft material for expert review, not a final product
Common failure modes to watch for in review
The confident generalization: Claude states something true in general but not true in your specific situation. "Standard practice is X" — but your context has a good reason to do Y.
The plausible inaccuracy: A fact or number that sounds right but isn't. Especially common in statistics, dates, and citations.
The missed constraint: Claude completed the task but violated an implicit constraint you had but didn't specify. ("Write a professional email" — Claude wrote one, but you assumed it would be under 200 words for this particular person.)
The format mismatch: Content is right but format doesn't work for the actual use case.
The hedged conclusion: Claude presents analysis but softens every conclusion to the point where it doesn't actually recommend anything. You wanted a recommendation; you got "considerations."
Building a team review culture
The goal is a team norm where AI review is routine and efficient, not paralyzed or absent.
What to encourage:
- Reading Claude outputs before using them (obvious but needs stating)
- Treating Claude code the same as code from any other contributor — reviewed, not rubber-stamped
- Using the verification prompts: "What's the strongest argument this is wrong?" applied to outputs that will inform important decisions
What to discourage:
- Shipping Claude outputs without reading them
- The opposite: treating Claude outputs as so suspect that the review cost exceeds the benefit
- Hiding that something was AI-assisted — transparency enables better review
Manager practices:
- Model the review behavior you want to see — if you review AI outputs carefully, your team will too
- Ask "how did you verify this?" for outputs from Claude that will drive decisions
- Celebrate catches — when a teammate catches a Claude error in review, that's the system working