Most AI code reviews are noise. Here's how to fix that.
β‘ TL;DR
Ask Claude to review code without careful directions and you’ll get one of two extremes: a no-value LGTM or a storm of noise and miscellanea.
We’re all fighting cognitive exhaustion. Make the robot minimize your effort!
This custom skill flips the default: Claude must prove each issue with a real failing scenario before reporting it, and concisely summarize why it warrants your valuable attention.
No proof, no report. The result is a short list of actual bugs instead of a long list of maybes.
Want to skip the explanation? Jump to the setup prompt.
β οΈ The problem: noise disguised as thoroughness
Claude’s default code review has two failure modes, and they compound each other.
Quantity over quality. Claude treats review as a completeness exercise. It scans for anything that could be an issue and reports everything. Style nitpicks sit next to real bugs. Speculative “what if someone later removes this guard” concerns sit next to actual logic errors. The signal drowns in noise.
Insufficient research. Claude reads the diff (or the file) and forms opinions based on what it sees in isolation. It doesn’t trace the full call chain. It doesn’t search for other callers. It doesn’t check whether a “missing” null check is handled three levels up. The result: findings that look plausible but fall apart when you check the actual code paths.
The combination is toxic. You get 15 findings, 12 are noise, and now you have to research each one to figure out which three matter. That’s the opposite of what code review is for.
π What the review skill does differently
The skill inverts Claude’s default behavior with one rule: prove it or discard it.
For every potential issue, Claude must:
- Read the project’s coding standards and design principles to understand the baseline for “issues.”
- Read the actual code, not just the diff.
- Search for all callers and usages to understand context.
- Check for existing documentation that explains the design rationale (TPPs, design docs, comments).
- Construct a specific failing scenario. If Claude can’t describe exactly how the bug manifests, it’s not an issue.
- Discard it if research shows it’s intentional or already handled.
This is the opposite of how Claude naturally operates. Left to its defaults, Claude will report a potential null pointer without checking whether the caller guarantees non-null input, and then, gallingly, ask you to look into it. The skill forces it to check first.
What not to report
The skill explicitly excludes:
- Style, organization, or naming preferences
- Speculative future risks
- Feature requests disguised as issues
- Anything not backed by evidence from the codebase
This exclusion list matters. Without it, Claude will pad the review with “you might want to consider…” suggestions that waste your time.
π The skill definition
Create this file at .claude/skills/review/SKILL.md:
---
name: review
description: Review code for potential issues and improvements. Use when asked to review specific files, functions, or code sections.
allowed-tools: Bash, Read, Glob, Grep, Edit, Write, WebSearch
---
# Code Review
Review the mentioned code for potential issues and improvements.
## Before you start
Study CLAUDE.md and any project design principles docs.
**Only report verified bugs, things that are actually wrong.** Do NOT report:
- Style, organization, or naming preferences
- Speculative future risks ("if someone later removes this guard...")
- Feature requests or suggestions disguised as issues
- Things you haven't proven with concrete evidence from the codebase
For EVERY potential issue, you MUST complete these steps before reporting:
1. **Read the actual code** (not just the diff). Follow the full call chain
2. **Search for all callers/usages** to understand context
3. **Read any design docs or TPPs** that explain the rationale
4. **Construct a concrete failing scenario.** If you can't describe
exactly how the bug manifests, it's not an issue
5. **Discard it** if your research shows it's intentional or already handled
**Use subagents liberally:**
- **Exploration**: When more than three files need review, or the code is
complex, launch Explore subagents (one per file/area) to gather findings
- **Validation**: Before reporting ANY issue, launch a subagent to verify
it. Have it trace the full call chain, search for guards/handlers you
might have missed, and read relevant design docs. If the subagent can't
confirm the bug, discard the issue
- **Iteration**: After your initial analysis, launch a second round of
subagents to dig deeper into the most promising findings. Check edge
cases, race conditions, and interaction effects between changed files
If you find zero real issues after thorough research, say "No issues found."
Do not pad the list.
## What to look for
**Correctness**
- Logic or implementation errors
- If correct but surprising, suggest a clearer equivalent or a comment
- Don't trust docs or implementation as authoritative. If they disagree,
flag it, consider what you think is correct (it may be neither!), and
explain your reasoning
**Code quality**
- Violations of project design principles or coding standards
- Dead code (suggest deleting it)
- Comments that merely restate the function name (suggest removing)
**Testing gaps**
- Missing coverage for critical paths or edge cases
## Response format
1. Completely omit any issues that are irrelevant after research and analysis.
2. Sort remaining issues by severity (Critical > High > Medium > Low).
**Step 1 β write up every issue as text first.** For each issue use a
short ID (e.g. `#A`, `#B`) and include:
- **Priority**: Critical / High / Medium / Low
- **Problem**: What's wrong, why, and the concrete scenario where it fails
- **Proof**: The specific code path or test that demonstrates the bug
- **Solution**: A concrete fix
- **Location**: File and line reference
**Step 2 β only after all issue blocks are written**, use `AskUserQuestion`
with checkboxes so the user can accept, veto, or comment on each one.
The question text is just the ID (e.g. `Accept #A?`) β it is NOT a
substitute for the write-up above. Never jump straight to
`AskUserQuestion` without the text write-up; the user can't evaluate
`#A` if they've never seen what `#A` is.
Why Edit and Write are included
Unlike replan (which is read-only), the review skill includes Edit and Write. This is intentional: when Claude finds a real bug and you agree it should be fixed, you want it to fix it right there rather than requiring a separate session. The proof-before-reporting discipline prevents Claude from making drive-by “improvements” it hasn’t justified.
Why Bash is included
Claude needs to run tests to verify that a suspected bug actually fails, and to confirm that a fix doesn’t break anything else. Without Bash, review findings are theoretical. With it, Claude can demonstrate the problem and validate the fix.
Why checkbox triage matters
The skill presents findings via AskUserQuestion as checkboxes. You accept or veto each issue before Claude acts on any of them. Without this step, Claude fixes everything it found, including findings you disagree with. The checkbox step turns review into a conversation rather than a unilateral edit.
π§ Adapting for your project
Reference your project’s standards. Replace the generic “Study CLAUDE.md and any project design principles docs” with specific file paths:
## Before you start
Study these before reviewing:
- [CLAUDE.md](CLAUDE.md)
- [DESIGN-PRINCIPLES.md](docs/DESIGN-PRINCIPLES.md)
Add project-specific “what to look for” items. The generic list covers correctness, code quality, and test gaps. Your project likely has additional concerns:
- “Check that new public APIs have rate limiting”
- “Verify database queries use parameterized inputs”
- “Confirm error messages don’t leak internal paths”
Tune the exclusion list. If your team does want style feedback, remove that line. If you want Claude to suggest refactors, remove the “feature requests” line. The default is strict because noise is the bigger problem, but your team’s tolerance may differ.
Subagents are not optional. The skill instructs Claude to use subagents for exploration (one per file/area), validation (verify each finding before reporting), and iteration (dig deeper into promising findings). This is the mechanism that makes proof-before-reporting work at scale. Without subagents, Claude tries to hold everything in its main context and cuts corners on verification.
π Setting this up
Open Claude Code in your project directory, switch to “plan” mode (Shift+Tab to cycle), and paste this prompt:
Set up the review skill described at
https://photostructure.com/coding/claude-code-review/index.md
for this project.
Create .claude/skills/review/SKILL.md with the skill
definition from the article. Then customize it:
1. Update the "Before you start" section with paths to
this project's coding standards and design docs
2. Add project-specific items to the "What to look for"
section based on this codebase's priorities
3. Adjust the exclusion list if your team wants style
or refactoring feedback
4. Ask me if there are specific review concerns that
should always be checked
Refer to https://code.claude.com/docs/en/skills.md for
skill syntax.
Review what it produces and edit to taste.
π² Your mileage WILL vary
The exclusion list, the verification steps, the response format: these are starting points. Season to taste. Do you have testing requirements that it can help enforce? Linting tools it should integrate with?
The core principle holds regardless: require proof before reporting. A review with three proven bugs is worth more than a review with fifteen maybes.
