Code Review in the AI Era
Shift from reviewing handwritten code to verifying AI-generated code against intent and system context.
Key takeaways
- AI-era review shifts the core question from "why did the author write it this way?" to "does this code behave as intended in our system?"
- AI-generated code is often syntactically clean yet presents wrong logic with the same confidence as correct logic, so zero lint warnings never prove business correctness.
- Prioritize review by AI mistake rate times impact: P0 security and auth need a senior reviewer, while P3 style and formatting go to CI and the formatter.
- Watch for the recurring AI failures: silent failures that return success, N+1 query loops, and unauthenticated or unconstrained search endpoints.
- An AI-collaboration PR template captures intent, AI involvement, human verification, and reviewer focus so reviewers know where to look.
Code review remains central, but the question changes. Traditional review asks, "Why did the author write it this way?" AI-era review asks, "Does this code behave as intended in our system?"
Broken Assumptions
| Review area | Old assumption | AI-era reality |
|---|---|---|
| Style | A teammate knows team conventions | AI infers conventions from context |
| Intent | The author can explain decisions | Intent is in the prompt and surrounding context |
| Business logic | The author understood the domain | AI implements what was explicit |
| Edge cases | Experience fills gaps | AI may cover common cases and miss local cases |
| Performance | The author knows system pressure | AI often optimizes locally |
Four Traits of AI-Generated Code
- It is often syntactically clean.
- It has limited understanding of hidden system contracts.
- It can be too generic or overfit to the prompt.
- It presents wrong code with the same confidence as right code.
Clean code can still be wrong
Zero lint warnings do not prove business correctness, security, or context fit.
Review Priority
| Priority | Review item | Owner |
|---|---|---|
| P0 | Security, auth, data exposure | Senior reviewer required |
| P1 | Business logic, concurrency, API contracts | Domain expert |
| P2 | Error handling, type safety | General reviewer |
| P3 | Style, formatting, naming | CI and formatter |
Common AI Mistakes
Silent Failure
if (!order) {
console.log('Order not found')
return { success: true }
}The code is clean but operationally dangerous. A missing order should be logged, alerted, retried, or rejected depending on the webhook contract.
N+1 Queries
AI often writes correct per-record logic and misses loop-level performance:
const membersWithProjects = await Promise.all(
team.members.map(async (member) => {
const projects = await db.project.findMany({ where: { assigneeId: member.id } })
return { ...member, projects }
})
)Reviewers should look for query count, unbounded collections, and missing null handling.
Security Gaps
const users = await db.user.findMany({
where: { email: { contains: query } },
})
return NextResponse.json(users)Ask: Is the route authenticated? Are fields selected explicitly? Is the search allowed by policy? Is input length constrained?
AI Collaboration PR Template
### Intent
- What behavior changed?
### AI involvement
- Draft generation / tests / refactor / research
### Human verification
- Business rule checked
- Security and permissions checked
- Tests passed
### Reviewer focus
- Missing edge cases
- Over-generalized abstraction
- Hidden system contractsNext
Read Testing Strategy Shift to turn review criteria into executable safeguards.