Article 3 of 6

Code Review as Culture: Turning Ritual into Development

Code review is your highest-leverage tool for improving team quality — if you use it for development, not just defect detection.

11 minIntermediate

✦

Key Takeaway

Code review is the most universally practiced and most universally under-optimized engineering activity. Most teams use it for defect detection — catching bugs before they ship. The higher-leverage purpose is knowledge transfer, and almost nobody designs their review process to optimize for it. This article gives you the practices to make code review the engine of your team's technical development.

Every team does code review. Almost no team does it well.

I don't mean that technically. I mean it culturally. Code review as a technical activity — reading code, checking for bugs, ensuring it compiles and follows conventions — is well-understood. Code review as a cultural activity — the practice through which a team's collective judgment gets transmitted, standards evolve, and junior engineers become senior engineers — is almost never designed intentionally.

The gap between these two versions of code review is enormous, and it's the gap that separates teams that plateau technically from teams that continuously elevate their collective capability.

In fifteen years of building engineering teams, I've watched code review habits vary wildly. Teams in Indian startups where every review is a battlefield of opinions and reviews take three days to get. Enterprise teams in Europe where every PR gets a two-minute LGTM from a senior engineer who is too busy to actually read it. Teams where review comments read like personal attacks. Teams where nobody says anything critical because they don't want to damage a relationship. None of these are functional. All of them are common.

Let me tell you what excellent code review actually looks like — and, more importantly, how to build a team culture where it happens naturally.

Two Purposes, Very Different ROI

Code review has two purposes, and most teams only optimize for one.

The first purpose is defect detection: catching bugs, security vulnerabilities, logical errors, and missing edge cases before they reach production. This is the purpose most teams consciously design around. It's why code review is mandatory, why you can't merge without an approval, and why senior engineers are expected to review critical changes.

The second purpose is knowledge transfer: spreading understanding of the codebase, the domain, and the craft of software development from more experienced engineers to less experienced ones — and also in the other direction, as newer engineers bring fresh eyes to established patterns. This is the purpose most teams never consciously design around, even though it delivers dramatically more long-term value.

Think about it quantitatively. If good defect detection in code review catches, say, two production bugs per month per engineer, the value is significant but bounded. If good knowledge transfer in code review accelerates the development of a junior engineer from "needs guidance on most things" to "works independently on complex features" three months faster, the value is a step-change in team capacity that compounds for the remaining years that engineer is on your team.

The math overwhelmingly favors knowledge transfer. But it's harder to measure, slower to manifest, and requires reviewing with a different mindset — which is why it gets optimized away.

What a Good Review Comment Looks Like

If you want to understand the quality of a team's code review culture, read the comments. Comments are the clearest signal of what reviewers are actually doing.

A poor comment flags a problem without explaining why it's a problem: "This should be a HashMap not an ArrayList." This is technically correct and practically useless. The author fixes the specific issue and learns nothing they can apply elsewhere.

A good comment explains the reasoning: "The way this is used — looking up entries by ID in a loop — means you're doing O(n) linear scans on every lookup. A HashMap gives you O(1) lookups, which matters here because this runs in the hot path during checkout. As a rule, any time you see a lookup-by-key pattern, HashMap should be your first instinct."

The difference is not effort — both comments take about 30 seconds to write. The difference is whether the reviewer is thinking about defect detection (fix this) or development (understand why).

Good review comments have three components. They're specific: not "this function is too complex" but "this function is doing three different things — input validation, business logic, and persistence — which makes it hard to test and hard to reason about." They explain the why: the underlying principle or the consequence of the current approach. And they invite dialogue: "Does this change how you're thinking about the structure?" is more useful than an imperative, because it prompts the author to engage rather than just comply.

Questions are often better review feedback than statements. "What happens if userId is null here?" forces the author to think through the edge case. "Handle the null case" tells them what to do without developing their judgment about when null checks are necessary.

This is especially important for senior engineers reviewing junior ones. Your job isn't just to make the code correct right now — it's to improve the judgment of the person who wrote it so that their next PR needs less correction.

The Rubber Stamp Problem

The LGTM within two minutes. The approval with a single emoji. The review that approves the PR without asking a single question.

These happen for understandable reasons. Senior engineers are busy. They trust their teammates. They don't want to slow down delivery by picking at minor things. And they've already reviewed the design in a prior conversation, so the implementation feels like detail.

The rubber stamp is particularly prevalent in India's startup ecosystem, where the pressure to ship is extreme and code review is often seen as bureaucratic overhead rather than engineering investment. I've seen teams where a PR would sit for three days waiting for a review, and when the review finally came, it was "looks good" with no comments. All the waiting, none of the benefit.

There are three things rubber stamps fail to do.

They fail to catch the bugs that only appear when you read the code carefully. Not the obvious bugs — the subtle ones. The off-by-one in the pagination logic. The race condition in the concurrent update path. The missing error handling in the case where a third-party API returns a 200 with an error body.

They fail to transfer knowledge. If every PR gets LGTM'd, the junior engineer who wrote it has no signal about whether their approach was good, merely adequate, or subtly problematic. They'll make the same structural choices in the next PR, and the next one after that.

And they normalize low-engagement review as the standard. Once LGTM becomes acceptable, the engineer who actually reads carefully starts to feel like they're doing extra work, or holding up delivery, by asking questions. The social norm calibrates to the lowest engagement level.

The fix is not a policy requiring a certain number of comments per review. It's creating conditions where genuine engagement is the path of least resistance.

The Blocking Problem

The opposite failure is the review that takes three days to arrive, or arrives with twelve blocking comments that require significant rework, or the reviewer who has taken ownership of the PR and is essentially rewriting it through comments.

Context switching is expensive. An engineer who submits a PR and then has to wait two days for review has moved on to other work. When the review arrives with substantive feedback, switching back into the context of that PR costs significant cognitive overhead. If the feedback requires several hours of rework, the engineer has now spent days on a task they thought was done. This is deeply demoralizing, and it's a significant source of the anti-review sentiment you'll find in engineers who have experienced it repeatedly.

The research on this is fairly clear: reviews that arrive within four hours maintain the author's context and result in faster iteration. Reviews that arrive within a day are still functional. Reviews that take longer than 24 business hours are causing flow disruption that starts to outweigh their value.

The other side of the blocking problem is reviewers who write too many comments, or comments that are about personal preference rather than genuine quality concerns. A reviewer who consistently leaves 15 comments on every PR — many of which are "I would have done this slightly differently" — is not improving quality. They're creating a tax on all PRs that reaches the author's attention. Over time, authors start to either avoid that reviewer or push back defensively on every comment, which destroys the social contract that makes good review possible.

There's a useful distinction between blocking comments (this must change before merge), non-blocking suggestions (I'd consider this, but it's your call), and observations (I notice this approach — just thinking about whether we've considered X). Making this distinction explicit in your comments — some teams use prefixes like "nit:", "blocking:", "thought:" — reduces the friction of knowing which feedback needs action.

PR Size as a Review Quality Lever

The single most powerful intervention for improving code review quality is enforcing small PRs. This is a constraint on the author, not the reviewer, which makes it feel like the wrong lever — but it isn't.

Small PRs get reviewed. They get reviewed carefully, because a reviewer can hold the whole change in working memory while reading. They get reviewed quickly, because the investment is small. They produce better comments, because the context is narrow and clear.

Large PRs don't get reviewed. They get rubber-stamped, or they get reviewed superficially while the reviewer is overwhelmed by volume. They produce either too many comments (the reviewer found so many things to discuss that the conversation becomes unmanageable) or too few (the reviewer gave up and approved). Neither outcome is good.

In my experience, PRs under 300 lines of code (excluding generated code and test files) get reviewed well. PRs over 500 lines start to degrade in review quality. PRs over 1,000 lines are effectively unreviewed regardless of how conscientious the reviewer is trying to be.

The challenge is that not every feature can be broken into sub-300-line chunks. The discipline of breaking large features into reviewable increments is a real skill, and it's worth developing deliberately. The typical approach is: split the feature branch into a series of smaller PRs — infrastructure/scaffolding first, then the core logic, then the wiring and integration. Each can be reviewed and merged independently. The feature branch accumulates the whole change, but the review happens in digestible increments.

Author Responsibility: The PR Description

The best code review I've ever received was on a PR I wrote with a thorough description. The second-best was on a PR I wrote with a thorough description. There's a pattern here.

Reviewers can't give good feedback without context. Why was this change made? What was the alternative approach, and why was it rejected? What are the risks or unknowns? What do you specifically want the reviewer to focus on?

A PR with a blank or single-sentence description forces the reviewer to reverse-engineer all of this from the code itself, which is slow, uncertain, and means they're spending their cognitive budget on understanding the context rather than evaluating the quality.

A PR with a thorough description tells the reviewer: here's what I'm doing, here's why, here's where I'm uncertain, here's what I want you to focus on. The reviewer can engage immediately at the right level.

The PR description also serves a future purpose: it becomes part of the change history. When an engineer is investigating an old commit with git blame six months later, a thorough PR description tells them the story behind the change. This is worth its weight in every refactor and incident that touches that code in the future.

As a practical standard: every non-trivial PR should have a description that a new team member could read to understand what changed and why, without reading the code first.

Giving Difficult Feedback

Sometimes the code has a fundamental problem — a design choice that will cause pain for years, an approach that misunderstands a core concept, a performance issue that will cause an incident under load. Giving this feedback without damaging the professional relationship requires skill that most engineers are never explicitly taught.

The most important principle is to critique the code, not the engineer. "This approach has a problem" is categorically different from "you misunderstood the requirement." The code is an artifact; the engineer is a person. The former can be changed without anyone losing face.

Be direct and specific. "I'm worried this approach won't scale once we're above ~10k users, because every request is doing a full table scan. Could we add a DB index on user_id and filter at the query level?" is better than "this might have performance issues." Vagueness forces the author to guess what you mean, which either leads to the wrong fix or to an uncomfortable follow-up conversation.

Use genuine questions for genuine uncertainty: "I'm not sure this handles the case where the session expires mid-checkout — what happens here?" This invites collaboration rather than compliance, and it's honest — if you're not certain there's a bug, framing it as a question is more accurate than framing it as a finding.

For architectural concerns — the kind of feedback that might require significant rework — make sure the author knows early. A comment that surfaces a fundamental design problem on the 47th comment of a large PR review is demoralizing. If the problem is big enough, open a conversation before the review: "I had a thought about the approach in this PR — can we talk through it before I write everything up?"

Building a Review Culture

All of the above is technique. Technique without culture doesn't hold.

Building a review culture starts with senior engineers who visibly model what good review looks like. When the most experienced person on the team writes comments like "I'd approach this differently — here's how I'm thinking about the trade-offs," it signals that review is for development, not just defect detection. When the same person rubber-stamps every PR because they're too busy, it signals the opposite.

It also requires that feedback be welcomed, not just tolerated. Engineers who receive review comments with defensiveness — who argue every point, who take feedback as personal criticism — create a chilling effect. Reviewers stop pushing back because it's not worth the friction. Over time, the review process becomes ritualistic rather than substantive.

The receiving skill is just as important as the giving skill. Treating review comments as information — as a peer's attempt to help you produce better work — is a professional discipline that most engineers don't explicitly develop. It's worth naming and discussing in team retrospectives: how are we feeling about the review process? Is feedback being received well? Is anything holding back honest engagement?

One concrete practice that consistently improves review culture is pairing on reviews occasionally — particularly when a junior engineer is reviewing the work of a senior engineer, or when the team is working with an unfamiliar area of the codebase. Sitting together for 20 minutes to walk through a PR, discuss it, and write comments collaboratively is an extremely efficient knowledge transfer activity. The reviewer learns from explaining their thinking; the author gets richer feedback; and both engineers leave the session with shared context.

Code review is a conversation. The best teams design it that way.

Testing Strategy: Building Confidence Without Slowing Down Managing Technical Debt Strategically (Not Just Reactively)