Should we require two approvals on every PR?

Usually no. It’s a blunt instrument that punishes low-risk work and doesn’t reliably improve correctness. A better default is **1 approval + required checks**, then use `CODEOWNERS` and tiered checks for high-risk paths (auth, payments, migrations).

How do we keep automation from slowing delivery?

Set a hard time budget for merge-critical checks (commonly **<10 minutes**). Everything else becomes risk-triggered or async. If you can’t meet the budget, fix caching, split tests, and quarantine flake before adding more checks.

Do we need a custom review bot to enforce our rules?

Rarely. Start with platform primitives: required checks, merge queue, `CODEOWNERS`, and native security scanning (e.g., `CodeQL`, secret scanning, dependency review). Custom bots become a maintenance product, and most teams underestimate that tax.

How does this change with AI-generated code?

AI increases throughput and variance. The paved road matters more: deterministic checks catch the obvious issues quickly, and humans focus on behavior, safety, operability, and risk. Tiered checks help you avoid turning every AI-assisted PR into a 40-minute pipeline.

Platform-productivity · Dec 12, 2025 · 8 minute read

Your Code Review Queue Isn’t a Team Problem — It’s a Missing “Paved Road” Problem

Q: Do we need a custom review bot to enforce our rules?

Rarely. Start with platform primitives: required checks, merge queue, `CODEOWNERS`, and native security scanning (e.g., `CodeQL`, secret scanning, dependency review). Custom bots become a maintenance product, and most teams underestimate that tax.

Q: How does this change with AI-generated code?

AI increases throughput and variance. The paved road matters more: deterministic checks catch the obvious issues quickly, and humans focus on behavior, safety, operability, and risk. Tiered checks help you avoid turning every AI-assisted PR into a 40-minute pipeline.

Automate the boring, standardize the defaults, and keep human review for the stuff that actually matters. Here’s how to raise quality without turning PRs into a week-long waiting room.

GitPlumbers Staff

Platform & Delivery Reliability

We’ve been on the wrong end of flaky CI, overloaded review queues, and “temporary” process hacks that became permanent. GitPlumbers helps teams standardize paved-road workflows, rescue legacy and AI-assisted codebases, and ship safely without turning delivery into a committee meeting.

If your staff engineers are leaving “run prettier” comments, you’re paying senior rates for work the CI system should do for free.

Back to all posts

The failure mode I keep seeing: humans doing machine work

I’ve watched teams with genuinely strong engineers grind to a halt because code review becomes the choke point for everything: formatting debates, “did you run the linter?”, missing tests, dependency drift, security concerns, and the occasional architecture argument buried in a 900-line PR.

Then leadership asks the classic question: “Why is delivery slowing down when headcount went up?” And the answer is usually sitting in plain sight: reviewers are spending their limited attention on deterministic checks that a bot should have caught in 30 seconds.

If your best staff engineers are leaving comments like “please sort imports” or “run prettier,” you don’t have a people problem. You have a paved-road problem.

What actually works: guardrails, not gatekeeping

Good review automation does two things:

Catches the boring, repeatable stuff before a human ever looks.
Keeps the path to merge predictable so engineers don’t batch work “until it’s worth the review tax.”

Bad review automation does the opposite:

Adds 25 checks that time out, flake, or create false positives.
Forces bespoke rules per repo (“snowflake governance”), so nobody remembers what “good” looks like.

A simple design constraint that keeps you honest:

Put a hard budget on your “merge-critical” pipeline: <10 minutes wall-clock.
Anything slower goes to:
- scheduled builds,
- nightly security scans,
- or label-triggered runs (e.g., risk:high).

This is the core trade-off: you can have more coverage or more flow, but trying to get both on every PR usually backfires.

The paved-road baseline (the stuff that should be automatic everywhere)

If I’m building a baseline for a platform org, I want one reusable workflow (or CI template) that every service inherits. Same names for checks, same pass/fail behavior, same time budget.

Here’s a GitHub Actions example that covers the “PR hygiene” basics without getting fancy:

# .github/workflows/pr-gate.yml
name: pr-gate

on:
  pull_request:

concurrency:
  group: pr-gate-${{ github.ref }}
  cancel-in-progress: true

jobs:
  fast-checks:
    runs-on: ubuntu-latest
    timeout-minutes: 10
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: npm

      - run: npm ci

      - name: Format check
        run: npm run format:check

      - name: Lint
        run: npm run lint

      - name: Typecheck
        run: npm run typecheck

      - name: Unit tests
        run: npm test -- --ci

A few hard-earned notes:

Caching is not optional. If npm ci takes 6–8 minutes, your whole idea collapses.
Timeouts matter. A check that hangs turns into “just re-run CI until it works,” which is how teams normalize garbage.
Don’t block on everything on day one. Start with deterministic checks that you trust.

On the repo side, make the defaults unmissable:

.editorconfig
.gitattributes
prettier / eslint / ruff / gofmt defaults
a PR template that asks for risk and test evidence

Example PR template that nudges behavior without turning into bureaucracy:

<!-- .github/pull_request_template.md -->

## What changed?

## Why?

## How did you test?
- [ ] Unit tests
- [ ] Integration/contract tests
- [ ] Manual validation

## Risk
- [ ] Low (refactor/docs)
- [ ] Medium (behavior change)
- [ ] High (auth/payments/data migration)

## Rollout plan
- [ ] Behind a feature flag
- [ ] Canary
- [ ] Immediate

This isn’t for compliance theater. It’s so reviewers can stop playing detective.

Keep humans on the hook for the right things (CODEOWNERS + merge policy)

The fastest way to slow delivery is requiring “two approvals from anyone” on every PR. The second fastest is requiring “approval from everyone who ever touched this code.” I’ve seen both.

Use CODEOWNERS to route review to the people who actually carry the pager for the area, but keep it pragmatic:

# CODEOWNERS
# Default: one platform owner for infra-y changes
/.github/        @platform-team
/terraform/      @platform-team
/k8s/            @platform-team

# Domain ownership
/services/payments/  @payments-oncall
/services/auth/      @identity-oncall

# Fallback: if you must, keep it small
*               @eng-leads

Then set branch protections (or GitLab equivalents) so the machine does the policing:

Require status checks: pr-gate / fast-checks
Require 1 approval (or codeowner approval only for certain paths)
Require merge queue (GitHub) or “merge trains” (GitLab) if you’re at any real scale
Allow auto-merge when checks pass

Why merge queue matters: without it, you get the classic “green PR, red main” because two PRs passed independently but conflict together. Merge queue turns that into a solved problem.

If you want a simple “fast lane” rule:

Low-risk changes (docs, internal refactors) → 1 approval + fast checks
High-risk paths (auth, payments, migrations) → codeowner approval + fast checks + extra checks

You can implement the extra checks with path filters or labels.

Tiered checks: spend CI minutes where risk actually lives

Teams love the idea of running everything on every PR:

SAST
dependency scanning
container scanning
full integration suite
performance checks
chaos tests (yes, really)

I’ve seen that movie. It ends with engineers pushing commits at 5pm, then going home while CI grinds until midnight.

Instead, design tiers:

Tier 1 (always, <10 min): format/lint/typecheck/unit tests, build, basic policy checks
Tier 2 (risk-triggered, 10–30 min): integration/contract tests, DB migration validation, API diff checks
Tier 3 (async, not blocking): CodeQL deep queries, container scans, nightly end-to-end

Here’s a practical label-triggered job:

# inside pr-gate.yml
  integration:
    if: contains(github.event.pull_request.labels.*.name, 'risk:high')
    runs-on: ubuntu-latest
    timeout-minutes: 30
    steps:
      - uses: actions/checkout@v4
      - run: ./scripts/run-integration-tests.sh

The trade-off is obvious: a high-risk PR now has a longer path to merge. That’s fine. That’s the point. You’re paying CI time where the blast radius is real.

For security, start with platform-native tools before buying/maintaining something exotic:

GitHub Advanced Security: CodeQL, secret scanning, dependency review
GitLab Ultimate equivalents

A minimal CodeQL setup (often acceptable as non-blocking initially):

name: codeql
on:
  pull_request:
  schedule:
    - cron: '0 3 * * 1-5'

jobs:
  analyze:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: github/codeql-action/init@v3
        with:
          languages: javascript-typescript
      - uses: github/codeql-action/analyze@v3

Tune it, then decide what becomes blocking. The fastest way to make security irrelevant is to drown teams in false positives.

Before/after: what changes when you stop treating review as QA

A real pattern from a client I’ll anonymize (payments adjacent, Node + Terraform, GitHub):

Before

PR checks were inconsistent by repo. Some had eslint, some didn’t.
“Approval” meant “someone glanced at it.” Review comments were mostly nits.
CI took 25–40 minutes on average. Lots of flaky integration tests.
Outcome metrics:
- PR lead time: 2.4 days median
- Rework (PR reopened / follow-up fix within 48h): ~18%
- Escaped defects causing rollback/hotfix: 2–3 per sprint

After (6 weeks)

Standard pr-gate workflow rolled out to all services via a repo template.
Tier 1 checks <10 minutes with caching and test selection.
Tier 2 integration tests triggered only on risk:high or changes under /services/payments.
CODEOWNERS required only for high-blast-radius paths.
Merge queue enabled for main.

Results weren’t magical, but they were real:

PR lead time: 1.1 days median (reviews happened sooner because PRs were smaller and “pre-cleaned”)
Rework: ~9% (bots caught the easy stuff, humans focused on design and edge cases)
Rollback/hotfixes: down ~40% (mostly from catching dependency/API breaks earlier)

The biggest win wasn’t a tool. It was making the expected path obvious and cheap.

Cost/benefit: where automation pays off (and where it’s a trap)

Here’s the blunt version I give leaders: if you need a custom bot to enforce a basic rule, your developer experience is already too complicated.

High ROI automation (usually worth it):

Formatting and linting (deterministic, zero debate)
Typechecking (fast signal)
Unit tests (fast feedback)
Dependency review (catches “oops I upgraded openssl and broke prod”)
Secret scanning (stops the “rotating keys at 2am” incident)

Lower ROI (easy to overdo):

Comment bots that produce 40 annotations per PR (engineers learn to ignore them)
Heavy integration suites on every PR (death by CI)
Custom “architecture compliance” linters (they rot, then everyone routes around them)

The hidden cost isn’t the SaaS bill — it’s the attention tax:

Every flaky check increases review latency.
Every false positive trains teams to treat red pipelines as noise.
Every bespoke rule creates “local lore” that new hires have to rediscover.

When we do this kind of work at GitPlumbers, the goal is boring: one paved road, minimal exceptions, and metrics that prove it helped.

A rollout plan that doesn’t implode your org

If you try to boil the ocean, you’ll get a rebellion and a bunch of bypasses.

Week 1–2: standardize Tier 1
- One workflow/template
- Format/lint/typecheck/unit tests
- Time budget <10 minutes
Week 3–4: fix the top two sources of flake
- Quarantine flaky tests
- Add retries only where justified (and track them)
Week 5–6: add ownership + merge queue
- CODEOWNERS for high-risk paths
- Merge queue to stabilize main
- Enable auto-merge for compliant PRs
Ongoing: tune security and integration coverage
- Start non-blocking, then graduate checks once noise is low
- Move slow checks off the critical path

Instrument like you mean it:

Review latency (time to first review)
PR lead time (open → merge)
CI duration and flake rate
Escaped defects / rollbacks
“Hotfix hours” on-call (a brutally honest KPI)

If you’re seeing a wave of AI-generated PRs (or “vibe coding” experiments), this becomes even more important: machines should check machine-written code first, so humans can focus on correctness, safety, and operability.

If your process assumes every line was carefully crafted by a patient human, AI-assisted throughput will break it.

If you want a second set of eyes, GitPlumbers routinely helps teams consolidate CI/CD, tame review automation, and clean up AI-generated code so it doesn’t turn into next year’s incident report.

Related Resources

Key takeaways

If humans are still checking formatting, imports, and obvious nits, your automation is underpowered — and your review throughput will always suffer.
Set a hard latency budget for PR checks (e.g., **under 10 minutes** on the critical path) and push everything else to async or scheduled runs.
Prefer **native platform features** (CODEOWNERS, required checks, merge queue, auto-merge) before introducing bespoke bots.
Use **tiered validation**: fast checks for every PR, heavy checks only when risk signals are present (labels, paths, release branches).
Measure outcomes that leaders actually care about: cycle time, rework rate, escaped defects, and on-call pain (MTTR).

Implementation checklist

Define your PR “critical path” check suite and keep it <10 minutes
Standardize formatting and linting with a single repo-level config
Add CODEOWNERS with sane defaults (avoid 12-person approval chains)
Require checks + 1 approval (or codeowner approval only where it matters)
Enable merge queue (or equivalent) to stop “green-on-main, red-after-merge” surprises
Turn on CodeQL/secret scanning and keep them non-blocking until tuned
Adopt auto-merge for compliant PRs to reduce human batching and delays
Instrument review and CI metrics (PR lead time, review latency, flaky test rate)

Questions we hear from teams

Should we require two approvals on every PR?: Usually no. It’s a blunt instrument that punishes low-risk work and doesn’t reliably improve correctness. A better default is **1 approval + required checks**, then use `CODEOWNERS` and tiered checks for high-risk paths (auth, payments, migrations).
How do we keep automation from slowing delivery?: Set a hard time budget for merge-critical checks (commonly **<10 minutes**). Everything else becomes risk-triggered or async. If you can’t meet the budget, fix caching, split tests, and quarantine flake before adding more checks.
Do we need a custom review bot to enforce our rules?: Rarely. Start with platform primitives: required checks, merge queue, `CODEOWNERS`, and native security scanning (e.g., `CodeQL`, secret scanning, dependency review). Custom bots become a maintenance product, and most teams underestimate that tax.
How does this change with AI-generated code?: AI increases throughput and variance. The paved road matters more: deterministic checks catch the obvious issues quickly, and humans focus on behavior, safety, operability, and risk. Tiered checks help you avoid turning every AI-assisted PR into a 40-minute pipeline.

Ready to modernize your codebase?

Let GitPlumbers help you transform AI-generated chaos into clean, scalable applications.

Talk to GitPlumbers about a paved-road review automation rollout See how we fix flaky CI and AI-generated code safely