How many required checks is “too many”?

Start with 3–5: build, unit tests, type check, formatter, and a critical security/secret scan. If your CI list doesn’t fit on a laptop screen without scrolling, you’ve almost certainly over-optimized the wrong thing.

Should we block on end-to-end tests?

Usually no. Run e2e post-merge against a staging canary with a circuit breaker and Prometheus SLOs. Keep PR checks fast and reliable; let deployment safety nets handle integration risk.

Do we need Bazel/Turborepo/Nx for a monorepo?

Not at first. Start with `paths` filters and aggregated checks. Add Turborepo/Nx when CI time and cache hit rates prove the need. Avoid bespoke graph builders unless you truly hit scale.

How do we keep AI-generated code from slipping risky patterns?

Use fast SAST and secret scans as annotations, enforce format/lint/type checks, and promote only critical findings to blocking. Pair that with `CODEOWNERS` on sensitive areas (crypto, auth, PII) for human review.

What metrics should we track to know it’s working?

Median PR cycle time, time to first CI signal, CI duration, re-run rate (flake), review wait time, and post-merge change failure rate. Decisions should be based on these, not vibes.

Platform-productivity · Dec 11, 2025 · 9 minute read

Code Review Automation That Doesn’t Grind Delivery to a Halt

A paved-road approach to PR checks, ownership, and merge discipline that keeps quality high and lead time low—without a bespoke Rube Goldberg machine.

Alex Ramirez

Senior Platform & SRE Consultant, GitPlumbers

20 years building and rescuing platforms from the dot-com days to today’s AI-fueled monorepos. Ex-Stripe, ex-Atlassian, led SRE and Platform teams through microservices, GitOps, and more CI revamps than he cares to admit.

Block only what protects quality. Annotate everything else. That’s how you keep engineers shipping without playing CI roulette.

Back to all posts

The PR carousel you’ve probably lived through

Two approvals, 17 “required” checks, flaky e2e, and a CI run that takes longer than a coffee roasting cycle. I watched a fintech on GKE stall at 42 hours median PR cycle time because every change—docs edits included—ran 40 minutes of CI, plus three reruns for flakes. The kicker: bug escape rate was unchanged. We weren’t protecting quality; we were taxing delivery.

I’ve seen this fail at unicorns and five-person startups. The pattern is the same: bespoke CI logic, every tool under the sun, and no paved road. Here’s the approach we use at GitPlumbers that actually works: minimal blocking gates, heavy use of annotations, clear ownership, and a merge queue. Paved road over Rube Goldberg.

Principles that keep quality high and flow fast

Block only what you must: compilation, unit tests, type checks, critical security scanning. Everything else annotates.
Fast feedback wins: aim for <5 minutes to first signal, <10 minutes total on typical PRs.
Prefer defaults: use GitHub Actions, built-in merge queue, CODEOWNERS, pre-commit. Avoid bespoke orchestrations unless you’ve outgrown them.
Right-size per repo: monorepo? Use paths filters and dependency graph to limit scope. Microservices? Keep each repo’s checks stupid-simple.
Own your hot paths: make risk areas explicit with CODEOWNERS and SLOs for review response time.
Measure, then tune: PR cycle time, review wait time, CI duration, flake rate, and change failure rate (ties to MTTR/SLOs).

A paved-road baseline CI: 15 lines that save days

This is the minimum viable, production-grade CI for a TypeScript service. It blocks on build, tests, types, and format. Everything else annotates.

# .github/workflows/ci.yml
name: ci
on:
  pull_request:
    paths:
      - '**/*.ts'
      - '**/*.tsx'
      - 'package.json'
      - 'pnpm-lock.yaml'
      - '.github/workflows/**'
concurrency:
  group: pr-${{ github.ref }}
  cancel-in-progress: true
jobs:
  fast-checks:
    runs-on: ubuntu-latest
    timeout-minutes: 15
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: '20', cache: 'pnpm' }
      - run: corepack enable && pnpm i --frozen-lockfile
      - run: pnpm format:check && pnpm lint && pnpm tsc --noEmit && pnpm test -- --ci

Package.json scripts keep it boring:

{
  "scripts": {
    "format": "prettier -w .",
    "format:check": "prettier -c .",
    "lint": "eslint . --max-warnings=0",
    "test": "vitest run"
  }
}

Why this works: it’s fast, cached, and scoped by paths. It blocks only on signals that correlate with real failures. Add language-specific equivalents (ruff/black/pytest, golangci-lint/go test, etc.).
Gotcha: resist job fan‑out (10 parallel “required” checks). Aggregate into one or two checks to cut flake and reruns.

Annotate, don’t block: make robots helpful

Blockers are expensive. Annotations are cheap. Use reviewdog or danger to surface issues without stopping merges.

# Add to ci.yml for annotations (non-blocking)
  annotations:
    runs-on: ubuntu-latest
    if: always()
    steps:
      - uses: actions/checkout@v4
      - uses: reviewdog/action-eslint@v1
        with:
          reporter: github-pr-review
          eslint_flags: '.'
      - uses: reviewdog/action-actionlint@v1
        with:
          reporter: github-pr-check

Or a small Dangerfile.ts that comments on PR hygiene without blocking delivery:

// Dangerfile.ts
import { danger, warn, message } from 'danger'

const bigPR = (danger.github.pr.additions ?? 0) + (danger.github.pr.deletions ?? 0) > 800
if (bigPR) warn('Large PR detected (>800 LOC). Consider splitting to reduce review latency.')

if (!danger.github.pr.body || danger.github.pr.body.length < 20) {
  message('Add context in the PR description for faster reviews. Template linked above.')
}

const hasLockfile = danger.git.modified_files.concat(danger.git.created_files).some(f => /lock\.(json|yaml)$/.test(f))
if (hasLockfile) message('Lockfile changed. Ensure reproducible builds and test in canary before prod.')

Blockers: build, unit tests, type checks, formatter, critical SAST/secret scan.
Annotations: style, docs, non-critical SAST, risk reminders, PR size warnings, flaky test hints.

If everything is a P0, nothing is. Promote only what truly protects quality to “required.”

Ownership and merge discipline: stop the “race to green”

Set clear responsibility lines and serialize merges through a queue. It’s pedestrian, and it works.

# CODEOWNERS (examples)
/apps/payments/**/*       @payments-team
/libs/crypto/**/*         @security
/infrastructure/terraform @platform

Require at least one CODEOWNERS review for hot paths.
Set branch protection with a merge queue (GitHub’s native queue or Mergify). This validates the exact tip you’ll merge, cuts “green-then-red” incidents, and reduces MTTR.

If you’re not ready for native queues, Mergify works fine:

# .mergify.yml
pull_request_rules:
  - name: merge via queue
    conditions:
      - status-success=ci
      - '#approved-reviews-by>=1'
      - base=main
    actions:
      queue: { method: merge, name: default }

Tie this to SRE-style SLOs for review response time (e.g., owners respond within 4 business hours). Your PR cycle time will drop just by making it explicit.

Monorepo without bespoke yak shaving

I’ve seen teams build a homegrown build graph that rivals Bazel to avoid running tests for unrelated packages. Unless you’re Google, start smaller.

Use paths filters to avoid waking CI on irrelevant changes.
Use turbo or nx for incremental builds if you truly need it, but keep required checks aggregated.

Simple selective CI:

# matrix builds only when relevant paths change
on:
  pull_request:
    paths:
      - 'services/payments/**'
      - 'packages/shared/**'

jobs:
  svc:
    strategy:
      matrix:
        svc: [payments]
    steps:
      - uses: actions/checkout@v4
      - run: pnpm i && pnpm --filter "services/${{ matrix.svc }}" build && pnpm test -w "services/${{ matrix.svc }}"

If you adopt turbo:

# turbo config keeps the graph simple and caches by task, not PR
pnpm dlx turbo run build test --filter="...[HEAD^]"

Cost/benefit:

Before: bespoke graph, 30+ required checks, 23–40 min CI, 3% flake.
After: paths filters + one aggregated check per language, 7–10 min CI, <1% flake. No custom infra to maintain.

Guardrails for AI-generated code without blocking the world

Whether you like it or not, your repo has AI-generated code. Some of it’s great; some is… vibe coding. Catch bad patterns early with fast, mostly annotating checks.

Add lightweight SAST and secrets scanning:

# .github/workflows/security.yml
name: security
on: pull_request
jobs:
  fast-security:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: trufflesecurity/trufflehog@v3
      - uses: returntocorp/semgrep-action@v1
        with: { config: 'p/owasp-top-ten' }

For Python/Go/TS, prefer paved-road tools:
- Python: ruff, black, pytest, bandit
- Go: golangci-lint, go test, govulncheck
- TS: eslint, tsc, jest/vitest

Pre-commit keeps diffs clean on dev machines:

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/psf/black
    rev: 24.8.0
    hooks: [ { id: black } ]
  - repo: https://github.com/charliermarsh/ruff-pre-commit
    rev: v0.6.8
    hooks: [ { id: ruff, args: [--fix] } ]
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.6.0
    hooks: [ { id: check-merge-conflict }, { id: end-of-file-fixer } ]

And let a bot maintain dependencies so humans don’t waste cycles:

// renovate.json
{
  "extends": ["config:base"],
  "schedule": ["after 8pm on sunday"],
  "rangeStrategy": "bump",
  "automerge": true,
  "automergeType": "branch",
  "packageRules": [
    { "matchManagers": ["npm"], "groupName": "npm minor/patch" },
    { "matchManagers": ["pip"], "groupName": "pip minor/patch" }
  ]
}

Use annotations for most SAST findings; promote only critical vulns to blocking. This avoids security theater that delays everything yet misses real risks.
Run heavier DAST or chaos tests post-merge in staging with canary deployment and a circuit breaker; measure with Prometheus and SLOs rather than arguing in PRs.

Before/after results from real teams

Payments service (Node, GKE, Istio)
- Before: 17 required checks, 28 min CI, 2.7% re-run rate, 42h PR cycle time, spike of red-after-merge incidents.
- After (paved road + merge queue): 4 required checks, 8.5 min CI, 0.4% re-run, 12h PR cycle time, 38% drop in incident rate and faster MTTR due to smaller, serialized merges.
Monorepo (Go + TS, ArgoCD GitOps, Terraform infra)
- Before: bespoke build graph, frequent flakes, engineers babysitting CI.
- After (paths filters + aggregated checks): CI minutes cut by 63%, reviewers assigned via CODEOWNERS, deployment SLOs stabilized.
Legacy modernization (Python)
- Before: untyped, mixed style, long reviews arguing about whitespace. AI-generated code sneaking in unsafe patterns.
- After (pre-commit + ruff/black + bandit annotations): style debates vanished, PRs shrank, caught three AI hallucination bugs pre-merge without blocking unrelated work.

A 30-day rollout that won’t hijack your quarter

Week 1: baseline
- Add the minimal CI (build, unit tests, types, format).
- Turn on CODEOWNERS for hot paths; require 1 owner approval.
- Enable pre-commit for local hygiene.
Week 2: signal tuning
- Add reviewdog or danger annotations; demote noisy checks from “required.”
- Introduce secrets/SAST scanning as annotations; promote only critical rules.
Week 3: merge discipline
- Enable merge queue (GitHub or Mergify). Aggregate required checks into 1–3 contexts.
- Set and socialize review response SLOs.
Week 4: optimization and guardrails
- Add paths filters; enable caching; cap timeouts to keep runs honest.
- Automate dependency PRs with Renovate.
- Instrument: track PR cycle time, CI duration, re-run rate, and change failure rate. Trim anything that doesn’t move those numbers.

If you plateau and still need more, then—and only then—consider heavier tools (e.g., turbo, build graphs). Don’t start there.

What I’d do differently (and what to avoid)

Don’t let every team bring their own linter and CI conventions. Set paved-road presets with escape hatches.
Avoid pull_request_target unless you fully understand the security model.
Don’t create 20 required checks; merge them.
Don’t block on flaky e2e; run them post-merge against a staging canary with Prometheus SLOs.
Avoid “drive-by approvals.” CODEOWNERS or it didn’t happen.
Resist custom CI orchestration until GitHub Actions defaults truly limit you. 90% of teams never hit that wall.

If you want a pair of seasoned eyes to set this up without yak shaving, that’s literally what GitPlumbers does.

structuredSections':[{

Related Resources

Key takeaways

Favor paved-road defaults over bespoke CI logic; keep required checks to the minimum set that protects quality.
Annotate liberally, block sparingly: make lint/security findings visible without stalling unrelated changes.
Keep PR feedback under 5 minutes and flake under 1%; use caching, paths filters, and concurrency to cut wait time.
Use CODEOWNERS and a merge queue to avoid drive-by approvals and “race to green” merges.
Automate the boring: pre-commit hooks, auto-formatters, dependency PRs via Renovate—but don’t make them blocking.
Measure PR cycle time, review wait time, and re-run rates; tune gates based on data, not vibes.
Start with a 30-day rollout: baseline checks, ownership, annotations, and merge discipline—then iterate.

Implementation checklist

Define 3–5 required checks: build, unit tests, type check, formatting, critical security scan.
Enable review annotations with `reviewdog` or `danger` for non-blocking signals.
Add `CODEOWNERS` for hot paths and risky areas; require at least one owner review.
Adopt GitHub Merge Queue (or Mergify) to serialize and validate merges.
Enforce fast feedback: cache deps, use `paths` filters, and limit job fan-out.
Add pre-commit hooks for formatters and basic linting; keep PR diffs clean.
Automate dependency updates with Renovate; confine to off-peak windows.
Track PR cycle time, CI duration, and re-run rate weekly; remove noisy checks.

Questions we hear from teams

How many required checks is “too many”?: Start with 3–5: build, unit tests, type check, formatter, and a critical security/secret scan. If your CI list doesn’t fit on a laptop screen without scrolling, you’ve almost certainly over-optimized the wrong thing.
Should we block on end-to-end tests?: Usually no. Run e2e post-merge against a staging canary with a circuit breaker and Prometheus SLOs. Keep PR checks fast and reliable; let deployment safety nets handle integration risk.
Do we need Bazel/Turborepo/Nx for a monorepo?: Not at first. Start with `paths` filters and aggregated checks. Add Turborepo/Nx when CI time and cache hit rates prove the need. Avoid bespoke graph builders unless you truly hit scale.
How do we keep AI-generated code from slipping risky patterns?: Use fast SAST and secret scans as annotations, enforce format/lint/type checks, and promote only critical findings to blocking. Pair that with `CODEOWNERS` on sensitive areas (crypto, auth, PII) for human review.
What metrics should we track to know it’s working?: Median PR cycle time, time to first CI signal, CI duration, re-run rate (flake), review wait time, and post-merge change failure rate. Decisions should be based on these, not vibes.

Ready to modernize your codebase?

Let GitPlumbers help you transform AI-generated chaos into clean, scalable applications.

Get a paved-road PR pipeline without the yak shaving We fix AI-generated vibe code before it hurts you