How do we stop AI review tools from hallucinating and blocking PRs?

Use AI for triage and summarization, not gating. Let AI draft PR summaries or suggest nit fixes, but never make it a required check. Keep humans in the approval loop, and rely on deterministic linters/tests for gates.

Our repo is polyglot and huge. Will path filters and caching still help?

Yes. Use `dorny/paths-filter` (or GitLab rules) to scope jobs and `Nx`/`Turborepo` for monorepo caching. Even basic `actions/cache` for language package managers will cut times by 30–60%.

Where do E2E tests fit?

Run smoke E2E in the merge queue and fuller E2E nightly or on demand. PRs should run unit and a thin slice of integration tests. Keep E2E out of the critical path for individual PRs.

We have regulated workloads. How do we prove compliance?

Keep your paved-road checks reproducible, version-pinned, and auditable. Treat policy-as-code (OPA/Conftest) and Terraform-managed branch protections as your evidence. Store CI logs and SARIF in an artifact bucket; that satisfies most audit trails without bespoke bots.

Can we do this with GitLab or Bitbucket?

Yes. The principles are the same: shared templates, selective jobs, one policy gate, and a merge queue equivalent. Swap in GitLab CI `rules:changes` and `approval rules`, or Bitbucket Pipelines with `conditions` and a single quality gate job.

Platform-productivity · Oct 21, 2025 · 8 minute read

Code Review Automation That Doesn’t Kill Velocity: A Paved-Road You Can Actually Live With

Guardrails, not gates. The reference CI pipeline, policy where it matters, and the merge-queue patterns that cut PR cycle time without letting quality rot.

Alex Pierce

Partner, Platform & Reliability at GitPlumbers

20 years in the trenches from bare-metal Linux to multi-cluster GitOps. Led platform and SRE teams at two unicorns, survived three monorepo rewrites, and still has opinions about Makefiles.

Boring wins. Pave the road, measure, then tighten.

Back to all posts

The PR treadmill you’ve lived through

You ship a feature, open a PR, and watch the checks queue: three linters, two scanners, a bespoke bot, and a full E2E suite on every keystroke. Half are flaky. Someone added a “just-in-case” check last quarter and forgot to pin versions. Your 60-line change sits for hours, reviewers context-switch, and lead time balloons. I’ve seen this movie at startups and at $10B unicorns. Automation meant to help becomes a tax.

Here’s the version that actually works: a paved road with boring defaults, guardrails not gates, and fast feedback. No bespoke snowflakes. You can tighten the screws later when it’s stable.

Principles: guardrails, paved roads, boring defaults

Catch the common 80% with four checks: format/lint, fast unit tests, dependency/security scan, and manifest/infra validation.
Prefer annotations over red builds early on. Use reviewdog to comment inline instead of failing jobs.
Selective execution with path filters. Don’t run K8s validators for docs-only PRs.
Single source of truth via a reusable CI workflow template; pin tool versions.
Heavy policy where it matters: auth, migrations, infra. Everything else gets lighter rules.
Make main always releasable with a merge queue and canaries. Shrink blast radius with trunk-based dev.

The paved road: a reference PR pipeline you can copy

These four jobs catch most issues fast without melting developer time:

Format/Lint: run pre-commit, eslint, golangci-lint. Pin versions.
Unit tests: parallel, small, deterministic. No network.
Deps/Security: osv-scanner, npm audit --omit=dev or pip-audit. Soft-fail with baseline first.
Manifests/Infra: kubeconform, kustomize build, kubectl apply --server-dry-run, terraform validate, tfsec.

A GitHub Actions template that stays fast:

# .github/workflows/pr.yml
name: pr

on:
  pull_request:
  merge_group:

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

jobs:
  changes:
    runs-on: ubuntu-latest
    outputs:
      frontend: ${{ steps.filter.outputs.frontend }}
      backend: ${{ steps.filter.outputs.backend }}
      infra: ${{ steps.filter.outputs.infra }}
      docs: ${{ steps.filter.outputs.docs }}
    steps:
      - uses: actions/checkout@v4
      - uses: dorny/paths-filter@v3
        id: filter
        with:
          filters: |
            frontend: ["web/**", "package.json", "eslint.*"]
            backend: ["go/**", "cmd/**", "internal/**", "go.*"]
            infra: ["infra/**", "k8s/**", "helm/**", "terraform/**"]
            docs: ["docs/**", "README.md"]

  lint-and-format:
    needs: changes
    if: needs.changes.outputs.docs != 'true'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
      - uses: actions/setup-go@v5
        with:
          go-version: '1.22'
      - name: Pre-commit
        uses: pre-commit/action@v3.0.1
      - name: ESLint (frontend)
        if: needs.changes.outputs.frontend == 'true'
        run: |
          npm ci
          npx eslint . --max-warnings=0
      - name: Golangci-lint (backend)
        if: needs.changes.outputs.backend == 'true'
        uses: golangci/golangci-lint-action@v6
        with:
          version: v1.60.0
          args: --timeout=5m

  unit-tests:
    needs: changes
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        include:
          - part: backend
          - part: frontend
    steps:
      - uses: actions/checkout@v4
      - name: Backend tests
        if: matrix.part == 'backend' && needs.changes.outputs.backend == 'true'
        uses: actions/setup-go@v5
        with:
          go-version: '1.22'
      - name: Run go tests
        if: matrix.part == 'backend' && needs.changes.outputs.backend == 'true'
        run: go test ./... -race -count=1 -parallel=4
      - name: Frontend tests
        if: matrix.part == 'frontend' && needs.changes.outputs.frontend == 'true'
        uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
      - name: Run jest
        if: matrix.part == 'frontend' && needs.changes.outputs.frontend == 'true'
        run: |
          npm ci
          npx jest --runInBand

  security-and-deps:
    needs: changes
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: aquasecurity/trivy-action@0.24.0
        with:
          scan-type: fs
          format: sarif
          output: trivy.sarif
          ignore-unfixed: true
      - uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: trivy.sarif
      - name: OSV Scanner
        uses: google/osv-scanner-action@v1
        with:
          scan-args: "--recursive ."

  manifests-and-infra:
    needs: changes
    if: needs.changes.outputs.infra == 'true'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Validate Kubernetes manifests
        run: |
          curl -sSL https://github.com/yannh/kubeconform/releases/download/v0.6.7/kubeconform-linux-amd64.tar.gz | sudo tar -xz -C /usr/local/bin kubeconform
          kubeconform -strict -summary -schema-location default -schema-location "https://raw.githubusercontent.com/yannh/kubernetes-json-schema/master/{{.NormalizedKubernetesVersion}}-standalone/{{.ResourceKind}}.json" k8s/
      - name: Kustomize build
        run: |
          curl -sS https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh | bash
          ./kustomize build k8s/overlays/dev | kubectl apply --server-side --dry-run=client -f -
      - name: Terraform validate + tfsec
        run: |
          terraform -chdir=infra init -backend=false
          terraform -chdir=infra validate
          curl -L https://github.com/aquasecurity/tfsec/releases/download/v1.28.5/tfsec-linux-amd64 -o /usr/local/bin/tfsec && chmod +x /usr/local/bin/tfsec
          tfsec infra/

  review-annotations:
    needs: [lint-and-format, unit-tests, security-and-deps, manifests-and-infra]
    runs-on: ubuntu-latest
    if: always()
    steps:
      - uses: actions/checkout@v4
      - name: Summarize results (reviewdog)
        uses: reviewdog/action-suggester@v1
        with:
          tool_name: pr-summary
          fail_on_error: false

Lock in consistent local behavior with pre-commit so CI isn’t the first time engineers hear about formatting:

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.6.0
    hooks:
      - id: trailing-whitespace
      - id: end-of-file-fixer
      - id: check-yaml
  - repo: https://github.com/charliermarsh/ruff-pre-commit
    rev: v0.6.1
    hooks:
      - id: ruff
  - repo: https://github.com/golangci/golangci-lint
    rev: v1.60.0
    hooks:
      - id: golangci-lint
  - repo: https://github.com/pre-commit/mirrors-eslint
    rev: v9.12.0
    hooks:
      - id: eslint

And a boring CODEOWNERS that routes reviews without ceremony:

# CODEOWNERS
*                       @eng/platform
web/**                  @eng/web-core
internal/auth/**        @eng/security @eng/backend
infra/**                @eng/platform @eng/sre
k8s/**                  @eng/platform @eng/sre

Policy where it matters: approvals by risk, enforced by code

I’ve watched teams try to encode every tribal rule into GitHub settings and bespoke bots. It turns into a maze. Keep GitHub’s native branch protections simple, then enforce nuanced rules with a single policy-as-code check.

Require 1 approval and the paved-road checks on every PR.
Add a single required status check, policy/risk-gate, that fails only when high-risk changes lack the right approvals or metadata.

Example Conftest policy that requires two approvals and a ticket ID for auth/, db/migrations/, and infra changes:

# policy/approval.rego
package pr

import future.keywords.if

high_risk_paths := ["internal/auth/", "db/migrations/", "infra/", "k8s/"]

is_high_risk { some p in input.changed_files; startswith(p, high_risk_paths[_]) }

missing_ticket {
  not re_match("[A-Z]{2,}-[0-9]+", input.title)
}

required_approvals := 2 if is_high_risk else 1

violation[msg] {
  is_high_risk
  input.approvals < 2
  msg := "High-risk change requires >=2 approvals"
}

violation[msg] {
  is_high_risk
  missing_ticket
  msg := "High-risk change must reference a ticket (e.g., SEC-123)"
}

Wire it into CI and make it the only “policy” gate:

# .github/workflows/policy.yml
name: policy
on:
  pull_request:

jobs:
  risk-gate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Gather PR context
        id: ctx
        uses: actions/github-script@v7
        with:
          script: |
            const files = await github.paginate(github.rest.pulls.listFiles, { ...context.issue, per_page: 100 });
            const approvals = await github.paginate(github.rest.pulls.listReviews, { ...context.issue, per_page: 100 });
            const approved = approvals.filter(r => r.state === 'APPROVED').length;
            return { files: files.map(f => f.filename), approvals: approved, title: context.payload.pull_request.title };
      - name: Install conftest
        run: |
          curl -L https://github.com/open-policy-agent/conftest/releases/download/v0.56.0/conftest_Linux_x86_64.tar.gz | tar xz
          sudo mv conftest /usr/local/bin/
      - name: Generate input.json
        run: |
          cat << 'JSON' > input.json
          { "changed_files": ${{ steps.ctx.outputs.result }} }
          JSON
      - name: Test policy
        run: conftest test -p policy input.json

Configure branch protection in Terraform to require just the paved-road checks and the policy gate:

# terraform/github/branch_protection.tf
resource "github_branch_protection" "main" {
  repository_id  = github_repository.app_repo.node_id
  pattern        = "main"
  required_pull_request_reviews {
    required_approving_review_count = 1
  }
  required_status_checks {
    strict   = true
    contexts = [
      "pr (lint-and-format)",
      "pr (unit-tests)",
      "pr (security-and-deps)",
      "pr (manifests-and-infra)",
      "policy (risk-gate)",
    ]
  }
}

Make it fast: selective runs, caching, and merge queues

Speed is a feature. Engineers respect automation that returns results in minutes, not hours.

Selective runs by path: shown above with dorny/paths-filter. Only run infra checks when infra changes.
Caching: use actions/setup-node and actions/setup-go built-in caches. For Python, pip cache dir + actions/cache@v4. For monorepos, use Nx or Turborepo remote cache.
Cancel superseded builds: concurrency in the workflow aborts stale runs.
Merge queue: keep main green and speed throughput. GitHub Merge Queue or bors runs a mini-batch with your PR plus main; if it passes, it merges.
Flake hunts: Prometheus alert on CI flake rate > 2%. Quarantine flaky tests with a label, not a global retry.

Minimal merge-queue setup with GitHub:

# Enable merge queue via GitHub UI or API
# Then make sure your workflow listens to merge_group (already in pr.yml)

If you’re on older GitHub plans, bors-NG is the proven alternative.

Before/after: what changed at a B2B SaaS client

Environment: GitHub, monorepo (Go, Node), ArgoCD to EKS with Istio, Terraform for infra, Prometheus/Grafana, canary via Argo Rollouts, SRE on-call with SLOs for checkout API.

Before (bespoke automation, every PR ran everything):

Median PR cycle time: 2.6 days
Mean CI wall time per PR: 48 minutes; 37% required re-runs due to flakes
MTTR: 9.5 hours (main broke weekly, long restore due to batch merges)
Release frequency: 2/week

After (paved-road + policy gate + merge queue):

Median PR cycle time: 9 hours (p50); p90 feedback under 20 minutes
CI wall time per PR: 8–12 minutes; flake rate 0.8%
MTTR: 55 minutes (main is almost always green; canary + circuit breakers in Istio reduce blast radius)
Release frequency: 20–25/week
Security: OSV baseline eliminated 120 historical warnings; new vulns block only when reachable code is affected

Business effect: fewer context switches, faster features to paid pilots, less anxiety for reviewers. No heroics, just fewer sharp edges.

Rollout in three sprints (and what to avoid)

Sprint 1: standardize and de-flake
- Introduce .pre-commit-config.yaml, CODEOWNERS, and the shared PR workflow.
- Pin tool versions; quarantine flaky tests instead of retries.
- Add paths-filter, concurrency, and caching. Target <15 min feedback.
- Track: PR cycle time, CI duration, rerun rate.
Sprint 2: policy where it matters
- Add policy/risk-gate with Conftest; start as soft-fail (continue-on-error: true).
- Require 1 approval across the board; use policy to drive 2+ for auth/, migrations/, infra/.
- Add manifest/infra validation and terraform validate/tfsec.
- Track: number of policy violations/week; tune wording and docs.
Sprint 3: keep main green, harden
- Turn on Merge Queue; move policy to required.
- Add canary deployments in non-prod via ArgoCD Pull Request Preview or an ephemeral namespace.
- Add reviewdog annotations for lint/test failures with links to the paved-road doc.
- Track: mainline failure rate, MTTR, throughput (DORA metrics).

What to avoid:

Building a bespoke bot army. One policy check beats five half-working webhooks.
All-or-nothing security. Baseline first; fail new issues only. Otherwise you’ll get bypasses.
Full E2E on every PR. Reserve E2E for merge-queue and nightly; PRs should be unit/integration only.
Unpinned versions. Today’s green is tomorrow’s red when a linter revs.

Boring wins. Pave the road, measure, then tighten. I’ve never regretted shipping faster with fewer flakes.

If you want a shortcut, GitPlumbers brings this in as a drop-in paved-road for GitHub or GitLab, with ArgoCD and Terraform hooks included. We’ll even clean up your flaky tests and get your merge queue humming.

Related Resources

Key takeaways

Automate the boring 80% with a paved-road pipeline: format/lint, fast unit tests, dependency/security scan, and manifest/infra validation.
Use selective execution and caching to keep PR feedback under 10 minutes; cancel superseded builds and use a merge queue to keep main green.
Enforce heavy policy only where risk is high (auth, migrations, infra) using policy-as-code and a single required status check.
Prefer inline annotations (reviewdog) over red builds; convert warnings to hard fails progressively with baselining.
Measure PR cycle time, review wait time, and flake rate; tune until median PR feedback < 10 minutes and mainline stays releasable.

Implementation checklist

Adopt a shared PR workflow template with concurrency, path filters, and caching.
Enable pre-commit and consistent linters across languages with pinned versions.
Add a single policy-as-code check (Conftest/OPA) for risky paths; keep it optional at first.
Turn on merge queues and trunk-based development with small PRs.
Track PR cycle time, re-run rate, and flake rate in a weekly dashboard and prune slow checks.

Questions we hear from teams

How do we stop AI review tools from hallucinating and blocking PRs?: Use AI for triage and summarization, not gating. Let AI draft PR summaries or suggest nit fixes, but never make it a required check. Keep humans in the approval loop, and rely on deterministic linters/tests for gates.
Our repo is polyglot and huge. Will path filters and caching still help?: Yes. Use `dorny/paths-filter` (or GitLab rules) to scope jobs and `Nx`/`Turborepo` for monorepo caching. Even basic `actions/cache` for language package managers will cut times by 30–60%.
Where do E2E tests fit?: Run smoke E2E in the merge queue and fuller E2E nightly or on demand. PRs should run unit and a thin slice of integration tests. Keep E2E out of the critical path for individual PRs.
We have regulated workloads. How do we prove compliance?: Keep your paved-road checks reproducible, version-pinned, and auditable. Treat policy-as-code (OPA/Conftest) and Terraform-managed branch protections as your evidence. Store CI logs and SARIF in an artifact bucket; that satisfies most audit trails without bespoke bots.
Can we do this with GitLab or Bitbucket?: Yes. The principles are the same: shared templates, selective jobs, one policy gate, and a merge queue equivalent. Swap in GitLab CI `rules:changes` and `approval rules`, or Bitbucket Pipelines with `conditions` and a single quality gate job.

Ready to modernize your codebase?

Let GitPlumbers help you transform AI-generated chaos into clean, scalable applications.

Get your paved-road CI in place in 2 weeks Read the merge queue case study