Quality Gates That Don’t Suck: The Boring Automation That Stops Technical Debt at the PR
Stop arguing in PRs. Ship a paved-road quality gate that blocks debt by default and only makes exceptions explicit.
Boring automation beats hero refactors.Back to all posts
The PR That Cost Us 3 Weeks (And How a Boring Gate Would’ve Stopped It)
A few years back, a team I was advising merged a “simple” feature on a Friday. TypeScript compiled, unit tests passed locally, and the PR looked clean. Monday morning, error rates spiked 6% in a hot path. Root cause: a silent API schema drift plus some AI-generated glue code that “looked right” but returned undefined for a nullable field. We spent three weeks unwinding fallout and writing a postmortem that nobody read.
That PR should’ve been blocked. A paved-road quality gate would’ve caught the OpenAPI breaking change, the missing null check, and the coverage dip on new code. No heroics, no bespoke tools—just boring defaults wired correctly.
What We Mean by “Quality Gate” (And Why Yours Keeps Failing)
A quality gate is an automated set of non-negotiable checks that run on every PR and block merges to main unless the code meets baseline standards. The gates I’ve seen stick share a few traits:
- Boring tooling:
GitHub Actions,pre-commit,ESLint,ruff/black,mypy,Jest/pytest,CodeQL,Dependabot/Renovate, optionalSonarCloudordiff-cover. - Delta-focused: judge new/changed code, not the legacy landfill. Block regressions without demanding a Big Bang cleanup.
- Fast and stable: <7 minutes to green, no flaky tests, caching on by default.
- Centralized: one reusable workflow for all services; teams opt-in by adding a tiny file, not copying 200 lines of YAML.
- Auditable exceptions: temporary waivers with owners and expiry dates. No permanent TODOs.
Where I’ve seen this fail:
- Bespoke pipelines stitched with five homegrown tools and one engineer who knows how to reboot it.
- Overzealous rules (e.g., 90% coverage overnight) that stall delivery and get silently bypassed.
- Slow, flaky CI that teaches devs to distrust red builds and merge on admin privileges.
- “Guidelines” without enforcement. If it doesn’t block merges, it’s a suggestion, not a gate.
A Minimal, Paved-Road Gate You Can Ship This Sprint
Here’s the boring baseline we roll out at GitPlumbers. Two language examples (TypeScript and Python) using GitHub Actions, plus CodeQL for security.
1) Reusable GitHub Actions quality gate
Create a central workflow in a shared repo (or org .github repo) and call it via workflow_call.
# .github/workflows/quality-gate.yml
name: quality-gate
on:
workflow_call:
inputs:
language:
required: true
type: string
permissions:
contents: read
security-events: write
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
node:
if: ${{ inputs.language == 'node' }}
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: '20', cache: 'npm' }
- run: npm ci --prefer-offline --no-audit --no-fund
- run: npx eslint . --max-warnings=0
- run: npx tsc --noEmit
- run: npm test -- --ci --coverage
- name: Enforce coverage threshold
run: npx nyc check-coverage --lines=80 --functions=80 --branches=75
- name: Size budget
run: npx size-limit
python:
if: ${{ inputs.language == 'python' }}
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with: { python-version: '3.11' }
- name: Cache
uses: actions/cache@v4
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements*.txt') }}
- run: pip install -r requirements.txt -r requirements-dev.txt
- run: ruff check . --no-cache --output-format=github --exit-non-zero-on-fix
- run: black --check .
- run: mypy .
- run: pytest -q --maxfail=1 --disable-warnings --cov=. --cov-report=xml
- name: Diff coverage gate (new code)
run: |
pip install diff-cover
diff-cover coverage.xml --fail-under=80
codeql:
uses: github/codeql-action/.github/workflows/codeql.yml@v3Then in each repo:
# .github/workflows/pr.yml
name: PR
on: [pull_request]
jobs:
gate:
uses: your-org/.github/.github/workflows/quality-gate.yml@main
with:
language: node # or 'python'2) Local fast feedback with pre-commit
# .pre-commit-config.yaml
repos:
- repo: https://github.com/psf/black
rev: 23.12.1
hooks: [{ id: black }]
- repo: https://github.com/charliermarsh/ruff-pre-commit
rev: v0.6.8
hooks: [{ id: ruff, args: ["--fix"] }]
- repo: https://github.com/pre-commit/mirrors-prettier
rev: v3.2.5
hooks: [{ id: prettier }]
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
hooks:
- id: detect-private-key
- id: forbid-new-submodulesDev installs:
pip install pre-commit && pre-commit install3) TypeScript thresholds
// package.json
{
"scripts": {
"lint": "eslint . --max-warnings=0",
"typecheck": "tsc --noEmit",
"test": "jest --ci --coverage",
"size": "size-limit"
},
"jest": {
"coverageThreshold": {
"global": { "branches": 75, "functions": 80, "lines": 80 }
}
},
"size-limit": [
{ "path": "dist/index.js", "limit": "200 KB" }
]
}4) Python thresholds
# pytest.ini
[pytest]
addopts = -q --maxfail=1 --disable-warnings --cov=. --cov-report=xml5) Branch protections as code
# Require green checks and code reviews
OWNER=your-org REPO=your-repo
gh api \
-X PUT \
repos/$OWNER/$REPO/branches/main/protection \
-f required_status_checks.strict=true \
-f enforce_admins=true \
-f required_pull_request_reviews.required_approving_review_count=1 \
-F required_status_checks.contexts='["PR (gate)", "codeql"]'Manage this with Terraform (github_branch_protection) if you’re serious about drift.
What To Actually Block (Budgets That Work In Practice)
Set budgets that protect you from new debt while you gradually pay down old debt. I use:
- Lint: 0 warnings on changed code. ESLint
--max-warnings=0; ruff--exit-non-zero-on-fix. - Types: no loose
anyin new code. EnablenoImplicitAnyandstrictNullChecks; allow legacy files behind// @ts-nocheckonly with a temporary waiver. - Coverage (delta): 80/80/75 for lines/functions/branches on changed code. Enforce global minimums only after a few sprints; start with
diff-cover(Python) orjest --changedSinceand PR-only coverage tools for TS. - Security: CodeQL must be green; secrets scanners block on detection.
- API compatibility: Gate OpenAPI diffs with
oasdifffor breaking changes. - Size/perf budgets:
size-limitfor bundles;pytest-benchmarkfor perf-critical functions.
Example OpenAPI diff gate:
npm i -D @redocly/openapi-cli
npx openapi bundle openapi.yaml -o dist/openapi.yaml
npx oasdiff breaking --fail-on-diff dist/openapi.yaml base/openapi.yamlThis prevents “oops” changes to enums and required fields—the exact class of break that AI-generated “vibe code” happily ignores.
Before/After: The Real Costs and Benefits
I’ve run this playbook at three companies in the last two years, from a unicorn with 200+ services to a Series B on a monorepo. The numbers rhyme:
- Cycle time: Median PR-to-merge dropped from 2.4 days to 1.6 days after the first month. Reason: fewer subjective review comments (style, nits) and more focused reviews.
- Escaped defects: Production bug reports per 100 PRs dropped ~35% once API diff checks and delta coverage landed.
- MTTR: With consistent test + type gates, we cut “schema drift” incidents by half; on-call pages for null derefs basically disappeared in Node/TS services.
- Developer time: New-hire ramp reduced by ~1 week because “what good looks like” is codified by the gate. Less bikeshedding; more shipping.
- CI spend: Yes, gates add minutes. But we trimmed flakiness and added caching—net CI cost stayed flat while rework costs fell. The CFO stopped asking why Kubernetes was burning money on daytime “vibe coding” retries.
Trade-offs:
- You’ll reject more PRs at first. The first 2–3 weeks feel slower. After the cleanup, velocity improves.
- You need an exception path for legacy modules. The trick is expiry: waivers auto-fail after 30 days.
- Someone owns the paved road. Platform teams that “set and forget” watch entropy creep back in.
Common Booby Traps (And How To Avoid Them)
- Flaky tests: Flakes erode trust. Quarantine with
jest --runInBandfor the test file andpytest -k not flakymarkers. Track flakes as SLO breaches for your test suite. - Slow CI: Cache everything (
actions/cache,setup-nodewith npm cache). Parallelize jobs; fail fast on lint/types before running long tests. - Over-strict day one budgets: Start with delta coverage. Ratchet +2% global every sprint until you hit your target.
- Silent warnings: Treat warnings as errors. If you can’t, at least fail on new warnings using baseline files.
- No visibility: Publish a weekly gate report: pass/fail count, avg duration, top causes of failure. Celebrate the trend.
- AI-generated code: “Vibe code” tends to compile but violate invariants. Add property tests for critical code and schema diff checks. If you’re already digging out, GitPlumbers does targeted vibe code cleanup and code rescue to get you back to green.
Rollout Plan: 30 / 60 / 90 Days
Day 1–30: Prove it
- Pick two services (one TS, one Python). Ship the reusable
quality-gateworkflow. - Turn on
pre-commitandbranch protection. - Add CodeQL and secret scanning.
- Set delta coverage at 80%; global coverage unchanged.
- Pick two services (one TS, one Python). Ship the reusable
Day 31–60: Scale it
- Move the workflow to org-level
.githubrepo. Default all new repos to use it. - Add API diff gate (
oasdiff) for services with OpenAPI. - Turn on Renovate/Dependabot with weekly group updates that must pass the gate.
- Publish the weekly gate report in Slack; measure cycle time and escaped defects.
- Move the workflow to org-level
Day 61–90: Ratchet and automate exceptions
- Introduce a waiver mechanism (label or file) with owner + expiry; CI fails if expired.
- Raise global coverage by +2–5% where feasible.
- Terraform-ize branch protections to kill drift.
- Document the paved road and lock in: templates, cookiecutters, and a 30-minute onboarding video.
That’s it. Not flashy. It works.
When to Call in Help (And What We Actually Do)
If you’re swimming in legacy, AI hallucinations in your codebase, or a monorepo with flaky tests, this is where GitPlumbers earns its name. We show up, unbreak pipelines, wire paved-road defaults, and set budgets that your teams can live with. We’ve done vibe coding cleanup, AI code refactoring, and full-on code rescue without stopping delivery.
Boring automation beats hero refactors.
If that resonates, let’s talk. We’ll give you a short blueprint specific to your stack and constraints.
Key takeaways
- Boring, paved-road defaults beat bespoke tooling for code quality gates.
- Treat warnings as errors on new code; set realistic coverage deltas and complexity budgets.
- Centralize reusable CI workflows so teams don’t copy/paste YAML forever.
- Measure with real outcomes: escaped defects, cycle time, MTTR, and rework rates.
- Automate exceptions: temporary waivers with expiry > merge-by-approval.
- Make the gate fast (<7 minutes) and stable (no flaky tests) or developers will route around it.
Implementation checklist
- Enforce branch protection with required checks and no direct pushes to main.
- Adopt pre-commit for fast local feedback (format, lint, security, secrets).
- Create a reusable CI quality-gate workflow (lint, test, coverage, static analysis).
- Set thresholds that block on deltas (new code) rather than legacy landfills.
- Automate code scanning (CodeQL) and dependency health (Dependabot/Renovate).
- Track and publish gate results to a dashboard (pass/fail counts, average duration).
- Add an exception mechanism with owners and expiry; no permanent TODOs.
Questions we hear from teams
- How strict should we set coverage at the start?
- Gate on deltas at ~80% for lines/functions and 75% for branches. Leave the global coverage alone initially and ratchet +2–5% per sprint. Blocking on legacy global coverage from day one will stall delivery.
- What about languages not shown (Go, Java)?
- Same pattern. For Go, use `golangci-lint`, `go test -coverprofile` plus `go tool cover` for thresholds, and `gosec`. For Java, use `spotless`, `errorprone`, `jacoco` coverage, and `OWASP Dependency Check` or CodeQL. Keep the workflow reusable with an input to switch stacks.
- Is SonarCloud required?
- Not required. If you already use Sonar, enable its quality gate and focus on delta coverage and new critical issues. Otherwise, keep it simple with ESLint/ruff + `diff-cover` and CodeQL. Add Sonar only if you’ll actually look at its dashboards.
- How do we handle exceptions without opening floodgates?
- Use a waiver file (`.gate-waivers.yaml`) or PR label that requires an owner, reason, and expiry date. CI should fail if the waiver is expired or missing an owner. Publish waivers in the weekly gate report.
- Will this slow my team down?
- In the first 2–3 weeks, yes a bit. After that, PR discussions shift from nits to substance and rework drops. The net effect is faster merges and fewer incidents. We’ve seen 30–40% reductions in escaped defects with a stable gate.
Ready to modernize your codebase?
Let GitPlumbers help you transform AI-generated chaos into clean, scalable applications.
