The Hidden Queue: Measuring Dev Friction and Killing Hand‑Off Wait Time on the Paved Road

If your PRs idle for a day and CI sits in “queued” for 20 minutes, you don’t have a talent problem—you have a queueing problem. Measure the friction that matters and pave the road so devs never wait on humans or bespoke tools again.

Developers don’t hate process—they hate waiting.
Back to all posts

You don’t have a talent problem—you have a queueing problem

A pre-IPO fintech called us when their “10x hires” couldn’t move faster than the legacy team. On Wednesdays, PRs waited ~30 hours for review. CI sat “queued” for 15–25 minutes on peak branches. Staging environments were shared and frequently wedged. The VP asked for more headcount; we found a hidden queue.

I’ve seen this movie. When devs wait on humans, shared infra, or bespoke tools, throughput flatlines no matter how senior the team is. The fix isn’t heroics—it’s measurement and a paved road that removes hand-offs by default.

Measure friction you can actually act on

Forget vanity metrics. Measure the waits that gate flow. Keep it simple and aligned to DORA/SPACE without creeping on individuals.

Track these five first:

  • PR cycle time: opened → merged median. Target: < 24h for trunk-based teams, < 48h for others.
  • First review response time: opened → first review. Target: < 2h during business hours.
  • CI queue vs runtime: run_started_at – created_at vs duration. Queue should be < 10% of runtime.
  • Flaky test rate: % of tests that both fail and pass without code change over a week. Target: < 2%.
  • Env provision lead time: PR opened → preview env healthy. Target: < 15 min.

Nice-to-haves once the basics are green:

  • Batch size: median LOC per PR (keep < 300 LOC).
  • Revert rate within 7 days (aim < 2%).
  • Waiting WIP: branches > 24h stale.

Don’t measure devs—measure systems. Aggregate at repo/team level. No keystroke nonsense.

Instrument in days, not months (no data lake required)

You can get to a credible dashboard with gh, jq, and a sheet/db. Start there; upgrade later.

  • PR metrics to CSV:
# Last 14 days of merged PRs with created/merged/reviews
gh pr list --state merged --limit 200 \
  --search "merged:>=$(date -v-14d +%Y-%m-%d)" \
  --json number,createdAt,mergedAt,reviews,author | \
  jq -r '.[] | [.number, .createdAt, .mergedAt, (.reviews[]?.submittedAt // null)] | @csv' > pr_metrics.csv
  • CI queue time from GitHub Actions:
# Compare created_at (queued) to run_started_at
gh run list --limit 200 --json databaseId,createdAt,runStartedAt,durationMs | \
  jq -r '.[] | [.databaseId, .createdAt, .runStartedAt, .durationMs] | @csv' > ci_runs.csv
  • Flaky test detection (JUnit):
# Count tests that failed at least once and later passed with no code changes
# Example sketch: parse recent JUnit XMLs and join on test identifiers
# Use a small script or a tool like `flaky` or `pytest-rerunfailures` to tag flakies
  • Throw it into something the team sees daily (Datadog, Grafana, Looker, even a Google Sheet). Annotate changes so you can correlate fixes to outcomes.

If you’re on GitLab, same idea via the REST API. If you’re on Jenkins, you can pull queueTime from the build API. Keep it boring and visible.

Kill the top three waits: review, CI, and environments

Start where the queue is longest.

  1. Cut review wait with automation and smaller batches
  • Turn on CODEOWNERS and auto-assign reviewers. Keep it to 1–2 primary owners per area.
# .github/CODEOWNERS
/api/**         @payments-team
/web/**         @frontend-core
/infrastructure @platform-oncall
  • Use GitHub’s “code review assignment” or a lightweight bot for reviewer roulette. Set a 2h SLA during business hours with a Slack reminder, not shaming.
  • Enforce small PRs. Use danger or a simple Action to warn on > 400 LOC diffs.
# .github/workflows/review-sla.yml
name: review-sla
on:
  schedule: [{cron: '*/30 14-23 * * 1-5'}]
  pull_request:
    types: [opened]
jobs:
  nudge:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/github-script@v7
        with:
          script: |
            const { data: pr } = await github.pulls.get({ ...context.repo, pull_number: context.issue.number });
            const openMs = Date.now() - new Date(pr.created_at).getTime();
            if (openMs > 2*60*60*1000 && pr.requested_reviewers.length === 0) {
              await github.pulls.requestReviewers({ ...context.repo, pull_number: pr.number, reviewers: ['reviewer-roulette-bot'] });
            }
  1. Slash CI queue and runtime
  • Enable concurrency to auto-cancel superseded runs on the same branch.
# .github/workflows/ci.yml
name: ci
on: [push, pull_request]
concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true
  • Cache dependencies aggressively and pin toolchains.
- uses: actions/cache@v4
  with:
    path: |
      ~/.npm
      ~/.cache/pip
      ~/.gradle/caches
    key: ${{ runner.os }}-${{ hashFiles('**/lockfiles/**') }}
  • Shard tests and run only what changed.
- name: Compute changed
  run: |
    echo "FILES=$(git diff --name-only origin/main...HEAD)" >> $GITHUB_ENV
- name: Jest
  run: npx jest --findRelatedTests $FILES --maxWorkers=50%
- name: Pytest
  run: pytest -k "$(python scripts/changed_tests.py)" -n auto --maxfail=1
  • Track flakies and quarantine them. Anything flaky > 3 runs goes behind a feature flag or gets rewritten.
  1. Kill environment hand-offs with previews
  • Use ApplicationSet to create a namespace per PR for k8s apps.
# argo-applicationset-prs.yaml
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: preview-prs
spec:
  generators:
    - pullRequest:
        github:
          owner: your-org
          repo: your-repo
          tokenRef: {secretName: github-token, key: token}
        labels: ["preview"]
  template:
    metadata:
      name: app-PR-{{number}}
    spec:
      destination: {server: https://kubernetes.default.svc, namespace: app-pr-{{number}}}
      source:
        repoURL: https://github.com/your-org/your-repo.git
        targetRevision: pull/{{number}}/head
        path: deploy/overlays/preview
  • For Terraform repos, run plan in PR and apply on merge using Atlantis instead of a human gatekeeper.

  • For non-k8s stacks, spin previews via flyctl, vercel, or short-lived ecs services. The goal: a link on the PR in under 15 minutes.

Pave the road: defaults beat bespoke every day

You don’t need another internal tool with an animal mascot. You need hard defaults that make the happy path the fast path.

  • Template repo with batteries included:
    • Makefile with make setup | lint | test | build | deploy.
    • Standard Dockerfile, health checks, and OpenTelemetry by default.
    • Reusable CI via workflow_call so updates propagate.
# .github/workflows/reusable-ci.yml
on:
  workflow_call:
    inputs:
      language: {required: true, type: string}
jobs:
  build-test:
    uses: your-org/.github/.github/workflows/lang-${{ inputs.language }}.yml@main
  • Dependency hygiene via Renovate and pre-commit hooks.
  • Feature flags wired from day one (LaunchDarkly or Unleash) to keep PRs small and enable canaries.
  • A thin developer portal (Backstage or even a docs site) that links the golden paths; don’t overbuild it.

Cost/benefit in practice:

  • Before: 30 repos with unique CI YAML, 4 flavors of Dockerfiles, three test runners. CI config churn ate 10–15 engineer-days per quarter.
  • After: one reusable workflow + template. We cut pipeline maintenance LOC by ~80% and shipped language/runtime updates org-wide in a day—not a quarter.

Before/after: two sprints to reclaim flow

An 80-engineer SaaS org we helped last fall. Baseline (week 0):

  • PR cycle time median: 2.7 days
  • First review response: 7.4 hours
  • CI queue time: 28 minutes (runtime 31 minutes)
  • Flaky rate: 7.1%
  • Preview env lead time: N/A (shared staging, 2–3 day booking)

Interventions (weeks 1–2):

  • CODEOWNERS + auto-assign + 2h SLA bot
  • CI concurrency + cache + sharded tests + quarantine
  • ArgoCD ApplicationSet preview envs per PR
  • Template repo + reusable CI + Renovate

Outcome (week 3):

  • PR cycle time median: 11 hours (−59%)
  • First review response: 1.9 hours (−74%)
  • CI queue time: 4 minutes (−86%); runtime: 22 minutes (−29%)
  • Flaky rate: 0.9% (−87%)
  • Preview envs: 12–14 minutes to healthy link

Business impact: ~20–25% more throughput measured as merged PRs per week and a 3x drop in Friday hotfixes. No extra headcount. The only “AI” involved was using small scripts instead of big platforms.

Run this play next sprint

You can do this without stopping the world.

  1. Baseline (2–3 days)

    • Script the 5 metrics with gh + jq and publish a simple chart.
    • Pick the longest queue and set an explicit target.
  2. Pave the road (2–3 days in parallel)

    • Create/refresh a template repo: Makefile, Dockerfile, reusable CI, Renovate, pre-commit.
    • Enable CODEOWNERS and branch protections on top repos.
  3. Kill the waits (1 week)

    • Review: auto-assign + 2h SLA reminder + small PR nudges.
    • CI: concurrency: cancel-in-progress, cache, shard, quarantine flakies.
    • Envs: previews per PR with ArgoCD ApplicationSet or your PaaS equivalent.
  4. Report and iterate (ongoing)

    • Share before/after in Slack weekly. Celebrate cycle-time wins.
    • Keep chopping the longest queue. Don’t boil the ocean.

What I’d do differently, and what to avoid

  • Don’t build a bespoke platform before fixing the waits. Get previews and review SLAs working; Backstage can wait.
  • Don’t measure people. Team-level metrics only; the goal is flow, not surveillance.
  • Don’t overfit tests. Chasing 100% coverage at the expense of speed hurts. Invest in contract tests and canaries.
  • Don’t gate on humans for Terraform. Use Atlantis or an equivalent and audit in Git; approvals should be for risk, not for terraform plan output diff aesthetics.
  • Don’t let templates drift. Use reusable workflows and a monthly “pave day” to roll changes org-wide.

Here’s what consistently works: paved-road defaults, tiny PRs, preview everything, and relentless attention to the queue. GitPlumbers can pair with your platform team to stand this up fast—and leave you with something boring, reliable, and cheap to run.

Related Resources

Key takeaways

  • Measure friction you can act on: PR cycle time, first-review SLA, CI queue time, flaky rate, and env provision lead time.
  • Kill waits at the source: auto-assign reviewers, auto-cancel superseded CI, shard tests, and spin ephemeral envs per PR.
  • Pave the road: standard Makefile, reusable CI workflows, CODEOWNERS, Renovate, and pre-commit—no bespoke bots.
  • Start in days: use `gh` + `jq` for metrics and a single dashboard. Avoid individual-level surveillance; fix systems, not people.
  • Expect material wins: 40–70% cycle-time reduction and a 2–4x drop in hot-fix pain within two sprints.

Implementation checklist

  • Define 5 friction metrics with targets and owners.
  • Enable `CODEOWNERS` + auto-assign + review SLAs.
  • Turn on CI `concurrency: cancel-in-progress` and caching.
  • Shard tests and track flaky rate below 2%.
  • Provision ephemeral environments per PR.
  • Create a paved-road template repo with a reusable CI workflow.
  • Report wins weekly and keep chopping the longest queue.

Questions we hear from teams

What metrics should I start with if I can only do two?
Start with PR cycle time (opened → merged) and CI queue time vs runtime. Those two expose review and infra bottlenecks that usually dominate lead time.
How do I avoid surveilling engineers?
Aggregate at team/repo level, not per person. Use timestamps from Git and CI, not IDE plugins or activity trackers. Share the dashboard openly and focus on system fixes.
Do I need Backstage to do this?
No. A template repo + reusable CI + a README can deliver 80% of the value. Add Backstage later to advertise golden paths once they’re stable.
What about monorepos vs polyrepos?
The playbook works for both. In monorepos, be extra disciplined about changed-file test selection, concurrency cancellation, and preview env routing by path.
Will AI code assistants fix this?
AI speeds up local coding, not organizational queues. If PRs sit for a day and CI queues for 20 minutes, AI won’t help until you kill those waits.

Ready to modernize your codebase?

Let GitPlumbers help you transform AI-generated chaos into clean, scalable applications.

Fix your developer wait time See how we pave the road

Related resources