How do we prevent pilots from stealing production SRE cycles?

Gate pilots in a lab namespace with quotas and egress limits, require on-call shadowing only after a promotion decision, and use CODEOWNERS so platform/security review changes before they hit shared infra.

What if we miss roadmap commitments because of the 10% pilot allocation?

Then you overcommitted. Make pilot capacity explicit in sprint planning. If leadership wants to reclaim it, they should publicly agree to defer a scope slice or accept risk to the innovation pipeline.

How do we keep spikes from turning into stealth migrations?

Timebox to two weeks, require an ADR on day one, and schedule a kill-or-promote decision in the calendar. No decision, no continuation. Enforce via labels and the weekly triage.

Isn’t this just more process?

It’s the minimum viable process to protect exploration. The rituals are short, the gates are binary, and the guardrails are code. Compared to failed hack weeks and ad-hoc spikes, it’s less overhead and more outcomes.

Culture · Oct 31, 2025 · 10 minute read

Innovation Time Without the Theater: The 85/10/5 Model That Survives Q4

You don’t need hackathons or vague “20% time.” You need guardrails, rituals, and metrics that let exploration breathe without blowing your delivery SLOs.

Alex Ramirez

Principal Engineer, GitPlumbers

20 years untangling platforms at banks, ad-tech, and unicorns. Ex-Spotify platform, ex-AWS SA. I help leaders ship safely while paying down the debt they’ve been ignoring.

Innovation isn’t free time; it’s scheduled, gated, and measured time.

Back to all posts

I’ve watched three flavors of “innovation time” fail in big shops: the mythical Google-style 20% that quietly withers under Q4 pressure, the once-a-year hackathon that creates fun prototypes nobody can run twice, and the “do it after hours if you really care” martyrdom program. If you’ve lived through end-of-quarter death marches, you know why this happens: no explicit capacity, no stage gates, and no metrics leadership will back when the board asks where the roadmap went.

Here’s what actually works: treat innovation like production work—with time allocation you can defend, guardrails you can automate, and rituals you won’t cancel when incidents spike. We’ve implemented this at banks, logistics orgs, and a unicorn that had more microservices than engineers. The model below keeps exploration breathing without choking delivery.

Why your “20% time” never sticks

It’s not on the plan. If it isn’t reflected in Jira capacity and sprint commitments, it doesn’t exist. Managers will “borrow” it at the first slip.
No stage gates. Spikes linger because there’s no promotion or kill decision. Everything becomes a zombie pilot.
Environments are unsafe. “Quick spikes” happen in prod-like clusters with real secrets and no quotas. Finance finds out in the AWS bill.
No measurable outcomes. You celebrate demos instead of adoption. Execs see “fun stuff” instead of reduced lead time or reliability gains.

The antidote: simple allocations, boring rituals, and guardrails in code. Not posters, not slogans.

A pragmatic allocation model: 85/10/5 with stage gates

Forget “innovation days.” Allocate capacity across three lanes—visible in planning tools and reinforced with stage gates.

85% Delivery: committed roadmap, SLO work, defects. Untouchable.
10% Pilots: production-adjacent experiments that could ship within a quarter. Think: canary of Istio egress policy, a feature flag rollout via LaunchDarkly, a new ETL in dbt.
5% Spikes: short, timeboxed exploration. New vector DB? Fine. Two weeks, then decide.

Stage gates (keep them short and ruthless):

Spike -> Kill or Pilot in 2 weeks max. Require a one-page ADR and a demo. 60-minute hard stop.
Pilot -> Ship or Park within 6 weeks. Must report SLO impact, cost-to-run, and security sign-off.
If it ships, fold into Delivery. If it parks, archive with reasoning and a revisit date.

Put the gates in the repo:

docs/adr/ for decisions, docs/rfcs/ for pilots that affect interfaces.
CODEOWNERS so pilots changing traffic, auth, or data access ping platform/security by default.
Labels in Jira or Linear: type=pilot, type=spike, innovation=true. Capacity is calculated off these labels.

Example ADR bootstrap you can wire to a script:

#!/usr/bin/env bash
# create_adr.sh "Evaluate OpenTelemetry Collector for edge traces"
TITLE="$*"
DATE=$(date +%Y-%m-%d)
SLUG=$(echo "$TITLE" | tr '[:upper:]' '[:lower:]' | sed -E 's/[^a-z0-9]+/-/g')
FILE="docs/adr/${DATE}-${SLUG}.md"
cat > "$FILE" <<EOF
# ADR: $TITLE

- Date: $DATE
- Status: Proposed
- Stage: Spike | Pilot
- Owner: @your-handle
- Decision Due: $(date -v+14d +%Y-%m-%d 2>/dev/null || date -d "+14 days" +%Y-%m-%d)

## Context

## Options Considered

## Decision

## Impact on SLO/Cost/Security

## Next Gate
EOF

git add "$FILE" && git commit -m "ADR: $TITLE" && echo "Created $FILE"

Communication rituals that make it real

You don’t need more meetings. You need small, consistent rituals that survive incident weeks.

Weekly 25-minute Triage (Mon/Tue). Review spikes/pilots against gates. One slide per item. Decisions only. Calendar-protect it. Leaders attend or delegate with authority.
Bi-weekly 45-minute Demo. No theater. Show working code, metrics, and the ADR. Invite platform, security, and a PM. Record and post a 3-minute cut.
Monthly 60-minute RFC Review. Anything touching contracts, traffic, or data shapes. Use docs/rfcs/000X.md and require comments in GitHub.
Slack digest bot (auto). Friday summary of what moved gates, cost burned, and next decisions.

We’ve wired this with GitHub Actions and Slack so it’s not a manual report:

# .github/workflows/innovation-digest.yml
name: innovation-digest
on:
  schedule:
    - cron: "0 16 * * FRI"  # 4pm UTC Fridays
jobs:
  digest:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Summarize ADRs and PRs with labels
        run: |
          ./scripts/innovation_digest.sh > digest.md
      - name: Post to Slack
        uses: slackapi/slack-github-action@v1.26.0
        with:
          payload: |
            {
              "channel": "#innovation",
              "text": "$(sed -z 's/\n/\\n/g' digest.md)"
            }
        env:
          SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}

If your culture tolerates canceling these for “real work,” stop reading. This won’t stick. Leaders must guard the time and make the trade-offs explicit in front of the room.

Guardrails in code: environments, budgets, and approvals

The fastest way to get innovation canceled is to let it break prod or blow the cloud bill. Put hard rails around it.

Isolated lab environment with quotas and narrow egress. Deploy via ArgoCD so it’s GitOps from day one.
Budgets-as-code for anything tagged innovation=true in Terraform; alarms route to the pilot owner and finance.
Automatic cleanup: TTL labels on namespaces; nightly job reaps expired spikes.
Approvals by path: CODEOWNERS requires platform/security review for anything under /labs or touching network/auth.

Example ArgoCD app for a lab namespace with an egress policy and resource quota:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: lab-env
spec:
  destination:
    namespace: lab
    server: https://kubernetes.default.svc
  source:
    repoURL: https://github.com/your-org/lab-infra
    targetRevision: main
    path: k8s/lab
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
---
apiVersion: v1
kind: ResourceQuota
metadata:
  name: lab-quota
  namespace: lab
spec:
  hard:
    requests.cpu: "8"
    requests.memory: 16Gi
    limits.cpu: "16"
    limits.memory: 32Gi
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-all-except-artifacts
  namespace: lab
spec:
  podSelector: {}
  policyTypes: [Egress]
  egress:
    - to:
        - namespaceSelector:
            matchLabels:
              name: artifacts

Budget guardrail with Terraform (AWS example):

resource "aws_budgets_budget" "innovation" {
  name              = "innovation-monthly"
  budget_type       = "COST"
  limit_amount      = "5000"
  limit_unit        = "USD"
  time_unit         = "MONTHLY"
  cost_filters = {
    "TagKeyValue" = ["innovation$true"]
  }
  notification {
    comparison_operator = "GREATER_THAN"
    threshold           = 80
    threshold_type      = "PERCENTAGE"
    subscriber_email_addresses = ["finops@your-org.com", "pilot-owner@your-org.com"]
  }
}

CODEOWNERS to keep risky changes honest:

/labs/** @platform-team @security-team
/infra/network/** @platform-team @netsec

What leadership must do (and not do)

I’ve seen CFOs bless innovation and then quietly claw it back via “just one more quarter” requests. Don’t be that exec. Behaviors that make this work:

Say “no” publicly. When a roadmap item threatens the 10% pilot capacity, decline it in the triage. People need to see the trade-off.
Kill fast with gratitude. Celebrate the team that kills their own spike because the data said no. Put their names in the company update.
Promote owners, not ideas. The owner who ships or kills on time gets rewarded. The “vision” is secondary.
Never hide cost. Share the cost-to-learn monthly. It’s easier to defend $4.2k of learnings that killed a bad idea than a stealth $20k surprise.
Tie to business KPIs. Frame pilots as a way to reduce MTTR, cut lead time, or open a new channel. Execs fund outcomes, not shiny tech.

Anti-patterns to squash:

“Free Fridays” that get canceled every time a sev-2 happens.
Hack weeks with no runway to production.
Spikes lasting months “because only Alice knows it.” If Alice goes on PTO, it dies.

Metrics you can actually trust

Skip vanity counts like “number of ideas.” Track flow and impact.

Adoption Rate (90-day): percentage of pilots that land in production behind a flag or as a dependency within 90 days.
Time-to-Decision: days from ADR created to gate decision (kill or promote). Lower is better.
Cost-to-Learn: cloud + license + people time per spike/pilot. Benchmark month over month.
Delivery Impact: change in DORA metrics for the squads participating (lead time, change fail rate, MTTR).
Reliability Impact: SLO deltas where pilots touched critical paths.

You can automate most of this with labels and simple queries.

Jira JQL to track decisions per month:

project = CORE AND labels in (innovation) AND status changed to Done DURING (startOfMonth(), endOfMonth())

SQL to compute time-to-decision from ADR metadata (if you log ADRs in a table):

SELECT
  adr_id,
  DATE_PART('day', decision_at - created_at) AS time_to_decision_days,
  stage,
  decision
FROM adrs
WHERE created_at >= date_trunc('month', now());

Prometheus to watch lab error rate so pilots don’t normalize failure:

100 * sum(rate(http_requests_total{env="lab",status=~"5.."}[5m]))
  /
    sum(rate(http_requests_total{env="lab"}[5m]))

If you can’t publish these monthly without caveats, your process is too squishy. Tighten the gates or reduce scope.

A 30-day rollout plan (that survives finance and security)

Week 1

Pick two teams as pilot participants. Agree on 85/10/5 capacity. Put it in Jira.
Create docs/adr/ and docs/rfcs/ templates. Add CODEOWNERS.
Schedule the triage and demo sessions for the quarter. Exec sponsor confirms attendance.

Week 2

Stand up a lab namespace via ArgoCD; apply quotas and egress restrictions.
Tag all lab infra innovation=true. Set a monthly budget alert.
Wire the Slack digest and a simple decision dashboard (Google Sheet is fine).

Week 3

Run two spikes, timeboxed to 2 weeks. Owners create ADRs on day 1.
First triage: set promotion/kill dates. Identify security touchpoints early.
Prep demo: working code + ADR + first pass on cost-to-learn.

Week 4

Make the first gate decisions. Kill at least one thing on purpose.
Convert one spike to a pilot with a 6-week ship-or-park target.
Publish the first scorecard: adoption rate (n/a yet), time-to-decision, cost-to-learn, delivery impact (neutral).

We’ve rolled this in regulated environments (SOX, HIPAA). The trick is involving platform/security as default reviewers and proving budget discipline upfront. After two months, even the skeptics admit it’s cheaper than surprise migrations and weekend rewrites.

Quick case snapshot: the platform team that ended innovation theater

A consumer fintech had “innovation days” that kept getting canceled. We implemented 85/10/5 across two squads, stood up a lab namespace, added budgets-as-code, and enforced CODEOWNERS on /labs.

Within 60 days, they killed 3 out of 5 spikes before week 3—saving an estimated 6 engineer-weeks.
One pilot (OpenTelemetry Collector at the edge) shipped behind a flag in week 7, cutting MTTR by 18% in the next quarter.
Delivery didn’t slip: lead time remained within 5% variance, because capacity was explicit.
That’s the difference between exploration and theater.

If this resonates, we can help you make it stick, adjust the knobs for your org (compliance, cost, culture), and wire it into your tooling without boiling the ocean. No hype, just working plumbing.

Related Resources

Key takeaways

Stop pretending you can do “20% time” without explicit capacity, stage gates, and a kill switch.
Allocate 85/10/5 across Delivery, Pilots, and Spikes; promote or kill work every two weeks.
Institutionalize short, boring rituals: weekly triage, bi-weekly demo, monthly RFC review.
Codify guardrails: isolated lab envs, budgets-as-code, and CODEOWNERS for high-risk areas.
Measure outcomes that matter: adoption rate, time-to-decision, cost-to-learn—not vanity hackathon stats.

Implementation checklist

Define 85/10/5 capacity in planning tools and enforce it in sprint commitments.
Stand up a lab environment with quotas, egress limits, and auto-cleanup.
Add stage gates with promotion/kill criteria and an ADR template in the repo.
Schedule a 25-minute weekly triage and a 45-minute demo every two weeks—never skip.
Set budgets-as-code and tag all lab resources; report cost-to-learn monthly.
Publish a simple scorecard: adoption rate, time-to-decision, cost burn, delivery impact.

Questions we hear from teams

How do we prevent pilots from stealing production SRE cycles?: Gate pilots in a lab namespace with quotas and egress limits, require on-call shadowing only after a promotion decision, and use CODEOWNERS so platform/security review changes before they hit shared infra.
What if we miss roadmap commitments because of the 10% pilot allocation?: Then you overcommitted. Make pilot capacity explicit in sprint planning. If leadership wants to reclaim it, they should publicly agree to defer a scope slice or accept risk to the innovation pipeline.
How do we keep spikes from turning into stealth migrations?: Timebox to two weeks, require an ADR on day one, and schedule a kill-or-promote decision in the calendar. No decision, no continuation. Enforce via labels and the weekly triage.
Isn’t this just more process?: It’s the minimum viable process to protect exploration. The rituals are short, the gates are binary, and the guardrails are code. Compared to failed hack weeks and ad-hoc spikes, it’s less overhead and more outcomes.

Ready to modernize your codebase?

Let GitPlumbers help you transform AI-generated chaos into clean, scalable applications.

Talk to an engineer about setting up 85/10/5 in your org Download the ADR + RFC templates