What’s the difference between a performance budget and a performance SLO?

A budget is a hard limit you enforce in CI (bytes, LCP/INP thresholds in a lab). An SLO is a production commitment measured with RUM (p75 Web Vitals per route/device). Budgets prevent obvious regressions before deploy; SLOs validate real-world experience and govern release velocity via error budgets.

Why p75 instead of p95?

p75 balances experience and practicality. It reflects what most users feel without letting tail events dominate. Track p95 to understand tail pain, but budget and gate releases on p75 so you can actually ship. If you’re a bank or trading platform, you may choose p90/p95 for critical flows—just be ready for slower velocity.

Can I trust Lighthouse in CI to represent real users?

Use Lighthouse CI for guardrails, not truth. It’s deterministic and good at catching obvious regressions. Pair it with production RUM SLOs and canary gating. When they disagree, production wins—always.

How do I handle third-party scripts within budgets?

Give each third party a line item (bytes, CPU cost) and load strategy (defer, async, after interaction). Lazy-load personalization and chat. If a vendor can’t meet the budget, negotiate, sandbox (iframe/worker), or remove it during peak using feature flags.

Performance-optimization · Oct 8, 2025 · 10 minute read

Stop Chasing 100 Lighthouse: Design Performance Budgets That Keep UX Consistent

Speed isn't a trophy—it's a contract with your users. Here's how to set, enforce, and hit performance budgets that protect customer experience and revenue.

Ravi Iyer

Principal Engineer, Performance & Reliability

20 years shipping and fixing web stacks at scale. Led perf turnarounds at marketplace, fintech, and media companies. Ex-Adobe, ex-Shopify partner. At GitPlumbers, I help teams wire performance budgets into the way they build and release software.

Speed isn’t a number—it’s a promise you keep under load.

Back to all posts

The speed promise you make (and break) on big days

A few Black Fridays ago, a retail client’s homepage had a Lighthouse 99 in staging and a war room full of happy faces. In production, mobile p75 LCP drifted from 2.6s to 3.9s during peak and checkout throughput cratered 11%. Same code. Different reality.

What actually failed wasn’t the code—it was the lack of a performance budget tied to real-user experience. They optimized for a vanity score in a clean lab. Real users got throttled CPUs, ad tags, chat widgets, and congested cell towers. Consistency tanked, revenue followed.

I’ve seen this movie at SaaS, marketplaces, and media companies. The fix is boring and effective: define budgets around user-facing metrics, enforce them in CI and in prod, and run a ruthless, repeatable playbook to keep within them.

Budget the metrics users actually feel

Anchor your budgets to Core Web Vitals and a couple of supporting signals. Set targets at the p75 (75th percentile) per device class and region.

LCP (Largest Contentful Paint): focus metric for perceived load. Target p75 < 2.5s on mobile, < 1.8s on desktop.
INP (Interaction to Next Paint): real interactivity metric replacing FID. Target p75 < 200ms.
CLS (Cumulative Layout Shift): visual stability. Target p75 < 0.1.
Supporting signals:
- TTFB: keep < 500ms p75 on mobile networks.
- Total bytes: keep your initial route under ~200–300 KB of JS and < 1 MB total critical bytes.

Tie these to actual journeys:

Home → PLP → PDP → Checkout should each have budgets. PDP might allow heavier images; checkout should be lean and predictable.
Segment by device: low-end Android on a 3G/4G profile is the forcing function for most businesses.
Regional CDNs, third-party scripts, and A/B frameworks get their own line items. If they move the needle, they get a budget.

Budgets aren’t goals—they’re guardrails. You’re either inside the line or you’re not.

Set budgets from your data, not a blog post

Here’s a simple, repeatable flow I use:

Inventory critical journeys and top entry pages.
Pull baselines from CrUX, your RUM (Datadog RUM, New Relic Browser, SpeedCurve LUX, or open-source Boomerang), and WebPageTest for controlled checks.
Segment by device class, network profile, and region.
Choose p75 targets that are both ambitious and feasible in the next quarter.
Translate them into budgets in CI and SLOs in production.

Wire CI with lighthouse-ci and bundle size limits:

{
  "ci": {
    "collect": {
      "url": [
        "https://staging.example.com/",
        "https://staging.example.com/product/123",
        "https://staging.example.com/checkout"
      ],
      "numberOfRuns": 3,
      "settings": { "preset": "desktop" }
    },
    "assert": {
      "assertions": {
        "largest-contentful-paint": ["error", { "maxNumericValue": 2500 }],
        "interaction-to-next-paint": ["error", { "maxNumericValue": 200 }],
        "cumulative-layout-shift": ["error", { "maxNumericValue": 0.1 }],
        "total-byte-weight": ["warn", { "maxNumericValue": 300000 }],
        "unused-javascript": ["warn", { "maxLength": 0 }]
      }
    }
  }
}

Add hard bundle caps with size-limit:

{
  "scripts": {
    "size": "size-limit"
  },
  "size-limit": [
    { "path": "dist/app-*.js", "limit": "170 KB" },
    { "path": "dist/vendor-*.js", "limit": "120 KB" }
  ]
}

And an example GitHub Actions step:

name: perf-budgets
on: [pull_request]
jobs:
  lhci:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: 20 }
      - run: npm ci
      - run: npm run build
      - run: npx size-limit
      - run: npx @lhci/cli autorun

If a PR blows the budget, it’s a failing check. No arguments, no “just this once.”

Enforce in production with SLOs and error budgets

CI catches dumb mistakes. Production SLOs catch reality. Define SLOs on real-user p75 per route and device, and use error budgets to control release velocity.

Prometheus example for a custom RUM metric web_vitals_lcp_seconds_bucket:

# p75 LCP over 5m by route and device
histogram_quantile(
  0.75,
  sum by (le, route, device) (
    rate(web_vitals_lcp_seconds_bucket[5m])
  )
)

Grafana alert rule idea:

Trigger when p75 LCP > 2.5s for 10 minutes on checkout mobile.
Burn-rate alerting: fire if 2h error budget consumption predicts SLO miss within 24h.

Datadog RUM equivalent (monitor query):

avg(last_10m):percentile(rum.web_vitals.lcp, 75, {service:frontend, route:checkout, device:mobile}) > 2.5s

Release gating:

Canary 5% with Argo Rollouts or Flagger.
Promote only if canary p75 stays within budgets for 15–30 minutes.
Auto-rollback on budget breach.

This mirrors SRE practice: your UX SLO protects revenue the way uptime SLOs protect availability.

The tactical playbook that actually moves p75

Most teams know the buzzwords; here’s what consistently works with measurable outcomes.

Ship less JS (INP/TBT)
- Split by route with dynamic imports:

// React
const ProductGallery = React.lazy(() => import('./ProductGallery'));

Tree-shake and modern-target builds (esbuild, rollup, webpack with browserslist modern queries).
Replace moment.js/lodash kitchen sinks with date-fns/ES APIs.
Cap third parties; load them after first interaction or via IntersectionObserver.
Make the main thread boring (INP)
- Break long tasks (>50ms). Use scheduler.postTask, requestIdleCallback, web workers for heavy parsing.
- Virtualize lists; don’t hydrate what isn’t on screen (islands/partial hydration via Astro, Qwik; or React Server Components in Next.js 14).
Render the important pixels first (LCP)
- Critical CSS only for above-the-fold; defer the rest.
- Preload hero image and its font.

<link rel="preconnect" href="https://cdn.example.com" crossorigin>
<link rel="preload" as="image" href="/img/hero.avif" imagesrcset="/img/hero.avif 1x, /img/hero@2x.avif 2x" imagesizes="100vw">
<link rel="preload" as="font" href="/fonts/Inter.woff2" type="font/woff2" crossorigin>

Images pay the bills (LCP/bytes)
- Serve AVIF/WebP with responsive srcset and lazy-load below the fold:

<img
  src="/img/pdp-640.webp"
  srcset="/img/pdp-320.webp 320w, /img/pdp-640.webp 640w, /img/pdp-1280.avif 1280w"
  sizes="(max-width: 600px) 90vw, 600px"
  loading="lazy"
  width="600" height="400" alt="Sneaker">

Use CDN image resizing (Cloudflare Images, Fastly IO, Imgix) to avoid shipping originals.
Kill layout shift (CLS)
- Always reserve space (width/height or aspect-ratio).
- Avoid late-injected banners/consent widgets; if you must, reserve the slot.
Back end matters (TTFB → LCP)
- Set fast-path caches: Cache-Control: public, max-age=600, stale-while-revalidate=86400 on static and edge-rendered pages.
- Use Server-Timing to prove where time went:

// Express example
app.use(async (req, res, next) => {
  const t0 = process.hrtime.bigint();
  res.once('finish', () => {
    const t1 = process.hrtime.bigint();
    const ms = Number(t1 - t0) / 1e6;
    res.setHeader('Server-Timing', `total;dur=${ms.toFixed(1)}`);
  });
  next();
});

Service Workers and caching
- Cache shell assets aggressively; background-refresh HTML with stale-while-revalidate.

Each of these moves p75, not just median. That’s the difference between happy demos and happy CFOs.

Case study: the checkout budget that saved a release

A marketplace client was shipping a React + Next.js 13 migration. Staging looked fine; RUM said otherwise: mobile checkout p75 LCP 3.8s, INP 380ms, and support tickets spiked during promos.

We set budgets in CI and production:

CI: lighthouse-ci with LCP < 2.5s, INP < 200ms on checkout route, size-limit for app-*.js < 180 KB.
Prod SLO: p75 mobile LCP < 2.5s and INP < 200ms on /checkout with burn-rate alerts and canary gates.

Tactical changes over three weeks:

Code-split checkout wizard; moved tax calc to a web worker; deferred chat widget until after payment step.
Preloaded hero image and web fonts; added critical CSS.
Switched images to AVIF via CDN resizing; fixed layout shifts by reserving component sizes.
Added Server-Timing and Redis microcache for high-traffic API reads; set stale-while-revalidate on edge.

Results (mobile, p75):

LCP: 3.8s → 2.3s
INP: 380ms → 160ms
Initial JS: 620 KB → 390 KB
Checkout conversion: +6.4% (week-over-week, same promo cohort)
Support tickets about “site slow” during promo: −28%

No heroics. Just budgets, gates, and a playbook.

What I’d do again (and what I wouldn’t)

Do again
- Start with RUM and journeys, not tooling. Tooling follows decisions.
- Enforce budgets at two layers: CI and production SLOs with release gates.
- Give every third party a budget line item and a plan B (delay, async, or remove).
Avoid
- Treating Lighthouse 100 as a strategy. It’s a snapshot, not a guarantee.
- Allowing exception creep: “just one more KB” is how you wake up 6 months later with a 1.2 MB bundle.
- Big-bang rewrites for perf. Iterative improvements compound and de-risk.

If your team needs an outside driver who’s been through the fire, GitPlumbers can run a one-week Performance Budget Workshop, wire CI and SLOs, and leave you with a playbook that sticks.

Related Resources

Key takeaways

Performance budgets must be tied to real-user metrics (p75 LCP/INP/CLS) per device/region, not synthetic vanity scores.
Bake budgets into both CI (Lighthouse CI, bundle size limits) and production SLOs with alerting and release gates.
Optimize the things users feel: bytes shipped, main-thread time, image weight, network round trips, and TTFB.
Use canaries and feature flags to protect budgets and roll back regressions before they reach everyone.
Measure impact like a business owner: conversion, bounce, and support volume—not just milliseconds.
Keep budgets living: re-baseline deliberately after major architectural shifts, never by exception creep.

Implementation checklist

Inventory critical journeys (home, PLP, PDP, checkout) and set per-journey budgets.
Define p75 LCP/INP/CLS targets by device class and region using RUM + CrUX.
Wire CI with `lighthouse-ci` assertions and `size-limit` or `bundlesize`.
Create production SLOs for p75 Web Vitals; alert when error budgets burn too fast.
Gating: block promotions when budgets are exceeded; allow canaries only.
Apply a tactical playbook: code-split, image optimize, critical CSS, resource hints, server timing, CDN caching.
Instrument a rollback runbook tied to feature flags and release toggles.
Review budgets quarterly; re-baseline only after deliberate architectural changes.

Questions we hear from teams

What’s the difference between a performance budget and a performance SLO?: A budget is a hard limit you enforce in CI (bytes, LCP/INP thresholds in a lab). An SLO is a production commitment measured with RUM (p75 Web Vitals per route/device). Budgets prevent obvious regressions before deploy; SLOs validate real-world experience and govern release velocity via error budgets.
Why p75 instead of p95?: p75 balances experience and practicality. It reflects what most users feel without letting tail events dominate. Track p95 to understand tail pain, but budget and gate releases on p75 so you can actually ship. If you’re a bank or trading platform, you may choose p90/p95 for critical flows—just be ready for slower velocity.
Can I trust Lighthouse in CI to represent real users?: Use Lighthouse CI for guardrails, not truth. It’s deterministic and good at catching obvious regressions. Pair it with production RUM SLOs and canary gating. When they disagree, production wins—always.
How do I handle third-party scripts within budgets?: Give each third party a line item (bytes, CPU cost) and load strategy (defer, async, after interaction). Lazy-load personalization and chat. If a vendor can’t meet the budget, negotiate, sandbox (iframe/worker), or remove it during peak using feature flags.

Ready to modernize your codebase?

Let GitPlumbers help you transform AI-generated chaos into clean, scalable applications.

Run a Performance Budget Workshop See how we fix performance regressions fast