How do I pick the right SLOs for performance vs cost?

Anchor SLOs to critical user journeys (e.g., product detail, checkout) and pick `p95` thresholds where conversion inflects. Pair each SLO with a budgeted `cost_per_1k_requests` or `cost_per_conversion`. If latency is below the inflection point but cost is high, optimize spend; if above, optimize speed first.

Isn’t autoscaling cheaper than optimizing code?

Sometimes—until you hit saturation on shared components (DB, cache, network) and tail latency explodes. Autoscaling without backpressure hides problems and inflates spend. Fix the top 5 queries, cache aggressively, and scale on RPS/queue depth; it’s usually cheaper and faster than buying bigger nodes.

How do I measure cost per request reliably?

Tag resources by `service`, `env`, and `owner`. Export metered cost (via AWS CUR or GCP Billing) to a warehouse and join against request counts from `Prometheus` or your gateway logs. Plot `cost_per_1k_requests` in Grafana next to `p95` and `error_rate`. Automate with `infracost` in CI for projections per PR.

What’s the quickest win if I have one week?

Deploy `Goldilocks`, right-size top 10 services, add CDN caching for low-churn endpoints, and enable `pg_stat_statements` to fix the top queries. Add `Envoy` circuit breakers to cap tail latency. You’ll usually see a 20–40% latency improvement and 15–30% cost reduction in a week.

When should I optimize `p99`?

If your `p95` is healthy but you still see user pain (timeouts, retries, SLA penalties) or high-value workflows suffer from long tails (trading, payments), then target `p99`. Use queue time budgets, circuit breakers, and runtime GC tuning. Otherwise, `p99` is often noise that derails the team.

Performance-optimization · Oct 2, 2025 · 9 minute read

Stop Chasing P99s in the Dark: A Practical Framework to Balance Performance and Cloud Spend

When the p95 spikes and Finance pings you on Slack, you need a framework that connects user happiness to dollars. Here’s the playbook I use to tune systems without torching the budget.

Back to all posts

Stop Chasing P99s in the Dark: A Practical Framework to Balance Performance and Cloud Spend

Key takeaways

Implementation checklist