Do we need a service mesh to do progressive delivery?

No. A mesh like Istio or Linkerd makes weighted routing and telemetry easier, but you can start with NGINX Ingress canary-by-header or service-level splits. The key is objective SLO checks gating promotion.

How do you handle database schema changes with canaries?

Use expand-contract: add new columns/tables, backfill, dual-read, then switch writes late. Never ship a version that requires a schema the stable version can’t read. Tools like gh-ost or pt-online-schema-change help.

What if our telemetry is split across Prometheus and Datadog?

Pick the source where labels are stable per SLI and wire the AnalysisTemplates to that system. Consistency beats tool consolidation for this step. We often use Prometheus for success-rate and Datadog for latency.

How do you satisfy SOC 2 or PCI change control with auto-rollbacks?

GitOps. Every step (promotion, abort, rollback) is a PR or an event recorded by the controller. Auditors care about traceability and approval flows, which ArgoCD plus change requests can provide.

What’s the typical time to first value?

If your observability is decent, we usually see the first service running progressive delivery in 2–3 weeks and cross-cutting adoption in 6–8 weeks.

Case-studies · Oct 3, 2025 · 10 minute read

The Canary That Stopped Payday From Breaking: Progressive Delivery at a Fintech

A payroll fintech cut change failures by 80% and tripled deploy frequency by wiring canaries to SLOs, not vibes.

Back to all posts

The Canary That Stopped Payday From Breaking: Progressive Delivery at a Fintech

Key takeaways

Implementation checklist