Should we use semantic versioning for APIs?

Use SemVer concepts, but be explicit: most HTTP APIs only meaningfully support **major** (breaking) and **minor** (additive) evolution. The important part is enforcing what “breaking” means with automated diffs and contracts, not arguing about whether a change is 1.6.3 vs 1.7.0.

Is header-based versioning better than `/v1` paths?

Header/media-type versioning is elegant, but path-based majors are easier to route, debug, and operate across gateways and caches. For most orgs with multiple clients and imperfect tooling, `/v1` and `/v2` is the least surprising option.

How do we know when it’s safe to shut off v1?

When you can prove (1) traffic share for v1 is below a threshold you set (often <1–5%), (2) top tenants have migrated, and (3) v2 meets SLOs. If you can’t identify remaining clients by key/tenant, you’re not ready to sunset—add that telemetry first.

What’s the fastest win if we’re already breaking clients accidentally?

Add an OpenAPI breaking-change gate (`oasdiff -fail-on=breaking`) to CI for the current major version, and start tracking 4xx/5xx by version. That alone prevents a lot of “oops” releases and gives you hard data for prioritizing cleanup.

Guides · Jan 25, 2026 · 8 minute read

The API Versioning Plan That Survives Real Clients (and Avoids a Breaking-Change Fire Drill)

Q: What’s the fastest win if we’re already breaking clients accidentally?

Add an OpenAPI breaking-change gate (`oasdiff -fail-on=breaking`) to CI for the current major version, and start tracking 4xx/5xx by version. That alone prevents a lot of “oops” releases and gives you hard data for prioritizing cleanup.

A pragmatic, metrics-driven playbook for evolving APIs without detonating your biggest customers—or your on-call rotation.

GitPlumbers Editorial Team

Legacy & AI Code Rescue (20-year industry veterans)

We’re the folks you call when the “quick refactor” took down production, the OpenAPI spec stopped matching reality, or AI-generated changes shipped faster than your safety rails. We’ve led migrations through the dot-com bust, the SOA-to-microservices era, and today’s LLM-driven development—always with an eye on SLOs, operability, and business risk.

Versioning isn’t documentation. Versioning is an operational contract.

Back to all posts

The moment you realize “we can’t change that field” is your new product requirement

I’ve watched teams ship a “minor” response shape tweak on a Friday, only to spend the weekend learning which enterprise customer is still on a Java 8 SDK pinned to a 2019 model class. The postmortem always sounds the same: “But we bumped the version in the docs.”

Versioning isn’t documentation. Versioning is an operational contract. If you don’t treat it like one—with gates, telemetry, and a deprecation runway—you’ll end up with shadow clients, frozen schemas, and a permanent tax on delivery.

This guide assumes you already have production consumers and you want an approach that:

Keeps backward compatibility as the default
Makes breaking changes rare, explicit, and measurable
Gives you proof points (dashboards + CI checks) that you’re not guessing

If you’re already in “v1 forever” hell (or dealing with AI-generated endpoints that don’t match the docs), GitPlumbers typically starts by building a compatibility test harness and an adoption dashboard before touching any production routing.

Pick a compatibility policy first (or you’ll bikeshed URI vs headers forever)

Before you argue about /v2 vs Accept:, write down what your org considers breaking. If you skip this step, every PR becomes a debate.

A rubric that actually works in practice:

Breaking changes (require new major version or parallel endpoint):

Removing an endpoint, field, or enum value
Changing a field type (e.g., string → object)
Changing semantics (e.g., status=PAID no longer means settled)
Tightening validation (previously accepted payloads now rejected)
Changing default sorting/pagination behavior

Non-breaking changes (allowed in-place):

Adding optional fields
Adding new endpoints
Adding enum values if clients are required to ignore unknowns
Adding new query params that default to old behavior

Checkpoint (policy):

1-pager: “What counts as breaking” approved by API owners + on-call lead
Client guideline: “Ignore unknown fields; treat unknown enum as UNKNOWN”

Metric to track:

Breaking-change rate: # of breaking changes / month (goal: trending to near-zero)

Tooling suggestion:

Use Spectral to enforce style + compatibility conventions at PR time.

# .spectral.yaml
extends: ["spectral:oas"]
rules:
  no-unknown-enum-without-fallback:
    description: "Enums must have an UNKNOWN value for forward compatibility"
    given: "$.components.schemas..[?(@.enum)]"
    then:
      field: "enum"
      function: "truthy"

Choose a versioning mechanism you can enforce at the edge

Here’s what I’ve seen “work in slides” vs “work at 2am”:

Option A: Version in the path (`/v1/...`)

Pros:

Visible, cache-friendly, simple routing
Easy to run dual-stack services

Cons:

Versions leak into client code and bookmarks forever

Option B: Version in headers (`Accept` / custom header)

Common pattern:

Accept: application/vnd.acme.orders+json;version=2

Pros:

Cleaner URLs, aligns with content negotiation

Cons:

Harder to debug with curl/browser
Some proxies and SDKs mangle headers

Option C: “No versions, only compatibility”

This is the “Stripe-style” dream. It can work if you have:

A strict compatibility policy
Strong client update control
Heavy contract testing

Most orgs don’t have that maturity on day one.

What actually works for most teams:

Major versions in the path (/v1, /v2) for routing and operations
Minor evolution via additive changes within a major version

Checkpoint (decision):

One standard across the org (don’t let each team invent their own)
Gateway routing supports running v1 and v2 simultaneously

Example: NGINX routing with explicit upstreams:

location /v1/ {
  proxy_pass http://orders-v1/;
}

location /v2/ {
  proxy_pass http://orders-v2/;
}

Metric to track:

Traffic share by version (you can’t deprecate what you can’t measure)

Design for additive change (and stop breaking clients with “small” refactors)

Most “accidental breaking changes” come from the same handful of mistakes:

Renaming JSON fields because “it reads better”
Tightening validation without a migration path
Reusing fields with new semantics

Concrete tactics that keep you shipping:

Never rename a field in place. Add the new field, keep the old, deprecate later.
Prefer new endpoints over overloading semantics. /v1/orders/{id}/cancel beats status=CANCELLED with hidden rules.
Treat enums as open sets. Clients must tolerate unknown values.
Make pagination stable. Cursor-based pagination is more version-resistant than offset.
Use idempotency keys for write APIs. Backward compatibility includes retries.

OpenAPI example: additive field with clear deprecation markers:

openapi: 3.1.0
info:
  title: Orders API
  version: "1.7.0"
paths:
  /v1/orders/{id}:
    get:
      responses:
        "200":
          description: OK
          content:
            application/json:
              schema:
                $ref: "#/components/schemas/Order"
components:
  schemas:
    Order:
      type: object
      required: [id, status]
      properties:
        id:
          type: string
        status:
          type: string
          enum: [PENDING, PAID, SHIPPED, UNKNOWN]
        totalCents:
          type: integer
          deprecated: true
          description: "Use totalMoney instead. Will be removed in v2."
        totalMoney:
          type: object
          properties:
            amount:
              type: string
            currency:
              type: string

Checkpoint (API review):

Every change classified: additive vs breaking
Every deprecation includes: replacement, target removal version, and date

Metrics to track:

Client-impacting error rate: increase in 4xx after deploy, segmented by version
p95 latency by version (v2 shouldn’t be “new and slower”)

Put compatibility gates in CI: spec diff + contract tests + fuzz

If your compatibility strategy relies on humans noticing a subtle change in a 2,000-line OpenAPI file… I’ve seen that fail. Repeatedly.

A CI pipeline that catches the real problems:

1) OpenAPI breaking-change detection

Use oasdiff (or openapi-diff) to block breaking changes on main for a given major version.

# compare base branch spec to PR spec
npx oasdiff@latest \
  -fail-on=breaking \
  ./openapi-base.yaml \
  ./openapi-pr.yaml

2) Lint conventions (Spectral)

npx @stoplight/spectral-cli lint openapi.yaml

3) Consumer-driven contract tests (Pact)

If you have top-tier consumers (mobile app, partner integrations, internal platform SDKs), treat their expectations as first-class.

Producers publish pacts
Consumers verify against producer builds
Breaking changes get caught before merge

4) Runtime property testing (Schemathesis)

This catches the “docs say one thing, implementation does another” drift.

schemathesis run http://localhost:8080/openapi.yaml \
  --checks all \
  --hypothesis-max-examples 200

Checkpoint (CI):

PR cannot merge if breaking diff detected for the current major version
Pact verification required for top N consumers
Schemathesis run in nightly (or per-release) against staging

Metrics to track:

Escaped breaking changes: breaking diffs that reached production (goal: 0)
Spec/impl drift incidents per quarter

Run versions in parallel, route at the gateway, and migrate with telemetry

The cleanest migrations I’ve seen look boring:

v2 exists alongside v1
Traffic shifts gradually
Old version is shut off on purpose, not by accident

Step-by-step migration plan

Ship v2 behind the gateway (no public announcement yet)
Canary internal clients (or a friendly partner)
Measure SLOs per version (error rate, latency, saturation)
Publish a migration guide + SDK updates
Add deprecation headers on v1
Throttle new signups to v1 (soft pressure)
Sunset v1 when adoption hits your threshold

Deprecation headers example:

HTTP/1.1 200 OK
Deprecation: true
Sunset: Wed, 31 Dec 2026 23:59:59 GMT
Link: <https://api.acme.com/docs/migrate-to-v2>; rel="deprecation"

Gateway routing example (Envoy) for explicit version prefixes:

virtual_hosts:
- name: orders
  domains: ["api.acme.com"]
  routes:
  - match: { prefix: "/v1/orders" }
    route: { cluster: orders_v1 }
  - match: { prefix: "/v2/orders" }
    route: { cluster: orders_v2 }

Checkpoint (operational readiness):

Separate dashboards and alerts per version
Separate rollbacks per version (don’t tie v1/v2 to the same deploy)

Metrics to track (minimum viable):

Traffic share by version (requests/min)
4xx and 5xx rate by version
p95 latency by version
Adoption slope: % migrated per week
Deprecation burn-down: # clients remaining on v1 (tracked via API keys or auth claims)

Tooling suggestions:

Prometheus + Grafana dashboards with version label
OpenTelemetry traces with http.route including /v1 vs /v2
Kong/Apigee/AWS API Gateway usage plans to identify top laggards

Deprecation is a product: set budgets, dates, and escalation paths

If you don’t put constraints around old versions, you end up supporting them forever. I’ve seen “temporary” v1 endpoints still running during a cloud migration five years later because nobody owned the shutdown.

A deprecation process that doesn’t rely on vibes:

Define support windows
- Example: “Major versions supported for 24 months; security fixes for 30 months.”
Require explicit sunset dates for new majors
Publish customer comms templates (email + docs + changelog)
Create an escalation ladder
- 90 days: warn
- 30 days: targeted outreach to top consumers
- 7 days: require exception approval to stay on old version
Budget migration engineering time
- If you don’t fund migration help, you’re choosing permanent legacy.

Checkpoint (governance):

A single owner (role, not a person) for version lifecycle
Quarterly review: which versions can be retired this quarter

Metrics to track:

Time-to-deprecate: major release → old major retired
Exception count: # clients granted extended support (this is hidden cost)

What GitPlumbers does when versioning is already messy

When we get called in, the common pattern is: multiple versions, inconsistent routing, AI-assisted code changes that didn’t update the OpenAPI, and no one can answer “who is still on v1?”

The fastest path to sanity usually looks like:

Build an adoption dashboard (by API key, tenant, or auth claim)
Add CI breaking-change gates (oasdiff + Spectral)
Stand up dual routing at the gateway and canary safely
Add contract tests for your top consumers

Internal links:

API and legacy rescue: https://gitplumbers.com/services/api-rescue
Code rescue for AI-generated code: https://gitplumbers.com/services/vibe-code-cleanup
Case studies: https://gitplumbers.com/case-studies

If you want a second set of eyes, bring us one service’s OpenAPI spec, your gateway config, and a list of top consumers. We can usually tell in a week whether you’re versioning safely or just hoping.

Next step: run oasdiff against your last 20 merges and see how many “breaking” changes slipped through. That number is your starting line.

Related Resources

Key takeaways

Versioning is a policy problem first: define what counts as “breaking” and enforce it with tooling.
Prefer additive evolution; when you must break, route versions at the edge and run both in parallel long enough to measure adoption.
Track version adoption, error rates by version, and breaking-change rate—treat them like SLOs, not documentation.
Automate compatibility gates in CI using OpenAPI diffs + consumer contracts (Pact) + runtime probes (Schemathesis).
Deprecation without telemetry is fantasy; build dashboards and budgets for migration time.

Implementation checklist

Define a “breaking change” rubric and publish it internally (and ideally to customers).
Choose a versioning surface area (path vs header vs media type) and standardize it across services.
Add CI gates: `oasdiff`/`openapi-diff` + Spectral rules for compatibility.
Implement consumer-driven contract tests (Pact) for top consumers.
Instrument metrics: traffic share by version, 4xx/5xx by version, p95 latency by version, deprecation burn-down.
Implement deprecation headers and a predictable sunset process (with dates).
Run dual-stack (v1 + v2) behind the gateway and canary traffic until SLOs hold.
Delete old versions intentionally—after your dashboards say it’s safe.

Questions we hear from teams

Should we use semantic versioning for APIs?: Use SemVer concepts, but be explicit: most HTTP APIs only meaningfully support **major** (breaking) and **minor** (additive) evolution. The important part is enforcing what “breaking” means with automated diffs and contracts, not arguing about whether a change is 1.6.3 vs 1.7.0.
Is header-based versioning better than `/v1` paths?: Header/media-type versioning is elegant, but path-based majors are easier to route, debug, and operate across gateways and caches. For most orgs with multiple clients and imperfect tooling, `/v1` and `/v2` is the least surprising option.
How do we know when it’s safe to shut off v1?: When you can prove (1) traffic share for v1 is below a threshold you set (often <1–5%), (2) top tenants have migrated, and (3) v2 meets SLOs. If you can’t identify remaining clients by key/tenant, you’re not ready to sunset—add that telemetry first.
What’s the fastest win if we’re already breaking clients accidentally?: Add an OpenAPI breaking-change gate (`oasdiff -fail-on=breaking`) to CI for the current major version, and start tracking 4xx/5xx by version. That alone prevents a lot of “oops” releases and gives you hard data for prioritizing cleanup.

Ready to modernize your codebase?

Let GitPlumbers help you transform AI-generated chaos into clean, scalable applications.

Talk to GitPlumbers about stabilizing your API versioning See how we rescue legacy + AI-assisted codebases

The moment you realize “we can’t change that field” is your new product requirement

Pick a compatibility policy first (or you’ll bikeshed URI vs headers forever)

Choose a versioning mechanism you can enforce at the edge

Option A: Version in the path (/v1/...)

Option B: Version in headers (Accept / custom header)

Option C: “No versions, only compatibility”

Design for additive change (and stop breaking clients with “small” refactors)

Put compatibility gates in CI: spec diff + contract tests + fuzz

1) OpenAPI breaking-change detection

2) Lint conventions (Spectral)

3) Consumer-driven contract tests (Pact)

4) Runtime property testing (Schemathesis)

Run versions in parallel, route at the gateway, and migrate with telemetry

Step-by-step migration plan

Deprecation is a product: set budgets, dates, and escalation paths

What GitPlumbers does when versioning is already messy

Related Resources

Key takeaways

Implementation checklist

Questions we hear from teams

Ready to modernize your codebase?

Related resources

Option A: Version in the path (`/v1/...`)

Option B: Version in headers (`Accept` / custom header)