Make WCAG 2.2 AA a Build Breaker: ARIA as Code, Evidence on Every Commit

Stop treating accessibility like an afterthought. Turn WCAG 2.2 AA and ARIA into acceptance criteria, guardrails, and automated proofs that ship fast without leaking regulated data.

Turn accessibility into a gate and a graph: it blocks broken builds and proves compliance on every commit.
Back to all posts

The a11y audit that stopped our launch

Three days before a healthcare client’s GA, a third-party audit flagged 11 critical WCAG failures: focus being trapped behind sticky headers, form fields without programmatic labels, and a CAPTCHA-only login. Legal froze the rollout. Product was furious. Engineering felt blindsided. I’ve seen this movie at FinTechs, GovTech vendors, and FAANG teams with better PR than process. The fixes weren’t rocket science—the failure was process. Accessibility was a lint note, not a gate.

What finally worked: we made WCAG 2.2 AA and ARIA non-negotiable acceptance criteria. We translated policy into guardrails in the design system, checks in CI, and automated proofs attached to every release. We shipped the patch in two sprints, passed re-audit, and avoided the consent decree. Here’s the playbook we now run at GitPlumbers when teams want speed without waking up legal.

Turn policy into acceptance criteria you can test

Guidelines don’t ship software. Acceptance criteria do. Make WCAG 2.2 AA explicit in tickets and your Definition of Done.

  • Ticket template add-on: include programmatic names, roles, and keyboard behavior.
  • Design handoff: annotate focus order, visible focus state, hit target size, error text, and accessible name.
  • DOD: Storybook a11y pass + component-level jest-axe pass + page-level axe/pa11y pass.

Example acceptance criteria (paste into Jira/Linear):

Feature: Checkout form
  Scenario: Accessible input fields
    Given a user relies on a screen reader
    When the page loads
    Then each input has a programmatic label via <label for> or aria-labelledby
    And required fields expose aria-required="true"
    And error messages are associated with aria-describedby
    And keyboard tab order follows visual order
    And focus is visible and not obscured by sticky headers (WCAG 2.2 2.4.11)

Definition of Done snippet:

  • eslint-plugin-jsx-a11y passes with no errors
  • Component Storybook shows focus styles and keyboard behavior
  • jest-axe has zero serious violations on component render
  • Page e2e axe scan has zero critical violations
  • Authentication supports non-cognitive alternatives (WCAG 2.2 3.3.8/3.3.9)

Guardrails in code: design system first, feature code second

You won’t lint your way out of inaccessible components. Put guardrails in the design system so product teams can’t foot-gun.

  • ESLint + JSX a11y: errors, not warnings; require accessible names.
  • Design tokens: focus ring color/width as tokens; never rely on outline: none.
  • Components: interactive elements render correct roles, keyboard handlers, and ARIA defaults.
  • Storybook a11y: every component has stories that demo keyboard navigation and focus.

ESLint config:

{
  "extends": ["next/core-web-vitals", "plugin:jsx-a11y/recommended"],
  "plugins": ["jsx-a11y"],
  "rules": {
    "jsx-a11y/anchor-is-valid": "error",
    "jsx-a11y/no-autofocus": "error",
    "jsx-a11y/aria-props": "error",
    "jsx-a11y/aria-roles": "error",
    "jsx-a11y/interactive-supports-focus": "error",
    "jsx-a11y/control-has-associated-label": ["error", { "depth": 3 }]
  }
}

Accessible button/link component (TypeScript + React):

import React from 'react';

type Props = React.ButtonHTMLAttributes<HTMLButtonElement> & {
  as?: 'button' | 'a';
  href?: string;
  loading?: boolean;
};

export function PrimaryAction({ as = 'button', href, loading, children, ...rest }: Props) {
  const common = {
    className: 'btn btn-primary focus:outline-4 focus:outline-offset-2',
    'aria-busy': loading || undefined,
    'aria-live': loading ? 'polite' : undefined,
  } as const;

  if (as === 'a') {
    return (
      <a role="button" href={href} {...common} {...rest}>
        {children}
      </a>
    );
  }
  return (
    <button type="button" {...common} {...rest}>
      {children}
    </button>
  );
}

Storybook a11y addon:

// .storybook/main.ts
import type { StorybookConfig } from '@storybook/react-vite';
const config: StorybookConfig = {
  addons: ['@storybook/addon-a11y'],
};
export default config;

Jest + jest-axe for component proofs:

import { render } from '@testing-library/react';
import { axe } from 'jest-axe';
import { PrimaryAction } from './PrimaryAction';

test('PrimaryAction has no a11y violations', async () => {
  const { container } = render(<PrimaryAction>Go</PrimaryAction>);
  const results = await axe(container);
  expect(results.violations).toHaveLength(0);
});

CI that proves compliance: scans, gates, and artifacts

Accessibility isn’t “done” until CI proves it. Run checks fast, fail fast, and keep the evidence.

  • Static: ESLint a11y rules on PRs.
  • Component: jest-axe in unit tests.
  • Page: axe-core or pa11y-ci in headless browser.
  • E2E: Playwright or Cypress with axe injection.
  • Artifacts: upload JSON/HTML reports and link them to releases.

GitHub Actions example:

name: a11y-checks
on: [pull_request]
jobs:
  a11y:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: 20 }
      - run: npm ci
      - name: Lint (a11y)
        run: npx eslint "**/*.{ts,tsx,js,jsx}" --max-warnings=0
      - name: Unit a11y (jest-axe)
        run: npx jest --runInBand
      - name: Build and start
        run: |
          npm run build
          npm run start &
          npx wait-on http://localhost:3000
      - name: pa11y-ci
        run: npx pa11y-ci --reporter json --sitemap http://localhost:3000/sitemap.xml > a11y-report.json
      - name: Playwright axe sweep
        run: |
          npx playwright install --with-deps
          node scripts/axe-playwright.js > axe-results.json
      - name: Upload artifacts
        uses: actions/upload-artifact@v4
        with:
          name: a11y-${{ github.sha }}
          path: |
            a11y-report.json
            axe-results.json

Minimal Playwright + axe example:

// scripts/axe-playwright.js (ts-node ok too)
const { chromium } = require('playwright');
const { AxeBuilder } = require('@axe-core/playwright');

(async () => {
  const browser = await chromium.launch();
  const page = await browser.newPage();
  await page.goto('http://localhost:3000');
  const results = await new AxeBuilder({ page }).analyze();
  console.log(JSON.stringify(results));
  if (results.violations.some(v => v.impact === 'critical' || v.impact === 'serious')) {
    console.error('Critical/serious a11y violations found');
    process.exit(1);
  }
  await browser.close();
})();

Store these artifacts forever (S3/GCS). When auditors ask, you have machine-readable proof tied to commits and releases.

WCAG 2.2 AA deltas you’ll miss if you’re not looking

Everyone checks alt text. Fewer teams verify new 2.2 criteria. Bake these into tests:

  • Focus Not Obscured (2.4.11/2.4.12): sticky headers, fixed footers, modals. Verify the focused element is fully visible.
  • Dragging Movements (2.5.7): must have a non-drag alternative. Provide click/keyboard increments.
  • Target Size (2.5.8): interactive targets ≥ 24x24 CSS pixels or sufficient spacing.
  • Consistent Help (3.2.6): help is in the same relative location across pages.
  • Redundant Entry (3.3.7): don’t force retyping persistent data.
  • Accessible Authentication (3.3.8/3.3.9): no cognitive tests only; alternative to text puzzles.

Playwright assertions for focus and target size:

import { test, expect } from '@playwright/test';

test('focused element is not obscured', async ({ page }) => {
  await page.goto('/checkout');
  await page.keyboard.press('Tab');
  const box = await page.evaluate(() => {
    const el = document.activeElement as HTMLElement;
    const r = el.getBoundingClientRect();
    const inViewport = r.top >= 0 && r.left >= 0 && r.bottom <= (window.innerHeight || 0) && r.right <= (window.innerWidth || 0);
    return { inViewport, tag: el.tagName };
  });
  expect(box.inViewport).toBeTruthy();
});

test('interactive targets meet minimum size', async ({ page }) => {
  await page.goto('/account');
  const buttons = page.locator('button, [role="button"], a[role="button"]');
  const sizes = await buttons.evaluateAll((els) => els.map(el => {
    const r = (el as HTMLElement).getBoundingClientRect();
    return { w: r.width, h: r.height };
  }));
  for (const { w, h } of sizes) {
    expect(w >= 24 || h >= 24).toBeTruthy();
  }
});

For dragging alternatives, add keyboard handlers:

<div role="slider" aria-valuemin={0} aria-valuemax={100} tabIndex={0}
     onKeyDown={(e) => {
       if (e.key === 'ArrowRight') setValue(v => Math.min(100, v + 1));
       if (e.key === 'ArrowLeft') setValue(v => Math.max(0, v - 1));
     }} />

Keep regulated data out of your a11y pipeline (and stay fast)

You can ship fast and stay compliant if you design your test data and logs like an adult. We’ve cleaned up too many pipelines that screenshotted real PHI/PII.

  • Synthetic accounts: seed fixtures; never test a11y with production data.
  • Network isolation: block calls to prod; stub third parties.
  • Screenshot redaction: blackout PII before saving.
  • Log redaction: structured logging; drop sensitive fields.

Cypress config for isolation and redaction:

// cypress/plugins/index.js
module.exports = (on, config) => {
  on('before:browser:launch', (browser = {}, launchOptions) => {
    launchOptions.args.push('--disable-dev-shm-usage');
    return launchOptions;
  });
  on('after:screenshot', (details) => {
    // Hook to run an ImageMagick mask if needed
  });
};

// cypress/support/e2e.ts
beforeEach(() => {
  cy.intercept('**/api/**', (req) => {
    if (req.url.includes('prod.example.com')) {
      throw new Error('Prod API blocked in tests');
    }
  });
  cy.intercept('POST', '/login', { fixture: 'login-success.json' });
});

Cypress.Screenshot.defaults({
  blackout: ['[data-test="pii"]', '.ssn', '.dob']
});

Masking logs with pino:

import pino from 'pino';
const redact = {
  paths: ['req.headers.authorization', 'user.ssn', 'user.dob', 'payment.cardNumber'],
  censor: '[REDACTED]'
};
export const logger = pino({ level: 'info', redact });

Seed synthetic data on boot:

# seed.sh
psql "$DATABASE_URL" -f seeds/synthetic.sql

Make evidence boring: artifacts, traceability, and VPATs

Auditors don’t want poetry; they want proofs tied to builds.

  1. Produce: machine-readable a11y reports (axe JSON, pa11y JSON, HTML diff).
  2. Persist: store in S3 a11y/${env}/${sha}/ and link to Git tag.
  3. Expose: link artifact URLs in release notes and change requests.

GitHub Actions to push to S3:

- name: Upload to S3
  if: always()
  env:
    AWS_REGION: us-east-1
  run: |
    aws s3 cp a11y-report.json s3://com-acme-ci/a11y/$GITHUB_SHA/a11y-report.json
    aws s3 cp axe-results.json s3://com-acme-ci/a11y/$GITHUB_SHA/axe-results.json

SARIF to Security tab (nice for dashboards):

npx @axe-core/cli https://localhost:3000 --save sarif --outfile axe.sarif
gh code-scanning upload --sarif=axe.sarif --category=a11y

If you need a VPAT, generate it from your component inventory and test outputs; we’ve automated this mapping for clients so updates are incremental, not a scramble.

Rollout plan and metrics that actually move the needle

Don’t boil the ocean. Sequence matters.

  1. Week 1–2: Baseline
    • Turn on eslint-plugin-jsx-a11y (errors) and jest-axe for the design system.
    • Add Storybook a11y addon; document focus and keyboard in stories.
    • Run pa11y on top 10 user journeys; log violations by severity.
  2. Week 3–4: Make it a gate
    • Add Playwright axe sweep in CI; fail on critical/serious.
    • Start artifact storage; link from PR checks and release notes.
    • Add acceptance-criteria templates to tickets.
  3. Week 5–6: Close 2.2 gaps
    • Focus not obscured fixes (sticky headers, dialogs).
    • Add non-drag interactions; enforce target size via tokens.
  4. Steady state
    • Violations SLO: 0 critical/serious on main; MTTR < 3 days for new a11y defects.
    • Evidence: 100% releases with artifacts attached.

KPIs we track with clients:

  • % pages/components with zero critical/serious violations
  • Time to fix a11y regression (MTTR)
  • % of PRs with a11y artifacts
  • Unassisted keyboard coverage in e2e tests

When leadership asks “Are we compliant?”, you show a dashboard, not a novel.

What GitPlumbers does when you bring us the mess

We fix the “almost there” setups fast:

  • Wire axe-core, pa11y-ci, Storybook a11y, and jest-axe into your mono-repo.
  • Refactor bespoke widgets into an accessible design system (menu, dialog, combobox, slider) without wrecking velocity.
  • Stand up synthetic data pipelines and redaction.
  • Build CI gates and artifact storage with tagging that legal can live with.
  • Train teams on WCAG 2.2 deltas and leave playbooks behind.

If you’ve got AI-generated “vibe code” in the mix, we do the vibe code cleanup and write the guardrails so it can’t regress.

Related Resources

Key takeaways

  • Make WCAG 2.2 AA a blocking gate, not a guideline.
  • Bake ARIA and keyboard behavior into your design system, not feature code.
  • Automate proofs with axe/pa11y/jest-axe/Playwright and store artifacts per build.
  • Target WCAG 2.2 deltas (focus not obscured, dragging alternatives, target size, authentication).
  • Keep auditors happy and legal calm: attach machine-readable reports to releases.
  • Protect speed and compliance with synthetic data, masked logs/screenshots, and isolated test envs.

Implementation checklist

  • Definition of Done includes WCAG 2.2 AA checks and Storybook a11y pass.
  • `eslint-plugin-jsx-a11y` enabled and CI-blocking.
  • All interactive components have keyboard support and visible focus states.
  • Playwright/axe-core run per PR with zero critical violations required.
  • A11y reports uploaded as build artifacts and linked to releases.
  • Synthetic test accounts seeded; screenshots and logs redact PII.
  • WCAG 2.2 deltas covered: focus not obscured, target size, dragging alternatives, accessible auth.

Questions we hear from teams

What parts of WCAG 2.2 cause the most regressions?
Focus not obscured (2.4.11/12) because of sticky headers, target size (2.5.8) when designers compress controls, and accessible authentication (3.3.8/9) when people add CAPTCHA without alternatives.
Is axe/pa11y enough to claim WCAG 2.2 AA compliance?
They’re necessary but not sufficient. Automated tools catch ~30–40% of issues. Pair them with design-system guardrails, manual keyboard checks in Storybook, and targeted Playwright tests for 2.2 deltas.
How do we avoid storing PHI/PII in test artifacts?
Use synthetic data, block calls to prod, redact logs (pino/winston redaction), and blackout PII in screenshots. Treat your CI like a public space.
Won’t this slow us down?
For about two sprints you’ll feel friction. After that, gates catch issues at PR time, not during audit week. Teams we’ve helped saw fewer rollbacks and faster approvals because evidence ships with the code.
Can GitPlumbers retrofit this into a legacy monolith?
Yes. We start with ESLint + jest-axe on the design system (or we carve one out), add page-level scans, then gate deployments with artifacts. We’ve done this in Rails, Java/Spring MVC, and React/Next monoliths without a rewrite.

Ready to modernize your codebase?

Let GitPlumbers help you transform AI-generated chaos into clean, scalable applications.

Make WCAG 2.2 a build breaker See how we retrofit legacy apps for accessibility

Related resources