The Release That Survived the Audit: OPA, Cosign, and Attestations in Your CI/CD

Turn policies into enforceable guardrails, generate automated proofs, and keep shipping without getting your sprint hijacked by compliance.

Compliance isn’t a meeting. It’s a build artifact.
Back to all posts

The release that didn’t get blocked (for once)

Two quarters ago, a fintech client was hours from a release when their auditor asked for “evidence of encryption-at-rest and provenance for production images.” The old playbook: scramble, screenshots, and a release freeze. This time, the team had pushed with confidence. The pipeline auto-generated an SBOM, signed the image, attached provenance attestations, and the cluster only admitted signed workloads. Evidence was already in the registry. The auditor got links, not a calendar invite. We shipped on time.

If your compliance lives in a PDF, you’re gambling. If it lives in your pipeline, you’re covered. Here’s how to wire it up without turning CI/CD into molasses.

Translate policies into guardrails developers can see

Stop arguing interpretations during incident calls. Translate written policies into policy-as-code with names, owners, and tests.

  • Create a policy register: each item maps a policy to a control, tool, and location of evidence
  • Keep policies close to the code they govern (monorepo subdir or a dedicated policy repo)
  • Provide quick feedback at PR time; block only when signals are high-confidence

Example: “No public storage for regulated data.”

  • Policy → “Buckets tagged data_classification=regulated must not be public; encryption-at-rest required.”
  • Terraform control (checked with conftest + Rego):
package terraform.aws.bucket

# deny public-read if bucket is regulated
violation[msg] {
  resource := input.resource_changes[_]
  resource.type == "aws_s3_bucket"
  tags := resource.change.after.tags
  tags.data_classification == "regulated"
  acl := resource.change.after.acl
  acl == "public-read" 
  msg := sprintf("Bucket %s is public but tagged regulated", [resource.name])
}

# require encryption at rest
violation[msg] {
  resource := input.resource_changes[_]
  resource.type == "aws_s3_bucket"
  tags := resource.change.after.tags
  tags.data_classification == "regulated"
  not resource.change.after.server_side_encryption_configuration
  msg := sprintf("Bucket %s missing SSE for regulated data", [resource.name])
}

Run it in CI with conftest against Terraform plan output:

terraform init && terraform plan -out tfplan
terraform show -json tfplan > tfplan.json
conftest test tfplan.json -p policy/terraform

The point isn’t the tool—OPA/Rego, Sentinel, or even simple grep in a pinch—the point is codified, testable rules with owners.

Wire the pipeline: pre-commit to production

Put fast checks where devs feel them (pre-commit/PR) and heavy checks in CI. Then sign everything.

A concrete GitHub Actions slice:

ame: ci-security
on: [pull_request, push]
jobs:
  security:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      id-token: write  # for keyless signing
    steps:
      - uses: actions/checkout@v4

      # Secret scanning (fast feedback)
      - name: gitleaks
        uses: gitleaks/gitleaks-action@v2

      # IaC scanning
      - name: tfsec
        uses: aquasecurity/tfsec-action@v1.0.3
      - name: checkov
        uses: bridgecrewio/checkov-action@master
        with:
          framework: terraform,kubernetes

      # Build image
      - name: Build
        run: |
          docker build -t ghcr.io/acme/payment-api:${{ github.sha }} .
          echo "IMAGE=ghcr.io/acme/payment-api:${{ github.sha }}" >> $GITHUB_ENV

      # SBOM generation
      - name: SBOM (Syft)
        uses: anchore/sbom-action@v0
        with:
          image: ${{ env.IMAGE }}
          format: cyclonedx-json
          output-file: sbom.json

      # Vulnerability scan
      - name: Trivy image scan
        uses: aquasecurity/trivy-action@0.21.0
        with:
          image-ref: ${{ env.IMAGE }}
          ignore-unfixed: true
          vuln-type: os,library
          severity: CRITICAL,HIGH

      # Install Cosign and sign image (keyless via OIDC)
      - uses: sigstore/cosign-installer@v3.5.0
      - name: Push and sign
        run: |
          echo $CR_PAT | docker login ghcr.io -u $GITHUB_ACTOR --password-stdin
          docker push $IMAGE
          cosign sign --yes $IMAGE
          cosign attest --yes \
            --predicate sbom.json \
            --type cyclonedx \
            $IMAGE

Notes from the trenches:

  • Keep PR checks under 3–5 minutes; anything heavier, run on push to main
  • Fail the build on high-severity vulns in new code; warn on existing tech debt and open a ticket
  • Store SBOMs and attestations in your OCI registry (GHCR, ECR, Harbor) so evidence rides with the artifact
  • GitLab CI and Jenkins can do the same—just ensure your runner has OIDC to enable keyless Cosign

Enforce at the cluster and the repo

Your cluster is the last line of defense. Admission controllers should reject what your pipeline shouldn’t produce in the first place.

  • Kubernetes: use Kyverno or Gatekeeper for admission policies
  • Git: use branch protection, required status checks, and CODEOWNERS for policy ownership
  • GitOps: Argo CD syncs what’s allowed, but enforcement lives in the cluster policies

Example Kyverno policy that enforces signed images and bans :latest:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-signed-and-no-latest
spec:
  validationFailureAction: enforce
  background: true
  rules:
    - name: disallow-latest
      match:
        any:
          - resources:
              kinds: [Deployment, StatefulSet, DaemonSet, Job, CronJob]
      validate:
        message: "Images must be pinned; ':latest' is not allowed"
        pattern:
          spec:
            template:
              spec:
                containers:
                  - image: "*!:latest"
    - name: verify-cosign-signature
      match:
        any:
          - resources:
              kinds: [Pod, Deployment, StatefulSet, DaemonSet, Job, CronJob]
      verifyImages:
        - image: "ghcr.io/acme/*"
          keyless:
            roots: https://fulcio.sigstore.dev
            issuer: https://token.actions.githubusercontent.com
            subject: "repo:acme/payment-api:ref:refs/heads/main"

Pair this with Argo CD AppProject constraints (restrict destinations/namespaces) and a require on passing CI checks in your repo. The combo is a belt-and-suspenders guardrail: bad configs get caught in PRs; stragglers get blocked at the gate.

Automated proofs that make audits boring

Auditors don’t want your feelings; they want evidence. Treat evidence as a build artifact.

What to generate automatically:

  • Signatures: cosign sign on every image and artifact
  • SBOMs: syft CycloneDX (or SPDX) attached as an OCI artifact
  • Vuln reports: trivy JSON, archived per build
  • Provenance: SLSA-compliant in-toto attestation (who built what, when, with which inputs)

Example: verify attestation at deploy time (can be a pipeline job or an admission controller webhook):

IMAGE=ghcr.io/acme/payment-api:sha-abc
cosign verify --certificate-oidc-issuer https://token.actions.githubusercontent.com $IMAGE
cosign verify-attestation \
  --type cyclonedx \
  --certificate-oidc-issuer https://token.actions.githubusercontent.com \
  $IMAGE

Store pointers to these in your ticketing or GRC system, but keep the binary evidence in your registry or artifact store. Bonus points for streaming OPA/Kyverno decision logs to a SIEM (e.g., gatekeeper-audit logs to Splunk) so you have a searchable trail of “who tried to deploy what, and why it was blocked.”

Speed without drama: fast paths, waivers, and progressive delivery

Here’s how we preserve delivery speed while satisfying HIPAA/PCI/SOC2 asks.

  1. Tiered severities and gates
    • Fail PRs on policy correctness (no-public-bucket, no :latest)
    • Fail main builds on exploitable critical vulns in changed code; warn otherwise with backlog tickets
  2. Timeboxed waivers with context
    • Waivers live in a waivers.yaml with owner, reason, expiry, and linked ticket
    • Pipeline reads waivers; Kyverno can match on labels and tolerate until expiry
  3. Progressive delivery
    • Allow canary deployment if signed+attested; full rollout requires zero critical vulns and passing runtime checks
  4. Parallelize and cache
    • Cache dependency scans; run image scans in parallel to tests; run IaC scanning on changed modules only
  5. Dry-run before enforce
    • Roll out new rules as audit (Kyverno audit or Gatekeeper dryrun) for 1–2 sprints, fix false positives, then flip to enforce
  6. Shift-left developer experience
    • Pre-commit hooks for gitleaks and conftest
    • Good error messages, links to fixing docs, and sample patches

The guardrail mantra: fast feedback, clear ownership, time-limited exceptions.

What good looks like after 90 days

Patterns we’ve seen after wiring this at fintech, healthtech, and insurance clients:

  • Release freezes due to “missing evidence” drop to near-zero
  • Lead time for changes remains flat while audit prep time shrinks from weeks to minutes
  • DORA metrics improve or hold: change failure rate down 10–20%, MTTR unaffected
  • Audit cycles shift from meetings to links: SBOM URLs, Cosign verify logs, policy decision logs
  • Security debt becomes visible: vuln and waiver burndown charts in your usual dashboards

Concrete outcomes from a payments client (PCI DSS + SOC 2 Type II):

  • 98% of prod workloads signed and verified at admission within four weeks
  • 100% of services with CycloneDX SBOMs attached to images
  • Evidence retrieval time: from 3 days to <10 minutes per control

Common mistakes and a minimal starter blueprint

What I’ve seen fail:

  • Tool sprawl with overlapping scans and no single source of truth for evidence
  • Blocking on noisy scanners without a dry-run period
  • Central security teams owning policy with no delivery team input (guaranteed revolt)
  • Over-focusing on container images while ignoring IaC and data paths
  • Signing artifacts but not enforcing verification at admission (half the job)

A minimal, useful blueprint:

  • One policy repo with OPA/Kyverno rules, tests, and CODEOWNERS
  • CI stages: secrets → IaC → build → SBOM → vuln scan → sign → attest
  • Registry as the evidence store; short-lived links in tickets
  • Kyverno/Gatekeeper enforcing signatures and basic hygiene
  • Monthly review to ratchet rules from warn → block

Starter CODEOWNERS excerpt:

/policy/terraform/        @cloud-platform-team
/policy/kubernetes/       @platform-sre
/.github/workflows/ci     @release-engineering

If you’ve got AI-generated code creeping into the repo, add a rule to block packages without known licenses and require SBOM diffs in PRs. We’ve done whole “vibe code cleanup” sprints that paid down the mess while preserving velocity.

Related Resources

Key takeaways

  • Translate written policies into policy-as-code with explicit mappings to controls and owners.
  • Shift-left with fast, developer-facing checks; enforce at merge and at cluster admission.
  • Generate and store automated proofs (signatures, SBOMs, attestations) as build artifacts.
  • Use severity tiers, timeboxed waivers, and progressive delivery to preserve velocity.
  • Measure what matters: change failure rate, lead time, and evidence availability time.

Implementation checklist

  • Inventory policies and map them to concrete controls and tools (e.g., PCI 3.4.1 → encryption-at-rest check).
  • Create a policy repo with OPA/Kyverno rules, test cases, and ownership.
  • Embed scans in CI (IaC, image, secrets, licenses) and make failures actionable.
  • Sign, attest, and store SBOMs and build provenance in your artifact registry.
  • Enforce signatures and risky config at cluster admission with Kyverno/Gatekeeper.
  • Implement waivers with expiry and audit trails; report on waivers weekly.
  • Track DORA metrics alongside compliance evidence SLAs.
  • Dry-run new rules, then ratchet from warn → block once false positives are resolved.

Questions we hear from teams

Do we need both OPA/Gatekeeper and Kyverno?
Pick one for admission control; both are solid. Kyverno has a friendlier YAML DSL (and verifyImages for Cosign). Gatekeeper is great if you’re already deep into Rego. Many teams start with Kyverno policies and keep OPA/Rego for non-K8s checks via conftest.
How do we handle legacy services that can’t be rebuilt for signing?
Wrap them with signed deployment manifests and a Kyverno/Gatekeeper policy that enforces signatures on the wrapper images or Helm charts. Add a timeboxed waiver and a remediation plan. Prioritize migrating these to signed images over time.
Won’t scanning and signing slow down our pipeline?
If you push everything into the PR path, yes. Keep PR checks under 5 minutes and run heavier scans on main. Cache dependencies, scan only changed modules, and parallelize image scans. Signing and attestation themselves are fast (seconds) with keyless Cosign.
Where should we store evidence?
Use your OCI registry (ECR/GHCR/Harbor) for SBOMs, signatures, and attestations—they travel with the artifact. Keep reports in your artifact store and link them in tickets/GRC. Export policy decision logs to your SIEM for searchability.
How do we prove to auditors that the cluster enforces policies?
Show the admission policies (Kyverno/Gatekeeper), logs of rejections, and successful verification commands (`cosign verify`). Provide links to the exact policies in Git with change history and approvals. That forms your automated proof.
What about AI-generated code sneaking in risky dependencies?
Require SBOMs for every build, block unknown or disallowed licenses, and diff SBOMs in PRs. It’s the fastest way to catch surprise transitive deps from AI-generated code. We’ve added these checks during vibe code cleanup engagements with minimal disruption.

Ready to modernize your codebase?

Let GitPlumbers help you transform AI-generated chaos into clean, scalable applications.

Talk to an engineer about wiring compliance into your pipeline Get the CI/CD compliance starter blueprint

Related resources