No More Blind Deploys: Baking Security Scanning Into CI/CD Without Torching Velocity

A pragmatic, tool-specific playbook to wire SAST, SCA, IaC, container, secrets, and SBOM/signing into your pipeline with sane gates, metrics, and a rollout plan that won’t bring dev to a halt.

Security gates that don’t ship aren’t security gates—they’re wishful thinking.
Back to all posts

The incident you’ve lived through

A fintech I worked with pushed a hotfix on a Friday. Base image was node:latest, no image scan, no SBOM, no signature. The upstream image quietly pulled in a vulnerable openssl with a known RCE. Within hours, edge nodes were mining Monero. We spent the weekend hunting containers instead of shipping features. That outage would not have happened if the pipeline had basic gates: dependency and image scanning, signed artifacts, and an admission policy that refuses unsigned images and High CVEs. Here’s the playbook we use at GitPlumbers to make that the default without nuking velocity.

Set guardrails first: policy, thresholds, and metrics

Don’t start by sprinkling tools. Start with a policy you can encode.

  • Risk thresholds
    • SAST: block on patterns mapped to CWE/OWASP (e.g., SSRF, SQLi, path traversal). Allowlist with expiry only.
    • SCA: block on CVSS >= 7.0 (High/Critical) when a fix exists; warn if no fix.
    • IaC: block on public exposure (0.0.0.0/0), open S3 buckets, privileged pods, etc.
    • Container: block on High/Critical in OS packages and known malicious indicators.
    • Secrets: block on any verified secret; require rotation.
  • SLOs that matter
    • Build time budget: PR security checks < 5 minutes; push-time full suite < 12 minutes.
    • MTTR to remediate Highs: < 14 days; Criticals: < 72 hours.
    • False positive rate: < 10% after first month.
    • Coverage: > 90% repos onboarded within 60 days.
  • Gating strategy
    • Phase 1: soft-fail + metrics.
    • Phase 2: hard-fail on new issues only.
    • Phase 3: hard-fail on all (with exceptions via expiring allowlists).

Map pipeline stages to the right tools

Pick tools you can actually run/maintain. I’ve seen fancy platforms die on the vine because nobody owned them.

  • PR-time (fast)
    • SAST: semgrep or codeql query suite (language-dependent).
    • Secrets: gitleaks or trufflehog (pre-commit and CI).
    • Differential scanning only: changed files to keep feedback tight.
  • Push-time (deeper)
    • SCA: snyk, npm audit+audit-resolver, pip-audit, bundler-audit.
    • IaC: checkov, tfsec, conftest (OPA) for Terraform/K8s/Helm.
    • Container: trivy or grype on the built image.
    • SBOM: syft to generate CycloneDX/SPDX; attach via cosign attest.
  • Deploy-time
    • Admission control: Kyverno or OPA Gatekeeper requiring signed images and policy conformance.
    • GitOps: ArgoCD/Flux verifying signatures before sync.

Tooling should export SARIF/JSON so results flow to dashboards (GitHub Security, GitLab Security, or your SIEM).

Concrete examples (GitHub Actions, GitLab CI, Jenkins)

Here’s a minimal but real setup. Tune for your stack.

GitHub Actions

name: ci-security
on:
  pull_request:
    paths-ignore: ['**/*.md']
  push:
    branches: [main]
permissions:
  contents: read
  security-events: write
  id-token: write  # for keyless signing
jobs:
  sast:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: returntocorp/semgrep-action@v1
        with:
          config: p/owasp-top-ten
          generateSarif: true
      - uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: semgrep.sarif
  secrets:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: gitleaks/gitleaks-action@v2
        with:
          args: detect --redact --no-git --exit-code 2
  build_and_scan_image:
    runs-on: ubuntu-latest
    needs: [sast, secrets]
    steps:
      - uses: actions/checkout@v4
      - name: Build image
        run: |
          IMAGE=ghcr.io/${{ github.repository }}:${{ github.sha }}
          echo "IMAGE=$IMAGE" >> $GITHUB_ENV
          docker build -t $IMAGE .
      - name: Trivy scan
        uses: aquasecurity/trivy-action@0.20.0
        with:
          image-ref: ${{ env.IMAGE }}
          vuln-type: 'os,library'
          severity: 'HIGH,CRITICAL'
          exit-code: '1'
          format: 'table'
      - name: Generate SBOM (CycloneDX)
        run: |
          curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh -s -- -b /usr/local/bin
          syft $IMAGE -o cyclonedx-json > sbom.json
      - name: Sign image & attach SBOM attestation
        env:
          COSIGN_EXPERIMENTAL: '1'
        run: |
          curl -sSfL https://raw.githubusercontent.com/sigstore/cosign/main/install.sh | sh -s -- -b /usr/local/bin
          cosign sign --yes $IMAGE
          cosign attest --yes --predicate sbom.json --type cyclonedx $IMAGE

GitLab CI

stages: [sast, build, scan]

sast:
  stage: sast
  image: returntocorp/semgrep:latest
  script:
    - semgrep --config=p/owasp-top-ten --json --output=semgrep.json || true
  artifacts:
    when: always
    paths: [semgrep.json]

build:
  stage: build
  image: docker:24
  services: [docker:24-dind]
  script:
    - docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
    - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA

scan_image:
  stage: scan
  image: aquasec/trivy:latest
  script:
    - trivy image --exit-code 1 --severity HIGH,CRITICAL $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA

Jenkins (declarative)

pipeline {
  agent any
  environment { IMAGE = "registry.local/app:${env.GIT_COMMIT}" }
  stages {
    stage('SAST') {
      steps { sh 'semgrep --config=p/owasp-top-ten || true' }
    }
    stage('Build') {
      steps { sh 'docker build -t $IMAGE .' }
    }
    stage('Scan Image') {
      steps { sh 'trivy image --exit-code 1 --severity HIGH,CRITICAL $IMAGE' }
    }
  }
}

Policy-as-code: IaC checks, SBOM, and admission gates

Scanning without enforcement is theater. Bake the rules in code.

  • Terraform/K8s policy with OPA/Conftest
package terraform.security

deny[msg] {
  some i
  input.resource.aws_security_group[i].ingress[_].cidr_blocks[_] == "0.0.0.0/0"
  msg := "Security Group allows 0.0.0.0/0 ingress"
}

Run it in CI:

conftest test --policy policy/ tfplan.json
  • Admission: only signed, low-risk images (Kyverno example)
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-signed-images
spec:
  validationFailureAction: enforce
  rules:
    - name: verify-signature
      match:
        resources:
          kinds: ["Pod","Deployment","StatefulSet","DaemonSet","Job","CronJob"]
      verifyImages:
        - image: "ghcr.io/acme/*"
          keyless:
            roots: "https://fulcio.sigstore.dev"
            issuer: "https://token.actions.githubusercontent.com"
            subject: "repo:acme/*"
  • SBOM + provenance
    • Generate SBOM with syft and attach via cosign attest.
    • Store SBOMs alongside images in your registry (GHCR/ECR/GCR).
    • Optionally add SLSA provenance attestations; verify in admission.

Keep velocity: fast feedback, sane gates, and exceptions with expiry

What’s killed most programs I’ve seen is “security or shipping, pick one.” Don’t do that.

  1. Split PR vs. push: PRs run fast SAST + secrets only. Push runs deep SCA/IaC/container. Keep PR security budget < 5 minutes.
  2. Gate on new issues first: only fail for new High/Critical introduced by the change. Track legacy debt separately.
  3. Differential scanning: Semgrep on changed files; Trivy diff mode for base vs. new image; Checkov on changed IaC paths.
  4. Ownership baked in: CODEOWNERS + reviewers for security hotspots. Pipe findings to team-specific Slack channels.
  5. Exceptions expire: allowlist entries must auto-expire (30/60/90 days) and require a ticket link.
  6. Breakglass: emergency label flips gates to warn-only with mandatory postmortem within 48 hours.
  7. AI-generated code checks: add Semgrep rules that catch common Copilot/GPT footguns (insecure HTTP, JWT without verify, SQL string concat). If you’re drowning in vibe code, schedule a targeted vibe code cleanup pass.

If a gate fires and there’s no clear owner, you don’t have a security problem—you have an org problem.

Triage and remediation that actually closes the loop

Detection is cheap. Remediation is where programs die.

  • Automated PRs: Dependabot/Renovate for app deps and GitHub Actions; enable “grouped updates” per service.
  • Autofix for IaC: Checkov suggests fixes; require follow-up PRs.
  • Ticketing: each failed pipeline opens a ticket with labels team, service, severity, cvss. Auto-close when the next pipeline passes.
  • Dashboards: track MTTR by team, open High/Critical counts, and gating pass rate. Ship weekly trendlines to execs.
  • False positive review: weekly 30-min session; if FP rate > 10%, adjust rulesets (e.g., Semgrep .semgrepignore, Checkov skip with justification).
  • Pre-commit: add pre-commit hooks for gitleaks and lightweight Semgrep to catch issues before CI.

Example Renovate config:

{
  "extends": ["config:recommended"],
  "packageRules": [
    { "matchUpdateTypes": ["minor", "patch"], "groupName": "minor-patch-deps" },
    { "matchManagers": ["docker"], "matchPackagePatterns": ["^ubuntu"], "allowedVersions": "<24.04" }
  ]
}

Rollout plan and checkpoints (4–6 weeks)

Week 1–2: baseline and soft-fail

  • Onboard 3 pilot services (one per language). Add PR-time SAST + secrets. Push-time SCA/IaC/container with soft-fail.
  • Checkpoints: PR feedback < 5 min, false-positive rate < 20%, coverage 3/3 pilots.

Week 3–4: ratchet gates

  • Turn on hard-fail for new High/Critical on pilots. Generate SBOMs; sign images; deploy admission in staging.
  • Checkpoints: MTTR for Highs < 7 days; admission blocks unsigned images; exceptions carry expiry.

Week 5–6: expand and enforce

  • Onboard remaining services. Admission in prod (enforce). Start new-issue-only gates org-wide.
  • Checkpoints: > 70% repos onboarded; CI pass rate > 85%; build time within budgets.

Ongoing: debt burn-down

  • Set monthly target to reduce legacy High/Critical by 20% until within policy.
  • Add canary deployments so a missed CVE doesn’t annihilate all your nodes at once.

Prove it with numbers (what good looks like by Q2)

  • PR security checks: P95 < 4 minutes.
  • Hard-fail gate triggered rate: < 10% of pushes; P99 remediation < 48 hours for Criticals.
  • False positive rate: < 5% after tuning.
  • SBOM coverage: 100% images; signature verification on all clusters.
  • Vulnerability backlog: High/Critical down 60% from baseline.
  • No unsigned image admitted to prod for 90 days (verified by admission logs).

If you want help wiring this in without a six-month yak shave, GitPlumbers has done this for banks, marketplaces, and ML-heavy teams shipping lots of AI-generated code. We’ll make the gates stick and the metrics move.

Related Resources

Key takeaways

  • Gate on risk, not vibes: define thresholds (e.g., block on CVSS >=7 unless allowlisted with expiry).
  • Split fast PR-time scans from deeper push-time scans to keep developer feedback <5 minutes.
  • Generate SBOMs and sign artifacts; enforce signatures at admission, not just in CI logs.
  • Start soft-fail, measure MTTR and false-positive rates, then ratchet to hard-fail per service.
  • Automate remediation: bots for upgrades, codeowners for ownership, and expiry-based exceptions.

Implementation checklist

  • Define policy thresholds and SLOs for vulnerability remediation
  • Add PR-time SAST and secret scanning with <5 min budget
  • Add push-time SCA, IaC, and container scanning with gating
  • Generate SBOM (CycloneDX/SPDX) and sign images with Cosign
  • Enforce admission policies (Kyverno/OPA) for signed + low-risk images
  • Automate triage (Dependabot/Renovate) and track MTTR via tickets
  • Measure false-positive rate and scan coverage; ratchet gates per repo
  • Roll out gradually with breakglass and time-boxed exceptions

Questions we hear from teams

How do we keep build times from exploding?
Split fast (PR) and deep (push) scans. Run SAST and secrets only on diffs during PRs and keep that under 5 minutes. Do SCA, IaC, container, and SBOM on push. Cache scanner databases (e.g., Trivy cache), pin rulesets, and parallelize jobs.
Which tools should we choose?
Pick what fits your stack and team: Semgrep or CodeQL for SAST, Trivy or Grype for containers, Checkov/Tfsec or Conftest for IaC, Syft + Cosign for SBOM/signing, and Kyverno/Gatekeeper for admission. Prioritize JSON/SARIF output and low operational overhead.
How do we avoid drowning in false positives?
Start soft-fail, measure FP rate, and tune: add `.semgrepignore`, suppress noisy rules with justification, and maintain centralized allowlists with mandatory expiry. If FP > 10% after 2 weeks, prune rules aggressively.
What about AI-generated (vibe) code?
Add rules targeting common AI footguns (e.g., lax JWT verification, string-concat SQL). Run lightweight SAST pre-commit and in PRs. If you inherited a repo of vibe code, do a time-boxed vibe code cleanup with Semgrep autofix and targeted refactors.
Can we do this in air-gapped environments?
Yes. Mirror scanner DBs (Trivy, Grype), host your own Sigstore (Fulcio/Rekor) or use key-based Cosign, and run internal registries for SBOM storage. Schedule DB syncs through a controlled bridge.

Ready to modernize your codebase?

Let GitPlumbers help you transform AI-generated chaos into clean, scalable applications.

Get your pipeline gated without killing velocity Download the SBOM + Signing Playbook

Related resources