Stop Chasing CVEs: Build Vulnerability Workflows That Rank by Business Risk

If your vuln backlog is just CVSS scores sorted descending, you’re losing. Tie findings to asset value, exposure, and data sensitivity, and automate guardrails and proofs so engineers can ship without security playing hall monitor.

“Security that ships beats security that shouts. Gate by risk and keep the receipts.”
Back to all posts

The backlog that lies: when CVSS isn’t enough

If you’ve ever walked into a war room with 4,000 “critical” findings from three scanners, you know the vibe. I’ve seen teams at fintechs and unicorns alike spend quarters chasing CVEs that never had a path to exploit, while the actual breach vector was a stale Jenkins with an open security group. CVSS alone doesn’t tell you if your internet-exposed payments API is bleeding risk or if it’s just a dev pod behind three layers of auth.

Here’s what actually works: rank vulnerabilities by business risk, not scanner severity. Combine exploit likelihood (EPSS), known exploited status (CISA KEV), asset exposure, data sensitivity, and service criticality. Then wire that into CI/CD so we block or warn with receipts—automated proofs—without turning security into a deployment tax.

“If you can’t explain why a finding matters to revenue or regulated data, it won’t get fixed in time.”

Model risk like you mean it

You don’t need a PhD. You need a predictable formula that engineering can debug.

  • Inputs we use in the field:

    • CVSS base score (from scanner)
    • EPSS (Exploit Prediction Scoring System probability)
    • KEV flag (present in CISA Known Exploited Vulnerabilities)
    • Exposure (internet-facing? lateral movement risk?)
    • Asset criticality (tier-1 revenue path vs. batch job)
    • Data sensitivity (PCI/PHI/PII vs. public)
  • Example weights (start simple, tune with incidents):

    • CVSS 30%, EPSS 25%, KEV 20%, Exposure 15%, Criticality 5%, Data sensitivity 5%
// scripts/risk-score.ts
import fs from 'node:fs'

interface Finding {
  cvss: number;       // 0-10
  epss: number;       // 0-1
  kev: boolean;       // CISA KEV
  exposure: 'internet'|'internal'|'isolated';
  criticality: 1|2|3; // 1=low, 3=high
  data: 'public'|'internal'|'pii'|'pci'|'phi';
}

const exposureWeight: Record<Finding['exposure'], number> = {
  internet: 1.0, internal: 0.5, isolated: 0.2
}
const dataWeight: Record<Finding['data'], number> = {
  public: 0, internal: 0.2, pii: 0.6, pci: 1.0, phi: 1.0
}

export function riskScore(f: Finding) {
  const cvss = (f.cvss / 10) * 0.30;
  const epss = f.epss * 0.25;
  const kev = (f.kev ? 1 : 0) * 0.20;
  const exposure = exposureWeight[f.exposure] * 0.15;
  const criticality = ((f.criticality - 1) / 2) * 0.05;
  const data = dataWeight[f.data] * 0.05;
  return Number(((cvss + epss + kev + exposure + criticality + data) * 100).toFixed(1));
}

const finding = JSON.parse(fs.readFileSync(process.argv[2], 'utf8')) as Finding;
console.log(riskScore(finding));
  • Thresholds that teams actually follow:
    • Risk ≥ 80: Block build, page on-call, 24h SLA
    • 50 ≤ Risk < 80: Warn, create ticket, 7-day SLA
    • Risk < 50: Log, batch into monthly remediation window

Turn policy into guardrails, checks, and automated proofs

Stop emailing PDFs. Put policy into code and evidence into artifacts.

  1. IaC guardrails with OPA/Rego
# policy/s3.rego
package terraform.aws

default allow = false

# Require S3 buckets to be private and encrypted
allow {
  input.resource_type == "aws_s3_bucket"
  input.change.after.acl == "private"
  input.change.after.bucket != ""
  input.change.after.server_side_encryption_configuration.rule.apply_server_side_encryption_by_default.sse_algorithm == "aws:kms"
}

# Block public ACLs anywhere
violation[msg] {
  input.resource_type == "aws_s3_bucket_public_access_block"
  not input.change.after.block_public_acls
  msg := sprintf("Public ACLs not blocked for %s", [input.address])
}
  1. Terraform validations to catch bad patterns early:
# terraform/validations.tf
validation {
  condition = var.env == "prod" ? var.enable_encryption : true
  error_message = "Prod resources must enable encryption"
}
  1. Kubernetes admission with Kyverno (block public LoadBalancers in prod):
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: deny-public-lb-in-prod
spec:
  validationFailureAction: enforce
  rules:
    - name: deny-ext-lb
      match:
        resources:
          kinds: [Service]
          selector:
            matchLabels:
              env: prod
      validate:
        message: "External LoadBalancer forbidden in prod"
        pattern:
          spec:
            type: LoadBalancer
            externalTrafficPolicy: "?*" # exists
            loadBalancerIP: "!"         # must not set
            externalIPs: "!"            # must not set
  1. CI checks with risk-aware gates
# .github/workflows/scan-and-gate.yml
name: Scan and Gate by Business Risk
on: [push]
jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: aquasecurity/trivy-action@0.22.0
        with:
          scan-type: fs
          format: json
          output: trivy.json
      - name: SBOM (CycloneDX)
        run: |
          curl -sSL https://github.com/anchore/syft/releases/latest/download/syft_Linux_x86_64.tar.gz | tar -xz
          ./syft packages dir:. -o cyclonedx-json=sbom.json
      - name: Compute risk
        run: |
          npm ci
          node scripts/collect-risk-inputs.js trivy.json > finding.json
          node scripts/risk-score.js finding.json > risk.txt
          echo "risk=$(cat risk.txt)" >> $GITHUB_OUTPUT
        id: risk
      - name: Attest build (SLSA + in-toto)
        run: |
          cosign attest --predicate sbom.json --type cyclonedx ${{ github.sha }}
          cosign attest --predicate trivy.json --type vuln ${{ github.sha }}
      - name: Gate by threshold
        run: |
          RISK=${{ steps.risk.outputs.risk }}
          echo "Risk score: $RISK"
          if [ "$RISK" -ge 80 ]; then echo "High risk, failing"; exit 1; fi
      - name: Create Jira ticket for medium risk
        if: ${{ steps.risk.outputs.risk >= 50 && steps.risk.outputs.risk < 80 }}
        uses: atlassian/gajira-create@v3
        with:
          project: SEC
          summary: "Medium-risk finding ${{ github.sha }} (risk=${{ steps.risk.outputs.risk }})"
          description: "See artifacts: sbom.json, trivy.json, attestations in OCI"
  1. Automated proofs
  • SBOMs with syft (CycloneDX/SPDX)
  • Signed artifacts and attestations with cosign/Sigstore
  • Provenance (SLSA v1) via your build system
  • Store in OCI registry or evidence bucket; link from Jira for auditors

Balance regulated-data constraints with delivery speed

This is where most teams over-rotate. You can be compliant and fast if you route friction by risk.

  • Tag assets and data at source of truth (CMDB or Terraform):
module "payments_api" {
  source = "./service"
  env    = "prod"
  data_classification = "pci"   # pci | pii | phi | internal | public
  internet_exposed    = true
  tier                = 3        # 1=low, 3=high
}
  • CI/CD behavior by classification:

    • pci/phi services: enforce stronger gates, mandatory canaries, secrets scanning on PR (gitleaks, GitHub Advanced Security)
    • internal/public: warn-only for medium risk, auto-promote with Argo Rollouts
  • PII and secret controls that don’t stall engineers:

    • PR checks: gitleaks, trufflehog, repo secret scanning
    • Image scans: flag hardcoded creds and outbound egress to unknown domains
    • Synthetic or masked data in non-prod: Gretel, Tonic, or in-house masking jobs
    • Logging guardrails: OpenTelemetry processors to redact before export
# otel-collector redaction example
processors:
  attributes/pii-redact:
    actions:
      - key: user.email
        action: hash
      - key: cardNumber
        action: delete
service:
  pipelines:
    logs:
      processors: [attributes/pii-redact]
  • Change velocity: use progressive delivery where risk is uncertain.
    • Canary 5% -> 25% -> 100%
    • Circuit breakers in the mesh (Istio/Linkerd) to fail fast
    • Rollback SLO: revert within 10 minutes if error budget burn > 2%/h

Close the loop: SLAs, dashboards, and exceptions

You can’t improve what you don’t track. We wire these into every engagement.

  • SLAs by risk tier

    • High (≥80): 24h to remediate or mitigate; exec visibility
    • Medium (50–79): 7 days
    • Low (<50): 30 days or backlog batch
  • Dashboards (pipe from CI + scanners to your warehouse):

    • MTTR by tier, open count vs. age, exceptions by policy owner
    • Risk-weighted vulnerability density per service (findings/KLoC normalized by risk)
    • Correlate deploys with changes in risk trend
  • Exception workflow

    • OPA policy supports justification annotation and a time-bound waiver
# policy/waivers.rego
package policy

default allowed = false

allowed {
  input.finding.risk < 50
} else = allowed {
  input.exception.approved_by != ""
  time.now_ns() < input.exception.expires_at_ns
}
  • Runtime signals
    • Falco/eBPF detections raise exposure score temporarily
    • Known exploit in your stack auto-bumps KEV; page on-call

Concrete example: tying it all together in GitOps

This is the pattern we’ve rolled out at a healthcare client that needed HIPAA and speed.

  1. Pull request opens against an ArgoCD-managed app.
  2. Pre-commit runs conftest with Rego and tflint on Terraform.
  3. GitHub Actions builds image, generates SBOM (syft), scans (trivy, grype), computes risk score, signs and attests (cosign).
  4. If risk ≥ 80, job fails, notification to Slack + Jira P1 with artifacts.
  5. If 50–79, ArgoCD promotion is paused; canary requires human approval.
  6. Kyverno admission policies prevent public LB in prod and enforce namespace labels for data classification.
  7. Evidence (SBOM, attestation, logs) pushed to an OCI registry and mirrored to the GRC tool (Drata/Vanta) via webhook.

Result after 60 days:

  • 58% reduction in high-risk MTTR (from 9d to 3.8d)
  • 0 emergency change windows triggered by compliance findings
  • Deploy frequency to non-sensitive services up 22%
  • Audit time for HIPAA controls down from 3 weeks to 4 days

What I’d do tomorrow if I inherited your backlog

  • Normalize all scanner output to a single schema; enrich with EPSS and KEV.
  • Compute risk scores per finding and per service; publish to a shared dashboard.
  • Enforce 3–5 critical guardrails in IaC and admission; don’t boil the ocean.
  • Add SBOM + attestation steps to the happy path; store artifacts in OCI.
  • Wire Jira automation with SLAs by tier.
  • Pilot in one product line, iterate weights with incident data.
  • Train teams on the “why” once; after that, let the gates and proofs do the talking.

If you need a partner who’s done the vibe code cleanup after an AI-fueled feature sprint and has the scars from fixing legacy Jenkins, GitPlumbers will help you get the risk math and the guardrails right without slowing you to a crawl.

Related Resources

Key takeaways

  • Stop sorting by CVSS. Combine CVSS, EPSS, exploit status (CISA KEV), internet exposure, asset criticality, and data sensitivity into a simple risk score.
  • Shift left with policy-as-code. Use OPA/Rego, Kyverno, and Terraform checks to block unsafe changes and generate automated proofs.
  • Automate evidence. Produce SBOMs, attestations, and provenance (SLSA) per build; store in OCI and surface to GRC via API.
  • Throttle friction. Use risk thresholds in CI/CD, canaries, and pre-approved patterns to keep delivery velocity for low-risk changes.
  • Close the loop. Route high-risk findings to on-call with SLAs and dashboards; defer low-risk to batched remediation windows.
  • Handle regulated data sanely. Gate by classification tags, scan for secrets/PII, and use synthetic or masked data in lower envs.

Implementation checklist

  • Define a risk model combining CVSS, EPSS, KEV, exposure, criticality, and data sensitivity.
  • Instrument CI to compute risk score and gate by threshold; log evidence as build artifacts.
  • Enforce IaC guardrails with OPA/Kyverno and Terraform validations; require exception tickets for overrides.
  • Produce SBOMs (CycloneDX/SPDX), sign artifacts, and attach in-toto attestations with cosign.
  • Route findings via Jira automation with SLAs that match risk tiers; report MTTR by tier.
  • Protect regulated data: classify assets, enforce network/policy isolation, and run DLP/secret scans on PRs and images.
  • Adopt GitOps with policy checks at admission (Gatekeeper/Kyverno) and progressive delivery (Argo Rollouts).

Questions we hear from teams

How do I get EPSS and KEV into my pipeline?
Pull EPSS via FIRST.org’s API and KEV from CISA’s JSON feed during CI or as a nightly job enriching your vulnerability database. Cache locally to avoid rate limits, and merge by CVE ID before computing risk.
What scanners should I start with if we have nothing?
Keep it simple: Trivy or Grype for images, OSV-Scanner for dependencies, tfsec or Checkov for IaC. Add SAST once you’re catching obvious issues. The differentiator isn’t the tool; it’s the risk model and automation.
How do we avoid blocking every deploy in a regulated environment?
Classify services and data. Only apply hard gates to high-sensitivity, internet-exposed, or tier-1 services. Everyone else gets warnings and tickets. Use canaries and short rollback SLOs to manage risk while shipping.
What counts as an automated proof for auditors?
Signed SBOMs, vulnerability scan results, SLSA provenance, and in-toto attestations tied to the artifact digest. Store in an immutable bucket or OCI registry and reference from your change record or ticket.
Our codebase has AI-generated vibe code—does this change anything?
Yes—enable SAST and secret scanning on PRs, bump static analysis for unsafe patterns, and enforce policy-as-code in IaC. AI code increases the rate of risky patterns; the risk gating and automated proofs keep shipping safe.

Ready to modernize your codebase?

Let GitPlumbers help you transform AI-generated chaos into clean, scalable applications.

Get a risk-weighted vulnerability workflow in 30 days See our Security & Compliance approach

Related resources