Stop Hand‑Waving Compliance: Codify Least‑Privilege, Secrets, and Dependency Risk or Eat the Pager
Policies don’t matter until they’re enforced in CI, proven with artifacts, and boring to audit. Here’s how to turn least‑privilege, rotation, and supply‑chain controls into code without grinding delivery to a halt.
You don’t have a control if you can’t prove it ran on every change.Back to all posts
The outage, the audit, and the pager
I’ve watched a fintech freeze releases for three weeks because an auditor asked for evidence of least‑privilege and rotation—and all they had were Confluence pages. Another client ate a P1 when a leaked CI token with “AdministratorAccess” spun up crypto miners. Both had the same root cause: policies were English, not code.
The fix isn’t another committee. It’s codifying controls where changes actually happen: Git, CI/CD, and your runtime. Translate policies into guardrails, checks, and automated proofs—and make the whole thing fast by default.
Policy as code in three layers
Stop thinking “security gates.” Think layers that move left and generate evidence by design.
- Guardrails (pre-merge): block obviously risky code before it lands. Example: OPA/Rego via
conftest
on Terraform plans; pre-commitgitleaks
. - Checks (CI/CD gates): enforce deeper scans and verifications. Example:
trivy
/grype
for images, SBOM generation, Kyverno/Gatekeeper admission policies. - Proofs (artifacts): store machine-verifiable evidence: SARIF reports, OPA decisions, SBOMs, cosign attestations, rotation logs.
- Write the control in code.
- Fail fast in PRs, block deploys in CI.
- Emit an artifact that proves it happened.
You don’t have a control if you can’t prove it ran on every change.
Least‑privilege you can’t backslide from
I’ve seen teams “clean up IAM later” and never come back. The pattern that sticks:
- Template roles with permissions boundaries. Use Terraform modules so every
aws_iam_role
attaches a boundary that narrows scope. - Deny wildcards at plan time. Use OPA/Rego to catch
"Action": "*"
and overly broad resources. - Validate with native tools. Run AWS IAM Access Analyzer policy checks in CI for drift/regressions.
Example Terraform (AWS):
module "app_role" {
source = "terraform-aws-modules/iam/aws//modules/iam-assumable-role"
version = "5.39.0"
create_role = true
role_name = "app-api"
role_permissions_boundary_arn = aws_iam_policy.boundary.arn
trusted_role_services = ["ec2.amazonaws.com"]
max_session_duration = 3600
}
resource "aws_iam_policy" "boundary" {
name = "app-boundary"
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{ Effect = "Allow", Action = ["s3:GetObject"], Resource = ["arn:aws:s3:::app-prod/*"] },
{ Effect = "Deny", Action = "*", Resource = "*", Condition = { "Bool": { "aws:ViaAWSService": "false" } } }
]
})
}
Rego guardrail (run with conftest
against tfplan.json
):
package tfplan.iam
deny[msg] {
some i
input.resource_changes[i].type == "aws_iam_policy"
stmt := input.resource_changes[i].change.after.policy.Statement[_]
action := stmt.Action
action == "*"
msg := sprintf("IAM policy %s has wildcard action", [input.resource_changes[i].name])
}
deny[msg] {
some i
input.resource_changes[i].type == "aws_iam_role"
not input.resource_changes[i].change.after.permissions_boundary
msg := sprintf("Role %s missing permissions boundary", [input.resource_changes[i].name])
}
Wire it up in PRs with Atlantis or GitHub Actions:
- name: Terraform plan
run: terraform plan -out tf.plan && terraform show -json tf.plan > tfplan.json
- name: Policy check
uses: open-policy-agent/conftest-action@v1
with: { files: tfplan.json }
Result: engineers get clear, fast feedback; auditors get JSON decisions archived per commit.
Secrets that rotate themselves (and leave no breadcrumbs in Git)
The only secrets I trust are short‑lived or dynamic. Everything else leaks eventually. The pattern:
- Zero Git secrets. Enforce
gitleaks
in pre-commit and CI. - Workload identity for CI. Use OIDC from GitHub/GitLab to assume cloud roles without static keys.
- Dynamic secrets by default. Vault or AWS RDS IAM auth for DB creds; short TTLs; automatic rotation.
- Kubernetes pulls secrets at runtime. External Secrets Operator syncs from Vault/Secrets Manager.
GitHub Actions to AWS without long‑lived keys:
permissions:
id-token: write
contents: read
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789012:role/ci-deploy
aws-region: us-east-1
Vault dynamic DB creds + External Secrets Operator:
vault secrets enable database
vault write database/config/pg \
plugin_name=postgresql-database-plugin \
connection_url="postgresql://{{username}}:{{password}}@pg:5432/postgres?sslmode=require" \
allowed_roles="app-role"
vault write database/roles/app-role \
db_name=pg \
creation_statements="CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}'; GRANT SELECT ON ALL TABLES IN SCHEMA public TO \"{{name}}\";" \
default_ttl=1h max_ttl=24h
# external-secrets.yaml
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: app-db-creds
spec:
refreshInterval: 1h
secretStoreRef:
name: vault
kind: ClusterSecretStore
data:
- secretKey: DB_USERNAME
remoteRef: { key: database/creds/app-role, property: username }
- secretKey: DB_PASSWORD
remoteRef: { key: database/creds/app-role, property: password }
Proofs that matter:
- Store
vault list sys/leases/lookup
outputs per deploy as evidence of rotation frequency. - Keep CloudTrail/CloudWatch logs for STS/OIDC sessions; prove max session duration ≤ 1 hour.
- Attach SARIF from
gitleaks
to PRs; block merges on findings.
Dependency risk you can actually block
We’ve all been burned by a transitive vuln shipped at 5 p.m. on Friday. Treat supply chain as code:
- Automate updates:
renovate
ordependabot
to keep libs fresh. - Generate SBOMs:
syft
to create SPDX/CycloneDX on every build. - Scan and break:
trivy
/grype
for images and filesystems; fail on HIGH/CRITICAL. - Verify signatures/attestations:
cosign
to sign images and attach SBOM and vulnerability reports. - Enforce at admission: Kyverno or Gatekeeper to block unsigned or vulnerable images.
CI fragments:
syft packages dir:. -o spdx-json > sbom.spdx.json
trivy fs --scanners vuln,secret --severity HIGH,CRITICAL --exit-code 1 .
trivy image --ignore-unfixed --severity HIGH,CRITICAL --exit-code 1 ghcr.io/acme/api:${GITHUB_SHA}
cosign sign --key cosign.key ghcr.io/acme/api:${GITHUB_SHA}
cosign attest --predicate sbom.spdx.json --type spdx --key cosign.key ghcr.io/acme/api:${GITHUB_SHA}
Kyverno policy to require signed images and block old bases:
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: verify-signed-and-fresh
spec:
validationFailureAction: Enforce
rules:
- name: require-cosign
match: { resources: { kinds: ["Pod"] } }
verifyImages:
- image: "ghcr.io/acme/*"
key: |-
-----BEGIN PUBLIC KEY-----
...
-----END PUBLIC KEY-----
- name: disallow-stale-images
match: { resources: { kinds: ["Pod"] } }
validate:
message: "Images older than 30 days are not allowed."
pattern:
spec:
containers:
- image: "?*"
preconditions:
all:
- key: "{{ images.containers[0].resolvedDigest | age() }}"
operator: LessThanOrEquals
value: 30d
Result: engineers merge Renovate PRs in daylight, your admission controller keeps prod clean, and you’ve got SBOMs and attests for auditors.
Proof or it didn’t happen: automate evidence
Auditors don’t want screenshots. They want artifacts tied to commits:
- Policy decisions: persist
conftest
results and Access Analyzer outputs as JSON per SHA. - Scan outputs: store Trivy/Grype SARIF; upload to code scanning; keep a copy in immutable storage (S3 with object lock).
- Attestations: use
cosign
to attach in-toto attestations for policy pass, SBOM, and build provenance. - Change traceability: link PR, build, image digest, and deployment via GitOps (ArgoCD) commit SHAs.
Example CI step to archive evidence:
- name: Upload evidence
run: |
mkdir -p evidence/${{ github.sha }}
cp trivy.json conftest.json sbom.spdx.json evidence/${{ github.sha }}/
aws s3 cp evidence/${{ github.sha }}/ s3://compliance-artifacts/${{ github.repository }}/${{ github.sha }}/ --recursive --sse AES256
Bonus: wire these artifacts to your GRC tool (Drata/Vanta) or a simple internal dashboard. Now “show me rotation evidence for Q3” is a link, not a war room.
Move fast on regulated data without cutting corners
Speed dies when regulated workloads ride the same pipeline as everything else. Use lanes:
- Data-classification labels on repos and namespaces:
data_class=regulated|internal|public
. - Policy routing: regulated lanes require extra checks (softer SLOs), fast lanes keep the lightweight path.
- Preview environments with synthetic or masked data. No PII in ephemeral PR envs.
- Progressive delivery: canary with
Argo Rollouts
and kill switches viaLaunchDarkly
. - Time-bound exceptions: allow break-glass with auto-expiry + Jira ticket + attestation.
Example routing in GitHub Actions:
- name: Determine lane
run: |
if grep -q "data_class=regulated" .repo-tags; then echo "lane=regulated" >> $GITHUB_OUTPUT; else echo "lane=fast" >> $GITHUB_OUTPUT; fi
- name: Run regulated pipeline
if: steps.lane.outputs.lane == 'regulated'
uses: ./.github/workflows/pipeline-regulated.yml
- name: Run fast pipeline
if: steps.lane.outputs.lane == 'fast'
uses: ./.github/workflows/pipeline-fast.yml
Now compliance is a routing decision, not a team-wide slowdown.
Rollout plan and the metrics that matter
You don’t need a 9‑month program. Ship value in weeks and measure real outcomes.
- Week 1–2: Add OIDC for CI, block Git secrets, and enforce IAM wildcards/permissions boundaries in Terraform PRs.
- Week 3–4: Deploy External Secrets Operator; migrate 1 service to dynamic DB creds; start rotation evidence capture.
- Week 5–6: Add SBOM generation, Trivy gates, and cosign signing + admission policy in a non-critical cluster.
- Week 7–8: Route regulated repos to the regulated pipeline; enable progressive delivery.
Track:
- Rotation coverage: % of services on dynamic/short‑lived secrets; median secret TTL.
- Least‑privilege drift: count of IAM policies with wildcards; Access Analyzer findings trend.
- Dependency freshness: median package age; % of images with CRITICAL vulns = 0.
- Lead time impact: delta vs. baseline (should be ±10% after the first month).
- Audit lag: time to produce evidence for a control (target < 5 minutes).
What changes: fewer 2 a.m. incidents, faster audits, and a provably safer pipeline.
Where GitPlumbers fits
We’ve done this at banks, adtech, and healthcare SaaS. We’ll pair with your team to ship the first lane in two sprints, leave you with reusable modules and policy bundles, and stick around only as long as you need us.
- Terraform IAM modules with boundaries and OPA policies wired to Atlantis.
- Vault/Secrets Manager + External Secrets rollout with migration tooling.
- SBOM + signing + admission policies that don’t brick your cluster on day one.
- Evidence pipelines your auditors actually accept.
If you’re staring at an audit date or a pager full of “leaked token” alerts, we should talk.
Key takeaways
- Translate controls into three layers: guardrails (pre-merge), checks (CI/CD gates), and proofs (attestations/artifacts).
- Least-privilege only sticks when you template IAM roles with permissions boundaries and block wildcards in code review via OPA.
- Rotate secrets by design: use workload identity + dynamic secrets; prove rotation with logs and attestations.
- Treat dependency risk as code: SBOMs, signature verification, and policy-based image and library blocking.
- Balance speed and regulation with “fast lanes vs. regulated lanes,” feature flags, and progressive delivery.
Implementation checklist
- Tag repos and environments by data classification to route them through appropriate pipelines.
- Enforce permissions boundaries and deny wildcards via OPA/Rego on Terraform plans.
- Standardize secret retrieval via External Secrets Operator; forbid static secrets in Git.
- Adopt OIDC federation for CI to cloud; eliminate long-lived keys.
- Generate and verify SBOMs; block unsigned or vulnerable images in admission.
- Capture policy evaluations and rotation logs as immutable artifacts.
- Pilot in one service, then scale via shared modules and reusable policy bundles.
Questions we hear from teams
- What if engineers need broad access for break-glass incidents?
- Allow it—but make it time‑boxed and provable. Implement a break‑glass role with strong MFA, session TTL ≤ 15 minutes, and mandatory Jira reference. Log the assumption event and create an attestation artifact that expires the access automatically.
- Do we need Vault if we’re all-in on AWS?
- Not necessarily. AWS Secrets Manager + IAM roles + External Secrets Operator covers most cases. Vault shines for dynamic database creds, complex rotation workflows, and multi-cloud. Pick one, standardize, and forbid ad-hoc secrets elsewhere.
- Won’t admission policies break production?
- Not if you stage them. Start with `audit`/`dry-run`, then `warn`, then `enforce` for scoped namespaces. Pair with progressive delivery so you can quickly rollback if a policy bites an edge case.
- How do we handle legacy apps that can’t rotate creds?
- Wrap them. Use a sidecar or init container that fetches/refreshes creds and updates the app via file/ENV. Set shorter TTLs and automate restarts on rotation. Plan a modernization path, but don’t wait for it to enforce rotation at the platform layer.
- What about GCP/Azure equivalents?
- Same patterns. Use Workload Identity Federation for CI, Cloud KMS/Secret Manager for secrets, Binary Authorization + Artifact Registry + COSIGN for images, and Policy Controller (OPA/Gatekeeper) or Azure Policy/Defender for admission and compliance.
Ready to modernize your codebase?
Let GitPlumbers help you transform AI-generated chaos into clean, scalable applications.