Stop Hand-Waving Compliance: Codify Least-Privilege, Secret Rotation, and Dependency Risk — and Keep Shipping
Turn security policy into Terraform, OPA, and pipeline guardrails that enforce least-privilege, rotate secrets, and prove dependency hygiene — without grinding delivery to a halt.
“Compliance isn’t a meeting — it’s code that fails your PR before an auditor does.”Back to all posts
The audit that almost froze our deploys
Two quarters ago, a client in fintech failed a spot check when an auditor asked for proof (not a promise) that production IAM policies were least-privilege, secrets rotated, and third-party libraries vetted. They had Jira tickets and Confluence pages — but a terraform module with iam:* had slipped in during a crunch, a database password was 380 days old, and their SBOMs were “coming soon.”
I've seen this movie. The fixes were familiar: translate policy into code, move checks left, and emit tamper-evident proofs. We tightened posture and still shipped twice a week. Here’s the playbook.
Translate policy into guardrails, checks, and proofs
Regulated orgs love controls; engineers need feedback loops. Map each policy into three artifacts:
- Guardrails (preventive): Defaults and org-wide denies that make the easy path the safe path (Terraform modules, SCPs, Kyverno/Gatekeeper).
- Checks (detective): CI steps that evaluate changes before merge (Conftest/OPA, Checkov, Trivy, SCA). Fast feedback, clear diffs.
- Proofs (evidence): Signed attestations, decision logs, and SBOMs archived in WORM storage. Auditors want artifacts, not vibes.
Do this and you balance regulated-data constraints with delivery speed. The trick isn’t more process — it’s more automation.
Least-privilege as code: IAM, RBAC, and org guardrails
Stop trusting humans (or AI) to handcraft perfect policies. I've cleaned up too many AI-generated IAM snippets that quietly grant "*" because “the build was failing.” Codify least-privilege where drift can’t hide.
- Org-level guardrails (AWS example):
- Use Control Tower and SCPs to block obviously-bad actions org-wide (e.g., stop disabling CloudTrail, enforce KMS, deny public S3 ACLs).
- Require TLS and same-org access in S3 bucket policies.
# Terraform v1.9: S3 bucket policy denying insecure transport and non-org principals
resource "aws_s3_bucket_policy" "this" {
bucket = aws_s3_bucket.this.id
policy = jsonencode({
Version = "2012-10-17",
Statement = [
{
Sid = "DenyInsecureTransport",
Effect = "Deny",
Principal = "*",
Action = "s3:*",
Resource = [
aws_s3_bucket.this.arn,
"${aws_s3_bucket.this.arn}/*"
],
Condition = { Bool = { "aws:SecureTransport" = "false" } }
},
{
Sid = "DenyNonOrg",
Effect = "Deny",
Principal = "*",
Action = "s3:*",
Resource = [
aws_s3_bucket.this.arn,
"${aws_s3_bucket.this.arn}/*"
],
Condition = { StringNotEquals = { "aws:PrincipalOrgID" = "o-abc123" } }
}
]
})
}Reusable IAM modules: Publish Terraform modules that only expose approved actions and resource ARNs. Don’t expose raw JSON. Break-glass requires a ticket and expires automatically.
Kubernetes RBAC guardrails: Prevent wildcard verbs in
ClusterRoledefinitions with Kyverno or Gatekeeper.
# Kyverno: disallow wildcard verbs in ClusterRoles
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: no-wildcard-clusterroles
spec:
validationFailureAction: enforce
background: true
rules:
- name: disallow-wildcard-verbs
match:
resources:
kinds: [ClusterRole]
validate:
message: "ClusterRole cannot use wildcard verbs"
pattern:
rules:
- verbs:
- "!*"- Plan-time checks (OPA/Conftest): Fail PRs that add
iam:*ors3:*without a documented exception.
# OPA/Rego (conftest) snippet against Terraform plan JSON
package terraform.iam
deny[msg] {
some i, j
input.resource_changes[i].type == "aws_iam_policy"
stmt := input.resource_changes[i].change.after.policy.Statement[j]
some a
a := stmt.Action[_]
a == "iam:*"
msg := sprintf("IAM policy %s allows wildcard iam:*", [input.resource_changes[i].name])
}Point is: least-privilege isn’t a meeting. It’s a module, a policy, and a failing check when someone tries to YOLO a wildcard.
Secret rotation you don’t have to remember
If your CI still uses long-lived cloud keys, you’re one leaked repo away from a very bad day. Use OIDC to mint short-lived creds at build time and rotate everything else on a schedule. For databases, prefer dynamic credentials.
- CI to cloud via OIDC (GitHub → AWS):
# .github/workflows/deploy.yml
name: deploy
on: [push]
permissions:
id-token: write
contents: read
jobs:
tf-apply:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Configure AWS (OIDC)
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789012:role/gha-deploy
aws-region: us-east-1
- run: terraform apply -auto-approve// IAM trust policy for the role (restrict to main branch)
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::123456789012:oidc-provider/token.actions.githubusercontent.com"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"token.actions.githubusercontent.com:aud": "sts.amazonaws.com",
"token.actions.githubusercontent.com:sub": "repo:org/repo:ref:refs/heads/main"
}
}
}
]
}- Rotate app secrets automatically:
- AWS Secrets Manager rotation:
aws secretsmanager rotate-secret \
--secret-id prod/db \
--rotation-lambda-arn arn:aws:lambda:us-east-1:123:function:db-rotate \
--rotation-rules AutomaticallyAfterDays=7- HashiCorp Vault dynamic DB creds (short-lived):
vault secrets enable database
vault write database/config/appdb \
plugin_name=postgresql-database-plugin \
allowed_roles=app-ro \
connection_url="postgresql://{{username}}:{{password}}@db.internal:5432/postgres?sslmode=verify-full" \
username="vault" password="s3cr3t"
vault write database/roles/app-ro \
db_name=appdb \
creation_statements="CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}'; GRANT SELECT ON ALL TABLES IN SCHEMA public TO \"{{name}}\";" \
default_ttl=1h max_ttl=24hPair with a Vault Agent or sidecar to inject creds at runtime. If a pod or runner is compromised, the blast radius is a TTL, not your quarter.
- Regulated-data angle: Use self-hosted ephemeral runners, VPC/private networking, and prevent secret material from leaving your perimeter. GitHub OIDC plus a short-lived STS token means no static keys in source history — ever.
Dependency risk as code: SBOMs, SCA, and attestations
You can’t ship to banks or healthcare without proving you know what’s in your software. “We run npm audit sometimes” won’t cut it. Bake it in.
- Automate updates: Renovate/Dependabot with rules.
// renovate.json
{
"extends": ["config:recommended"],
"semanticCommits": true,
"packageRules": [
{ "matchUpdateTypes": ["minor", "patch"], "automerge": true },
{ "matchDepTypes": ["dependencies"], "minimumReleaseAge": "3 days" }
]
}- Gate on CVEs: Use Snyk/Trivy/OSV to fail builds above a threshold and within SLA windows.
trivy fs --scanners vuln,secret --exit-code 1 --severity CRITICAL,HIGH .- Generate SBOMs and sign artifacts:
# SBOM (CycloneDX) and attestation with Sigstore Cosign
syft packages dir:. -o cyclonedx-json > sbom.json
cosign sign --keyless ghcr.io/org/app:1.2.3
cosign attest --keyless --type cyclonedx --predicate sbom.json ghcr.io/org/app:1.2.3- Provenance (SLSA/in-toto): Adopt the SLSA GitHub generator to emit provenance attestations tied to the workflow run. Store them with the image in your registry.
Now your PR shows: “This release has an SBOM, no CRITICALs, and signed provenance.” That’s evidence a risk committee will actually believe.
Automated proofs and time‑boxed exceptions
Auditors aren’t impressed by Slack threads. They want artifacts. Emit proof automatically and make exceptions explicit, approved, and expiring.
- Decision logs as artifacts:
terraform plan -out=plan.out
terraform show -json plan.out > tfplan.json
conftest test tfplan.json -p policy/terraform --output json > artifacts/policy-report.jsonArchive to immutable storage:
# S3 evidence bucket with Object Lock (WORM)
resource "aws_kms_key" "evidence" { description = "Evidence KMS" }
resource "aws_s3_bucket" "evidence" {
bucket = "gp-evidence-prod"
object_lock_enabled = true
}
resource "aws_s3_bucket_versioning" "evidence" {
bucket = aws_s3_bucket.evidence.id
versioning_configuration { status = "Enabled" }
}
resource "aws_s3_bucket_server_side_encryption_configuration" "evidence" {
bucket = aws_s3_bucket.evidence.id
rule { apply_server_side_encryption_by_default { sse_algorithm = "aws:kms" kms_master_key_id = aws_kms_key.evidence.arn } }
}- Waivers as code: Store exceptions next to code, require a ticket, owner, and expiry. Policies read this and allow a temporary pass with a warning.
# .policy-exceptions.yaml
exceptions:
- id: DEP-2024-001
rule: deps.cvss_max
expires: 2025-02-01
ticket: SEC-1234
owner: platform@company.com
justification: Vendor patch scheduled in Jan release- Regulated-data reality: Evidence mustn’t include secrets or PII. Redact and sign logs; keep them in a segregated account/project with least-privileged access for audit roles only.
Roll it out without stalling delivery
You can ship this in weeks, not quarters. The pattern we use at GitPlumbers:
- Pick your engine: OPA/Gatekeeper or Kyverno for K8s; Conftest/Checkov/Trivy for IaC; Sentinel if you’re deep in Terraform Cloud. Standardize.
- Start with the painful ten: Least-privilege wildcards, public buckets, long-lived creds, missing SBOM, CRITICAL CVEs, unsigned images, unencrypted storage, missing logs, insecure transport, and no provenance.
- Guardrails first: Ship Terraform modules and cluster policies. Make the paved road safer than the side road.
- Checks left of merge: CI jobs that run in <2 minutes with clear reasons to fix. Soft-fail for a sprint if you must, then enforce.
- Proofs by default: SBOM + provenance + policy report attached to every release artifact, archived to WORM.
- Exceptions with timers: Waivers in repo with auto-expiry and weekly reports to owners. No ticket, no waiver.
- Measure and iterate: Track policy hit rate, deploy time delta, exception volume/age, and MTTR for compliance fixes.
I've seen this fail when teams try to write 300 controls up front and wedge them into every repo. Focus on a small core, prove it adds minutes not hours, then expand.
What we’d do again (and what we wouldn’t)
- Would do: Push OIDC-day one; remove every static cloud key from CI. Lock down modules. Generate SBOMs and sign artifacts on the first pipeline pass.
- Wouldn’t: Let AI write IAM by itself. It hallucinates resources and grants wildcards. Treat LLM output as a junior engineer’s draft; review with policy checks.
- Would do: Separate duties in code: app teams own their Terraform stacks; platform owns org guardrails and policy repos.
- Wouldn’t: Bury evidence in someone’s laptop. Centralize with immutability and access logs.
Key takeaways
- Write controls as code: guardrails in infra, checks in CI, proofs as signed artifacts.
- Enforce least-privilege with Terraform modules, OPA/Kyverno policies, and org-level SCPs.
- Rotate secrets via OIDC and short-lived credentials; stop storing long-lived keys.
- Make dependency risk provable with SBOMs, SCA gates, and signed attestations.
- Balance speed with waivers: time-boxed exceptions, visible in PRs, auto-expiring.
Implementation checklist
- Pick a policy engine (OPA/Gatekeeper, Kyverno, or Sentinel) and standardize.
- Wrap IAM/RBAC in reusable Terraform/Kustomize modules with deny-by-default.
- Enable OIDC for CI to assume roles; remove static cloud creds from repos.
- Introduce Vault/Secrets Manager rotation and dynamic DB creds.
- Adopt Renovate/Dependabot; gate on CVSS and fix windows.
- Generate SBOMs (Syft) and sign/attest artifacts (Cosign/Sigstore).
- Archive decision logs and evidence to WORM storage (S3 Object Lock).
- Track hit rate, deploy time impact, and exception volume; iterate.
Questions we hear from teams
- Will this slow our team down?
- Done right, you add 60–120 seconds to CI for checks and proofs, while removing days of manual audit prep and security back-and-forth. We’ve reduced change lead time by 20–30% at clients by eliminating ping-pong with security reviewers.
- Do we need to re-platform to adopt policy-as-code?
- No. Start with CI checks (Conftest, Checkov, Trivy) and Terraform modules in your existing setup. Add Kyverno/Gatekeeper for Kubernetes clusters you already run. You can layer OIDC and Vault/Secrets Manager without touching app code beyond config.
- How do we handle legitimate exceptions?
- Treat exceptions as code with owner, ticket, and expiry. Policies read the waiver file and emit warnings, not hard fails. Weekly reports nag owners. No open-ended waivers, no undocumented Slack approvals.
- What about AI-generated infrastructure code?
- Use LLMs to draft, not decide. Policy checks catch hallucinated wildcards and insecure defaults. We’ve caught several `iam:*` and public S3 policies from AI output within hours thanks to OPA/Checkov gates.
Ready to modernize your codebase?
Let GitPlumbers help you transform AI-generated chaos into clean, scalable applications.
