Back to work
2026-04-1912 min readin-progress

Secure GitOps Pipeline

An ArgoCD + Flux pipeline with SLSA provenance, cosign image verification, Trivy scans, and OPA Gatekeeper policies — fail-closed on every promotion gate.

ArgoCD
Flux
Cosign
Trivy
OPA Gatekeeper
Kyverno
Sigstore
Secure GitOps Pipeline

Supply chain

SLSA L3

Promotion gates

7

Policy rules

34

Mean repair time

< 5min

What I wanted

A deploy pipeline where the only way code reaches prod is through a chain of fail-closed, cryptographically verifiable gates. No developer shell. No "just this once" exceptions. If the image isn't signed, if the policy fails, if the SBOM shows an unaddressed critical CVE — the deploy stops and the board lights up red.

The seven gates

  1. Gate 1

    Hermetic build

    Every image is built in a sandboxed runner with no network. Build provenance recorded in SLSA L3 format.

  2. Gate 2

    SBOM + Trivy

    SBOM generated at build time (Syft) and attached as an OCI artifact. Trivy scans for CVEs and license issues; critical findings fail the gate.

  3. Gate 3

    Cosign signing

    Images signed with cosign keyless (OIDC) via Sigstore. The signature covers both the image and the SBOM.

  4. Gate 4

    OPA policy eval

    ArgoCD sync hook runs an OPA Gatekeeper bundle: allow-listed registries, no :latest, no privileged containers, required labels, CPU/memory requests set.

  5. Gate 5

    Kyverno admission

    Kyverno enforces Pod Security Standards at admission: read-only root filesystem, non-root UID, seccomp profile set.

  6. Gate 6

    Progressive rollout

    Argo Rollouts does a canary with automated analysis against Prometheus SLOs. Any regression triggers rollback.

  7. Gate 7

    Runtime detection

    Falco rules catch anomalous syscalls post-deploy; critical alerts fire to on-call and auto-revert.

How it feels for developers

  • Push to main → image builds → artifacts land in the registry
  • A Renovate-style bot opens a PR against the environment repo
  • The PR runs the full gate suite in preview
  • Reviewer approves → ArgoCD syncs → canary rolls out over 20 minutes
  • If anything goes wrong, auto-rollback fires within 90 seconds

Developers see a single PR status check. Security sees a full chain of custody.

Policy highlights (OPA)

package k8s.image
 
deny[msg] {
  input.review.object.spec.containers[_].image == regex.match(".*:latest$", _)
  msg := "Image tags must be immutable (no :latest)"
}
 
deny[msg] {
  not startswith(input.review.object.spec.containers[_].image, "registry.aibos.internal/")
  msg := "Images must come from the internal registry"
}

What's next

  • Finish the Kyverno policy library (34 rules shipped, aiming for 60)
  • Write up the Argo Rollouts + Prometheus SLO analysis template
  • Publish the provenance-viewer UI so non-SREs can audit deploys