DevOps

Kubernetes CI/CD Deployment Engine

2×/month20×/monthdeploy frequency

The team was doing 2 deployments per month because every deploy was a manual, anxiety-inducing event. Rollbacks required redeploying old images and took 30+ minutes. Main branch broke 8 times a month.

KubernetesGitHub ActionsGo

What Was Broken

  • 2 deployments per month — fear-driven infrequency
  • 30+ minute rollbacks requiring manual kubectl commands
  • Main branch breaking 8+ times per month from unreviewed pushes
  • No audit trail — impossible to know what was deployed when
// required fix
  • Branch protection and PR-gated CI checks on every commit
  • Optimized Docker builds with layer caching
  • Parallel test suite under 5 minutes total
  • ArgoCD-based GitOps with automated rollback on health check failure
  • Zero-downtime Blue/Green deployments on EKS

How It Was Built

Built end-to-end: branch protection → optimized Docker builds → parallel tests → ArgoCD GitOps → Argo Rollouts Blue/Green with Prometheus-gated promotion.

Branch Protection → Build → Test → Deploy
  • Full pipeline: protected main, multi-stage Docker build (920MB → 118MB), parallel test matrix (22min → 4.
  • 📄 .github/workflows/ci-cd.yml

Branch Protection → Build → Test → Deploy

Full pipeline: protected main, multi-stage Docker build (920MB → 118MB), parallel test matrix (22min → 4.5min), ArgoCD auto-sync, Blue/Green rollout with Prometheus analysis gates.

.github/workflows/ci-cd.yml
yaml
on:
  push:
    branches: [main]
jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: docker/build-push-action@v5
        with:
          tags: ghcr.io/${{ github.repository }}:${{ github.sha }}
          cache-from: type=registry,ref=ghcr.io/${{ github.repository }}:cache
  update-manifest:
    needs: build
    run: |
      yq e '.spec.template.spec.containers[0].image = "$IMAGE"' \
        -i k8s-config/apps/production/deployment.yaml
      git push  # ArgoCD picks this up automatically

What Changed

Deploy frequency: 2×/month → 20×/month. Rollback time: 30 minutes → 30 seconds. Zero 503s during any deployment since implementation.

Deploy frequency
2×/month
0
10× increase
Rollback time
30 min (manual)
0
60× faster
Build time
14 min
0
8× faster
Test wall time
22 min
0
5× faster
"Deployment became a non-event. The team ships 10× more frequently without scheduling maintenance windows or fearing rollbacks."