On this page
Docker Compose CI/CD Patterns
Patterns for using Docker Compose in CI/CD pipelines: separating dev and prod configurations, ECR integration, and deployment strategies.
Our CI/CD pipeline ran docker-compose pull and then docker-compose up -d on the production server. The logs showed success, but the container was running an old image built locally — not the fresh one we’d just pushed to ECR. The culprit? Our docker-compose.yml used build: instead of image:, so pull silently did nothing.
This is one of those mistakes that wastes hours because everything looks correct. This post covers the pattern that prevents it: separating your Docker Compose files into development (build:) and production (image:) configurations, along with CI/CD pipeline strategies for Airflow deployments on EC2.
The Build vs Image Problem
The Issue
The root cause is a fundamental difference in what build: and image: mean to Docker Compose. When a service uses build:, Compose ignores pull entirely — there’s nothing to pull, because the configuration says “build this locally.” When a service uses image:, Compose knows to fetch the specified image from a registry.
# docker-compose.yml
services:
webserver:
build: # ← "Build locally"
context: ..
dockerfile: master/Dockerfile docker-compose pull # ← Does nothing! No image to pull
docker-compose up -d # ← Builds locally instead Analogy: Like telling someone “follow this recipe” (build) when you already cooked the meal and put it in the fridge (ECR).
The Solution: Separate Files
The fix is straightforward: maintain two separate Compose files. One for local development that builds from source, one for production that pulls pre-built images from ECR.
project/
├── docker-compose.yml # Local development (build:)
└── docker-compose.prod.yml # Production (image:) Local Development uses build: so you can iterate on Dockerfile changes without pushing to a registry:
# docker-compose.yml
services:
webserver:
build:
context: ..
dockerfile: master/Dockerfile Production uses image: with an ECR registry URL. The ${ECR_REGISTRY} variable is injected by CI/CD at deploy time:
# docker-compose.prod.yml
services:
webserver:
image: ${ECR_REGISTRY}/airflow-master:latest # ← Pull from ECR With the Compose files separated, the CI/CD pipeline can use the right file for each environment. Here’s the full flow for an Airflow deployment that supports both DAG-only changes (fast, no restart) and image changes (full rebuild and deploy).
CI/CD Pipeline Flow
┌─────────────────────────────────────────────────────────────────┐
│ GitHub Actions (deploy.yml) │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 1. detect-changes │
│ └─ Detect if dags/ or master/, worker/ changed │
│ │
│ 2a. sync-dags (DAG only changes) │
│ └─ EC2: git pull │
│ └─ No restart, ~30s reflection │
│ │
│ 2b. build-images (image changes) │
│ └─ GitHub Actions: Docker build │
│ └─ Push to ECR (airflow-master:latest, airflow-worker:latest)│
│ │
│ 3. deploy-ec2 (image changes) │
│ ├─ Secrets Manager → .env file │
│ ├─ Add ECR_REGISTRY to .env │
│ ├─ docker-compose.prod.yml pull ← KEY CHANGE │
│ └─ docker-compose.prod.yml up -d │
│ │
└─────────────────────────────────────────────────────────────────┘ ECR_REGISTRY Environment Variable
The ${ECR_REGISTRY} variable in the production Compose file needs to resolve to the actual ECR URL. CI/CD handles this by appending the registry URL to the .env file on the target server:
# In deploy.yml
echo "ECR_REGISTRY=${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com" >> master/.env Then docker-compose.prod.yml uses it:
services:
webserver:
image: ${ECR_REGISTRY}/airflow-master:latest Trigger Strategy
An important decision for production deployments: should they run automatically on every push, or require manual approval? We started with automatic triggers and learned the hard way why manual is safer.
Before: Auto + Manual
on:
push:
branches: [main]
workflow_dispatch: After: Manual Only (Recommended for Prod)
on:
workflow_dispatch:
inputs:
deploy_type:
description: "Deploy type"
required: true
default: "all"
type: choice
options:
- dags
- images
- all Why manual?
- Production deployment should be intentional
- Prevent accidental deployments from main push
- Allow choosing deployment type (DAG only, images only, all)
Secrets Manager Integration
The production server needs environment variables (database credentials, API keys, etc.) that should never live in the repository. The CI/CD pipeline fetches them from AWS Secrets Manager and writes them to .env on the target server at deploy time:
# In deploy.yml
aws secretsmanager get-secret-value
--secret-id prod/airflow/master
--query SecretString --output text |
jq -r 'to_entries | map("(.key)=(.value)") | .[]' > master/.env Required Secrets
Master:
prod/airflow/master:
├── POSTGRES_HOST, POSTGRES_PORT, POSTGRES_DB
├── POSTGRES_USER, POSTGRES_PASSWORD
├── REDIS_HOST, REDIS_PORT
├── AIRFLOW_ADMIN_USER, AIRFLOW_ADMIN_PASSWORD, AIRFLOW_ADMIN_EMAIL
├── AIRFLOW_SECRET_KEY
├── AWS_DEFAULT_REGION
├── AWS_ACCOUNT_ID ← For DAG ECR image paths
└── GITHUB_PAT ← For git pull Deployment Scenarios
Let’s walk through the two most common deployment scenarios and how they differ in speed and impact.
Scenario 1: DAG Only Changes
DAG-only changes are the fastest deployment path — a git pull on the EC2 instance, and Airflow picks up the changes within ~30 seconds. No container restart needed.
# 1. Push code
git add dags/my_dag.py
git commit -m "feat: add new DAG"
git push origin main
# 2. GitHub Actions (manual trigger)
# → deploy_type: dags
# 3. Result
# - EC2: git pull
# - No restart
# - ~30s reflection Scenario 2: Dockerfile/Requirements Changes
Image changes require the full pipeline: build a new Docker image, push it to ECR, pull it on the server, and restart containers. This takes 1-2 minutes with a brief downtime window.
# 1. Push code
git add master/Dockerfile requirements.txt
git commit -m "feat: add new dependency"
git push origin main
# 2. GitHub Actions (manual trigger)
# → deploy_type: images
# 3. Result
# - GitHub Actions: build image
# - Push to ECR
# - EC2: docker-compose.prod.yml pull
# - Container restart (~1-2min downtime) Rollback Methods
When a deployment goes wrong, you need to get back to a known-good state fast. The rollback approach depends on what changed.
ECR Image Rollback
For image-related issues, pin the Compose file to a specific image tag (git SHA) instead of :latest:
ssh airflow-master
cd /opt/airflow
# Edit docker-compose.prod.yml: :latest → :abc123 (specific commit SHA)
docker-compose -f master/docker-compose.prod.yml pull
docker-compose -f master/docker-compose.prod.yml up -d DAG Rollback
ssh airflow-master
cd /opt/airflow
# Rollback specific files
git checkout <commit-sha> -- dags/
# Or full rollback
git reset --hard <commit-sha> Summary
| File | Purpose | Uses |
|---|---|---|
docker-compose.yml | Local development | build: directive |
docker-compose.prod.yml | Production deployment | image: directive |
CI/CD Gotchas
One lesson learned the hard way: floating action tags break builds silently. We used cloudflare/wrangler-action@v3 in a GitHub Actions workflow, and one day builds started failing with “bun not found.” The action had changed its default packageManager from npm to bun — and since ubuntu-latest doesn’t ship with bun, the action failed immediately.
The fix was simple: pin packageManager: npm explicitly. The broader rule: always pin action versions or explicitly set all configurable defaults. A @v3 tag can shift under your feet without a single line of your code changing.
Practical Takeaways
The build: vs image: distinction is the single most important thing to get right in Docker Compose CI/CD. Everything else follows from this separation:
Always use separate Compose files for dev and prod.
docker-compose.ymlwithbuild:for local development,docker-compose.prod.ymlwithimage:for production. Mixing them leads to the silent failure wherepulldoes nothing because the file says “build locally.”Inject the ECR registry URL via environment variable. The
ECR_REGISTRYpattern keeps your Compose file portable — the same file works for any AWS account or region. CI/CD writes it to.env, and Docker Compose interpolates it automatically.Use manual triggers for production deployments.
workflow_dispatchwith deployment type selection (dags,images,all) prevents accidental deployments from pushes to main. For a system like Airflow, this also lets you deploy DAG changes without rebuilding containers — a 30-second operation instead of a 2-minute one.Store secrets in AWS Secrets Manager, not in the repository. The CI/CD pipeline fetches secrets at deploy time and writes them to
.envon the target server. This keeps credentials out of git history and makes rotation straightforward.
The pattern in this post scales from a single EC2 instance to multi-node deployments. The key insight remains the same: development builds locally, production pulls pre-built images.