brandonwie.dev
EN / KR
On this page
devops devopsairflowcicddockerwork

Airflow CI/CD Concepts

Understanding Airflow deployment and CI/CD concepts through a kitchen analogy.

Updated March 22, 2026 4 min read

When I first started working with Airflow, the relationship between DAGs, ETL code, Docker images, and deployments was confusing. What gets restarted when? Which changes are hot-reloaded? Why does a DAG change deploy instantly but a requirements.txt change needs a full rebuild? A kitchen analogy helped me understand the architecture, and knowing which changes need restarts versus which are picked up automatically saved me from unnecessary downtime.

DAGs vs ETL: Recipes vs Cooking

The most fundamental distinction in Airflow is between the DAG and the ETL code.

DAG = Recipe Card

A DAG (Directed Acyclic Graph) is like a recipe card. It defines:

  • WHAT to do (which tasks to run)
  • WHEN to do it (schedule: daily, hourly, etc.)
  • IN WHAT ORDER (task A before task B)
# This is a DAG - it's just instructions, not the actual work
with DAG('amplitude_pipeline', schedule='@daily'):
    task1 = "Fetch data from Amplitude"      # Step 1
    task2 = "Transform the data"              # Step 2
    task3 = "Save to database"                # Step 3
    task1 >> task2 >> task3   # Do in this order

ETL = The Actual Cooking

ETL (Extract, Transform, Load) is the actual code that does the work:

  • Extract: Get data (connect to API, download events)
  • Transform: Process data (clean, calculate, join)
  • Load: Save results (to S3, database)

The key insight: the DAG doesn’t process data. It tells the worker “run this container now.” The container runs the ETL code, does the actual work, and exits. The DAG is the orchestrator, not the executor.

What Needs a Restart?

This is the question that confused me most. The answer depends on what changed:

No Restart Needed (90% of cases)

ChangeWhat happens
dags/my_dag.py (new/modify)Scheduler auto-detects in ~30 sec
ETL code (arch-etl)Next DAG run uses new container

DAG file changes are hot-reloaded — the Airflow scheduler continuously scans the DAGs folder and picks up changes within ~30 seconds. ETL code changes are even simpler: since each task run pulls a fresh Docker image, pushing a new image to ECR means the next DAG run automatically uses the updated code.

Restart Required (10% of cases)

ChangeWhy restart
Airflow versionNew image = need to restart
requirements.txtNew Python packages need to be in image
DockerfileImage changed = rebuild + restart
.env fileEnvironment variables loaded at container start

These changes affect the Airflow infrastructure itself (scheduler, webserver, worker containers), not the DAGs or ETL code. Infrastructure changes require a container rebuild and restart.

Deployment Scenarios

Putting it all together:

ScenarioActionRestart?Downtime
DAG changesgit pull on EC2NoNone (~30s)
ETL code changesECR pushNoNone
Airflow upgradeImage rebuild + restartYes~1-2 min

The Three Repos Pattern

In our setup, different types of code live in different repositories, each with their own deployment flow:

RepoContainsDeploy HowRestart?
arch-airflowDAG filesgit pull to EFSNo
arch-airflowAirflow imagesECR + docker restartYes (rare)
arch-etlETL job codeECR pushNo (auto-pulls latest)
backend-infraInfrastructureTerraform (one-time)N/A

The separation matters because DAG files change frequently (daily), ETL code changes moderately (weekly), and infrastructure changes rarely (monthly). Each should deploy independently without affecting the others.

Why This Matters

Understanding the deployment model prevents two common mistakes:

  1. Unnecessary restarts — Restarting Airflow containers for a DAG change causes 1-2 minutes of downtime for no reason. DAG changes are hot-reloaded.
  2. Missing restarts — Updating requirements.txt without rebuilding the Docker image means the new package isn’t available. The DAG fails at runtime with an import error.

The rule is simple: if the change is to a Python file in the DAGs folder or to ETL code in a separate Docker image, no restart. If the change affects the Airflow infrastructure (Docker image, environment variables, Airflow version), restart.

Key Takeaways

  1. DAG = Recipe (what/when/order), ETL = Cooking (actual work)
  2. DAGs don’t touch data — they tell the worker “run this container now”
  3. Most deployments don’t need restart — DAGs and ETL are hot-reloaded
  4. Only restart for Airflow image changes (version upgrade, new packages)
  5. ETL containers are ephemeral — they run, do work, exit, and get deleted

Takeaway

Airflow’s deployment model has two independent paths: DAG files (hot-reloaded, no restart) and infrastructure (requires restart). Knowing the boundary between them — DAGs and ETL code versus Docker images and environment variables — eliminates both unnecessary downtime and missed restarts. When in doubt, check: does this change affect the Airflow containers themselves, or just the instructions they follow?

References

Comments

enko