Running a container doesn’t mean your app is running fine. It might look like everything’s green from the outside… but inside? Your app could be frozen, stuck, or completely dead 🧊💀
Welcome to the world of Docker HEALTHCHECK
— a super underrated feature that can make or break your reliability game. Today we’ll dive into:
✅ Why HEALTHCHECK is essential
⚠️ Real risks of skipping it
⚙️ How Docker HEALTHCHECK Works
🛠️ How to add it to your Dockerfiles
👀 Two practical test cases (healthy vs unhealthy)
❗ Why You Should Care
Let’s be honest. We often celebrate when our container is “up and running” — but that just means the process inside hasn’t crashed. It doesn’t tell us if:
- The web server is responding 🕸️
- The database is reachable 📉
- Your app logic is frozen in a loop 🔁
Without a healthcheck, Docker assumes everything is okay. That’s dangerous in production, but also in dev: it gives you a false sense of security.
Healthchecks add real visibility — if your app doesn’t behave as expected, Docker will mark it as unhealthy
, and tools like Docker Swarm or Kubernetes can act accordingly (restarts, scaling, etc.).
⚙️ How Docker HEALTHCHECK
Works — Under the Hood
When you add a HEALTHCHECK
instruction in your Dockerfile, you’re telling the Docker engine to periodically run a command inside the container to determine its health status. Here’s how it works step by step:
🧱 1. The HEALTHCHECK
Instruction
Example:
HEALTHCHECK --interval=10s --timeout=3s --retries=3 \
CMD curl -f http://localhost:5000/health || exit 1
You’re defining:
Option | Meaning |
---|---|
CMD | The actual command to run inside the container. It must exit with 0 for healthy, non-zero for unhealthy. |
--interval | How often to run the health check (default: 30s). |
--timeout | How long to wait before the command is considered failed (default: 30s). |
--retries | Number of consecutive failures before the container is marked unhealthy (default: 3). |
🧠 2. Docker Monitors Using a Background Healthcheck Manager
When you start a container that has a HEALTHCHECK
, Docker spawns a lightweight internal timer per container. This timer schedules and executes the CMD
at the interval you define.
It’s all handled by the Docker daemon, which adds a health state entry to the container’s metadata.
🧪 3. Exit Codes Determine Health
Docker executes the healthcheck command inside the container, and uses its exit code to decide the result:
Exit Code | Meaning |
---|---|
0 | Healthy ✅ |
1 | Unhealthy ❌ |
>1 | Unhealthy ❌ |
CMD not found
or fails to run? Still counts as unhealthy.
Docker tracks the consecutive failures, and once the retry limit is reached, the container is marked as unhealthy
.
🔄 4. Status Stored in Container Metadata
You can view this with:
docker inspect --format='{{json .State.Health}}' [container_name] | jq
It shows:
Status
:starting
,healthy
, orunhealthy
FailingStreak
: how many times it failed consecutivelyLog
: recent healthcheck attempts with timestamps and outputs
Docker updates this metadata in real-time, and you can consume it via:
- CLI (
docker ps
,docker inspect
) - Docker Remote API (
/containers/id/json
) - Orchestration tools (like Swarm or Kubernetes)
🪄 5. No Magic, Just Smart Logic
Docker doesn’t inject anything magical into your container. It simply:
- Executes the given command using the container’s existing binaries (like
curl
,wget
, etc.) - Waits for the result
- Updates internal health state
But this tiny mechanism becomes powerful when combined with:
- Restart policies (
--restart=on-failure
) - Health-based load balancers (Swarm, K8s, Traefik)
- Alerting systems (via Docker events or logs)
💡 A Note About “Starting”
After the container boots, healthchecks begin after a default grace period of 0s (can be configured). During this period, the container status shows as:
"Status": "starting"
Once the first successful check is done, status becomes healthy
. If it fails N
times, it becomes unhealthy
.
🚫 What Healthchecks DON’T Do
- ❌ They do not stop or restart containers by themselves
- ❌ They don’t directly affect container networking or DNS
- ❌ They don’t send alerts unless you wire them to an external system
🔧 Adding a HEALTHCHECK to Your Dockerfile
It’s simple! Here’s the syntax:
HEALTHCHECK --interval=10s --timeout=3s --retries=3 CMD curl -f http://localhost:5000/health || exit 1
This checks every 10 seconds if the /health
endpoint returns a success. If it fails 3 times in a row, the container becomes unhealthy
.
🧪 Let’s Test It in Action
We’ll create two test containers:
✅ Healthy App
This one includes a proper /health
endpoint that always returns 200 OK.
Dockerfile:
FROM python:3.11-slim
ENV DEBIAN_FRONTEND=noninteractive
WORKDIR /app
COPY app.py .
# Install curl
RUN apt-get update && \
apt-get install -y curl && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
RUN pip install flask
EXPOSE 5000
HEALTHCHECK --interval=10s CMD curl -f http://127.0.0.1:5000/health || exit 1
CMD ["python", "app.py"]
app.py:
from flask import Flask
app = Flask(__name__)
@app.route('/')
def home():
return "All good!"
@app.route('/health')
def health():
return "OK", 200
app.run(host="0.0.0.0", port=5000)
👉 Build and run:
docker build -t healthy-app .
healthtest
docker run -d --namehealthy-app
healthtest
docker inspect --format='{{.State.Health.Status}}'
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
55e9b6f148a9 healthcheck_test "python app.py" 18 seconds ago Up 18 seconds (healthy) 0.0.0.0:5000->5000/tcp, [::]:5000->5000/tcp healthtest
🎉 You’ll get: healthy
❌ Unhealthy App
Now let’s break the /health
endpoint.
Modified app.py:
@app.route('/health')
def health():
return "Error", 500
Build and run again:
docker build -t unhealthy-app .
docker run -d --name broken unhealthy-app
docker inspect --format='{{.State.Health.Status}}' broken
💥 Result: unhealthy
You’ll also see the logs showing failed healthcheck attempts:
$docker inspect broken | jq '.[].State.Health.Log'
[
{
"Start": "2025-06-05T19:26:07.323262742+02:00",
"End": "2025-06-05T19:26:07.367595028+02:00",
"ExitCode": 1,
"Output": " % Total % Received % Xferd Average Speed Time Time Time Current\n Dload Upload Total Spent Left Speed\n\r 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\r 0 5 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\ncurl: (22) The requested URL returned error: 500\n"
},
{
"Start": "2025-06-05T19:26:17.369661511+02:00",
"End": "2025-06-05T19:26:17.408770486+02:00",
"ExitCode": 1,
"Output": " % Total % Received % Xferd Average Speed Time Time Time Current\n Dload Upload Total Spent Left Speed\n\r 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\r 0 5 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\ncurl: (22) The requested URL returned error: 500\n"
},
{
"Start": "2025-06-05T19:26:27.409488914+02:00",
"End": "2025-06-05T19:26:27.450101106+02:00",
"ExitCode": 1,
"Output": " % Total % Received % Xferd Average Speed Time Time Time Current\n Dload Upload Total Spent Left Speed\n\r 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\r 0 5 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\ncurl: (22) The requested URL returned error: 500\n"
},
{
"Start": "2025-06-05T19:26:37.450803223+02:00",
"End": "2025-06-05T19:26:37.492805511+02:00",
"ExitCode": 1,
"Output": " % Total % Received % Xferd Average Speed Time Time Time Current\n Dload Upload Total Spent Left Speed\n\r 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\r 0 5 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\ncurl: (22) The requested URL returned error: 500\n"
}
]
👁️ What’s the Impact?
Scenario | Behavior |
---|---|
No HEALTHCHECK | Docker marks container as healthy by default |
HEALTHCHECK passes | Container state = healthy ✅ |
HEALTHCHECK fails | Container state = unhealthy 🚨 |
Why it matters:
- Your orchestration tools (Swarm, Kubernetes, etc.) rely on this signal
- You can detect failing containers early during development
- It helps your CI/CD pipeline make smart decisions
🧠 Final Thoughts
A HEALTHCHECK
is like a pulse check for your app ❤️🩹
Just because a container runs doesn’t mean your service is okay.
Whether you’re in local development or scaling in production, a tiny HEALTHCHECK
line in your Dockerfile can save you hours of debugging and nights of firefighting.
So go ahead — make your containers honest.
Docker HEALTHCHECK
is:
- A built-in mechanism that runs periodic commands inside containers
- Based entirely on the exit status of your script or command
- Tracked by the Docker daemon, with results exposed via CLI & API
- Powerful when combined with orchestration, restarts, and alerts
📚 Bonus tip: Want to auto-restart unhealthy containers?
Add this when running your container:
docker run --restart=on-failure ...