Keeping an application healthy isn’t just about good code - it’s about giving your orchestration layer (Kubernetes, Nomad, Docker Swarm, whatever you use) the right signals. That’s where health checks come in. Two of the most important ones are readiness and liveness probes. They sound similar, but they serve completely different purposes.
Let’s break them down clearly.
What Are Health Checks?
In containerized environments, health checks let the platform understand whether your app is ready to serve traffic or needs a restart. Without proper checks, you risk downtime, failed deployments, and broken user experiences.
What Is a Liveness Probe?
A liveness probe answers a very simple question:
“Is the application alive, or is it stuck?”
If this probe fails, the orchestrator restarts the container.
When liveness should fail:
- The process is deadlocked.
- The app is stuck in an infinite loop.
- It cannot recover without a restart.
- Memory leaks or resource issues locked it up.
Good liveness endpoints:
- A simple “heartbeat” route with minimal logic.
- No DB calls.
- No external service calls.
- Should return quickly, every time.
Example response:
200 OK means “alive”, anything else means “restart me”.
What Is a Readiness Probe?
A readiness probe answers a different question:
“Can the application handle requests right now?”
This doesn’t restart the container - it just tells the load balancer to temporarily stop sending traffic.
When readiness should fail:
- The app is booting and not ready yet.
- Database connection isn’t established.
- Migrations are still running.
- Cache is warming up.
- External API dependency is down.
Good readiness checks:
- Check DB connection availability.
- Check essential dependencies.
- Check if the app finished initialization.
Example: If your DB is down, readiness fails → traffic stops → no user hits broken endpoints.
Real-World Example: Why Both Matter
Imagine a service connecting to PostgreSQL.
- Your app is running fine, no deadlocks, so liveness is OK.
- But PostgreSQL is temporarily down, so readiness should fail.
This protects you from:
- Unnecessary restarts.
- Users seeing 500 errors.
- Load balancers sending requests to a half-initialized service.
What Goes Wrong Without Proper Health Checks?
A few common disasters:
- Putting DB checks in liveness → constant restarts when DB is slow.
- Not using readiness at all → users get errors during startup or deployments.
- Over-engineering liveness probes → unnecessary cascading failures.
- Returning too much info → leaking sensitive details via diagnostics.
Keep it simple. Keep it predictable.
Best Practices for Production
Here’s a quick checklist:
Liveness
- Keep it dumb and fast.
- No DB.
- No heavy logic.
- Only fail when the app is truly stuck.
Readiness
- Check dependencies that must be up to serve traffic.
- Fail gracefully.
- Make sure startup logic uses readiness, not liveness.
General
-
Use separate endpoints:
/healthz(liveness),/readyz(readiness). - Avoid expensive checks.
- Make responses minimal and cache-safe.
- Test probes during deployments.
Final Thoughts
Good health checks aren’t optional - they’re part of application reliability. Liveness keeps your service running, readiness keeps your users happy. Use both correctly and your deployments become smoother, your outages shorter, and your infrastructure more predictable.
If you’re building anything that runs in containers, getting this right is a baseline requirement.
Album of the blog:




