By Alwyn Botha, Alibaba Cloud Tech Share Author. Tech Share is Alibaba Cloud's incentive program to encourage the sharing of technical knowledge and best practices within the cloud community.
This tutorial teaches you about two independent types of probes to help ensure your Pods run smoothly:
Kubernetes assumes responsibility that your containers in your Pods are alive. If not, it restarts the containers that fail liveness probes. Kubernetes do not assume responsibility for your Pods to be ready. Readiness may be a complicated set of interrelated networked components that enables a Pod to be ready.
Restarting a container with a failing readiness probe will not fix it, so readiness failures receive no automatic reaction from Kubernetes. A Pod may have several containers running inside it. All those containers may have different liveness and readiness probes ( since different software runs inside each ).
This tutorial demonstrates Pods with just one simple container. This way we can focus only on liveness and readiness probes.
This tutorial will cover the following topics:
Create the following YAML file with your favorite Linux editor.
nano myLiveness-Pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: myliveness-pod
spec:
containers:
- image: httpd:2.4
imagePullPolicy: IfNotPresent
name: myliveness-container
command: ['sh', '-c', 'echo Container 1 is Running ; sleep 3600']
ports:
- name: liveness-port
containerPort: 80
hostPort: 8080
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 2
periodSeconds: 10
Explanation of Pod spec above:
A httpGet livenessProbe uses http get command to probe if a container is alive.
Let's create the Pod to see how this works.
Create the Pod.
kubectl create -f myLiveness-Pod.yaml
pod/myliveness-pod created
Truncated list of describe output ... only relevant fields shown.
kubectl describe pod/myliveness-pod
Name: myliveness-pod
Status: Running
Containers:
myliveness-container:
Image: httpd:2.4
Port: 80/TCP
Host Port: 8080/TCP
State: Running
Started: Wed, 16 Jan 2019 07:37:02 +0200
Ready: True
Restart Count: 0
Liveness: http-get http://:80/ delay=2s timeout=1s period=10s #success=1 #failure=3
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 3s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 3s kubelet, minikube Created container
Normal Started 3s kubelet, minikube Started container
This Pod looks identical to any other successfully running Pod - zero difference even in events list. Liveness probes waiting those defined seconds before probing starts.
Still looks like any other Pod for first 30 seconds.
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 1/1 Running 0 12s
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 1/1 Running 0 22s
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 1/1 Running 0 33s
Let's investigate what is happening in detail.
kubectl describe pod/myliveness-pod
Name: myliveness-pod
Start Time: Wed, 16 Jan 2019 07:37:01 +0200
Status: Running
Containers:
myliveness-container:
State: Running
Started: Wed, 16 Jan 2019 07:37:02 +0200
Ready: True
Restart Count: 0
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 37s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 36s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 36s kubelet, minikube Created container
Normal Started 36s kubelet, minikube Started container
Warning Unhealthy 9s (x3 over 29s) kubelet, minikube Liveness probe failed: Get http://172.17.0.6:80/: dial tcp 172.17.0.6:80: connect: connection refused
We had 3 liveness probe failures so far. Overall Pod status stays READY and RUNNING. ( This is a confusing fact: the container is not alive, but it is in status: ready )
Wait around 15 seconds and redo describe
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 62s default-scheduler Successfully assigned default/myliveness-pod to minikube
Warning Unhealthy 34s (x3 over 54s) kubelet, minikube Liveness probe failed: Get http://172.17.0.6:80/: dial tcp 172.17.0.6:80: connect: connection refused
Normal Pulled 4s (x2 over 61s) kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 4s (x2 over 61s) kubelet, minikube Created container
Normal Started 4s (x2 over 61s) kubelet, minikube Started container
Normal Killing 4s kubelet, minikube Killing container with id docker://myliveness-container:Container failed liveness probe.. Container will be killed and recreated.
Apache is not running in the container. This causes liveness probe to fail. There is no working port 80 to connect to : dial tcp 172.17.0.6:80: connect: connection refused
Let's fix that. Enter the Pod and start apache:
kubectl exec myliveness-pod -i -t -- /bin/sh
# httpd
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 172.17.0.6. Set the 'ServerName' directive globally to suppress this message
# httpd
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 172.17.0.6. Set the 'ServerName' directive globally to suppress this message
httpd (pid 15) already running
# exit
AH00558 warning is easy to fix, but irrelevant to liveness probes, so feel free to ignore it.
I entered httpd twice - second time it shows it is running already ( exactly what I wanted to see ).
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 1/1 Running 1 104s
Our Pod is running, it restarted once.
We fixed the problem.
Unhealthy 32s (x5 over 102s) will not shown any more failures.
8 seconds later
Warning Unhealthy 40s (x5 over 110s) kubelet, minikube Liveness probe failed: Get http://172.17.0.6:80/: dial tcp 172.17.0.6:80: connect: connection refused
20 seconds later
Warning Unhealthy 57s (x5 over 2m7s) kubelet, minikube Liveness probe failed: Get http://172.17.0.6:80/: dial tcp 172.17.0.6:80: connect: connection refused
Restart count does not increase anymore. Liveness probes succeed.
Unfortunately the events do not SHOW a log entry about this success. You have to deduce it and assume it now works. There is no field that displays the liveness status.
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 1/1 Running 1 2m19s
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 1/1 Running 1 2m27s
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 1/1 Running 1 2m34s
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 1/1 Running 1 2m45s
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 1/1 Running 1 2m59s
Delete Pod.
kubectl delete -f myLiveness-Pod.yaml --force --grace-period=0
pod "myliveness-pod" force deleted
This demo worked since the default restartPolicy: Always is in effect.
Let's see what happens with a restartPolicy: Never.
nano myLiveness-Pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: myliveness-pod
spec:
containers:
- image: httpd:2.4
imagePullPolicy: IfNotPresent
name: myliveness-container
command: ['sh', '-c', 'echo Container 1 is Running ; sleep 3600']
ports:
- name: liveness-port
containerPort: 80
hostPort: 8080
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 2
periodSeconds: 2
restartPolicy: Never
I made periodSeconds 2 seconds. Now we will quickly see what happens.
Create the Pod.
kubectl create -f myLiveness-Pod.yaml
pod/myliveness-pod created
Investigate just the tail ( events ) part of kubectl describe pod/myliveness-pod
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 8s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 8s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 8s kubelet, minikube Created container
Normal Started 8s kubelet, minikube Started container
Warning Unhealthy 2s (x3 over 6s) kubelet, minikube Liveness probe failed: Get http://172.17.0.6:80/: dial tcp 172.17.0.6:80: connect: connection refused
6 seconds later ..........
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 14s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 14s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 14s kubelet, minikube Created container
Normal Started 14s kubelet, minikube Started container
Warning Unhealthy 8s (x3 over 12s) kubelet, minikube Liveness probe failed: Get http://172.17.0.6:80/: dial tcp 172.17.0.6:80: connect: connection refused
10 seconds later .........
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 25s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 25s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 25s kubelet, minikube Created container
Normal Started 25s kubelet, minikube Started container
Warning Unhealthy 19s (x3 over 23s) kubelet, minikube Liveness probe failed: Get http://172.17.0.6:80/: dial tcp 172.17.0.6:80: connect: connection refused
another 10 seconds later .........
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 34s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 34s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 34s kubelet, minikube Created container
Normal Started 34s kubelet, minikube Started container
Warning Unhealthy 28s (x3 over 32s) kubelet, minikube Liveness probe failed: Get http://172.17.0.6:80/: dial tcp 172.17.0.6:80: connect: connection refused
restartPolicy: Never works. No restarts done.
The default failureThreshold is 3 times. After 3 failures a container is classified as failed.
Only 3 failed probes done.
Here we see the Pod status turns to error.
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 1/1 Running 0 37s
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 0/1 Error 0 55s
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 0/1 Error 0 67s
Investigate overall Pod status (below) :
Pod status is failed: 3 liveness probe failures and restartPolicy: Never prevents Kubernetes from restarting it in an effort to fix it.
kubectl describe pod/myliveness-pod
Name: myliveness-pod
Start Time: Wed, 16 Jan 2019 07:45:47 +0200
Status: Failed
Containers:
myliveness-container:
State: Terminated
Reason: Error
Exit Code: 137
Started: Wed, 16 Jan 2019 07:45:47 +0200
Finished: Wed, 16 Jan 2019 07:46:23 +0200
Ready: False
Restart Count: 0
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Delete Pod.
kubectl delete -f myLiveness-Pod.yaml --force --grace-period=0
pod "myliveness-pod" force deleted
By default failureThreshold equals 3. 3 tries before container declared a failure.
Let's set failureThreshold equal to 1 and experiment. ( note last line in spec )
nano myLiveness-Pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: myliveness-pod
spec:
containers:
- image: httpd:2.4
imagePullPolicy: IfNotPresent
name: myliveness-container
command: ['sh', '-c', 'echo Container 1 is Running ; sleep 3600']
ports:
- name: liveness-port
containerPort: 80
hostPort: 8080
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 2
periodSeconds: 2
failureThreshold: 1
Create the Pod.
kubectl create -f myLiveness-Pod.yaml
pod/myliveness-pod created
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 1/1 Running 0 6s
After around a minute:
desc pod/myliveness-pod|tail
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 64s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 31s (x2 over 64s) kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 31s (x2 over 64s) kubelet, minikube Created container
Normal Started 31s (x2 over 64s) kubelet, minikube Started container
Normal Killing 31s kubelet, minikube Killing container with id docker://myliveness-container:Container failed liveness probe.. Container will be killed and recreated.
Warning Unhealthy 28s (x2 over 62s) kubelet, minikube Liveness probe failed: Get http://172.17.0.6:80/: dial tcp 172.17.0.6:80: connect: connection refused
Determine number of restarts:
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 1/1 Running 2 68s
2 restarts after 2 liveness probe failures.
Another 30 seconds later.
desc pod/myliveness-pod|tail
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 86s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 20s (x3 over 86s) kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 20s (x3 over 86s) kubelet, minikube Created container
Normal Started 20s (x3 over 86s) kubelet, minikube Started container
Normal Killing 20s (x2 over 53s) kubelet, minikube Killing container with id docker://myliveness-container:Container failed liveness probe.. Container will be killed and recreated.
Warning Unhealthy 18s (x3 over 84s) kubelet, minikube Liveness probe failed: Get http://172.17.0.6:80/: dial tcp 172.17.0.6:80: connect: connection refused
Determine number of restarts:
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 1/1 Running 3 108s
3 restarts after 3 liveness probe failures.
You have to determine the suitable failureThreshold for your production environment.
Different containers in the same Pod may have / need different suitable failureThreshold values.
The default timeoutSeconds is one seconds.
Similary, you have to determine the suitable timeoutSeconds for your production environment - for each container with different software.
Delete Pod.
kubectl delete -f myLiveness-Pod.yaml --force --grace-period=0
pod "myliveness-pod" force deleted
Till now we used httpGet livenessProbes
For software that does not support http gets, you can use tcp Socket liveness probes.
Create using your editor:
nano myLiveness-Pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: myliveness-pod
spec:
containers:
- image: httpd:2.4
imagePullPolicy: IfNotPresent
name: myliveness-container
command: ['sh', '-c', 'echo Container 1 is Running ; sleep 3600']
ports:
- name: liveness-port
containerPort: 80
hostPort: 8080
livenessProbe:
tcpSocket:
port: 80
initialDelaySeconds: 3
periodSeconds: 10
Only difference from before is tcpSocket instead of httpGet
Create the Pod.
kubectl create -f myLiveness-Pod.yaml
pod/myliveness-pod created
Based on what you learned so far you can do this exercise on your own.
Container liveness probes will fail.
The following will fix it, just as before.
kubectl exec myliveness-pod -i -t -- /bin/sh
# httpd
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 172.17.0.6. Set the 'ServerName' directive globally to suppress this message
# exit
Delete Pod.
kubectl delete -f myLiveness-Pod.yaml --force --grace-period=0
pod "myliveness-pod" force deleted
Based on the software running in each of your production containers you have to determine which liveness probe to use:
We did liveness probes thus far.
Readinessprobes are independent of liveness probes.
Readinessprobes probe to ensure your containers are ready to do productive work.
You have to determine exactly what to test to ensure a readiness probe tests readiness.
readinessProbe and livenessProbe syntax are identical.
You can have both these probes defined for a Pod.
Our Pod spec below demonstrates one readiness probe.
nano myLiveness-Pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: myliveness-pod
spec:
containers:
- image: httpd:2.4
imagePullPolicy: IfNotPresent
name: myliveness-container
command: ['sh', '-c', 'echo Container 1 is Running ; sleep 3600']
ports:
- name: liveness-port
containerPort: 80
hostPort: 8080
readinessProbe:
tcpSocket:
port: 80
initialDelaySeconds: 3
periodSeconds: 2
Note we use short delay seconds at bottom of spec: to see what happens quickly.
Create the Pod.
kubectl create -f myLiveness-Pod.yaml
pod/myliveness-pod created
The Pod is running on the node, but it is not ready. Kubernetes noticed the readiness probe that needs to succeed. Then it will convert the ready state to true.
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 0/1 Running 0 3s
Truncated list of describe output ... only relevant EVENT fields shown.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 9s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 8s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 8s kubelet, minikube Created container
Normal Started 8s kubelet, minikube Started container
Warning Unhealthy 1s (x3 over 5s) kubelet, minikube Readiness probe failed: dial tcp 172.17.0.6:80: connect: connection refused
Last line: readiness probe failed 3 times.
6 seconds later ...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 21s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 20s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 20s kubelet, minikube Created container
Normal Started 20s kubelet, minikube Started container
Warning Unhealthy 1s (x9 over 17s) kubelet, minikube Readiness probe failed: dial tcp 172.17.0.6:80: connect: connection refused
6 more failures.
Another 10 seconds later ...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 31s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 30s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 30s kubelet, minikube Created container
Normal Started 30s kubelet, minikube Started container
Warning Unhealthy 1s (x14 over 27s) kubelet, minikube Readiness probe failed: dial tcp 172.17.0.6:80: connect: connection refused
Another 5 failures. Note no mention of restarts. Kubernetes does not restart failed readiness probes.
This is the MAJOR difference between readiness and liveness probes.
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 0/1 Running 0 36s
Detailed Pod status:
kubectl describe pod/myliveness-pod
Name: myliveness-pod
Status: Running
Containers:
myliveness-container:
State: Running
Started: Wed, 16 Jan 2019 08:42:09 +0200
Ready: False
Restart Count: 0
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 73s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 72s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 72s kubelet, minikube Created container
Normal Started 72s kubelet, minikube Started container
Warning Unhealthy 27s (x22 over 69s) kubelet, minikube Readiness probe failed: dial tcp 172.17.0.6:80: connect: connection refused
Fix the Pod, start Apache.
kubectl exec myliveness-pod -i -t -- /bin/sh
# httpd
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 172.17.0.6. Set the 'ServerName' directive globally to suppress this message
# exit
Check Pod status again ... now it is ready
kubectl describe pod/myliveness-pod
Name: myliveness-pod
Status: Running
Containers:
myliveness-container:
State: Running
Started: Wed, 16 Jan 2019 08:42:09 +0200
Ready: True
Restart Count: 0
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 99s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 98s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 98s kubelet, minikube Created container
Normal Started 98s kubelet, minikube Started container
Warning Unhealthy 53s (x22 over 95s) kubelet, minikube Readiness probe failed: dial tcp 172.17.0.6:80: connect: connection refused
A minute later. x22 failed readiness probe count does not increase anymore.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m19s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 2m18s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 2m18s kubelet, minikube Created container
Normal Started 2m18s kubelet, minikube Started container
Warning Unhealthy 93s (x22 over 2m15s) kubelet, minikube Readiness probe failed: dial tcp 172.17.0.6:80: connect: connection refused
Delete Pod.
kubectl delete -f myLiveness-Pod.yaml --force --grace-period=0
pod "myliveness-pod" force deleted
This final exercise demonstrated THE readiness versus liveness difference:
Excellent official reference documentation about liveness and readyness probe settings:
We only did tcp socket and http get probes in this tutorial. The last way to do probes is via commands.
Official Kubernetes demo using commands
As a final exercise I suggest you follow those instructions.
You will note that the kubectl describe pod/myliveness-pod output they show is using a previous format.
This concludes this tutorial.
Carefully read the text of that last link and determine appropriate settings for each container in each Pod in your production environment.
Running Microsoft Exchange Server with Alibaba Cloud ECS
This article covers setting up Exchange Server 2019 on Windows Server 2019 Datacenter edition image using an Alibaba Cloud ECS Instance.
Kubernetes : Assign CPU Resource Defaults and Limits to Containers
2,599 posts | 764 followers
FollowAlibaba Developer - April 3, 2020
Alibaba Cloud Storage - June 4, 2019
Alibaba Cloud Community - June 8, 2022
Alibaba Container Service - July 31, 2024
Alibaba Container Service - November 21, 2024
Alibaba Cloud Blockchain Service Team - October 25, 2018
2,599 posts | 764 followers
FollowAlibaba Cloud Container Service for Kubernetes is a fully managed cloud container management service that supports native Kubernetes and integrates with other Alibaba Cloud products.
Learn MoreA secure image hosting platform providing containerized image lifecycle management
Learn MoreElastic and secure virtual cloud servers to cater all your cloud hosting needs.
Learn MoreMore Posts by Alibaba Clouder
5770277854158395 January 1, 2020 at 4:50 am
What is Kubernetes Liveness Probe
Alibaba Clouder January 6, 2020 at 9:01 am
Please read official kubernetes documentation about what liveness means:
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
Liveness - is the server alive - can I ping it? Liveness might mean still broken ... not READY yet.
Readyness - is the database also up and working - is the full functionality of the whole website READY to work 100% completely.
In the mornings you are alive when you wake up, BUT after your first coffee you are READY for the world.