Configure the health check feature for an EAS service - Platform For AI

Elastic Algorithm Service (EAS) provides the health check feature, which uses the health check mechanism of Kubernetes. The health check feature can automatically detect and recover failed containers to ensure that only healthy instances receive traffic and resources are not allocated to unhealthy instances. This topic describes how to configure the health check feature.

Limits

You can configure the health check feature only when you use a custom image that contains the health check logic to deploy a service.

How it works

The health check feature of EAS uses the health check mechanism of Kubernetes. The feature allows you to detect and manage the health status and availability of services by using the probe technology and health check methods. The following tables describe the probe types and health check methods.

Probe types

Probe type	Description
Liveness probe	The kubelet uses liveness probes to check whether containers are alive, kills unhealthy containers, and then performs subsequent operations based on the restart policy. If a container is not probed by a liveness probe, the kubelet considers that the liveness probe returns Success for the container. This indicates that the container is alive.
Readiness probe	Readiness probes are used to check whether a container is ready to receive requests. Only pods that are in the Ready state can receive requests. The relationship between services and endpoints depends on whether a pod is ready. If the value of the Ready field is False, Kubernetes removes the IP address of the pod from the list of endpoints that are associated with the services. After the value of the Ready field changes to True, Kubernetes adds the IP address of the pod to the list of endpoints that are associated with the services.
Startup probe	The kubelet uses startup probes to learn when a container is launched. You can use startup probes to ensure that liveness probes and readiness probes are sent to a container only after the container is launched. Startup probes can be used to perform liveness checks on containers that have a slow start speed. This way, the containers are not killed by the kubelet before the containers are launched.

Health check methods

Health check method	Description
`http_get`	Send HTTP GET requests to check the health status and liveness of services, and confirm whether the probes are successful based on the returned status codes.
`tcp_socket`	Attempt to create a TCP connection to check the health status and liveness of services.
`exec`	Run specific commands in containers and confirm whether the probes are successful based on the exit codes.

Prepare a custom image

You can choose a web framework to encapsulate the prediction logic. In this example, the Flask framework is used. Sample app.py file:

import json
from flask import Flask, request, make_response

app = Flask(__name__)

@app.route('/', methods = ['GET','POST'])
def process_handle_func():
    """ 
       Parse the request body based on your business requirements.
    """
    data = request.get_data().decode('utf-8')
    body = json.loads(data)
    res = process(body)
    """ 
       Configure the response based on your business requirements.
    """
    response = make_response(res)
    response.status_code = 200
    return response

def process(data):
    """ 
       Your prediction logic
    """
    return 'result'

if __name__ == '__main__':
    """
    You must set the host parameter to 0.0.0.0. Otherwise, the health check may fail during service deployment. 
    The port number that you specify for the port parameter must be the same as the port number specified in the JSON configuration file of the service that you deploy. 
    """
    app.run(host='0.0.0.0', port=8000)

You can write a simple Dockerfile to copy the prediction code to the file and install the required packages. The following sample code provides an example of the content of the Dockerfile:

# In this example, Python is used.
FROM registry.cn-shanghai.aliyuncs.com/eas/bashbase-amd64:0.0.1
COPY ./process_code  /eas
RUN /xxx/pip install Name of the package that you require
CMD ["/xxx/python", "/eas/xxx/app.py"]

For information about how to create a custom image, see Use a Container Registry Enterprise Edition instance to build an image. For more information about custom images, see Deploy a model service by using a custom image. You can also store the code in an File Storage NAS (NAS) or Git repository and mount the storage to a service instance to write the code to the instance during service deployment. For more information, see Mount storage to services. The following section describes how to configure the health check feature during service deployment by copying prediction code to a Dockerfile.

Configure the health check feature during service deployment

Configure the health check feature in the PAI console

Log on to the PAI console. Select a region on the top of the page. Then, select the desired workspace and click Enter Elastic Algorithm Service (EAS).
Click Deploy Service. In the Custom Model Deployment section, click Custom Deployment.

On the Custom Deployment page, configure the following key parameters. For information about other parameters, see Deploy a model service in the PAI console.

In the Environment Information section, configure the parameters. The following table describes the parameters.

Parameter

Description

Image Configuration

Select Image Address and enter the address of the prepared custom image. Example: registry-vpc.cn-shanghai.aliyuncs.com/xxx/yyy:zzz.

Command to Run

The entry command of the image. You can enter only a single command. Complex scripts are not supported. The command must be consistent with the command in the Dockerfile. Example: /data/eas/ENV/bin/python /data/eas/app.py.

You must also enter the port number, which is the local HTTP port on which the image listens after the image is started. Example: 8000.

Important

We recommend that you do not specify port 8080 and port 9090 because the EAS engine listens on the ports.
The port number must be the same as the port number configured in the xxx.py file specified in the command.

In the Service Configuration section, enable Health Check and configure the following parameters.

Note

You can add up to three health check items. Only one probe type can be configured for each health check item, and the probe type configured for each health check item must be unique.

Parameter	Description
Probe Type	The following types of probes are supported: Liveness probe: checks whether containers are running as expected. Readiness probe: ensures that containers are initialized and ready to process requests. Startup probe: prevents applications from being incorrectly marked as failed due to slow launch of containers. This probe is designed for applications that require a long period of time to be initialized. For information about the working principles of each type of probe, see How it works.
Check Method	The following health check methods are supported: http_get: Call the HTTP GET method by using the IP address, port number, and path of a container. If the status code of the response is greater than or equal to 200 and less than 400, the container is healthy. tcp_socket: Perform a TCP check by using the IP address and port number of a container. If a TCP connection is established, the container is healthy. exec (Custom Health Check): Run specific commands in a container. If the exit code is 0 after the operation is successful, the health check is successful.
Call Path	This parameter is available only if you set the Check Method parameter to http_get. The endpoint of the HTTP server on which you want to perform the health check. The prefix of the endpoint is `http://localhost`. You must specify a custom suffix for the endpoint. The default suffix is `/`.
Port Number	This parameter is available only if you set the Check Method parameter to http_get or tcp_socket. The port number for the health check. Example: 8000.
Command	This parameter is available only if you set the Check Method parameter to exec(Custom Health Check). The command that you want to run. The frontend automatically converts the command into the corresponding format and writes the command into the JSON service configuration file.
Latency for Check Initialization	The time required to initiate the first health check after the container is launched. Default value: 0. Unit: seconds.
Check Interval	The frequency of the health check. Default value: 10. Unit: seconds. A high frequency generates additional overheads for pods. A low frequency may lead to ignorance of container errors.
Check Timeout Period	The timeout period of the health check. Default value: 1. Unit: seconds. If a health check times out, the health check is considered failed.
Check Success Threshold	The minimum number of consecutive failed health checks after a successful health check before the service is considered unhealthy. Default value for the readiness probe: 3. Default value for the liveness and startup probes: 1.
Check Failure Threshold	The minimum number of consecutive successful health checks after a health check fails before the service is considered healthy. Default value: 1.

Click OK.

After you configure the parameters, click Deploy.

Configure the health check feature on an on-premises client

Download the EASCMD client and complete identity authentication. In this example, Windows 64 is used.

Create a service configuration file named service.json in the directory in which the client is located. The following sample code provides an example of the content of the file:

{
    "metadata": {
        "name": "test",
        "instance": 1,
        "enable_webservice": true
    },
    "cloud": {
        "computing": {
            "instance_type": "ml.gu7i.c16m60.1-gu30"
        }
    },
    "containers": [
        {
            "image":"registry-vpc.cn-shanghai.aliyuncs.com/xxx/yyy:zzz",
            "env":[
                {
                    "name":"VAR_NAME",
                    "value":"var_value"
                }
            ],
            "liveness_check":{
                "http_get":{
                    "path":"/",
                    "port":8000
                },
                "initial_delay_seconds":3,
                "period_seconds":3,
                "timeout_seconds":1,
                "success_threshold":2,
                "failure_threshold":4
            },
            "command":"/data/eas/ENV/bin/python /data/eas/app1.py",
            "port":8000
        }
    ]
}

The following table describes the key parameters. For information about other parameters, see All Parameters of model services.

Parameter		Description
image		The address of the custom image used to deploy a model service. EAS does not support Internet access. You need to access the image by using the virtual private cloud (VPC) endpoint of the image repository to which the image is uploaded. Example: `registry-vpc.cn-shanghai.aliyuncs.com/xxx/yyy:zzz`.
env	name	The name of the environment variable that is used to launch a container based on the image.
env	value	The value of the environment variable that is used to launch a container based on the image.
command		The entry command of the image. You can enter only a single command. Complex scripts, such as `/data/eas/ENV/bin/python /data/eas/app.py`, are not supported.
port		The network port on which the process in the image listens. Example: 8000. Important The port number must be consistent with the port number configured in the xxx.py file specified in the command.
liveness_check Note liveness_check indicates that a liveness probe is used in the health check. You can also specify readiness_check (readiness probe) or startup_check (startup probe).	http_get	The HTTP GET check method that is used to send requests over port 8000. Take note of the following parameters: http_get.path: the endpoint of the HTTP server on which you perform the health check. The prefix of the endpoint is `http://localhost`. You must specify a custom suffix for the endpoint. The default suffix is`/`. http_get.port: the port on which you perform the health check on the HTTP Server. You can also use the following health check methods: tcp_socket: Perform a TCP check by using the IP address and port number of a container. If a TCP connection is established, the container is healthy. Configuration method: `"tcp_socket":{ "port":8000 }` exec: Run a specific command in the container. If the exit code is 0 after the execution is successful, the health check is successful. Configuration method: `"exec":{ "command":[ "your_script", "with_args" ] }`
	initial_delay_seconds	The time required to initiate the first health check after the container is launched. Default value: 0. Unit: seconds.
	period_seconds	The frequency of the health check. Default value: 10. Unit: seconds. A high frequency generates additional overheads for pods. A low frequency may lead to ignorance of container errors.
	timeout_seconds	The timeout period of the health check. Default value: 1. Unit: seconds. If a health check times out, the health check is considered failed.
	success_threshold	The minimum number of consecutive failed health checks after a successful health check before the service is considered unhealthy. Default value for the readiness probe: 3. Default value for the liveness and startup probes: 1.
	failure_threshold	The minimum number of consecutive successful health checks after a health check fails before the service is considered healthy. Default value: 1.

Run the following command in the directory in which the JSON file is located to create the service: For more information, see Run commands to use the EASCMD client.
```
eascmdwin64.exe create <service.json>
```
Replace <service.json> with the name of the JSON file that you created.