Graceful start of applications - Microservices Engine - Alibaba Cloud Documentation Center

For online applications, operations such as release, scale-out, and restart are necessary. Microservices Engine (MSE) provides the graceful start feature to protect applications when you perform the preceding operations. The graceful start feature provides the delayed service registration, low-traffic service prefetching, and service readiness probe capabilities. This topic describes the graceful start feature provided by MSE.

Feature overview

Delayed service registration

A microservice application that serves as a provider registers with an MSE instance during application startup. After the registration, applications that serve as consumers can subscribe to and call the provider. For Java applications developed based on the Spring framework, the registration process is performed after the Spring context is refreshed. If a provider registers with an MSE instance before all asynchronous initialization logic in the provider is executed, a request error may occur when a consumer calls the provider. For example, MaxCompute needs to pull hundreds of megabytes of data from Object Storage Service (OSS) before MaxCompute can provide services. If an application registers with the MSE instance immediately after the application starts, an error is reported because resources are not ready. You can configure the delayed service registration capability to specify a waiting period to delay the service registration operation. Delayed service registration enables an application to register with an MSE instance only after the application is completely initialized. This prevents call failures that may occur when an external application calls the provider before the provider is ready.

Low-traffic service prefetching

In most cases, a newly started instance is in the cold state. In this state, operations such as lazy loading of the connection pool, cache prefetching, and hotspot code generation are required. Therefore, the request processing capability of new instances in the cold state is significantly lower than the request processing capability of existing instances that have been running for a long period of time. If developers and O&M personnel do not intervene in the request processing operations, the overall average response time (RT) of the system may increase when a new instance is started. In some cases, services may stop responding, which causes a large number of request call timeouts and errors.

The following figure shows the different durations that are required for a request call of an instance before and after resource loading is complete. If a large number of requests are sent to the instance during resource loading, the requests may be blocked.

You can use the low-traffic service prefetching capability to limit the amount of traffic that is used for a consumer to call a new service instance in the initial period after the service instance is started. This helps improve the request processing capability and reduce the system RT when the Java application is started and in the cold state. This also protects the newly started instance from breaking down when traffic surges. Traffic that is routed to the instance increases over time based on specific rules. When the specified prefetching duration is reached, the low-traffic service prefetching process ends and the instance can receive traffic as normal.

Note

The low-traffic service prefetching capability uses the traffic of online consumers. The consumers must also be connected to MSE Microservices Governance. For more information, see Principles of low-traffic service prefetching.

Service readiness probe

Kubernetes provides the readiness probe mechanism. When you release a service, the application instance of the old version is shut down after the application instance of the new version passes the readiness probe. The specific situation depends on the release policy that you specify. However, Kubernetes cannot detect the time when a microservice application is ready. In most cases, Kubernetes considers an application ready when a port is available for connecting to the application. The application instance of the new version may be considered ready before it is registered with an MSE instance. If this occurs, the application instance of the old version is shut down earlier than expected. As a result, the consumer may fail to call the provider of the new version, and the error message service no provider/instance is reported.

The service readiness probe capability of the graceful start feature provides an HTTP interface by using agents without code modifications. The interface is used to check whether an application instance is registered with an MSE instance. If MSE detects that the application instance is not registered, the HTTP status code 500 is returned. If MSE detects that the application instance is registered, the HTTP status code 200 is returned. After you configure the HTTP interface for the service readiness probe capability, Kubernetes can accurately determine whether application instances are ready. This helps ensure that consumers can call available providers during the service release in Kubernetes, and no error that indicates provider unavailability is reported.

Use graceful start

Prerequisites

Microservices Governance is activated. For more information, see Activate Microservices Governance.
Microservices Governance is enabled for your microservice applications in a Container Service for Kubernetes (ACK) cluster. For more information, see Enable Microservices Governance for Java microservice applications in an ACK or ACS cluster.

Usage notes

For Spring Cloud applications, the low-traffic service prefetching capability is supported only if the applications are registered with a Nacos, ZooKeeper, or Eureka registry.
The low-traffic service prefetching capability of Spring Cloud is implemented based on the default load balancer ZoneAwareLoadBalancer, RoundRobinLoadBalancer, or RandomLoadBalancer of the Spring Cloud framework. If the configuration of the load balancer in an application is modified, the low-traffic service prefetching capability becomes invalid.
The low-traffic service prefetching capability takes effect only after MSE Microservices Governance is enabled for both the provider and consumer. Gateway applications expose APIs to receive external traffic. Therefore, the low-traffic service prefetching capability of MSE does not apply to such applications.

How to use graceful start

Step 1: Enable graceful start

Log on to the MSE console, and select a region in the top navigation bar.
In the left-side navigation pane, choose Microservices Governance > Application Governance. On the page that appears, click the resource card of the application that you want to manage.
On the application details page, click Traffic management in the left-side navigation pane, and click the Graceful Start/Shutdown tab.
In the Settings section, click Revised. In the Graceful Start and Shutdown Settings panel, turn on the Graceful Start switch, and click OK.

Step 2: Configure service readiness probe for Kubernetes

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, click the cluster that you want to manage. In the left-side navigation pane of the page that appears, choose Workloads > Deployments. On the Deployments page, find the deployed application and click Edit in the Actions column. In the Health Check section, select Enable on the right side of the Readiness parameter and configure the following parameters. Then, click Update on the right side of the page.
- Path: Enter /health in the field. (If the version of the agent used by your application is later than 4.1.10, enter /readiness in the field. To view the agent version, you can go to the MSE console, and choose Microservices Governance > Application Governance. On the Application list page, click the card of the application. On the Application overview page, click Node details in the left-side navigation pane, and view the agent version on the right side.)
- Port: Enter 55199 in the field.
- Initial Delay (s): We recommend that you set this parameter to a value that is greater than the sum of the time required for application startup and the value of the Delayed Registration Duration (Second) parameter configured in the Graceful Startup section of the Graceful Start/Shutdown tab. The default value of the Delayed Registration Duration (Second) parameter is 0. If you do not configure the Initial Delay (s) parameter as recommended, the normal use of the service readiness probe capability is not affected.
- For the settings of other parameters, see Create a stateless application by using a Deployment. After the application is restarted, the application passes the service readiness probe only after the service registration is complete.

Important

After you configure the service readiness probe in Kubernetes, the application is restarted. Therefore, if your application is in the production environment, we recommend that you perform this operation in a release time window.

(Optional) Configure delayed service registration

Determine whether to configure the delayed service registration capability based on your business requirements. For more information, see Delayed service registration.

Go to the Graceful Start/Shutdown tab and enable the graceful start feature in the Graceful Start section as described in Step 1. Then, configure the service readiness probe capability for Kubernetes as described in Step 2.
Modify the configuration information of graceful start and shutdown. In the Graceful Start and Shutdown Settings panel, click the arrow next to Graceful Start, specify the Delayed Registration Duration (Second) parameter, and then click OK.

Note

After the preceding configurations are complete, the delayed service registration capability takes effect when the application is started next time.

(Optional) Change the duration of low-traffic service prefetching

After you enable the graceful start feature, the low-traffic service prefetching capability is automatically enabled. The default duration of low-traffic service prefetching is 120 seconds. You can change the duration based on your business requirements by performing the following steps:

Go to the Graceful Start/Shutdown tab and enable the graceful start feature in the Graceful Start section as described in Step 1. Then, configure the service readiness probe capability for Kubernetes as described in Step 2.
Modify the configuration information of graceful start and shutdown. In the Graceful Start and Shutdown Settings panel, click the arrow next to Graceful Start, click Advanced options, specify the Low-traffic Prefetching Duration (Seconds) parameter, and then click OK.

Note

After the preceding configurations are complete, the new duration for low-traffic service prefetching takes effect when the application is started next time.
When you use the low-traffic service prefetching capability, the system allows the consumer to calculate weight values based on the startup time of each provider, and uses the load balancing algorithm to increase traffic over time. This helps prefetch applications after the applications are started. MSE Microservices Governance must also be enabled for the service consumer.
The first time you use the low-traffic service prefetching capability of the graceful start feature, we recommend that you use the default duration. If the prefetching effect is not obvious and traffic loss occurs when you use the default duration to implement low-traffic service prefetching, you can change the value of the Low-traffic Prefetching Duration (Seconds) parameter for optimization.
To ensure that services are completely prefetched, we recommend that you refer to What is the best practice for low-traffic service prefetching?.

Observe graceful start

After the preceding configurations are complete and the application is restarted, you can view the time the application is started and shut down and the queries per second (QPS) curve of the application during the same period on the Graceful Start/Shutdown tab.

Log on to the MSE console, and select a region in the top navigation bar.
In the left-side navigation pane, choose Microservices Governance > Application Governance. On the page that appears, click the resource card of the application that you want to manage.
On the application details page, click Traffic management in the left-side navigation pane, and click the Graceful Start/Shutdown tab.
On the Start and Shutdown Overview subtab, click the application in the left-side pane, and view the QPS changes and events that occurred during the startup process of the application in the right-side pane.

The preceding figure shows that the service registration, service prefetching started, and service prefetching ended events occur in sequence. The Kubernetes readiness probe passed event also occurs after the service registration event. The QPS curve gradually rises to the maximum value within the service prefetching duration (default duration: 120s), instead of rising sharply. If the sequence of the events and the shape of the QPS curve do not meet expectations when your application is started, you can refer to FAQ to resolve the issue.

Note

The preceding figure shows the QPS data of an application in which the Path parameter is set to /health and the Port parameter is set to 55199 for the Kubernetes service readiness probe capability. The Minimum Ready Time (minReadySeconds) parameter is set to 120 in the Upgrade Policy dialog box of the Deployments page, which is the same as the default service prefetching duration.

References

Configure graceful start and shutdown in YAML