Use slow starts to implement graceful deployment of services - Server Load Balancer

Server groups of Application Load Balancer (ALB) support the slow start mode. After you enable the slow start mode for an ALB instance, the ALB instance gradually increases the number of requests forwarded to scaled out backend servers to prevent traffic spikes in scenarios such as resource preparation, caching, and prefetching.

Sample scenario

In scenarios with high traffic loads, backend servers are manually or automatically scaled out. Requests are distributed to healthy backend servers based on their weights. If a backend server is overloaded, the CPU usage of the backend server reaches 100% or the memory becomes exhausted. As a result, the backend server may become inaccessible.

ALB server groups support the slow start mode. After you enable the slow start mode, ALB gradually increases requests distributed to scaled out backend servers to prevent traffic spikes.

The following figure shows an example. A company deployed an Internet-facing ALB instance in the China (Hangzhou) region. The ALB instance is associated with an HTTP listener and a backend server named ECS01. The ALB instance forwards requests to services on ECS01. The company wants to add a backend server named ECS02 and enable the slow start mode to gradually forward requests to ECS02. The slow start mode can prevent traffic spikes on ECS02.

Usage notes

Only standard and WAF-enabled ALB instances support the slow start mode. Basic ALB instances do not support the slow start mode.
This parameter is unavailable for server groups of the Function Compute type.
The slow start mode is supported by server groups only if you set Scheduling Algorithm to Weighted Round-robin.
After you enable the slow start mode, healthy backend servers do not automatically enter the slow start mode.
When you enable the slow start mode for an empty server group:
- The first backend server added to the server group does not enter the slow start mode.
- New backend servers can enter the slow start mode only when at least one healthy backend server is in slow start mode.
Backend servers that are removed before the slow start duration ends automatically exit the slow start mode. If you re-add a backend server to a server group, the backend server can enter the slow start mode only when the backend server passes health checks.
If a backend server is declared unhealthy before the slow start duration ends, the backend server exits the slow start mode. The backend server re-enters the slow start mode when the backend server passes health checks.
After you enable the slow start mode and health checks, only healthy backend servers enter the slow start mode. If you disable health checks, all backend servers immediately enter the slow start mode.

Prerequisites

A standard ALB instance is created, and a server group is created for the ALB instance. For more information, see Create an ALB instance and Create and manage server groups.
A listener is created for the ALB instance and associated with the sever group of the ALB instance. For more information, see Add an HTTP listener.
An access log is created for the ALB instance. For more information, see Access logs.
In the following examples, two Elastic Compute Service (ECS) instances are used.
ECS01 and ECS02 function as backend servers. NGINX applications are deployed on ECS01 and ECS02. For information about how to create an instance, see Create an instance by using the wizard.
The following commands show how to deploy applications on ECS01 and ECS02:
Commands for deploying an application on ECS01
```
yum install -y nginx
systemctl start nginx.service
cd /usr/share/nginx/html/
echo "Hello World !  This is ECS01." > index.html
```
Commands for deploying an application on ECS02
```
yum install -y nginx
systemctl start nginx.service
cd /usr/share/nginx/html/
echo "Hello World !  This is ECS02." > index.html
```

ECS01 is added to the server group, and clients can access services on ECS01. For more information, see Use an ALB instance to provide IPv4 services and Use ALB to balance loads for IPv6 services.

Click to view sample ALB instance configurations

Parameter	Configuration
Network type	Internet-facing Domain name: alb-1o44v******.cn-hangzhou.alb.aliyuncs.com
Listener protocol	HTTP (port 80)
Backend server	ECS01 IP address: 10.0.2.50 Port: 80 Weight: 100 ECS02 IP address: 10.0.2.51 Port: 80 Weight: 100

Step 1: Enable the slow start mode

In this example, the configurations of an existing server group are modified to enable the slow start mode. If you do not have a server group, create a server group and enable the slow start mode.

Log on to the ALB console.
In the top navigation bar, select the region where the server group is deployed. In this example, China (Hangzhou) is selected.
In the left-side navigation pane, choose ALB > Server Groups.
On the Server Groups page, click the ID of the server group that you want to manage.
On the Details tab, click Modify Basic Information in the Basic Information section.
In the Modify Basic Information dialog box, click Advanced Settings and turn on Slow Start.
Set Slow Start Duration to 30 seconds and click Save.
Note
If you set the slow start duration to 30 seconds, ALB gradually increases the number of requests forwarded to scaled out backend servers within 30 seconds.

Step 2: Use the wrk tool to simulate client requests

In this example, a client that runs the 64-bit Alibaba Cloud Linux 3.2104 operating system is used. The installation method varies based on the operating system. For more information, see the manual of your operating system.

Log on to the client and open the command-line interface (CLI). Run the following commands to install wrk:
```
yum -y install git make gcc
git clone https://github.com/wg/wrk.git
yum install unzip
cd wrk
make
```
Run the following command to run a stress test on the backend server of the ALB instance:
```
./wrk -c 1000 -d 6000s -t 3 -H "Connection:Close" http://<ALB domain name>
```
Take note of the following parameters:
- -c: stands for connections and specifies the number of concurrent connections that are maintained by each thread.
- -d: stands for durations and specifies the test duration.
- -t: stands for threads and specifies the number of threads that you want to use to simulate concurrent clients.
- -H: stands for header and specifies the HTTP header that you want to add to requests. For example, -H "Connection:Close" specifies that a non-persistent connection header is added to each request.

Step 3: Add ECS02 to the backend server

Important

Add the backend server to the server group before the stress testing duration ends.

Log on to the ALB console.
In the top navigation bar, select the region where the ALB instance is deployed.
In the left-side navigation pane, choose ALB > Server Groups.
On the Server Groups page, find the server group that you want to manage and click Modify Backend Server in the Actions column.
On the Backend Servers tab, click Add Backend Server.
In the Add Backend Server panel, select ECS02 and click Next.
In the Ports/Weights step, assign port 80 to ECS02, use the default port weight 100, and then click OK.

Step 4: Verify the result

Log on to the ALB console.
In the top navigation bar, select the region where the ALB instance is deployed.
On the Instances page, click the ID of the NLB instance that you want to manage.

Click the Access Logs tab, and click the link on the right side of Simple Log Service in the Basic Information section to go to the Simple Log Service console. In the console, view the network traffic status of ECS02, as shown in the following figure.

The following figure shows that requests are gradually forwarded to ECS02 within the slow start duration.

测试结果-cn (2).png

No.	Description
①	Run the following command to query the amount of network traffic that the ALB instance forwards to ECS02. `alb-80ri6**** and upstream_addr : "10.0.2.51:80"` In the preceding command: `alb-80ri6****` specifies the ID of the ALB instance. `upstream_addr` specifies the IP address and port of the backend server. In this example, the private IP address and port of ECS02 are used.
②	Select the time period that you want to query. In this example, `1 Minute` is selected.
③	Click Search & Analyze to query network traffic on ECS02.

References

For more information about how to enable the slow start mode when you create a sever group, see Create a server group.
To gracefully undeploy a service, enable connection draining. For more information, see Use connection draining to implement graceful undeployment.
For more information about how to query and analyze access logs, see Query and analyze logs.