In addition to setting a fixed number of provisioned instances, you can make flexible adjustments by configuring scheduled and threshold-based scaling policies. This helps improve instance utilization.
Scheduled scaling
Threshold-based scaling
Scenarios
Choose scheduled scaling when your function experiences distinct periodic patterns or predictable traffic peaks. When the number of concurrent invocations exceeds the capacity defined by the scheduled scaling policy, all excess requests will be directed to on-demand instances for processing.
Sample configuration
The following figure shows two scheduled actions for instance scaling: the first action scales out the provisioned instances before the traffic peak, while the second scales in the instances afterward.

The following code snippet shows how to call the PutProvisionConfig operation to configure scheduled scaling policies. In this example, a function named function_1 is configured to automatically scale in and out, with the time zone set to Asia/Shanghai (UTC+8). The configurations take effect from 10:00:00 on August 1, 2024, to 10:00:00 on August 30, 2024 (UTC+8). During this period, the number of provisioned instances is increased to 50 at 20:00 (UTC+8) and reduced to 10 at 22:00 (UTC+8) each day.
"scheduledActions": [
{
"name": "scale_up_action",
"startTime": "2024-08-01T10:00:00",
"endTime": "2024-08-30T10:00:00",
"target": 50,
"scheduleExpression": "cron(0 0 20 * * *)",
"timeZone": "Asia/Shanghai"
},
{
"name": "scale_down_action",
"startTime": "2024-08-01T10:00:00",
"endTime": "2024-08-30T10:00:00",
"target": 10,
"scheduleExpression": "cron(0 0 22 * * *)",
"timeZone": "Asia/Shanghai"
}
]
The following table describes the parameters in the code snippet.
Parameter | Description |
name | The name of the scheduled scaling task. |
startTime | The time when the scaling policy starts to take effect. The system defaults to using UTC if you do not specify a time zone. |
endTime | The time when the scaling policy expires. The system defaults to using UTC if you do not specify a time zone. |
target | The target number of provisioned instances. |
scheduleExpression | The schedule information. The system defaults to using UTC if you do not specify a time zone. The following formats are supported: At expressions - "at(yyyy-mm-ddThh:mm:ss)": runs the scheduled task only once. For example, if you want to run the scheduled task at 20:00 on April 1, 2024 (UTC+8), set the time zone to Asia/Shanghai and configure this parameter to at(2024-04-01T20:00:00) . Cron expressions - "cron(0 0 4 * * *)": runs the scheduled task for multiple times. Set the value in the standard crontab format. For example, if you want to run the scheduled task at 20:00 (UTC+8) every day, set the time zone to Asia/Shanghai and configure this parameter to cron(0 0 20 * * *) .
|
timeZone | The specified time zone. |
Cron expressions
The following table describes the fields of a cron expression in the format of Seconds Minutes Hours Day-of-month Month Day-of-week.
Field | Valid values | Allowed special characters |
Seconds | 0 to 59 | None |
Minutes | 0 to 59 | , - * / |
Hours | 0 to 23 | , - * / |
Day-of-month | 1 to 31 | , - * ? / |
Month | 1 to 12 or JAN to DEC | , - * / |
Day-of-week | 1 to 7 or MON to SUN | , - * ? |
The following table describes the special characters in a cron expression.
Character | Description | Example |
* | Indicates any or each. | In the Minutes field, 0 indicates that the task is run at the start of every minute. |
, | Specifies a list of values. | In the Day-of-week field, MON, WED, FRI indicates every Monday, Wednesday, and Friday. |
- | Specifies a range. | In the Hours field, 10-12 indicates a time range from 10:00 to 12:00 in your specified time zone. |
? | Indicates an uncertain value. | This character is used together with specified values. For example, when you specify a date without tying it to a particular day of the week, you can use this character in the Day-of-week field. |
/ | Specifies increments. n/m indicates an increment of m starting from the position of n. | In the Minutes field, 3/5 indicates that the task is run every 5 minutes starting from the third minute. |
Scenarios
After you configure a threshold-based scaling policy, Function Compute periodically collects the concurrency or resource utilization metrics for the provisioned instances. It uses these metrics, along with the minimum and maximum numbers of provisioned instances you specify, to control instance scaling, ensuring the number of instances aligns more closely with actual resource usage.
Sample configuration
The following figure shows an example of auto scaling based on the utilization of instance concurrency. When the traffic volume increases, the scale-out threshold is triggered and Function Compute starts to increase the number of provisioned instances. The scale-out stops when the number reaches the upper limit you set. Excess requests are sent to on-demand instances for processing.

Note
To configure a threshold-based scaling policy, you must first enable the collection of instance-level metrics. Otherwise, a 400 InstanceMetricsRequired
error will be reported. For more information, see Enable collection of instance-level metrics.
The concurrency utilization metric includes only the concurrency of provisioned instances, excluding that of on-demand instances.
The concurrency utilization metric evaluates the ratio of concurrent requests handled by provisioned instances to the maximum number of concurrent requests that all provisioned instances can handle. The value of the metric can range from 0 to 1.
The following code snippet shows how to call the PutProvisionConfig operation to configure threshold-based scaling policies. In this example, a function named function_1 is configured to automatically scale in and out based on the ProvisionedConcurrencyUtilization metric, which tracks the concurrency utilization of provisioned instances. The time zone is set to Asia/Shanghai (UTC+8). The configurations take effect from 10:00:00 on August 1, 2024, to 10:00:00 on August 30, 2024 (UTC+8). During this period, when concurrency utilization exceeds 60%, the number of provisioned instances is increased, up to a maximum of 100. Conversely, when concurrency utilization falls below 60%, the number of provisioned instances is reduced, down to a minimum of 10.
"targetTrackingPolicies": [
{
"name": "action_1",
"startTime": "2024-08-01T10:00:00",
"endTime": "2024-08-30T10:00:00",
"metricType": "ProvisionedConcurrencyUtilization",
"metricTarget": 0.6,
"minCapacity": 10,
"maxCapacity": 100,
"timeZone": "Asia/Shanghai"
}
]
The following table describes the parameters in the code snippet.
Parameter | Description |
name | The name of the threshold-based scaling task. |
startTime | The time when the scaling policy starts to take effect. The system defaults to using UTC if you do not specify a time zone. |
endTime | The time when the scaling policy expires. The system defaults to using UTC if you do not specify a time zone. |
metricType | The metric that is tracked. In this example, the value is set to ProvisionedConcurrencyUtilization. |
metricTarget | The threshold that triggers auto scaling. |
minCapacity | The minimum number of provisioned instances allowed. |
maxCapacity | The maximum number of provisioned instances allowed. |
timeZone | The specified time zone. |
Scaling principles
When instance scale-in is triggered, Function Compute gradually reduces the number of provisioned instances based on a scale-in coefficient that ranges from 0 (excluded) to 1. The scale-in coefficient is a system parameter used to slow down the scale-in speed. It does not require manual configuration. The target values for scaling tasks are the smallest integers that are greater than or equal to the following calculation results:
Scale-out target value = Current provisioned instances × (Current metric value/Specified utilization threshold)
Scale-in target value = Current provisioned instances × Scale-in coefficient × (1 - Current metric value/Specified utilization threshold)
The following example demonstrates how to calculate the scale-out target. Similarly, the scale-in target can be determined using the previously mentioned principle and formula.
If the current metric value is 80%, the specified utilization threshold is 40%, and the current number of provisioned instances is 100, then the target number of instances is calculated as follows: 100 × (80%/40%) = 200. The number of provisioned instances is increased to 200 (as long as this does not exceed the maximum allowed) to ensure that utilization stays around 40%.
The following example clarifies how the target values specified by the defaultTarget parameter and scheduled scaling policies determine the number of provisioned instances at a specific time. In this example, the defaultTarget parameter is set to 5, and two scheduled scaling policies are configured, using the Asia/Shanghai time zone (UTC+8). The configurations take effect from 10:00:00 on January 9, 2025, to 00:00:00 on January 11, 2025 (UTC+8). During this period, the number of provisioned instances is increased to 20 at 10:00 (UTC+8) and reduced to 10 at 22:00 (UTC+8) each day. The following code snippet shows the content of the scaling policies:
{
"defaultTarget": 5,
"scheduledActions": [
{
"name": "scale_up_action",
"startTime": "2025-01-09T10:00:00",
"endTime": "2025-01-11T00:00:00",
"target": 20,
"scheduleExpression": "cron(0 0 10 * * *)",
"timeZone": "Asia/Shanghai"
},
{
"name": "scale_down_action",
"startTime": "2025-01-09T10:00:00",
"endTime": "2025-01-11T00:00:00",
"target": 10,
"scheduleExpression": "cron(0 0 22 * * *)",
"timeZone": "Asia/Shanghai"
}
]
}
The following figure illustrates the changes in the number of provisioned instances over time: