The reserved mode allows you to reserve instances in response to function invocation requests. This reduces the occurrences of cold starts and improves the response speed for latency-sensitive online services. You can perform scheduled auto-scaling or metric tracing auto-scaling to make better use of reserved instances.
Scheduled auto-scaling
- Definition: Scheduled auto-scaling is used to flexibly configure reserved instances. You can configure the number of reserved instances to be automatically adjusted to a specified value at a specified time so that the number of instances can meet the concurrency of your business.
- Scenario: You can use scheduled auto-scaling to reserve instances in advance of periodic or predicted traffic peaks for functions. When the number of instances invoked concurrently by a function is higher than the reserved instance concurrency, the excess instances are charged on a pay-as-you-go basis.
- Example of configuration: Two scheduled operations are configured. The first scheduled operation scales out the reserved instances before the traffic peak, and the second scheduled operation scales in the reserved instances after the traffic peak.
Example of parameter settings:
- In this example, a function named function_1 in a service named service_1 is configured to automatically scale in and out. Set the scaling period for function_1 to the period from 10:00:00 on November 1, 2020 to 10:00:00 on November 30, 2020 (UTC+8). The number of reserved instances is scaled out to 50 at 20:00 every day and scale in to 10 at 22:00 every day.
{ "ServiceName": "service_1", "FunctionName": "function_1", "Qualifier": "alias_1", "SchedulerActions": [ { "Name": "action_1", "StartTime": "2020-11-01T10:00:00Z", "EndTime": "2020-11-30T10:00:00Z", "TargetValue": 50, "ScheduleExpression": "cron(0 0 20 * * *)" }, { "Name": "action_2", "StartTime": "2020-11-01T10:00:00Z", "EndTime": "2020-11-30T10:00:00Z", "TargetValue": 10, "ScheduleExpression": "cron(0 0 22 * * *)" } ] }
- Parameters
Parameter Description Schema Name The name of the scheduled auto-scaling task. string StartTime The time when the configuration starts to take effect. Specify the value in UTC. string EndTime The time when the configuration expires. Specify the value in UTC. string TargetValue The number of instances to be reached. integer (int64) ScheduleExpression The scheduled expression that specifies when to run the scheduled task. The following formats are supported: - At expressions - "at(yyyy-mm-ddThh:mm:ss)": runs the scheduled task only once. Specify the value in UTC.
- Cron expressions - "cron(0 0 20 * * *)": runs the scheduled task multiple times. Specify the value in the standard CRON format. For example, cron(0 0 20 * * *) indicates that the scheduled task is run at 20:00 every day.
string The following tables describe the fields and special characters of the CRON expression (Seconds Minutes Hours Day-of-month Month Day-of-week).Table 1. Fields Field name Valid value Allowed special characters Seconds 0 to 59 None Minutes 0 to 59 , - * / Hours 0 to 23 , - * / Day-of-month 1 to 31 , - * ? / Month 1 to 12 or JAN to DEC , - * / Day-of-week 1 to 7 or MON to SUN , - * ? Table 2. Special characters Character Description Example * Indicates any or each. In the Minutes
field, 0 indicates that the task is run at the 0th second of every minute., Indicates the list value. In the Day-of-week
field, MON, WED, and FRI indicate Monday, Wednesday, and Friday.- Indicates a range. In the Hours
field, 10-12 indicates that the time range is from 10:00 to 12:00 in UTC.? Indicates an uncertain value. This value is used with other specified values. For example, if you specify a specific date, but you do not care what day of the week it is, you can use this special character in the Day-of-week
field./ Indicates the increment of a value. For example, n/m means to add an increment m to n each time. In the minute
field, 3/5 indicates that the operation is performed every 5 minutes starting from the minute 3.
Metric tracking auto-scaling
- Definition: Metric tracking auto-scaling tracks the metrics to dynamically scale reserved instances.
- Scenario: Function Compute periodically collects the concurrency usage rate of reserved instances, and uses this metric together with the scale-out and scale-in trigger values you configured to control the scaling of reserved instances. In this way, the number of reserved instances can be scaled based on your business needs.
- Principle: Reserved instances are scaled in or out every minute based on the metric.
- When the metric exceeds the scale-out threshold, the system scales out the number of instances to the destination value as soon as possible.
- When the metric is lower than the scale-in threshold, the system slightly scales in the number of instances to the destination value.
- Example of configuration:
- When the traffic increases and the number of required instances reaches 80% of the scale-out threshold, the number of reserved instances starts to scale out till it reaches the scale-out threshold. Requests that cannot be processed by reserved instances are sent to pay-as-you-go instances.
- When the traffic decreases and the number of required instances reaches 60% of the scale-in threshold, the number of reserved instances starts to scale in.
Statistics only on reserved instances are collected to calculate the concurrency usage
rate of reserved instances. The statistics on the pay-as-you-go instances are not
included. The metric is calculated based on the following formula: Number of concurrent
requests to which reserved instances are responding/Maximum number of concurrent requests
to which all reserved instances can respond. The metric value ranges from 0 to 1.
The maximum number of concurrent requests to which reserved instances can respond
is calculated based on different Function operations:
- Single request processed by one instance: Maximum concurrency = Number of instances.
- Multiple requests processed by one instance: Maximum concurrency = Number of instances × Number of requests concurrently processed by one instance.
Scale-in and scale-out values:
- The values are determined by the current metric value, scaling threshold, number of reserved instances, and scaling factor.
- Calculation principle: The system scales in based on a scale-in factor. Value range:
(0,1]. You do not need to set this factor. The scale-in and scale-out values are rounded-up
integers of the following calculation results:
- Scale-out value = (Current metric value/Scale-out threshold) × Number of reserved instances.
- Scale-in ratio = (1 - Current metric value/Scale-out threshold) × Scale-in factor.
- Scale-in value = Current number of instances × (1 - Scale-in ratio).
- Example: The current metric value is 90%, the scale-out threshold is 80%, and the number of reserved instances is 100. The scale out value = (90%/80%) × 100 = 112.5 (rounded up to 113). The number of reserved instances is increased to 113.
Example of parameter settings:
- In this example, a function named function_1 in a service named service_1 is configured to automatically scales in and out based on the ProvisionedConcurrencyUtilization metric. Set the scaling period from 10:00:00 on November 1, 2020 to 10:00:00 on November 30, 2020. When the concurrency usage rate exceeds 60%, the number of reserved instances can scale out to 100. When the concurrency usage rate is lower than 60%, the number of reserved instances can scale in to 10.
{ "ServiceName": "service_1", "FunctionName": "function_1", "Qualifier": "alias_1", "TargetTrackingPolicies": [ { "Name": "action_1", "StartTime": "2020-11-01T10:00:00Z", "EndTime": "2020-11-30T10:00:00Z", "MetricType": "ProvisionedConcurrencyUtilization", "MetricTarget": 0.6, "MinCapacity": 10, "MaxCapacity": 100, } ] }
- Parameters
Parameter Description Schema Name The name of the scheduled auto-scaling task. string StartTime The time when the configuration starts to take effect. Specify the value in UTC. string EndTime The time when the configuration expires. Specify the value in UTC. string MetricType The tracked metric: ProvisionedConcurrencyUtilization. string MetricTarget The tracked value of the metric. double MinCapacity The minimum scale-in value. integer (int64) MaxCapacity The maximum scale-out value. integer (int64)