All Products
Search
Document Center

Platform For AI:CreateServiceAutoScaler

Last Updated:Nov 22, 2024

Enables the Autoscaler feature and creates an Autoscaler controller for a service.

Debugging

You can run this interface directly in OpenAPI Explorer, saving you the trouble of calculating signatures. After running successfully, OpenAPI Explorer can automatically generate SDK code samples.

Authorization information

The following table shows the authorization information corresponding to the API. The authorization information can be used in the Action policy element to grant a RAM user or RAM role the permissions to call this API operation. Description:

  • Operation: the value that you can use in the Action element to specify the operation on a resource.
  • Access level: the access level of each operation. The levels are read, write, and list.
  • Resource type: the type of the resource on which you can authorize the RAM user or the RAM role to perform the operation. Take note of the following items:
    • The required resource types are displayed in bold characters.
    • If the permissions cannot be granted at the resource level, All Resources is used in the Resource type column of the operation.
  • Condition Key: the condition key that is defined by the cloud service.
  • Associated operation: other operations that the RAM user or the RAM role must have permissions to perform to complete the operation. To complete the operation, the RAM user or the RAM role must have the permissions to perform the associated operations.
OperationAccess levelResource typeCondition keyAssociated operation
eas:CreateServiceAutoScalercreate
*Service
acs:eas:{#regionId}:{#accountId}:service/{#ServiceName}
    none
none

Request syntax

POST /api/v2/services/{ClusterId}/{ServiceName}/autoscaler HTTP/1.1

Request parameters

ParameterTypeRequiredDescriptionExample
ClusterIdstringYes

The ID of the region where the service is deployed.

cn-shanghai
ServiceNamestringYes

The service name. For more information about how to query the service name, see ListServices .

foo
bodyobjectNo

The request body.

minintegerYes

The minimum number of instances in the service.

2
maxintegerYes

The maximum number of instances in the service. The value of max must be greater than the value of min.

8
scaleStrategiesarray<object>Yes

The service for which the metric is specified. If you do not set this parameter, the current service is specified by default.

objectNo
metricNamestringYes

The name of the metric for triggering auto scaling. Valid values:

  • qps: the queries per second (qps) for an individual instance.
  • cpu: the cpu utilization.
  • gpu[util]: gpu utilization.
qps
thresholdfloatYes

The threshold of the metric that triggers auto scaling.

  • If you set metricName to qps, scale-out is triggered when the average qps for a single instance is greater than this threshold.
  • If you set metricName to cpu, scale-out is triggered when the average cpu utilization for a single instance is greater than this threshold.
  • If you set metricName to gpu, scale-out is triggered when the average cpu utilization for a single instance is greater than this threshold.
10
servicestringNo

The service for which the metric is specified. If you do not set this parameter, the current service is specified by default.

demo_svc
behaviorobjectNo

The Autoscaler operation.

scaleUpobjectNo

The scale-out operation.

stabilizationWindowSecondsintegerNo

The time window that is required before the scale-out operation is performed. The scale-out operation can be performed only if the specified metric exceeds the specified threshold in the specified time window. Default value: 0.

0
scaleDownobjectNo

The scale-in operation.

stabilizationWindowSecondsintegerNo

The time window that is required before the scale-in operation is performed. The scale-in operation can be performed only if the specified metric drops below the specified threshold in the specified time window. Default value: 300.

300
onZeroobjectNo

The operation that reduces the number of instances to 0.

scaleDownGracePeriodSecondsintegerNo

The time window that is required before the number of instances is reduced to 0. The number of instances can be reduced to 0 only if no request is available or no traffic exists in the specified time window. Default value: 600.

600
scaleUpActivationReplicasintegerNo

The number of instances that you want to create at a time if the number of instances is 0. Default value: 1.

1

Response parameters

ParameterTypeDescriptionExample
object

The response parameters.

RequestIdstring

The request ID.

40325405-579C-4D82****
Messagestring

The returned message.

Succeed to auto scale service [foo]

Examples

Sample success responses

JSONformat

{
  "RequestId": "40325405-579C-4D82****",
  "Message": "Succeed to auto scale service [foo]"
}

Error codes

For a list of error codes, visit the Service error codes.

Change history

Change timeSummary of changesOperation
2023-05-17The internal configuration of the API is changed, but the call is not affectedView Change Details
2022-09-16The internal configuration of the API is changed, but the call is not affectedView Change Details
2022-09-16The internal configuration of the API is changed, but the call is not affectedView Change Details