Elastic Algorithm Service (EAS) on Platform for AI (PAI) introduces the elastic resource pool capability, enabling seamless service scaling beyond dedicated resource group limitations. When your dedicated resources reach capacity during scale-out operations, EAS automatically provisions additional service instances in pay-as-you-go public resources while maintaining cost-efficient billing. During scale-in events, the system prioritizes releasing public resource instances first, optimizing your resource utilization and cost management.
Prerequisites
Ensure you have created a dedicated resource group. Refer to Work with EAS resource groups for detailed setup instructions.
Background information
You can provision subscription or pay-as-you-go instances within your EAS dedicated resource group, allowing you to acquire adequate computing resources through cost-effective purchasing strategies.
In practical scenarios, you'll likely need your services within dedicated resource groups to accommodate dynamic scaling requirements. For instance, during traffic peaks, you may need additional pay-as-you-go resources that can automatically scale down during quieter periods. While EAS offers automatic horizontal scaling to dynamically add and remove service instances, dedicated resource groups face inherent limitations—the maximum service instances are constrained by available node resources, and manual resource adjustment proves both inefficient and cumbersome. The elastic resource pool feature resolves this constraint by enabling service instance creation in public resources during horizontal scaling operations.
Benefits
By combining the elastic resource pool feature with horizontal auto scaling, you can achieve automated service scaling in dedicated resource groups based on key performance metrics like queries per second (QPS) and CPU utilization, transcending traditional node resource limitations. This hybrid approach leverages both subscription and pay-as-you-go billing models to optimize your operational costs while maintaining service performance.
Procedure
Enable auto scaling during service deployment
Use the console
-
Log on to the PAI console. Select a region on the top of the page. Then, select the desired workspace and click Elastic Algorithm Service (EAS).
-
Click Deploy Service. In the Custom Model Deployment section, click Custom Deployment.
-
Navigate to the Resource Deployment section on the Custom Deployment page and configure the essential parameters. The table below outlines the critical configuration options. For comprehensive parameter details, refer to Model service deployment by using the PAI console.
Parameter
Description
Resource Type
Choose EAS Resource Group from the available options.
Resource Group
Select an existing dedicated resource group from the dropdown list.
Elastic Resource Pool
Enable the Elastic Resource Pool toggle and designate a public resource group as the Resource Type to activate elastic resource pooling for services deployed within your dedicated resource group.
When Elastic Resource Pool is activated and your dedicated resource group reaches full capacity, the system automatically provisions pay-as-you-go instances in the public resource group during scale-out operations. These additional instances are billed according to public resource pricing, and during scale-in events, the system prioritizes releasing public resource instances first.

-
Complete the deployment by clicking Deploy.
Use the client
You can configure auto-scaling capabilities during service deployment using the EASCMD client. The following instructions demonstrate the process using a Windows 64-bit server as an example.
-
Configure a JSON file.
ImportantResource configuration methods and Virtual Private Cloud (VPC) direct connection capabilities differ depending on your service's resource group type. Within public resource groups, utilize the cloud.computing parameter to define specific node types and allocate additional resources for your service. The cloud.networking parameter enables VPC direct connectivity for your service. For services deployed in dedicated resource groups, VPC direct connections can only be configured at the resource group level. When deploying services in dedicated resource groups with elastic resource pooling enabled, you must configure the cloud.networking parameter to maintain VPC connectivity throughout service scaling operations.
The following code provides sample content of the JSON file:
{ "model_path": "http://examplebucket.oss-cn-shanghai.aliyuncs.com/models/lr.pmml", "name": "test_burstable_service", "processor": "pmml", "metadata": { "instance": 1, "cpu": 1, "resource": "eas-r-xxx", "resource_burstable": true }, "cloud": { "computing": { "instance_type": "ecs.r7.2xlarge" }, "networking": { "security_group_id": "sg-uf68iou5an8j7sxd****", "vswitch_id": "vsw-uf6nji7pzztuoe9i7****" } } }In the preceding code:
-
resource_burstable: Controls auto-scaling activation for your service. Setting this parameter to true enables auto-scaling functionality.
-
cloud.networking: This parameter has no effect on services deployed within dedicated resource groups. When enabling elastic resource pooling for your service, configuring this parameter ensures continuous VPC direct connectivity throughout scaling operations.
-
cloud.computing: Optional parameter that allows you to define specific node types within the public resource group during scale-out operations. For detailed specifications, see Use public resources.
For comprehensive parameter documentation, refer to JSON deployment.
-
-
Execute service deployment using the EASCMD client. Detailed instructions are available in Deploy a model by using the EASCMD client.
When your dedicated resource group lacks sufficient capacity to accommodate service scale-out demands, newly added service instances will automatically utilize the public resource group.
Manage auto-scaling for deployed services
Use the console
-
Log on to the PAI console. Select a region on the top of the page. Then, select the desired workspace and click Elastic Algorithm Service (EAS).
-
Select Update from the Actions column corresponding to your service.
-
In the Resource Information section of the Update Service page, enable or disable the resource auto scaling feature.
-
Enable resource auto scaling
In the Resource Information section, activate the Elastic Resource Pool toggle and specify the public resource group type.
-
Disable resource auto scaling
In the Resource Information section, deactivate the Elastic Resource Pool toggle.
-
-
Confirm your changes by clicking Update.
Use the client
Execute the following commands to activate or deactivate the elastic resource pool functionality for your deployed service. The examples below demonstrate usage on a Windows 64-bit server.
If the cloud.networking parameter was not configured during initial service deployment in a dedicated resource group, enabling the elastic resource pool feature afterward will result in unavailable VPC direct connections for newly added service instances in the public resource group.
# Enable the elastic resource pool feature for a deployed service.
eascmdwin64.exe modify <service_name> -Dmetadata.resource_burstable=true
# Disable the elastic resource pool feature for a deployed service.
eascmdwin64.exe modify <service_name> -Dmetadata.resource_burstable=false
Substitute <service_name> with your target service identifier.
The elastic resource pool functionality applies exclusively to newly created service instances. For instance, if a service undergoes scale-out with two existing pending instances before activating the elastic resource pool feature, these instances will not automatically transition to the public resource group upon feature activation. To migrate existing instances, you must restart them through the PAI console, which will reschedule them to the public resource group. Similarly, service instances already assigned to the public resource group will remain there even after disabling the elastic resource pool feature—they won't automatically revert to the dedicated resource group.
References
-
Activate horizontal auto scaling to enable automatic instance scaling based on your specified metrics. For detailed configuration instructions, see Horizontal auto scaling.
-
Learn how to configure automatic scaling to maintain a specific instance count through Scheduled auto scaling.