After you activate Elastic Algorithm Service (EAS) in Platform for AI (PAI), the system automatically creates a public resource group. You can use the public resource group to deploy model services. This topic provides an overview of the public resource group.
Scenarios
If you run a small number of tasks and do not require quick responses, we recommend that you use the public resource group.
Billing
How to start billing
The public resource group allows you to deploy model services by specifying the node resources or node type. The billing starts after a service is deployed and enters the Running state. For more information, see Billing of EAS.
We recommend that you stop model services that you no longer use to prevent unnecessary costs.
If you use the EASCMD client to deploy a service, you can specify a system disk capacity. For more information, see Parameters of model services. Each node of the public resource group has 30 GB free system disk capacity. You are billed for the storage capacity that exceeds the free quota on a pay-as-you-go basis, and the billing starts immediately after the system disk is created. For more information, see Billing of EAS.
How to stop billing
On the Elastic Algorithm Service (EAS) page, find the model service that you want to stop on the Inference Service tab and click Stop in the Actions column. Then, the model service is stopped, and the system stops billing the resources used by the model service. For more information, see Model service deployment by using the PAI console and Machine Learning Designer.
If you allocate a system disk capacity to a service, the billing for the capacity stops only if the service is deleted.
Make sure that the stopped model service is no longer needed to prevent unnecessary business losses.
Work with the public resource group
The public resource group is ready for use after you activate EAS.
You can enable Virtual Private Cloud (VPC) direct connection for the public resource group to reduce network latency for your clients and allow your EAS services to access other cloud services deployed in the same VPC. For more information, see Configure network connectivity.
You can configure log collection for the public resource group to send the logs generated by EAS services in the public resource group to Simple Log Service. This way, you can monitor EAS services in real time. For more information, see Configure log collection for a resource group.
Use one of the following methods to deploy services to the public resource group.
Use the console
Deploy a model service on the Elastic Algorithm Service (EAS) page and set Resource Group Type to Public Resource Group. For more information, see Model service deployment by using the PAI console and Machine Learning Designer.
Use EASCMD
Use the EASCMD client to deploy a model service. For more information, see Deploy model services by using EASCMD or DSW.
Deploy the model service by specifying the node resources or node type.
Deploy the model service by specifying the node resources:
{ "metadata": { "instance": 2, "cpu": 1, "memory": 2000 }, "cloud": { "computing": {} }, "name": "test", "model_path": "http://examplebucket.oss-cn-shanghai.aliyuncs.com/models/model.tar.gz", "processor": "tensorflow_cpu_1.12" }
To deploy the model service by specifying the node type, you must add the cloud.computing.instance_type parameter to the service configuration file to specify an Elastic Compute Service (ECS) instance type:
{ "name": "tf_serving_test", "model_path": "http://examplebucket.oss-cn-shanghai.aliyuncs.com/models/model.tar.gz", "processor": "tensorflow_gpu_1.12", "cloud":{ "computing":{ "instance_type":"ecs.gn6i-c24g1.6xlarge" } }, "metadata": { "instance": 1, "cuda": "9.0", "memory": 7000, "gpu": 1, "cpu": 4 } }
The following table describes the ECS instance types supported for the instance_type parameter.
Instance Type
Specification
ecs.c5.6xlarge
c5 (24vCPU+48GB)
ecs.c6.2xlarge
c6 (8vCPU+16GB)
ecs.c6.4xlarge
c6 (16vCPU+32GB)
ecs.c6.6xlarge
c6 (24vCPU+48GB)
ecs.c6.8xlarge
c6 (32vCPU+64GB)
ecs.g5.6xlarge
g5 (24vCPU+96GB)
ecs.g6.2xlarge
g6 (8vCPU+32GB)
ecs.g6.4xlarge
g6 (16vCPU+64GB)
ecs.g6.6xlarge
g6 (24vCPU+96GB)
ecs.g6.8xlarge
g6 (32vCPU+128GB)
ecs.gn5-c28g1.7xlarge
28vCPU+112GB+1*P100
ecs.gn5-c4g1.xlarge
4vCPU+30GB+1*P100
ecs.gn5-c8g1.2xlarge
8vCPU+60GB+1*P100
ecs.gn5-c8g1.4xlarge
16vCPU+120GB+2*P100
ecs.gn5i-c4g1.xlarge
4vCPU+16GB+1*P4
ecs.gn5i-c8g1.2xlarge
8vCPU+32GB+1*P4
ecs.gn6i-c16g1.4xlarge
16vCPU+62GB+1*T4
ecs.gn6i-c24g1.12xlarge
48vCPU+186GB+2*T4
ecs.gn6i-c24g1.6xlarge
48vCPU+186GB+2*T4
ecs.gn6i-c4g1.xlarge
4vCPU+15GB+1*T4
ecs.gn6i-c8g1.2xlarge
8vCPU+31GB+1*T4
ecs.gn6v-c8g1.2xlarge
8vCPU+32GB+1*V100
ecs.r6.2xlarge
r6 (8vCPU+64GB)
ecs.r6.4xlarge
r6 (16vCPU+128GB)
ecs.r6.6xlarge
r6 (24vCPU+192GB)
ecs.r6.8xlarge
r6 (32vCPU+256GB)
ecs.g7.2xlarge
g7 (8vCPU+32GB)
ecs.g7.4xlarge
g7 (16vCPU+64GB)
ecs.g7.6xlarge
g7 (24vCPU+96GB)
ecs.g7.8xlarge
g7 (32vCPU+128GB)
ecs.c7.2xlarge
c7 (8vCPU+16GB)
ecs.c7.4xlarge
c7 (16vCPU+32GB)
ecs.c7.6xlarge
c7 (24vCPU+48GB)
ecs.c7.8xlarge
c7 (32vCPU+64GB)
ecs.r7.2xlarge
r7 (8vCPU+64GB)
ecs.r7.4xlarge
r7 (16vCPU+128GB)
ecs.r7.6xlarge
r7 (24vCPU+192GB)
ecs.r7.8xlarge
r7 (32vCPU+256GB)
ecs.g7.16xlarge
g7 (64vCPU+256GB)
ecs.c7.16xlarge
c7 (64vCPU+128GB)
ecs.r7.16xlarge
r7 (64vCPU+512GB)
ecs.gn7i-c8g1.2xlarge
8vCPU+30GB+1*A10
ecs.gn7i-c16g1.4xlarge
16vCPU+60GB+1*A10
ecs.gn7i-c32g1.8xlarge
32vCPU+188GB+1*A10
ecs.gn6e-c12g1.3xlarge
12vCPU+92GB+1*V100
ecs.g6.xlarge
g6 (4vCPU+16GB)
ecs.c6.xlarge
c6 (4vCPU+8GB)
ecs.r6.xlarge
r6 (4vCPU+32GB)
ecs.g6.large
g6 (2vCPU+8GB)
ecs.c6.large
c6 (2vCPU+4GB)
ecs.r6.large
r6 (2vCPU+16GB)
ecs.c7a.large
AMD (2vCPU+4GB)
ecs.c7a.xlarge
AMD (4vCPU+8GB)
ecs.c7a.2xlarge
AMD (8vCPU+16GB)
ecs.c7a.4xlarge
AMD (16vCPU+32GB)
ecs.c7a.8xlarge
AMD (32vCPU+64GB)
ecs.c7a.16xlarge
AMD (64vCPU+128GB)
ecs.g7a.large
AMD (2vCPU+8GB)
ecs.g7a.xlarge
AMD (4vCPU+16GB)
ecs.g7a.2xlarge
AMD (8vCPU+32GB)
ecs.g7a.4xlarge
AMD (16vCPU+64GB)
ecs.g7a.8xlarge
AMD (32vCPU+128GB)
ecs.g7a.16xlarge
AMD (64vCPU+256GB)
References
The resources in the public resource group can be shared by multiple services, but cannot ensure stable resource allocation during peak hours. You can create a dedicated resource group and use the dedicated resource group to deploy services. For more information, see Work with dedicated resource groups.
You can enable VPC direct connection for services deployed in the public resource group. For more information, see Configure network connectivity.