deploy an EAS service by using a public resource group in PAI - Platform For AI

After you activate Elastic Algorithm Service (EAS) in Platform for AI (PAI), the system automatically creates a public resource group. You can use the public resource group to deploy model services. This topic provides an overview of the public resource group.

Scenarios

If you run a small number of tasks and do not require quick responses, we recommend that you use the public resource group.

Billing

How to start billing

The public resource group allows you to deploy model services by specifying the node resources or node type. The billing starts after a service is deployed and enters the Running state. For more information, see Billing of EAS.

Important

We recommend that you stop model services that you no longer use to prevent unnecessary costs.

If you use the EASCMD client to deploy a service, you can specify a system disk capacity. For more information, see Parameters of model services. Each node of the public resource group has 30 GB free system disk capacity. You are billed for the storage capacity that exceeds the free quota on a pay-as-you-go basis, and the billing starts immediately after the system disk is created. For more information, see Billing of EAS.

How to stop billing

On the Elastic Algorithm Service (EAS) page, find the model service that you want to stop on the Inference Service tab and click Stop in the Actions column. Then, the model service is stopped, and the system stops billing the resources used by the model service. For more information, see Model service deployment by using the PAI console and Machine Learning Designer.

Important

If you allocate a system disk capacity to a service, the billing for the capacity stops only if the service is deleted.
Make sure that the stopped model service is no longer needed to prevent unnecessary business losses.

Work with the public resource group

The public resource group is ready for use after you activate EAS.

You can enable Virtual Private Cloud (VPC) direct connection for the public resource group to reduce network latency for your clients and allow your EAS services to access other cloud services deployed in the same VPC. For more information, see Configure network connectivity.

You can configure log collection for the public resource group to send the logs generated by EAS services in the public resource group to Simple Log Service. This way, you can monitor EAS services in real time. For more information, see Configure log collection for a resource group.

Use one of the following methods to deploy services to the public resource group.

Use the console
Deploy a model service on the Elastic Algorithm Service (EAS) page and set Resource Group Type to Public Resource Group. For more information, see Model service deployment by using the PAI console and Machine Learning Designer.

Use EASCMD

Use the EASCMD client to deploy a model service. For more information, see Deploy model services by using EASCMD or DSW.

Deploy the model service by specifying the node resources or node type.

Deploy the model service by specifying the node resources:

{
    "metadata": {
        "instance": 2,
        "cpu": 1,
        "memory": 2000
    },
    "cloud": {
        "computing": {}
    },
    "name": "test",
    "model_path": "http://examplebucket.oss-cn-shanghai.aliyuncs.com/models/model.tar.gz",
    "processor": "tensorflow_cpu_1.12"
}

To deploy the model service by specifying the node type, you must add the cloud.computing.instance_type parameter to the service configuration file to specify an Elastic Compute Service (ECS) instance type:

{
  "name": "tf_serving_test",
  "model_path": "http://examplebucket.oss-cn-shanghai.aliyuncs.com/models/model.tar.gz",
  "processor": "tensorflow_gpu_1.12",
  "cloud":{
      "computing":{
          "instance_type":"ecs.gn6i-c24g1.6xlarge"
      }
  },
  "metadata": {
    "instance": 1,
    "cuda": "9.0",
    "memory": 7000,
    "gpu": 1,
    "cpu": 4
  }
}

The following table describes the ECS instance types supported for the instance_type parameter.

Instance Type	Specification
ecs.c5.6xlarge	c5 (24vCPU+48GB)
ecs.c6.2xlarge	c6 (8vCPU+16GB)
ecs.c6.4xlarge	c6 (16vCPU+32GB)
ecs.c6.6xlarge	c6 (24vCPU+48GB)
ecs.c6.8xlarge	c6 (32vCPU+64GB)
ecs.g5.6xlarge	g5 (24vCPU+96GB)
ecs.g6.2xlarge	g6 (8vCPU+32GB)
ecs.g6.4xlarge	g6 (16vCPU+64GB)
ecs.g6.6xlarge	g6 (24vCPU+96GB)
ecs.g6.8xlarge	g6 (32vCPU+128GB)
ecs.gn5-c28g1.7xlarge	28vCPU+112GB+1*P100
ecs.gn5-c4g1.xlarge	4vCPU+30GB+1*P100
ecs.gn5-c8g1.2xlarge	8vCPU+60GB+1*P100
ecs.gn5-c8g1.4xlarge	16vCPU+120GB+2*P100
ecs.gn5i-c4g1.xlarge	4vCPU+16GB+1*P4
ecs.gn5i-c8g1.2xlarge	8vCPU+32GB+1*P4
ecs.gn6i-c16g1.4xlarge	16vCPU+62GB+1*T4
ecs.gn6i-c24g1.12xlarge	48vCPU+186GB+2*T4
ecs.gn6i-c24g1.6xlarge	48vCPU+186GB+2*T4
ecs.gn6i-c4g1.xlarge	4vCPU+15GB+1*T4
ecs.gn6i-c8g1.2xlarge	8vCPU+31GB+1*T4
ecs.gn6v-c8g1.2xlarge	8vCPU+32GB+1*V100
ecs.r6.2xlarge	r6 (8vCPU+64GB)
ecs.r6.4xlarge	r6 (16vCPU+128GB)
ecs.r6.6xlarge	r6 (24vCPU+192GB)
ecs.r6.8xlarge	r6 (32vCPU+256GB)
ecs.g7.2xlarge	g7 (8vCPU+32GB)
ecs.g7.4xlarge	g7 (16vCPU+64GB)
ecs.g7.6xlarge	g7 (24vCPU+96GB)
ecs.g7.8xlarge	g7 (32vCPU+128GB)
ecs.c7.2xlarge	c7 (8vCPU+16GB)
ecs.c7.4xlarge	c7 (16vCPU+32GB)
ecs.c7.6xlarge	c7 (24vCPU+48GB)
ecs.c7.8xlarge	c7 (32vCPU+64GB)
ecs.r7.2xlarge	r7 (8vCPU+64GB)
ecs.r7.4xlarge	r7 (16vCPU+128GB)
ecs.r7.6xlarge	r7 (24vCPU+192GB)
ecs.r7.8xlarge	r7 (32vCPU+256GB)
ecs.g7.16xlarge	g7 (64vCPU+256GB)
ecs.c7.16xlarge	c7 (64vCPU+128GB)
ecs.r7.16xlarge	r7 (64vCPU+512GB)
ecs.gn7i-c8g1.2xlarge	8vCPU+30GB+1*A10
ecs.gn7i-c16g1.4xlarge	16vCPU+60GB+1*A10
ecs.gn7i-c32g1.8xlarge	32vCPU+188GB+1*A10
ecs.gn6e-c12g1.3xlarge	12vCPU+92GB+1*V100
ecs.g6.xlarge	g6 (4vCPU+16GB)
ecs.c6.xlarge	c6 (4vCPU+8GB)
ecs.r6.xlarge	r6 (4vCPU+32GB)
ecs.g6.large	g6 (2vCPU+8GB)
ecs.c6.large	c6 (2vCPU+4GB)
ecs.r6.large	r6 (2vCPU+16GB)
ecs.c7a.large	AMD (2vCPU+4GB)
ecs.c7a.xlarge	AMD (4vCPU+8GB)
ecs.c7a.2xlarge	AMD (8vCPU+16GB)
ecs.c7a.4xlarge	AMD (16vCPU+32GB)
ecs.c7a.8xlarge	AMD (32vCPU+64GB)
ecs.c7a.16xlarge	AMD (64vCPU+128GB)
ecs.g7a.large	AMD (2vCPU+8GB)
ecs.g7a.xlarge	AMD (4vCPU+16GB)
ecs.g7a.2xlarge	AMD (8vCPU+32GB)
ecs.g7a.4xlarge	AMD (16vCPU+64GB)
ecs.g7a.8xlarge	AMD (32vCPU+128GB)
ecs.g7a.16xlarge	AMD (64vCPU+256GB)

References

The resources in the public resource group can be shared by multiple services, but cannot ensure stable resource allocation during peak hours. You can create a dedicated resource group and use the dedicated resource group to deploy services. For more information, see Work with dedicated resource groups.
You can enable VPC direct connection for services deployed in the public resource group. For more information, see Configure network connectivity.