This topic describes the items of which you must take note when you deploy services in the shared resource group.
Background information
You can deploy services in the shared resource group without the need to maintain
the underlying resource pool. The underlying resources are billed in pay-as-you-go
mode. You can configure the specifications of service instances for deploying models
as services by using one of the following two methods: specify the number of CPU cores
or an Elastic Compute Service (ECS) instance type for each service instance. The former
method is the same as the method used to configure the specifications of service instances
for deploying regular services. In this topic, you can learn more about the following
items of which you must take note when you deploy services in the shared resource
group:
- Supported ECS instance types and their specifications. For more information, see Supported ECS instance types.
- The scenarios in which you need to connect Elastic Algorithm Service (EAS) to the network of your client and the field settings. For more information, see Network connection.
- The procedure used to collect the logs of services that are deployed in the shared resource group and deliver the logs to your Log Service Logstore. For more information, see Log delivery.
- The troubleshooting method used if your online service unexpectedly exits due to code errors. For more information about how to use the core file for troubleshooting, see Core file configuration.
Supported ECS instance types
The following table describes the ECS instance types that are supported for service
deployment.
Instance type | Specifications (Instance family) |
---|---|
ecs.c5.6xlarge | 24 cores and 48 GB of memory (c5, compute-optimized instance family) |
ecs.c6.2xlarge | 8 cores and 16 GB of memory (c6, compute-optimized instance family) |
ecs.c6.4xlarge | 16 cores and 32 GB of memory (c6, compute-optimized instance family) |
ecs.c6.6xlarge | 24 cores and 48 GB of memory (c6, compute-optimized instance family) |
ecs.c6.8xlarge | 32 cores and 64 GB of memory (c6, compute-optimized instance family) |
ecs.g5.6xlarge | 24 cores and 96 GB of memory (g5, general-purpose instance family) |
ecs.g6.2xlarge | 8 cores and 32 GB of memory (g6, general-purpose instance family) |
ecs.g6.4xlarge | 16 cores and 64 GB of memory (g6, general-purpose instance family) |
ecs.g6.6xlarge | 24 cores and 96 GB of memory (g6, general-purpose instance family) |
ecs.g6.8xlarge | 32 cores and 128 GB of memory (g6, general-purpose instance family) |
ecs.gn5-c28g1.7xlarge | 28 cores, 112 GB of memory, and 1 NVIDIA Tesla P100 GPU (gn5, GPU-accelerated compute-optimized instance family) |
ecs.gn5-c4g1.xlarge | 4 cores, 30 GB of memory, and 1 NVIDIA Tesla P100 GPU (gn5, GPU-accelerated compute-optimized instance family) |
ecs.gn5-c8g1.2xlarge | 8 cores, 60 GB of memory, and 1 NVIDIA Tesla P100 GPU (gn5, GPU-accelerated compute-optimized instance family) |
ecs.gn5-c8g1.4xlarge | 16 cores, 120 GB of memory, and 2 NVIDIA Tesla P100 GPU (gn5, GPU-accelerated compute-optimized instance family) |
ecs.gn5i-c4g1.xlarge | 4 cores, 16 GB of memory, and 1 NVIDIA Tesla P4 GPU (gn5i, GPU-accelerated compute-optimized instance family) |
ecs.gn5i-c8g1.2xlarge | 8 cores, 32 GB of memory, and 1 NVIDIA Tesla P4 GPU (gn5i, GPU-accelerated compute-optimized instance family) |
ecs.gn6i-c16g1.4xlarge | 16 cores, 62 GB of memory, and 1 NVIDIA Tesla T4 GPU (gn6i, GPU-accelerated compute-optimized instance family) |
ecs.gn6i-c24g1.12xlarge | 48 cores, 186 GB of memory, and 2 NVIDIA Tesla T4 GPU (gn6i, GPU-accelerated compute-optimized instance family) |
ecs.gn6i-c24g1.6xlarge | 24 cores, 93 GB of memory, and 1 NVIDIA Tesla T4 GPU (gn6i, GPU-accelerated compute-optimized instance family) |
ecs.gn6i-c4g1.xlarge | 4 cores, 15 GB of memory, and 1 NVIDIA Tesla T4 GPU (gn6i, GPU-accelerated compute-optimized instance family) |
ecs.gn6i-c8g1.2xlarge | 8 cores, 31 GB of memory, and 1 NVIDIA Tesla T4 GPU (gn6i, GPU-accelerated compute-optimized instance family) |
ecs.gn6v-c8g1.2xlarge | 8 cores, 32 GB of memory, and 1 NVIDIA Tesla V100 GPU (gn6v, GPU-accelerated compute-optimized instance family) |
ecs.r6.2xlarge | 8 cores and 64 GB of memory (r6, memory-optimized instance family) |
ecs.r6.4xlarge | 16 cores and 128 GB of memory (r6, memory-optimized instance family) |
ecs.r6.6xlarge | 24 cores and 192 GB of memory (r6, memory-optimized instance family) |
ecs.r6.8xlarge | 32 cores and 256 GB of memory (r6, memory-optimized instance family) |
ecs.g7.2xlarge | 8 cores and 32 GB of memory (g7, general-purpose instance family) |
ecs.g7.4xlarge | 16 cores and 64 GB of memory (g7, general-purpose instance family) |
ecs.g7.6xlarge | 24 cores and 96 GB of memory (g7, general-purpose instance family) |
ecs.g7.8xlarge | 32 cores and 128 GB of memory (g7, general-purpose instance family) |
ecs.c7.2xlarge | 8 cores and 16 GB of memory (c7, compute-optimized instance family) |
ecs.c7.4xlarge | 16 cores and 32 GB of memory (c7, compute-optimized instance family) |
ecs.c7.6xlarge | 24 cores and 48 GB of memory (c7, compute-optimized instance family) |
ecs.c7.8xlarge | 32 cores and 64 GB of memory (c7, compute-optimized instance family) |
ecs.r7.2xlarge | 8 cores and 64 GB of memory (r7, memory-optimized instance family) |
ecs.r7.4xlarge | 16 cores and 128 GB of memory (r7, memory-optimized instance family) |
ecs.r7.6xlarge | 24 cores and 192 GB of memory (r7, memory-optimized instance family) |
ecs.r7.8xlarge | 32 cores and 256 GB of memory (r7, memory-optimized instance family) |
ecs.gn7-c12g1.3xlarge | 12 cores, 95 GB of memory, and 1 NVIDIA Tesla A100 GPU (gn7, GPU-accelerated compute-optimized instance family) |
ecs.g7.16xlarge | 64 cores and 256 GB of memory (g7, general-purpose instance family) |
ecs.c7.16xlarge | 64 cores and 128 GB of memory (c7, compute-optimized instance family) |
ecs.r7.16xlarge | 64 cores and 512 GB of memory (r7, memory-optimized instance family) |
ecs.gn7i-c8g1.2xlarge | 8 cores, 30 GB of memory, and 1 NVIDIA Tesla A10 GPU (gn7i, GPU-accelerated compute-optimized instance family) |
ecs.gn7i-c16g1.4xlarge | 16 cores, 60 GB of memory, and 1 NVIDIA Tesla A10 GPU (gn7i, GPU-accelerated compute-optimized instance family) |
ecs.gn7i-c32g1.8xlarge | 32 cores, 188 GB of memory, and 1 NVIDIA Tesla A10 GPU (gn7i, GPU-accelerated compute-optimized instance family) |
ecs.gn6e-c12g1.3xlarge | 12 cores, 92 GB of memory, and 1 NVIDIA Tesla V100 GPU (gn6e, GPU-accelerated compute-optimized instance family) |
ecs.g6.xlarge | 4 cores and 16 GB of memory (g6, general-purpose instance family) |
ecs.c6.xlarge | 4 cores and 8 GB of memory (c6, compute-optimized instance family) |
ecs.r6.xlarge | 4 cores and 32 GB of memory (r6, memory-optimized instance family) |
ecs.g6.large | 2 cores and 8 GB of memory (g6, general-purpose instance family) |
ecs.c6.large | 2 cores and 4 GB of memory (c6, compute-optimized instance family) |
ecs.r6.large | 2 cores and 16 GB of memory (r6, memory-optimized instance family) |
To specify an ECS instance type for deploying a model as a service, you can add the
cloud.computing.instance_type field in the service configuration file. The following code provides an example:
{
"name": "tf_serving_test",
"model_path": "http://eas-data.oss-cn-shanghai.aliyuncs.com/models/model.tar.gz",
"processor": "tensorflow_gpu_1.12",
"cloud":{
"computing":{
"instance_type":"ecs.gn6i-c24g1.6xlarge"
}
},
"metadata": {
"instance": 1,
"cuda": "9.0",
"memory": 7000,
"gpu": 1,
"cpu": 4
}
}
Note You can also run the eascmd create service.json command to deploy a model as a service. For more information about how to use the
EASCMD client, see Run commands to use the EASCMD client.
Network connection
In the following two scenarios, you need to connect EAS to the network of your client:
- Access to EAS
You want to access EAS from your virtual private cloud (VPC) in direct connection mode.
- Access from EAS
You want to use EAS to access a service in your VPC, such as Redis.
You can configure the network connection by using an elastic network interface (ENI),
and the vSwitch ID and security group ID of your client. When a service instance in
EAS is started, an ENI is created for the specified vSwitch and associated with the
service instance. Then, you can use the ENI to establish connections between the service
instance and the network of your client.
Note After you configure the network connection by specifying the ID of a vSwitch, the
access from or to the VPC to which the vSwitch belongs is allowed by default. If the
CIDR block of the VPC is 10.0.0.0/8, which overlaps the CIDR block of EAS, service
instances fail to be connected. To prevent this issue, you must explicitly specify
a subnet of the CIDR block of the VPC to or from which you allow access.
To configure the network connection, you can set the fields with the cloud.networking prefix.
- The following table describes the fields with the cloud.networking prefix.
Field Description cloud.networking.security_group_id The ID of the security group. The ECS instance on which your client resides belongs to the security group. cloud.networking.vswitch_id The ID of the vSwitch. The ECS instance on which your client resides belongs to the vSwitch. An ENI will be created for the vSwitch. Make sure that the vSwitch can have enough available IP addresses. Otherwise, service instances cannot be created for service deployment in EAS. Note By default, the access from or to the VPC to which the specified vSwitch belongs is allowed. If the CIDR block of the VPC is 10.0.0.0/8, which overlaps the CIDR block of EAS, service instances fail to be connected. To prevent this issue, you must use the destination_cidrs field to explicitly specify a subnet of the CIDR block of the VPC to or from which you allow access. For example, if you want to allow access only from or to the CIDR block (10.1.1.0/24) of the specified vSwitch, you can set the value of the destination_cidrs field to 10.1.1.0/24.cloud.networking.destination_cidrs The subnet of the CIDR block of the VPC to or from which you allow access. The value must be explicitly specified. cloud.networking.default_route The default network egress. Valid values: eth0 and eth1. Default value: eth0. After you connect EAS to the VPC of your client, a service instance has two network interface controllers (NICs). eth0 is the primary NIC in the VPC of EAS, and eth1 is the secondary NIC in the VPC of your client. By default, the traffic flows from eth0. However, the VPC of EAS does not allow access over the Internet. Typically, if you want to enable Internet access for EAS, you can configure a NAT gateway for the VPC of your client and set the cloud.networking.default_route field to eth1. Then, the traffic to the Internet can flow to the VPC of your client through eth1 and finally flow to the Internet through the NAT gateway. - The following code provides an example on how to set the fields:
{ "model_path": "http://eas-data.oss-cn-shanghai.aliyuncs.com/models/lr_xingke4.pmml", "name": "test_pmml", "processor": "pmml", "metadata": { "instance": 1, "cpu": 3, "memory": 2000 }, "cloud": { "networking.security_group_id": "sg-2vce4xxvy5hn1hmjx7yh", "networking.vswitch_id": "vsw-2vcbbihcy3cg8fjdpdvdl", "networking.destination_cidrs": "10.2.0.0/8" } }
Log delivery
You can deliver the logs of the services deployed in the shared resource group to your Log Service Logstore by performing the following steps:
Core file configuration
Your online service may unexpectedly exit due to code errors. To troubleshoot the
issue, you can analyze the core file that is generated when the service unexpectedly
exits. The services deployed in the shared resource group run in serverless mode.
If you want to acquire the core file, you must mount the external storage such as
Apsara File Storage NAS (NAS) and set the core_pattern field to the mount directory
to which you want to store the core file. You must add the runtime.core_pattern field to the service configuration file when you deploy the service. The following
code provides an example:
{
"model_path": "http://eas-data.oss-cn-shanghai.aliyuncs.com/models/lr_xingke4.pmml",
"name": "test_pmml",
"processor": "pmml",
"runtime": {
"core_pattern": "/corefiles/%h-%t-%e-%p.core"
},
"metadata": {
"instance": 1,
"cpu": 3
}
}