If your business involves multiple services, you can create service groups. A service group has a centralized data ingress. The system forwards traffic to each service in the service group based on traffic distribution policies. This topic describes how to create a service group, view the data ingress, and modify traffic distribution policies.
Scenarios
You can use service groups in the following scenarios:
Canary releases
Add Service A and Service B to a service group. Service A is used in the production environment, and Service B is used for canary releases. Service B has fewer deployed instances than Service A, and the traffic distribution between the services varies based on the number of configured instances for each service.
To publish a new service version, update Service B and check its status. If Service B does not run as expected, you can roll back Service B. You can also stop Service B and switch traffic to Service A. If Service B runs as expected, you can update Service A. After you update Service A, reduce the number of instances for Service B to zero. You can also configure Service B to continue receiving a small amount of traffic.
Auto scaling of pay-as-you-go and subscription resource groups
You can create services in a service group, deploy a service to a subscription dedicated resource group, and then specify a fixed number of instances for the service to meet business requirements. You can also deploy a service to a pay-as-you-go public resource group and configure auto scaling for the service to handle traffic spikes. This way, you can use public resources with dedicated resources to reduce costs.
Use of heterogeneous hardware resources
In GPU acceleration scenarios, a service commonly uses only one GPU or CPU type. If the GPU or CPU type that is used by your service is discontinued or unavailable in the region where your service is deployed, the system cannot scale out the service. You can create a service group and dynamically add services that use different CPU or GPU types to the service group. Different CPU and GPU types have specific requirements for the Compute Unified Device Architecture (CUDA) environment. Different services can use different CPU or GPU types. This allows you to create multiple services that use heterogeneous hardware resources to meet business requirements. All services in the service group share the same data ingress. The number of services that are created in the service group is hidden from users.
Create a service group
When you create a service, you can specify the service group to which the service belongs.
If the specified service group does not exist, the system automatically creates the service group. If the specified service group exists, the system adds the new service to the service group. After all services in a service group are deleted, the service group is automatically deleted.
The following example shows how to create a service group named pmml and add the pmml_prod and pmml_grey services to the service group.
Create a service group in the PAI console
Log on to the PAI console. Select a region on the top of the page. Then, select the desired workspace and click Enter Elastic Algorithm Service (EAS).
On the Elastic Algorithm Service (EAS) page, click the Canary Release tab. On the tab that appears, click Create Group and Service.
On the Custom Deployment page, configure the parameters and click Deploy.
Parameters:
Service Name: Follow the on-screen instructions to specify a valid service name. Example: pmml_prod.
Group: the service group to which the service belongs. In this example, New Group is used, and the new group name is set to pmml.
For information about other parameters, see Deploy a model service in the PAI console.
Repeat Steps 2 and 3 to create a service named pmml_grey that belongs to the pmml service group.
After you create the services, click pmml to go to the group details page and view the services that belong to the group.
Create a service group by using a client
Prepare a service configuration file named service.json.
The following sample code provides an example of the configuration file content of the pmml_prod service:
{ "name":"pmml_prod", "model_path":"http://examplebucket.oss-cn-shanghai.aliyuncs.com/models/lr_xxxx.pmml", "processor":"pmml", "metadata":{ "cpu":1, "instance":4, "group":"pmml", "traffic_state": "grouping" } }
The following sample code provides an example of the configuration file content of the pmml_grey service:
{ "name":"pmml_grey", "model_path":"http://examplebucket.oss-cn-shanghai.aliyuncs.com/models/lr_xxxx.pmml", "processor":"pmml", "metadata":{ "cpu":1, "instance":1, "group":"pmml", "traffic_state": "grouping" } }
Parameters:
group: the name of the service group that you created. This parameter specifies the service group to which the service belongs.
traffic_state: specifies whether to forward traffic to the service in the service group. Valid values:
grouping: forwards traffic to the service.
standalone: does not forward traffic to the service.
NoteIf you do not configure the traffic_state parameter in the configuration file of a service, the system automatically forwards traffic to the service. If you do not want a service that is added to a service group to immediately receive traffic, set the traffic_state parameter to standalone.
If a service group contains only a single service and you set the traffic_state parameter to standalone for the service, the system automatically changes the value to grouping.
For information about other parameters in the configuration file, see Run commands to use the EASCMD client.
Create two services and a service group.
Log on to the EASCMD client and run the
create
command to create two services and a service group. For more information about how to log on to the EASCMD client, see Download the EASCMD client and complete identity authentication. Sample code:$ eascmd create service.json
View information about the services and service group.
Run the following
ls
command to view information about the services and service group:$ eascmd ls
The following information is returned:
[RequestId]: 716BEBFC-E8A4-51FD-A3F7-56376B167923 +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+ | SERVICENAME | INSTANCE | CPU | MEMORY | CREATETIME | UPDATETIME | STATUS | WEIGHT | TRAFFICSTATE | SERVICEGROUP | +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+ | pmml_prod | 4 | 1 | 1000M | 2022-06-05T14:30:49Z | 2022-06-05T14:30:49Z | Running | 80 | grouping | pmml | | pmml_grey | 1 | 1 | 1000M | 2022-06-05T14:31:38Z | 2022-06-05T14:31:38Z | Running | 20 | grouping | pmml | +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+
Parameters:
pmml is displayed in the SERVICEGROUP column. This indicates that the two services belong to the pmml service group.
grouping is displayed in the TRAFFICSTATE column. This indicates that both services receive traffic. The traffic distribution between the services is 80% and 20%, which is calculated based on the number of service instances.
What to do next:
After you create the services and service group, you can view the data ingress of the service group and the data ingresses of services. The data ingresses are used for external access. For more information, see View data ingresses.
After you create the services and service group, the system automatically performs group traffic switchover. The traffic distribution between the services is calculated based on the number of service instances. You can modify the traffic distribution policies. For more information, see Modify traffic distribution policies (group traffic switchover).
View data ingresses
A service group has a centralized data ingress. Each service in the service group has a separate data ingress. The data ingresses are in the following formats:
Data ingress of a service group:
<endpoint>/api/predict/<group_name>
Example:
http://182848887922****.vpc.cn-shanghai.pai-eas.aliyuncs.com/api/predict/pmml
To view the data ingress of a service group, follow the instructions shown in the following figure in the Platform for AI (PAI) console.
Traffic that flows through the ingress is distributed to services in the service group based on the traffic distribution policies. You can create or delete services in the service group. The data ingress remains unchanged. You can use the data ingress to debug services online.
Data ingress of a service:
<endpoint>/api/predict/<group_name>.<service_name>
Example:
http://182848887922****.vpc.cn-shanghai.pai-eas.aliyuncs.com/api/predict/pmml.pmml_prod
To view the data ingress of a service, follow the instructions shown in the following figure in the PAI console.
This data ingress is related to the lifecycle of the service. Traffic that flows through the ingress is distributed to the specified service. After the service is deleted, the data ingress is also deleted. After the group traffic switchover is complete, you must use the data ingress to access and debug the service online.
Modify traffic distribution policies (group traffic switchover)
Elastic Algorithm Service (EAS) supports group traffic switchover and allows you to modify traffic distribution policies.
Group traffic switchover
After you create services and specify the service group to which the services belong, group traffic switchover is automatically performed. The services in the service group start to receive traffic, and the traffic distribution between services is calculated based on the number of service instances.
Modify traffic distribution policies
To modify traffic distribution policies, follow the instructions shown in the following figure in the PAI console. If you turn on the switch indicated by ③, the service receives traffic. If you turn off the switch, the service does not receive traffic.
Run the following
release
command to modify traffic distribution policies: For information about how to log on to the EASCMD client, see Download the EASCMD client and complete identity authentication.$ eascmd release <service_name> -s grouping|standalone
Parameters:
<service_name>: the name of the service. Change the value to the name of the service for which you want to modify the traffic distribution policy.
grouping|standalone: the status after modification. Valid values: grouping and standalone. grouping indicates that the service receives traffic, and standalone indicates that the service does not receive traffic.
Examples:
Run the following command to change the status of the pmml_grey service to standalone. This way, the pmml_grey service does not receive traffic.
$ eascmd release pmml_grey -s standalone
The following output is returned:
Confirmed to release service [pmml_grey] to group traffic [Y/n]yes [RequestId]: 40C787DF-8900-5F7A-8A01-30F7D5A8BF3B [OK] Service [pmml_grey] has entered the traffic state: standalone
Run the
eascmd ls
command to view the status of the service. The following output is returned:[RequestId]: 83BE3FBB-8CE2-5008-B435-1938A20B13AA +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+ | SERVICENAME | INSTANCE | CPU | MEMORY | CREATETIME | UPDATETIME | STATUS | WEIGHT | TRAFFICSTATE | SERVICEGROUP | +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+ | pmml_prod | 4 | 1 | 1000M | 2022-06-05T14:30:49Z | 2022-06-05T14:30:49Z | Running | 100 | grouping | pmml | | pmml_grey | 1 | 1 | 1000M | 2022-06-05T14:42:41Z | 2022-06-05T14:42:41Z | Running | 0 | standalone | pmml | +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+
The TRAFFICSTATE of the pmml_grey service changes to standalone. The value of the WEIGHT parameter is 0, which indicates that all traffic is received by the pmml_prod service.
Run the following command to change the status of the pmml_grey service to grouping. This allows the pmml_grey service to receive traffic.
$ eascmd release pmml_grey -s grouping
The following output is returned:
Confirmed to release service [pmml_grey] to group traffic [Y/n]yes [RequestId]: 40C787DF-8900-5F7A-8A01-30F7D5A8BF3B [OK] Service [pmml_grey] has entered the traffic state: grouping
Run the
eascmd ls
command to view the status of the service. The following output is returned:[RequestId]: 83BE3FBB-8CE2-5008-B435-1938A20B13AA +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+ | SERVICENAME | INSTANCE | CPU | MEMORY | CREATETIME | UPDATETIME | STATUS | WEIGHT | TRAFFICSTATE | SERVICEGROUP | +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+ | pmml_prod | 4 | 1 | 1000M | 2022-06-05T14:30:49Z | 2022-06-05T14:30:49Z | Running | 80 | grouping | pmml | | pmml_grey | 1 | 1 | 1000M | 2022-06-05T14:42:41Z | 2022-06-05T14:42:41Z | Running | 20 | grouping | pmml | +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+
The TRAFFICSTATE of the pmml_grey service changes to grouping. The percentage of traffic that is received by the service is 20%.