In large language model (LLM) services and model service-related scenarios such as intelligent search, recommendation, and advertising, you may need to modify the model service based on online performance and business requirements and adjust the traffic distribution between model services to implement low-cost, short-term model service tests to achieve high cost-effectiveness. PAI-ABTest provides ready-to-use general-purpose model service tests.
Limits
PAI-ABTest is available only in the following regions: China (Beijing), China (Shanghai), China (Hangzhou), and China (Shenzhen).
PAI-ABTest does not support access by using role-based single sign-on (SSO). You can access PAI-ABTest only by using a RAM user.
Billing method
PAI-ABTest is in invitational preview and provided free of charge. If you use other Alibaba Cloud services, such as Elastic Algorithm Service (EAS) of Platform for AI (PAI) or MaxCompute, you are charged for the resources consumed when you use PAI-ABTest. For more information, see Billing of EAS and Overview of MaxCompute Billing.
Terms
Experiment management
Experiment: An experiment is a combination of different experimental versions that require AB testing. You can configure multiple versions and combinations in an experiment. Experiment traffic comes from the experiment layer, which contains multiple experiments. Traffic of experiments at the same layer is mutually exclusive. An experiment includes a combination of parameters that control the experiment process. Traffic of each experiment version is randomly allocated. You can compare the different parameters and their effects.
Experiment project: aggregation of business logic. The services in similar business scenarios can be grouped in the same project.
Experiment domain: a collection of traffic. You can allocate traffic randomly, based on specific business, or by using filter conditions based on business attributes. You can allocate traffic based on custom allocation policy.
Experiment layer: An experiment domain contains one or more experiment layers. Traffic of each experimental layer is orthogonal. Each layer can use 100% of traffic of the domain.
Traffic management
User group: a collection of specific traffic IDs in the experiment.
Metric management
Metrics: the performance metrics and service metrics that are used to assess the experiment.
Data table: contains information such as the data source and related fields that are required for the experiment.
Global configuration
Publish management: deploy the optimal experiment configurations for all user groups.
Architecture
Architecture of ABTest
You can use the Alibaba Cloud ABTest service in the Platform for AI (PAI) console to configure experiments and experiment metrics.
ABTest provides SDKs for Go and Java. You can reference the SDKs on the ABTest server. The SDKs pull the experiment-related metadata in a round-robin manner. When you use an SDK, the SDK allocates traffic based on the access context to obtain the experiment-related configuration. Then, the SDK executes the related business logic based on the returned configuration.
You can register a MaxCompute log table as a source data table. The system registers the table on the ABTest server. When an application generates behavior logs, the application sends data back to the MaxCompute log table based on event tracking. When you configure experiment metrics in ABTest, if the content of the log table is generated in near real time, hourly and daily experiment metrics are generated at the same time and saved in the Hologres storage of ABTest.
You can view the reports related to an experiment in ABTest in the PAI console.
Implementation of an experiment
Single-layer experiment: After you create a project, the system automatically creates the default domain and layer. You can create an experiment on the default layer. The experiment can obtain all traffic of the layer or a specific portion of the traffic of the layer based on random allocation. An experiment includes multiple versions. You can configure the percentage of traffic that you want to allocate to each experiment version.
Multi-layer experiment: You can expand a single-layer experiment into multiple layers. The traffic of each layer is orthogonal. You can create experiments on each layer.
The combination of layers and domains: A layer can contain multiple domains, and you can create multiple layers in a domain, as shown in the following figure. In multi-layer experiment scenarios, you can conduct an experiment on a single layer or an integrated experiment across layers. The following figure provides a sample configuration.
Permissions
Grant access permission to RAM user
Grant the ABTest full access permission to the RAM user.
Log on to the RAM console with your Alibaba Cloud account.
In the left-side navigation pane, choose
.Click Create Policy. On the page that appears, select JSON and enter the following sample policy. Set the name of the policy to
pai_abtest_full_access
. For more information, see Create custom policies.{ "Version": "1", "Statement": [ { "Effect": "Allow", "Action": "paiabtest:*", "Resource": "*" } ] }
Click Grant Permission and grant the
pai_abtest_full_access
permission to the RAM user. For more information, see Grant permissions to a RAM user.
Authorize ABTest to access other services
ABTest requires service-linked role (SLR) authorization. The name of the SLR is AliyunServiceRoleForPAIABTest. Sample policy:
{
"Version": "1",
"Statement": [
{
"Action": "ram:DeleteServiceLinkedRole",
"Resource": "*",
"Effect": "Allow",
"Condition": {
"StringEquals": {
"ram:ServiceName": "abtest.pai.aliyuncs.com"
}
}
},
{
"Effect": "Allow",
"Action": [
"odps:ActOnBehalfOfAnotherUser",
"odps:ListProjects",
"odps:ListTables"
],
"Resource": "acs:odps:*:*:users/*"
}
]
}