The Elastic Algorithm Service (EAS) module of Platform for AI (PAI) provides multiple methods to help you deploy model services based on your business requirements.
Deployment methods
You can deploy a model by using an image or a processor in EAS.
Use an image (recommended)
If you deploy a model by using an image, EAS pulls the image that contains the runtime environment from Container Registry (ACR) and mounts model files and code from storage services such as Object Storage Service (OSS) and File Storage NAS (NAS).
The following figure shows the workflow of deploying a model by using an image in EAS.
Take note of the following items:
You can use one of the following methods when you deploy a model by using an image:
Deploy Service by Using Image: You can call the service by using API operations after deployment.
Deploy Web App by Using Image: You can access the web application by using a link after deployment.
For information about the differences between the two methods, see the "Step 2: Deploy a model" section of this topic.
PAI provides multiple prebuilt images to accelerate model deployment. You can also create a custom image and upload the image to ACR.
We recommend that you upload the model files and the code files that contain the preprocessing or postprocessing logic to storage services. This way, you can mount the files to the runtime environment. Compared with packaging the files into a custom image, this method allows you to update the model in a convenient manner.
When you deploy a model by using an image, we recommend that you build an HTTP server to receive requests that are forwarded by EAS. The HTTP server cannot receive requests on ports 8080 and 9090 because the EAS engine listens on these ports.
If you use a custom image, you must upload the image to ACR before you use the image during deployment. Otherwise, EAS may fail to pull the image. If you use Data Science Workshop (DSW) to develop a model, you must upload the image to ACR before you use the image in EAS.
If you want to reuse your custom images or warm-up data in other scenarios, you can manage the images or data in a centralized manner by using the AI Computing Asset Management module of PAI. EAS does not support mounting CPFS datasets from NAS.
Use a processor
If you deploy a model by using a processor, prepare the model files and processor files, upload the files to storage services such as OSS or NAS before deployment, and then mount the files to EAS during deployment.
The following figure shows the workflow of deploying a model by using a processor in EAS.
Take note of the following items:
PAI provides multiple prebuilt images to accelerate model deployment. You can also create a custom image based on your business requirements and upload the image to ACR.
We recommend that you develop and store the model file and the processor file separately. You can call the get_model_path() method in the processor file to obtain the path of the model file. This allows you to update the model in a convenient manner.
When you deploy a model by using a processor, EAS automatically pulls an official image based on the inference framework of the model and deploys an HTTP server based on the processor file to receive service requests.
When you deploy a model by using a processor, make sure that the inference framework of the model and the processor file meet the requirements of the development environment. This method is less flexible and efficient. We recommend that you deploy a model by using an image.
Deployment tools and methods
The following table describes the deployment tools.
Operation
GUI tools
CLI tools
Deploy services
Use the PAI console or Machine Learning Designer to deploy a service with a few clicks. For more information, see Deploy a model service in the PAI console or Deploy a model service by using Machine Learning Designer.
Use DSW or the EASCMD client to deploy a service. For more information, see Deploy model services by using EASCMD or DSW.
Manage services
Manage model services on the EAS-Online Model Services page. For more information, see Deploy a model service in the PAI console.
The following operations are supported:
View invocation information.
View logs, monitoring information, and service deployment information.
Scale, start, stop, and delete model services.
Use the EASCMD client to manage model services. For more information, see Run commands to use the EASCMD client.
If you use a dedicated resource group to deploy a model service, you can mount the required data from storage services. For more information, see Mount storage to services.
The following table describes the deployment methods.
Deployment method
Description
Reference
Deploy Service by Using Image (recommended)
Scenario: Use an image to deploy a model service.
Benefits:
Images ensure consistency between the model development environment and the runtime environments.
Prebuilt images for common scenarios allow you to complete deployment with a few clicks.
Custom images can be used for deployment without the need for modification.
Deploy Web App by Using Image (recommended)
Scenario: Use an image to deploy a web application.
Benefits:
Prebuilt images for common scenarios, such as Stable-Diffusion-Webui and Chat-LLM-Webui, allow you to complete deployment with a few clicks. You can build an HTTP server by using frameworks such as Gradio, Flask, and FastAPI.
Custom images can be used for deployment without the need for modification.
Deploy Service by Using Model and Processor
EAS provides prebuilt processors for common model frameworks, such as PMML and XGBOOST, to accelerate deployment.
If the prebuilt processors cannot meet your business requirements, you can build custom processors to obtain greater flexibility.
Advanced configurations
Service groups
EAS supports service groups, which you can use in scenarios that require traffic distribution across multiple services, such as canary releases. For more information, see Manage service groups.
Scheduled service deployment
You can use DataWorks to automatically deploy services on a regular basis. For more information, see Configure scheduled model deployment.
Instance utilization
EAS provides preemptible instances and allows you to select multiple instance types. This way, you can deploy services in a cost-effective manner. For more information, see Create and use preemptible instances and Specify multiple instance types.
Storage integration
EAS can mount data from multiple storage services, such as Object Storage Service (OSS), File Storage NAS (NAS), and Git repositories. For more information, see Mount storage to services.
Model warm-up
EAS provides the model warm-up feature to reduce the delay in processing the first request after deployment. This ensures that model services can work as expected immediately after they are published. For more information, see Warm up model services (advanced).
References
You can use multiple methods to call the service that you deployed. For more information, see Methods for calling services.
You can view the metrics related to service invocation and operational health on the Service Monitoring tab. For more information, see Service monitoring.