All Products
Search
Document Center

Platform For AI:Deploy a Hugging Face model in EAS

Last Updated:Dec 11, 2024

Platform for AI (PAI) provides preset images for community model deployment and acceleration mechanisms for model distribution and image startup in Elastic Algorithm Service (EAS). You can quickly deploy a community model in EAS by configuring a few parameters. This topic describes how to deploy a Hugging Face model in EAS.

Background information

Open source model communities, such as Hugging Face, provide various machine learning models and code implementations. You can access the encapsulated models, frameworks, and related processing logic through interfaces of related libraries. You can perform end-to-end operations such as model training and calling with only a few lines of code without the need to consider environment dependencies, preprocessing and post-processing logic, or framework types. For more information, visit the official site of Hugging Face. This ecology is an upgrade of deep learning frameworks such as TensorFlow and PyTorch.

EAS provides optimized capabilities to facilitate community model deployment.

Deploy a Hugging Face model

PAI allows you to deploy a model in the pipelines library of Hugging Face as an online model service in EAS. Perform the following steps:

  1. Select the model that you want to deploy from the pipelines library. In this example, the distilbert-base-uncased-finetuned-sst-2-english model is selected. On the details page of the distilbert-base-uncased-finetuned-sst-2-english model, obtain the model ID, task, and revision information, and save them to your on-premises machine.0ec0f57fcd2cee6a6c91b53d67616f26.png

    The following table provides the mapping relationship between the TASK that is displayed on the Hugging Face page and the TASK parameter that you need to specify when you deploy services in EAS:

    TASK displayed on the Hugging Face page

    TASK to specify in EAS

    Audio Classification

    audio-classification

    Automatic Speech Recognition(ASR)

    automatic-speech-recognition

    Feature Extraction

    feature-extraction

    Fill Mask

    fill-mask

    Image Classification

    image-classification

    Question Answering

    question-answering

    Summarization

    summarization

    Text Classification

    text-classification

    Sentiment Analysis

    sentiment-analysis

    Text Generation

    text-generation

    Translation

    translation

    Translation (xx-to-yy)

    translation_xx_to_yy

    Text-to-Text Generation

    text2text-generation

    Zero-Shot Classification

    zero-shot-classification

    Document Question Answering

    document-question-answering

    Visual Question Answering

    visual-question-answering

    Image-to-Text

    image-to-text

  2. On the EAS page, deploy the Hugging Face model.

    1. Log on to the PAI console. Select a region on the top of the page. Then, select the desired workspace and click Enter Elastic Algorithm Service (EAS).

    2. Click Deploy Service. In the Custom Model Deployment section, click Custom Deployment.

    3. On the Custom Deployment page, configure the following parameters. For information about the other parameters, see Deploy a model service in the PAI console.

      Parameter

      Description

      Basic Information

      Service Name

      Specify a name for the service.

      Environment Information

      Deployment Method

      Select Image-based Deployment and Enable Web App.

      Image Configuration

      Select huggingface-inference from Alibaba Cloud Image. Then, select an image version.

      Environment Variable

      Click Add and use the following information that is obtained in Step 1:

      • MODEL_ID: distilbert-base-uncased-finetuned-sst-2-english

      • TASK: text-classification

      • REVISION: main

      Command

      The system automatically configures the command to run after you configure the Image Configuration parameter. You do not need to modify the setting.

      Resource Deployment

      Additional System Disk

      Set to 100 GB.

    4. Click Deploy. If Service Status changes to Running, the service is deployed.

  3. Call the deployed model service.

    Use the console to call the service

    • On the EAS page, click View Web App in the Service Type column to verify the model inference effect on the web UI.image.png

    • Click Online Debugging in the Actions column of the service. On the Body tab, enter the request data, such as, {"data": ["hello"]}, and click Send Request.image

      Note

      The format of the input data ({"data": ["XXX"]}) of a text classification model is defined by /api/predict of the Gradio framework. If you use other types of models, such as image classification or speech data processing, you can construct the request data by referring to /api/predict.

    Use APIs to call the service

    1. Click the service name to go to the Service Details tab. On the tab that appears, click View Endpoint Information.

    2. On the Public Endpoint tab of the Invocation Method dialog box, obtain the values of URL and Token.

    3. Use the following code to call the service:

      import requests
      
      resp = requests.post(url="<service_url>",
                    headers={"Authorization": "<token>"},
                    json={"data": ["hello"]})
      
      print(resp)
      
      # resp: {"data":[{"label":"POSITIVE","confidences":[{"label":"POSITIVE","confidence":0.9995185136795044}]}],"is_generating":false,"duration":0.280987024307251,"average_duration":0.280987024307251}

      Replace <service_url> and <token> with the values obtained in Step b.