All Products
Search
Document Center

Platform For AI:Use a TensorFlow Serving image to deploy a model service

Last Updated:Dec 04, 2024

TensorFlow Serving is an inference serving engine for deep learning models. TensorFlow Serving allows you to deploy TensorFlow models in the SavedModel format as online services. TensorFlow Serving also supports features such as rolling updates and version management of models. This topic describes how to deploy a model service by using a TensorFlow Serving image.

Before you begin

Model files

To deploy a model service with a TensorFlow Serving image, ensure your model files are stored in an Object Storage Service (OSS) bucket of the following structure:

  • Version sub-directories: Each directory must have at least one version sub-directory. The name of a version sub-directory must be a number that indicates the model version. A larger number indicates a later model version.

  • Model files: Model files are stored in the SavedModel format within a version sub-directory. A model service automatically loads the model files from the sub-directory that corresponds to the latest version.

Take the following steps:

  1. Create a model storage directory in an OSS bucket (For example, oss://examplebucket/models/tf_serving/). For more information, see Manage directories.

  2. Upload the model file to the directory created in the previous step (you can use tf_serving.zip as a sample). The model storage directory format is as follows:

    tf_serving 
    ├── modelA
    │   └── 1
    │       ├── saved_model.pb
    │       └── variables
    │           ├── variables.data-00000-of-00001
    │           └── variables.index
    │
    ├── modelB
    │   ├── 1
    │   │   └── ...
    │   └── 2
    │       └── ...
    │
    └── modelC
        ├── 1
        │   └── ...
        ├── 2
        │   └── ...
        └── 3
            └── ...

Model configuration file

A configuration file allows you to run multiple models within a single service. If you only need to deploy a single-model service, skip this step.

Create a configuration file following the instructions below and upload it to OSS (the example provided in the Model files section includes a model configuration file named model_config.pbtxt, which you can use or modify as needed). In this example, the model configuration file is uploaded to oss://examplebucket/models/tf_serving/.

The model configuration file, model_config.pbtxt, should contain the following:

model_config_list {
  config {
    name: 'modelA'
    base_path: '/models/modelA/'
    model_platform: 'tensorflow'
    model_version_policy{
        all: {}
    }
  }
  config {
    name: 'modelB'
    base_path: '/models/modelB/'
    model_platform: 'tensorflow'
    model_version_policy{
        specific {
            versions: 1
            versions: 2
        }
    }
    version_labels {
      	key: 'stable'
      	value: 1
    }
    version_labels {
      	key: 'canary'
      	value: 2
    }
  }
  config {
    name: 'modelC'
    base_path: '/models/modelC/'
    model_platform: 'tensorflow'
    model_version_policy{
        latest {
            num_versions: 2
        }
    }
  }
}

The following table describes the key parameters:

Parameter

Required

Description

name

No

The name of the model. We recommend that you specify this parameter. Otherwise, you cannot call the service later.

base_path

Yes

The path to the model directory within the service instance, used to read model files in subsequent steps. For instance, if the mount directory is /models and the model directory to be loaded is /models/modelA, set this parameter to /models/modelA.

model_version_policy

No

The policy for loading model versions.

  • If this parameter is omitted, the latest version is loaded by default.

  • all{}: Loads all versions of the model. In the example, modelA loads all versions.

  • latest{}: In the example, modelC is set with num_versions: 2, which loads the two latest versions, versions 2 and 3.

  • specific{}: Loads specific versions. In the example, modelB loads versions 1 and 2.

version_labels

No

Custom labels for identifying model versions. Without version_labels, model versions can only be distinguished by their numbers. The request path is: /v1/models/<model name>/versions/<version number>:predict.

If version_labels are set, you can request the version label to point to a specific version number: /v1/models/<model name>/labels/<version label>:predict.

Note

Labels can only be assigned to model versions that have been loaded and started as services by default. To assign labels to unloaded model versions, set Command to Run to --allow_version_labels_for_unavailable_models=true. Scenario-based deployment does not support Command to Run. Select custom deployment instead.

Deploy the service

You can use one of the following methods to deploy a TensorFlow Serving model service.

  • Scenario-based deployment: suitable for basic deployment scenarios. You need to only configure a few parameters to deploy a TensorFlow Serving model service.

  • Custom deployment: suitable for model services that need to run in specific environments. You can configure settings based on your business requirements.

Important

TensorFlow Serving model services support ports 8501 and 8500:

  • 8501: launches an HTTP or REST server on port 8501 to receive HTTP requests.

  • 8500: launches a Google Remote Procedure Call (gRPC) server on port 8500 to receive gRPC requests

By default, scenario-based deployment uses port 8501 and cannot be changed. To configure port 8500 for gRPC services, select custom deployment.

The example below deploys modelA as a single-model service.

Scenario-based deployment

Perform the following steps:

  1. Log on to the PAI console. Select a region and a workspace. Then, click Enter Elastic Algorithm Service (EAS).

  2. On the Elastic Algorithm Service (EAS) page, click Deploy Service. In the Scenario-based Model Deployment section of the page that appears, click TensorFlow Serving Deployment.

  3. On the TFServing deployment page, configure the key parameters described in the following table. For information about other parameters, see Deploy a model service in the PAI console.

    Parameter

    Description

    Deployment Method

    Supported deployment methods include:

    • Standard Model Deployment: This method is for deploying services that use a single model.

    • Configuration File Deployment: This approach is for deploying services that incorporate multiple models.

    Model Settings

    When you choose Standard Model Deployment as the Deployment Method, specify the OSS path that contains the model files.

    When you choose Configuration File Deployment as the Deployment Method, configure the following parameters:

    • OSS: Choose the OSS path where the model files are stored.

    • Mount Path: Specifies the destination path within the service instance for accessing model files.

    • Configuration File: Select the OSS path for the model configuration file.

    Example configurations:

    Parameter

    Single-model example (deploy modelA)

    Multi-model example

    Service Name

    modela_scene

    multi_scene

    Deployment Method

    Choose Standard Model Deployment.

    Opt for Configuration File Deployment.

    Model Settings

    OSS: oss://examplebucket/models/tf_serving/modelA/.

    • OSS:oss://examplebucket/models/tf_serving/.

    • Mount Path: /models

    • Configuration File: oss://examplebucket/models/tf_serving/model_config.pbtxt

  4. Click Deploy.

Custom deployment

Perform the following steps:

  1. Log on to the PAI console. Select a region and a workspace. Then, click Enter Elastic Algorithm Service (EAS).

  2. On the Elastic Algorithm Service (EAS) page, click Deploy Service. In the Custom Model Deployment section, click Custom Deployment.

  3. On the Custom Deployment page, configure the key parameters described in the following table. For information about other parameters, see Deploy a model service in the PAI console.

    Parameter

    Description

    Image Configuration

    Select a version of tensorflow-serving form Alibaba Cloud Image. We recommend that you use the latest version.

    Note

    If the model service requires GPU resources, the image version must be in the x.xx.x-gpu format.

    Model Settings

    You can configure model files using multiple methods. This example uses OSS.

    • OSS: Choose the OSS path where the model files are stored.

    • Mount Path: The path within the service instance for reading model files.

    Run Command

    The startup parameters for tensorflow-serving. When selecting the tensorflow-serving image, the command /usr/bin/tf_serving_entrypoint.sh is preloaded. Configure the following parameters:

    Startup parameters for single-model deployment:

    • --model_name: The name of the model, used in the service request URL. Default value: model.

    • --model_base_path: The path to the model directory within the service instance. Default value: /models/model.

    Startup parameters for multi-model deployment:

    • --model_config_file: Required. The path to the model configuration file.

    • --model_config_file_poll_wait_seconds: Optional. The interval for checking updates to the model configuration file, in seconds. For example, --model_config_file_poll_wait_seconds=30 means the service checks the file every 30 seconds.

      Note

      When a new configuration file is detected, only the changes in the new file are applied. For instance, if Model A is removed from the new file and Model B is added, the service will unload Model A and load Model B.

    • --allow_version_labels_for_unavailable_models: Optional. Default value: false. Set to true to assign custom labels to unloaded model versions. For instance, --allow_version_labels_for_unavailable_models=true.

    Example configurations:

    Parameter

    Single-model example (deploy modelA)

    Multi-model example

    Deployment Method

    Select Image-based Deployment.

    Image Configuration

    Choose Alibaba Cloud Image: tensorflow-serving:2.14.1.

    Model Settings

    Select OSS.

    • OSS: oss://examplebucket/models/tf_serving/.

    • Mount Path: /models.

    Run Command

    /usr/bin/tf_serving_entrypoint.sh --model_name=modelA --model_base_path=/models/modelA

    /usr/bin/tf_serving_entrypoint.sh --model_config_file=/models/model_config.pbtxt --model_config_file_poll_wait_seconds=30 --allow_version_labels_for_unavailable_models=true

    The default port number is 8501. The model service launches an HTTP or REST server on port 8501 to receive HTTP requests. If you want the service to support gRPC requests, perform the following operations:

    • In Environment Information, change the Port Number to 8500.

    • In Service Configuration, Enable gRPC.

    • In Edit Service Configuration, add the following configuration:

      "networking": {
          "path": "/"
      }
  4. Select Deploy.

Call a model service

You can send HTTP or gRPC requests to a model service based on the port number that you configure when you deploy the model service. The example below sends requests to modelA.

  1. Prepare Test Data

    ModelA is an image classification model that uses the Fashion-MNIST training dataset, comprising 28x28 grayscale images. It predicts the likelihood of a sample belonging to one of ten categories. For testing, the test data for modelA service requests is represented by [[[[1.0]] * 28] * 28].

  2. Sample Request:

    HTTP request

    The service is configured to listen on port 8501 for HTTP requests. Below is a summary of the HTTP request paths for both single-model and multi-model deployments:

    Single Model

    Multi-Model

    Path format: <service_url>/v1/models/<model_name>:predict

    Where:

    • For scenario-based deployment, <model_name> is predefined as model.

    • For custom deployment, <model_name> corresponds to the model name set in the Command to Run. Default value: model.

    The service accommodates requests with or without a specified model version. The respective path formats are:

    • For requests without a version (automatically loading the latest version):

      <service_url>/v1/models/<model_name>:predict

    • For requests specifying a model version:

      <service_url>/v1/models/<model_name>/versions/<version_num>:predict

    • If version labels are configured:

      /v1/models/<model name>/labels/<version label>:predict

    Here, <model_name> refers to the name set in the model's configuration file.

    The <service_url> is the endpoint of your service. To view it, go to the Elastic Algorithm Service (EAS) page, click Invocation Method in the Service Type column of the desired service. View the endpoint on the Public Endpoint tab. This URL is pre-filled when using the console for online debugging.

    For instance, the HTTP request path for a scenario-based single model deployment of modelA is: <service_url>/v1/models/model:predict.

    Below are examples of how to send service requests using the console and Python code:

    Use the console

    Once the service is deployed, select Online Debugging in the Actions column. The <service_url> is included in the Request Parameter Online Tuning. Append the path /v1/models/model:predict to the URL and enter the request data in the Body:

    {"signature_name": "serving_default", "instances": [[[[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]], [[1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0]]]]}

    After setting the parameters, click Send Request. Sample return:

    image

    Python code

    Sample Python code:

    from urllib import request
    import json
    
    # Please replace with your service endpoint and token.
    # You can view the token on the Public address call tab by clicking Call information in the Service method column of the inference service list.
    service_url = '<service_url>'
    token = '<test-token>'
    # For scenario-based single-model deployment, use model. For other cases, refer to the path description table above.
    model_name = "model"
    url = "{}/v1/models/{}:predict".format(service_url, model_name)
    
    # Create an HTTP request.
    req = request.Request(url, method="POST")
    req.add_header('authorization', token)
    data = {
        'signature_name': 'serving_default',
        'instances': [[[[1.0]] * 28] * 28]
    }
    
    # Send the request.
    response = request.urlopen(req, data=json.dumps(data).encode('utf-8')).read()
    
    # View the response.
    response = json.loads(response)
    print(response)

    gRPC request

    For gRPC requests, configure the service to use port 8500 and add the necessary settings. Below is sample Python code for sending gRPC requests:

    import grpc
    import tensorflow as tf
    from tensorflow_serving.apis import predict_pb2
    from tensorflow_serving.apis import prediction_service_pb2_grpc
    from tensorflow.core.framework import tensor_shape_pb2
    
    # The endpoint of the service. For more information, see the description of the host parameter below.
    host = "tf-serving-multi-grpc-test.166233998075****.cn-hangzhou.pai-eas.aliyuncs.com:80"
    # Replace <test-token> with the token of the service. You can view the token on the Public address call tab.
    token = "<test-token>"
    
    # The name of the model. For more information, see the description of the name parameter below.
    name = "<model_name>"
    signature_name = "serving_default"
    # Set the value to a model version that you want to use. You can specify only one model version in a request.
    version = "<version_num>"
    
    # Create a gRPC request.
    request = predict_pb2.PredictRequest()
    request.model_spec.name = name
    request.model_spec.signature_name = signature_name
    request.model_spec.version.value = version
    request.inputs["keras_tensor"].CopyFrom(tf.make_tensor_proto([[[[1.0]] * 28] * 28]))
    
    # Send the request.
    channel = grpc.insecure_channel(host)
    stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
    metadata = (("authorization", token),)
    response, _ = stub.Predict.with_call(request, metadata=metadata)
    
    print(response)
    

    The following table details the primary parameters:

    Parameter

    Description

    host

    The endpoint of the model service without the http:// prefix and with the :80 suffix. To obtain the endpoint, perform the following steps: Go to the Elastic Algorithm Service (EAS) page, find the model service, and then click Invocation Method in the Service Type column. On the Public Endpoint tab, view the endpoint of the model service.

    name

    • For single-model gRPC requests:

      • In scenario-based deployments, set name to model.

      • In custom deployments, set name to the model name specified in the Command to Run. If not set, it defaults to model.

    • For multi-model gRPC requests:

      Set name to the model name outlined in the model configuration file.

    version

    The model version that you want to use. You can specify only one model version in a request.

    metadata

    The token of the model service. You can view the token on the Public Endpoint tab.

References