Connect the recommendation engine with the EasyRec processor when EAS of PAI is used for model inference but EasyRec is not used for model training - Artificial Intelligence Recommendation

This topic describes how to use Elastic Algorithm Service (EAS) of Platform for AI (PAI) for model inference when EasyRec is not used for model training. For more information, see EasyRec processor.

Background information

In some cases, users may have completed model training in other environments, but they still want to use the high-performance services provided by EAS for model inference.

Step 1: Prepare a model

Make sure that you have exported a TensorFlow model in SavedModel format and uploaded the model to an accessible storage location, such as a bucket of Alibaba Cloud Object Storage Service (OSS).

Step 2: Configure a service configuration file

Create a service configuration file, such as echo.json, and configure related parameters. The following code provides an example of the configuration file.

bizdate=$1
cat << EOF > echo.json
{
  "name":"ali_rec_rnk_no_fg",
  "metadata": {
    "instance": 2,
    "rpc": {
      "enable_jemalloc": 1,
      "max_queue_size": 100
    }
  },
  "cloud": {
    "computing": {
      "instance_type": "ecs.g7.large"",
      "instances": null
    }
  },
  "model_config": {
    "fg_mode": "bypass"
  },
  "processor": "easyrec-1.9",
  "processor_envs": [
    {
      "name": "INPUT_TILE",
      "value": "2"
    }
  ],
  "storage": [
    {
      "mount_path": "/home/admin/docker_ml/workspace/model/",
      "oss": {
        "path": "oss://easyrec/ali_rec_sln_acc_rnk/20221122/export/final/"
      }
    }
  ],
  "warm_up_data_path": "oss://easyrec/ali_rec_sln_acc_rnk/rnk_warm_up.bin"
}

EOF

In the example, the value of the fg_mode field is set to bypass, which indicates that feature generation (FG) is not enabled and only the TensorFlow model is deployed.

The following table describes the key parameters. For information about other parameters, see Parameters of model services.

Parameter	Required	Description	Example
processor	Yes	The name of the EasyRec processor.	"processor": "easyrec"
fg_mode	Yes	The feature engineering mode. Valid values: tf: the FG-enabled mode. In this mode, FG is embedded as an operator into a TensorFlow graph and the graph is optimized to improve model performance. bypass: the FG-disabled mode. In this mode, only a TensorFlow model is deployed. This mode is suitable for custom feature processing scenarios. If you use this mode, you do not need to configure the parameters related to Item Feature Cache and FeatureStore.	"fg_mode": "tf"
outputs	Yes	The name of the output variable of the TensorFlow model. Example: probs_ctr. Separate multiple names with commas (,). To obtain the name of the output variable, run the TensorFlow command saved_model_cli.	"outputs":"probs_ctr,probs_cvr"
save_req	No	Specifies whether to save the returned data files to the model directory. The files can be used for warmup and performance testing. Valid values: true: saves the returned data files to the model directory. false (default): does not save the returned data files to the model directory. For optimal performance, we recommend that you set this parameter to false in the production environment.	"save_req": "false"
Parameters related to Item Feature Cache
period	Yes	The interval at which item features are updated. Unit: minutes. If updates occur every few days, set this parameter to a value greater than one day (for example, 2880), since item features are updated every day when the service is updated.	"period": 2880
remote_type	Yes	The data source of item features. Valid values: hologres: reads and writes data from a Hologres instance by using the SQL interface. This method is suitable for storing and querying large amounts of data. none: adds item features by sending requests, instead of obtaining item features from Item Feature Cache. If you specify this value, set the tables parameter to [].	"remote_type": "hologres"
tables	No	The item feature table. This parameter is required only when you set the remote_type parameter to hologres. This parameter contains the following fields: key: the name of the item_id column. This field is required. name: the name of the item feature table. This field is required. value: the names of the columns to be loaded. Separate multiple column names with commas (,). This field is optional. condition: You can use a WHERE clause to filter items. Example: style_id<10000. This field is optional. timekey: Specifies when to update incremental item features. Supported data types: timestamp and int. This field is optional. static: Specifies that this is a static item feature which does not need periodical updates. This field is optional. If you want to read item feature data from multiple tables, configure this parameter in the following format: "tables": [{"key":"table1", ...},{"key":"table2", ...}] If the tables have duplicate columns, the column of the subsequent table overwrites that of the previous table.	"tables": { "key": "goods_id", "name": "public.ali_rec_item_feature" }
url	No	The endpoint for connecting to Hologres.	"url": "postgresql://LTAI****************:yourAccessKeySecret@hgprecn-cn-xxxxx-cn-hangzhou-vpc.hologres.aliyuncs.com:80/bigdata_rec"
Parameters related to FeatureStore
fs_project	No	The name of the FeatureStore project. This parameter is required if you use FeatureStore. For more information, see Configure FeatureStore projects.	"fs_project": "fs_demo"
fs_model	No	The name of the model feature in FeatureStore.	"fs_model": "fs_rank_v1"
fs_entity	No	The name of the feature entity in FeatureStore.	"fs_entity": "item"
region	No	The region where the FeatureStore service is deployed.	"region": "cn-beijing"
access_key_id	No	The AccessKey ID of the FeatureStore service.	"access_key_id": "LTAI****************"
access_key_secret	No	The AccessKey secret of the FeatureStore service.	"access_key_secret": "yourAccessKeySecret"
load_feature_from_offlinestore	No	Specifies whether to obtain offline feature data from an offline data store in FeatureStore. Valid values: True: obtains data from an offline data store in FeatureStore. False (default): obtains data from an online data store in FeatureStore.	"load_feature_from_offlinestore": True
Parameter related to automatic broadcasting
INPUT_TILE	No	Enables automatic broadcasting for item feature arrays. If the values of an item feature (such as user_id) are the same in a request, specify the value once and it will be duplicated into the array. Automatic broadcasting can reduce the request size, network transfer time, and compute time. To enable automatic broadcasting, set the INPUT_TILE parameter to 2. Note This parameter is supported in EasyRec 1.3 and later versions. If you set the fg_mode parameter to tf, automatic broadcasting is enabled by default. You do not need to configure this parameter.	"processor_envs": [ { "name": "INPUT_TILE", "value": "2" } ]

Step 3: Deploy the service

Use EASCMD to deploy the service configuration file created in the previous step.

# Run the deployment command. 
eascmd  create echo.json
# eascmd -i <AccessKeyID>  -k  <AccessKeySecret>   -e <endpoint> create echo.json
# Run the update command.
eascmd modify ali_rec_rnk_no_fg -s echo.json

Check output logs to ensure that the service is successfully deployed. After the service is deployed, you can obtain the access address of the service.

Step 4: Call the service

Call the EasyRec model service

If you use the bypass mode, you can use the Java SDK or the Python SDK to call the model service based on the request format of the EasyRec processor.

Example of using the Java SDK

Before you use the Java SDK, you must configure the Maven environment. For information about how to configure the Maven environment, see SDK for Java. Sample code for calling the ali_rec_rnk_no_fg service:

import java.util.List;

import com.aliyun.openservices.eas.predict.http.PredictClient;
import com.aliyun.openservices.eas.predict.http.HttpConfig;
import com.aliyun.openservices.eas.predict.request.TFDataType;
import com.aliyun.openservices.eas.predict.request.TFRequest;
import com.aliyun.openservices.eas.predict.response.TFResponse;

public class TestEasyRec {
    public static TFRequest buildPredictRequest() {
        TFRequest request = new TFRequest();
 
        request.addFeed("user_id", TFDataType.DT_STRING, 
                        new long[]{5}, new String []{ "u0001", "u0001", "u0001"});
      	request.addFeed("age", TFDataType.DT_FLOAT, 
                        new long[]{5}, new float []{ 18.0f, 18.0f, 18.0f});
        // Note: If you set the INPUT_TILE parameter to 2, you can simplify the code in the following manner:
        //    request.addFeed("user_id", TFDataType.DT_STRING,
        //            new long[]{1}, new String []{ "u0001" });
        //    request.addFeed("age", TFDataType.DT_FLOAT, 
        //            new long[]{1}, new float []{ 18.0f});
      	request.addFeed("item_id", TFDataType.DT_STRING, 
                        new long[]{5}, new String []{ "i0001", "i0002", "i0003"});  
        request.addFetch("probs");
      	return request;
    }

    public static void main(String[] args) throws Exception {
        PredictClient client = new PredictClient(new HttpConfig());

        // Call setDirectEndpoint to access the service by using a virtual private cloud (VPC) direct connection channel. 
        //   client.setDirectEndpoint("pai-eas-vpc.cn-shanghai.aliyuncs.com");
        // You need to create a VPC direct connection channel on the EAS page of the PAI console. 
        // Compared to using a gateway, using the direct connection channel improves stability and performance. 
        client.setEndpoint("yourAccessKeySecretx.vpc.cn-hangzhou.pai-eas.aliyuncs.com");
        client.setModelName("ali_rec_rnk_no_fg");
        client.setToken("");
        long startTime = System.currentTimeMillis();
        for (int i = 0; i < 100; i++) {
            try {
                TFResponse response = client.predict(buildPredictRequest());
                // probs: the name of the output field. You can run the cURL command to view the input and output of the model.
                //   curl yourAccessKeySecretx.vpc.cn-hangzhou.pai-eas.aliyuncs.com -H "Authorization:{token}"
                List<Float> result = response.getFloatVals("probs");
                System.out.print("Predict Result: [");
                for (int j = 0; j < result.size(); j++) {
                    System.out.print(result.get(j).floatValue());
                    if (j != result.size() - 1) {
                        System.out.print(", ");
                    }
                }
                System.out.print("]\n");
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
        long endTime = System.currentTimeMillis();
        System.out.println("Spend Time: " + (endTime - startTime) + "ms");
        client.shutdown();
    }
}

Example of using the Python SDK

For more information about how to use the Python SDK, see SDK for Python. Due to its limited performance, we recommend that you use Python SDK only for debugging purposes. Sample code for calling the ali_rec_rnk_no_fg service:

#!/usr/bin/env python

from eas_prediction import PredictClient
from eas_prediction import StringRequest
from eas_prediction.tf_request_pb2 import TFRequest

if __name__ == '__main__':
    client = PredictClient('http://yourAccessKeySecretx.vpc.cn-hangzhou.pai-eas.aliyuncs.com', 'ali_rec_rnk_no_fg')
    client.set_token('')
    client.init()

    req = TFRequest()
    req.add_feed('user_id', [3], TFRequest.DT_STRING, ['u0001'] * 3)
    req.add_feed('age', [3], TFRequest.DT_FLOAT, [18.0] * 3)
    # Note: If you set the INPUT_TILE parameter to 2, you can simplify the code in the following way:
    #   req.add_feed('user_id', [1], TFRequest.DT_STRING, ['u0001'])
    #   req.add_feed('age', [1], TFRequest.DT_FLOAT, [18.0])
    req.add_feed('item_id', [5], TFRequest.DT_STRING, 
        ['i0001', 'i0002', 'i0003'])
    for x in range(0, 100):
        resp = client.predict(req)
        print(resp)

You can also create custom service requests. For more information, see Request syntax.

Step 5: Monitor and optimize service performance

After the service is deployed, we recommend that you test service performance and optimize the service performance and stability based on test results.

Summary

Following the preceding steps, you can use EAS for model inference without using EasyRec for model training.