This topic describes how to use Elastic Algorithm Service (EAS) of Platform for AI (PAI) for model inference when EasyRec is not used for model training. For more information, see EasyRec processor.
Background information
In some cases, users may have completed model training in other environments, but they still want to use the high-performance services provided by EAS for model inference.
Step 1: Prepare a model
Make sure that you have exported a TensorFlow model in SavedModel format and uploaded the model to an accessible storage location, such as a bucket of Alibaba Cloud Object Storage Service (OSS).
Step 2: Configure a service configuration file
Create a service configuration file, such as echo.json, and configure related parameters. The following code provides an example of the configuration file.
bizdate=$1
cat << EOF > echo.json
{
"name":"ali_rec_rnk_no_fg",
"metadata": {
"instance": 2,
"rpc": {
"enable_jemalloc": 1,
"max_queue_size": 100
}
},
"cloud": {
"computing": {
"instance_type": "ecs.g7.large"",
"instances": null
}
},
"model_config": {
"fg_mode": "bypass"
},
"processor": "easyrec-1.9",
"processor_envs": [
{
"name": "INPUT_TILE",
"value": "2"
}
],
"storage": [
{
"mount_path": "/home/admin/docker_ml/workspace/model/",
"oss": {
"path": "oss://easyrec/ali_rec_sln_acc_rnk/20221122/export/final/"
}
}
],
"warm_up_data_path": "oss://easyrec/ali_rec_sln_acc_rnk/rnk_warm_up.bin"
}
EOF
In the example, the value of the fg_mode field is set to bypass, which indicates that feature generation (FG) is not enabled and only the TensorFlow model is deployed.
The following table describes the key parameters. For information about other parameters, see Parameters of model services.
Parameter | Required | Description | Example |
processor | Yes | The name of the EasyRec processor. | "processor": "easyrec" |
fg_mode | Yes | The feature engineering mode. Valid values:
| "fg_mode": "tf" |
outputs | Yes | The name of the output variable of the TensorFlow model. Example: probs_ctr. Separate multiple names with commas (,). To obtain the name of the output variable, run the TensorFlow command saved_model_cli. | "outputs":"probs_ctr,probs_cvr" |
save_req | No | Specifies whether to save the returned data files to the model directory. The files can be used for warmup and performance testing. Valid values:
| "save_req": "false" |
Parameters related to Item Feature Cache | |||
period | Yes | The interval at which item features are updated. Unit: minutes. If updates occur every few days, set this parameter to a value greater than one day (for example, 2880), since item features are updated every day when the service is updated. | "period": 2880 |
remote_type | Yes | The data source of item features. Valid values:
| "remote_type": "hologres" |
tables | No | The item feature table. This parameter is required only when you set the remote_type parameter to hologres. This parameter contains the following fields:
If you want to read item feature data from multiple tables, configure this parameter in the following format: "tables": [{"key":"table1", ...},{"key":"table2", ...}] If the tables have duplicate columns, the column of the subsequent table overwrites that of the previous table. | "tables": { "key": "goods_id", "name": "public.ali_rec_item_feature" } |
url | No | The endpoint for connecting to Hologres. | "url": "postgresql://LTAI****************:yourAccessKeySecret@hgprecn-cn-xxxxx-cn-hangzhou-vpc.hologres.aliyuncs.com:80/bigdata_rec" |
Parameters related to FeatureStore | |||
fs_project | No | The name of the FeatureStore project. This parameter is required if you use FeatureStore. For more information, see Configure FeatureStore projects. | "fs_project": "fs_demo" |
fs_model | No | The name of the model feature in FeatureStore. | "fs_model": "fs_rank_v1" |
fs_entity | No | The name of the feature entity in FeatureStore. | "fs_entity": "item" |
region | No | The region where the FeatureStore service is deployed. | "region": "cn-beijing" |
access_key_id | No | The AccessKey ID of the FeatureStore service. | "access_key_id": "LTAI****************" |
access_key_secret | No | The AccessKey secret of the FeatureStore service. | "access_key_secret": "yourAccessKeySecret" |
load_feature_from_offlinestore | No | Specifies whether to obtain offline feature data from an offline data store in FeatureStore. Valid values:
| "load_feature_from_offlinestore": True |
Parameter related to automatic broadcasting | |||
INPUT_TILE | No | Enables automatic broadcasting for item feature arrays. If the values of an item feature (such as user_id) are the same in a request, specify the value once and it will be duplicated into the array.
Note
| "processor_envs": [ { "name": "INPUT_TILE", "value": "2" } ] |
Step 3: Deploy the service
Use EASCMD to deploy the service configuration file created in the previous step.
# Run the deployment command.
eascmd create echo.json
# eascmd -i <AccessKeyID> -k <AccessKeySecret> -e <endpoint> create echo.json
# Run the update command.
eascmd modify ali_rec_rnk_no_fg -s echo.jsonCheck output logs to ensure that the service is successfully deployed. After the service is deployed, you can obtain the access address of the service.
Step 4: Call the service
Call the EasyRec model service
If you use the bypass mode, you can use the Java SDK or the Python SDK to call the model service based on the request format of the EasyRec processor.
Example of using the Java SDK
Before you use the Java SDK, you must configure the Maven environment. For information about how to configure the Maven environment, see SDK for Java. Sample code for calling the ali_rec_rnk_no_fg service:
import java.util.List;
import com.aliyun.openservices.eas.predict.http.PredictClient;
import com.aliyun.openservices.eas.predict.http.HttpConfig;
import com.aliyun.openservices.eas.predict.request.TFDataType;
import com.aliyun.openservices.eas.predict.request.TFRequest;
import com.aliyun.openservices.eas.predict.response.TFResponse;
public class TestEasyRec {
public static TFRequest buildPredictRequest() {
TFRequest request = new TFRequest();
request.addFeed("user_id", TFDataType.DT_STRING,
new long[]{5}, new String []{ "u0001", "u0001", "u0001"});
request.addFeed("age", TFDataType.DT_FLOAT,
new long[]{5}, new float []{ 18.0f, 18.0f, 18.0f});
// Note: If you set the INPUT_TILE parameter to 2, you can simplify the code in the following manner:
// request.addFeed("user_id", TFDataType.DT_STRING,
// new long[]{1}, new String []{ "u0001" });
// request.addFeed("age", TFDataType.DT_FLOAT,
// new long[]{1}, new float []{ 18.0f});
request.addFeed("item_id", TFDataType.DT_STRING,
new long[]{5}, new String []{ "i0001", "i0002", "i0003"});
request.addFetch("probs");
return request;
}
public static void main(String[] args) throws Exception {
PredictClient client = new PredictClient(new HttpConfig());
// Call setDirectEndpoint to access the service by using a virtual private cloud (VPC) direct connection channel.
// client.setDirectEndpoint("pai-eas-vpc.cn-shanghai.aliyuncs.com");
// You need to create a VPC direct connection channel on the EAS page of the PAI console.
// Compared to using a gateway, using the direct connection channel improves stability and performance.
client.setEndpoint("yourAccessKeySecretx.vpc.cn-hangzhou.pai-eas.aliyuncs.com");
client.setModelName("ali_rec_rnk_no_fg");
client.setToken("");
long startTime = System.currentTimeMillis();
for (int i = 0; i < 100; i++) {
try {
TFResponse response = client.predict(buildPredictRequest());
// probs: the name of the output field. You can run the cURL command to view the input and output of the model.
// curl yourAccessKeySecretx.vpc.cn-hangzhou.pai-eas.aliyuncs.com -H "Authorization:{token}"
List<Float> result = response.getFloatVals("probs");
System.out.print("Predict Result: [");
for (int j = 0; j < result.size(); j++) {
System.out.print(result.get(j).floatValue());
if (j != result.size() - 1) {
System.out.print(", ");
}
}
System.out.print("]\n");
} catch (Exception e) {
e.printStackTrace();
}
}
long endTime = System.currentTimeMillis();
System.out.println("Spend Time: " + (endTime - startTime) + "ms");
client.shutdown();
}
}Example of using the Python SDK
For more information about how to use the Python SDK, see SDK for Python. Due to its limited performance, we recommend that you use Python SDK only for debugging purposes. Sample code for calling the ali_rec_rnk_no_fg service:
#!/usr/bin/env python
from eas_prediction import PredictClient
from eas_prediction import StringRequest
from eas_prediction.tf_request_pb2 import TFRequest
if __name__ == '__main__':
client = PredictClient('http://yourAccessKeySecretx.vpc.cn-hangzhou.pai-eas.aliyuncs.com', 'ali_rec_rnk_no_fg')
client.set_token('')
client.init()
req = TFRequest()
req.add_feed('user_id', [3], TFRequest.DT_STRING, ['u0001'] * 3)
req.add_feed('age', [3], TFRequest.DT_FLOAT, [18.0] * 3)
# Note: If you set the INPUT_TILE parameter to 2, you can simplify the code in the following way:
# req.add_feed('user_id', [1], TFRequest.DT_STRING, ['u0001'])
# req.add_feed('age', [1], TFRequest.DT_FLOAT, [18.0])
req.add_feed('item_id', [5], TFRequest.DT_STRING,
['i0001', 'i0002', 'i0003'])
for x in range(0, 100):
resp = client.predict(req)
print(resp)You can also create custom service requests. For more information, see Request syntax.
Step 5: Monitor and optimize service performance
After the service is deployed, we recommend that you test service performance and optimize the service performance and stability based on test results.
Summary
Following the preceding steps, you can use EAS for model inference without using EasyRec for model training.