EasyRec - Platform For AI - Alibaba Cloud ドキュメントセンター

Platform for AI (PAI) のElastic Algorithm Service (EAS) は、EasyRecプロセッサを内蔵しています。このプロセッサは、スコアリングサービスとしてEasyRecまたはTensorFlow推奨モデルの展開をサポートし、機能エンジニアリング機能を統合します。 EasyRecプロセッサを使用して、機能エンジニアリングとTensorFlowモデルの両方に最適化された高性能スコアリングサービスをデプロイできます。このトピックでは、EasyRecモデルサービスをデプロイして呼び出す方法について説明します。

背景情報

次の図は、EasyRecプロセッサに基づくレコメンデーションエンジンのアーキテクチャを示しています。

EasyRecプロセッサには、次のモジュールが含まれます。

Item Feature Cache: このモジュールはFeatureStoreの機能をメモリにキャッシュします。これにより、頻繁なリクエスト操作によるFeatureStoreの負荷が軽減されます。アイテム機能キャッシュは、リアルタイム機能更新などの増分更新をサポートします。
Feature Generator (FG): このモジュールでは、リアルタイムおよびオフラインの機能エンジニアリングに同じ実装を使用して、一貫性を確保します。 FGは淘宝網からの豊富な経験に基づいて設計されています。
TFModel: このモジュールは、TensorFlowを使用してEasyRecプロセッサによってエクスポートされたSavedModelをロードし、Bladeを使用してCPUとGPUの両方で推論の最適化を行います。
機能追跡と増分更新: ほとんどの場合、これらのモジュールはリアルタイムトレーニングに使用されます。詳細については、「Online Deep Learning」をご参照ください。

制限事項

EasyRecプロセッサは、T4、A10、3090、4090タイプのGPUデバイス、およびIntel CPUを使用するg6、g7、g8などの汎用Elastic Compute Service (ECS) インスタンスファミリーで使用できます。

プロセッサのバージョン

EasyRecプロセッサは継続的に改善されています。それ以降のバージョンでは、機能と推論パフォーマンスが強化されます。最適な結果を得るには、最新バージョンを使用して推論サービスをデプロイすることを推奨します。次の表に、リリースされたバージョンとその基本情報を示します。

プロセッサ名	リリース日	TensorFlowバージョン	新機能
easyrec	20230608	2.10	Feature GeneratorおよびItem Feature Cacheモジュールを追加します。オンライン深層学習をサポートします。 Faissベクトルリコールをサポートします。 GPU推論をサポートします。
easyrec-1.2	20230721	2.10	重み付きカテゴリの埋め込みを改善します。
easyrec-1.3	20230802	2.10	MaxComputeからアイテム機能キャッシュへのアイテム機能の読み込みをサポートします。
easyrec-1.6	20231006	2.10	アイテム機能の自動放送をサポートします。 GPU配置を改善します。モデルディレクトリに保存されるリクエストをサポートします。
easyrec-1.7	20231013	2.10	Kerasモデルのパフォーマンスを向上させます。
easyrec-1.8	20231101	2.10	FeatureStoreのクラウドバージョンをサポートします。
easyrec-kv-1.8	20231220	DeepRec (deeprec2310)	DeepRec EmbeddingVariableをサポートします。
easyrec-1.9	20231222	2.10	TagFeatureとRawFeatureのグラフ最適化の問題を修正します。
easyrec-2.4	20240826	2.10	FeatureStore SDK for CppはFeatureDBをサポートしています。 FeatureStore SDK for CppはSTSトークンをサポートしています。リクエストはdouble (float64) タイプをサポートしています。

手順1: サービスのデプロイ

EASCMDクライアントを使用してEasyRecモデルサービスをデプロイする場合、プロセッサタイプをeasyrec-{version} に設定する必要があります。詳細については、「EASCMDまたはDSWを使用したモデルサービスのデプロイ」をご参照ください。次のコードでは、構成ファイルの例を示します。

サンプルコード (FGが有効)

bizdate=$1
cat << EOF > echo.json
{
  "name":"ali_rec_rnk_with_fg",
  "metadata": {
    "instance": 2,
    "rpc": {
      "enable_jemalloc": 1,
      "max_queue_size": 100
    }
  },
  "cloud": {
    "computing": {
      "instance_type": "ecs.g7.large"",
      "instances": null
    }
  },
  "model_config": {
    "remote_type": "hologres",
    "url": "postgresql://<AccessKeyID>:<AccessKeySecret>@<DomainName>:<port>/<database>",
    "tables": [{"name":"<schema>.<table_name>","key":"<index_column_name>","value": "<column_name>"}],
    "period": 2880,
    "fg_mode": "tf",
    "outputs":"probs_ctr,probs_cvr",
  },
  "model_path": "",
  "processor": "easyrec-1.9",
  "storage": [
    {
      "mount_path": "/home/admin/docker_ml/workspace/model/",
      "oss": {
        "path": "oss://easyrec/ali_rec_sln_acc_rnk/20221122/export/final_with_fg"
      }
    }
  ]
}

EOF
# Run the deployment command. 
eascmd  create echo.json
# eascmd -i <AccessKeyID>  -k  <AccessKeySecret>   -e <endpoint> create echo.json
# Run the update command.
eascmd update ali_rec_rnk_with_fg -s echo.json

サンプルコード (FGが無効の場合)

bizdate=$1
cat << EOF > echo.json
{
  "name":"ali_rec_rnk_no_fg",
  "metadata": {
    "instance": 2,
    "rpc": {
      "enable_jemalloc": 1,
      "max_queue_size": 100
    }
  },
  "cloud": {
    "computing": {
      "instance_type": "ecs.g7.large"",
      "instances": null
    }
  },
  "model_config": {
    "fg_mode": "bypass"
  },
  "processor": "easyrec-1.9",
  "processor_envs": [
    {
      "name": "INPUT_TILE",
      "value": "2"
    }
  ],
  "storage": [
    {
      "mount_path": "/home/admin/docker_ml/workspace/model/",
      "oss": {
        "path": "oss://easyrec/ali_rec_sln_acc_rnk/20221122/export/final/"
      }
    }
  ],
  "warm_up_data_path": "oss://easyrec/ali_rec_sln_acc_rnk/rnk_warm_up.bin"
}

EOF
# Run the deployment command. 
eascmd  create echo.json
# eascmd -i <AccessKeyID>  -k  <AccessKeySecret>   -e <endpoint> create echo.json
# Run the update command.
eascmd update ali_rec_rnk_no_fg -s echo.json

次の表に、主要なパラメーターを示します。その他のパラメーターについては、「モデルサービスのパラメーター」をご参照ください。

パラメーター	必須	説明	例
プロセッサ	必須	EasyRecプロセッサの名前。	`"プロセッサ": "easyrec"`
fg_mode	必須	機能エンジニアリングモード。有効な値： tf: FG対応モード。このモードでは、FGは演算子としてTensorFlowグラフに埋め込まれ、モデルのパフォーマンスを向上させるためにグラフが最適化されます。 bypass: FG無効モード。このモードでは、TensorFlowモデルのみがデプロイされます。このモードは、カスタム機能を処理する必要があるシナリオに適しています。このモードを使用する場合、アイテム機能キャッシュおよびFeatureStoreに関連するパラメーターを設定する必要はありません。	`"fg_mode": "tf"`
出力	必須	TensorFlowモデルの出力変数の名前。例: provs_ctr. 複数の名前はコンマ (,) で区切ります。出力変数の名前を取得するには、TensorFlowコマンドsaved_model_cliを実行します。	"outputs":"probs_ctr,probs_cvr"
save_req	選択可能	返されたデータファイルをmodelディレクトリに保存するかどうかを指定します。これらのファイルは、ウォームアップやパフォーマンステストに使用できます。有効な値： true: 返されたデータファイルはモデルディレクトリに保存されます。 false (デフォルト): 返されたデータファイルはmodelディレクトリに保存されません。パフォーマンスを最適化するために、本番環境でこのパラメーターをfalseに設定することを推奨します。	"save_req": "false"
アイテム機能キャッシュに関連するパラメーター
ピリオド	必須	アイテムフィーチャが更新される間隔。単位は分です。更新が数日おきに行われる場合は、このパラメーターを1日より大きい値 (2880など) に設定します。このようにして、サービスが毎日更新されるときにアイテム機能が更新されます。	`"period": 2880`
remote_type	必須	アイテムフィーチャのデータソース。有効な値： hologres: SQLインターフェイスを使用して、Hologresインスタンスからデータを読み書きします。この方法は、大量のデータの保存とクエリに適しています。 none: アイテム機能キャッシュから取得する代わりに、リクエストを送信してアイテム機能を追加します。このパラメーターをnoneに設定した場合、tablesパラメーターを [] に設定します。	`"remote_type": "hologres"`
テーブル	選択可能	アイテム機能テーブル。このパラメーターは、remote_typeをhologresに設定した場合にのみ必要です。このパラメータには、次のフィールドが含まれます。 key: 必須です。 item_id列の名前。 name: 必須です。フィーチャーテーブルの名前。 value: オプション。ロードする列の名前。複数の列名はコンマ (,) で区切ります。 condition: オプション。 WHERE substatementを使用してアイテムをフィルタリングできます。例: `style_id<10000` timekey: オプション。増分アイテム機能を更新するタイミングを指定します。サポートされているデータ型: timestampとint。 static: オプション。これが静的アイテム機能であり、定期的な更新を必要としないことを指定します。複数のテーブルから項目フィーチャデータを読み取る場合は、このパラメーターを次の形式で設定します。 `"tables": [{"key":"table1", ...},{ "key":"table2", ...}]` テーブルに重複する列がある場合、後続のテーブルの列は前のテーブルの列を上書きします。	`"tables": {` `"key": "goods_id" 、` `"name": "public.ali_rec_item_feature"` `}`
url	選択可能	Hologresに接続するためのエンドポイント。	`"url": "postgresql:// LTAIXXXXX:J6geXXXXXX@hgprecn-cn-xxxxx-cn-hangzhou-vpc.hologres.aliyuncs.com:80/bigdata_rec"`
FeatureStoreに関連するパラメーター
fs_project	選択可能	FeatureStoreプロジェクトの名前。このパラメーターは、FeatureStoreを使用する場合に必要です。詳細については、「FeatureStoreプロジェクトの設定」をご参照ください。	"fs_project": "fs_demo"
fs_model	選択可能	FeatureStoreのモデルフィーチャーの名前。	"fs_model": "fs_rank_v1"
fs_entity	選択可能	FeatureStore内のフィーチャエンティティの名前。	"fs_entity": "item"
region	選択可能	FeatureStoreプロジェクトがデプロイされているリージョン。	"region": "cn-beijing"
access_key_id	選択可能	FeatureStoreへのアクセスに使用されるAccessKey ID。	"access_key_id": "xxxxx"
access_key_secret	選択可能	FeatureStoreへのアクセスに使用されるAccessKeyシークレット。	"access_key_secret": "xxxxx"
load_feature_from_offlinestore	選択可能	オフライン機能がFeatureStore OfflineStoreからデータを取得するかどうかを指定します。有効な値： True: オフライン機能はFeatureStore OfflineStoreからデータを取得します。 False (デフォルト): オフライン機能はFeatureStore OnlineStoreからデータを取得します。	"load_feature_from_offlinestore": 真
パラメータ関连する自动放送
INPUT_TILE	選択可能	アイテム機能配列の自動ブロードキャストを有効にします。 user_idなどのアイテム機能の値がリクエスト内で同じである場合は、値を1回指定すると、配列に複製されます。自動ブロードキャストは、要求サイズ、ネットワーク転送時間、および計算時間を削減できます。自動ブロードキャストを有効にするには、INPUT_TILEを2に設定します。説明このパラメーターは、easyrec-1.3以降のバージョンでサポートされています。 fg_modeをtfに設定した場合、自動ブロードキャストはデフォルトで有効になっており、このパラメーターを設定する必要はありません。	"processor_envs": [ { "name": "INPUT_TILE" 、 "value": "2" } ]

EasyRecプロセッサの推論最適化に使用するパラメータ

パラメーター

必須/任意

説明

例:

TF_XLA_FLAGS

選択可能

このパラメーターは、GPUデバイスで実行されるモデルにのみ使用されます。 Accelerated Linear Algebra (XLA) コンパイラフレームワークを使用して、演算子を自動的にマージできます。これにより、モデルのコンパイルと最適化が容易になります。

"processor_envs":

[

{

"name": "TF_XLA_FLAGS" 、

"value": "-- tf_xla_auto_jit=2"

{

"name": "XLA_FLAGS" 、

"value": "-- xla_gpu_cuda_data_dir=/usr/local/cuda/"

{

"name": "XLA_ALIGN_SIZE" 、

"value": "64"

}

]

TensorFlowスケジューリングパラメーター

選択可能

inter_op_parallelism_threads: さまざまな操作の実行に使用されるスレッドの数を制御します。

intra_op_parallelism_threads: 単一の操作を実行するために使用されるスレッドの数を制御します。

32コアCPUを使用する場合は、このパラメーターのフィールドを16に設定してパフォーマンスを向上させます。

"model_config": {

"inter_op_parallelism_threads": 16、

"intra_op_parallelism_threads": 16、

}

ステップ2: コールサービス

EasyRecモデルサービスをデプロイした後、Elastic Algorithm service (EAS) ページに移動します。このページで、[サービスの種類] 列の [呼び出し方法] をクリックして、サービスのエンドポイントとトークンを表示します。

EasyRecモデルサービスの入力と出力は、プロトコルバッファ (protobuf) 形式です。 FGが有効かどうかに基づいて、次の方法でサービスを呼び出すことができます。

FGが有効な場合のサンプルコード `(fg_mode=tf)`

SDK for Java

SDK for Javaを使用する前に、Maven環境を設定する必要があります。 Maven環境の設定方法については、「SDK For Java」をご参照ください。 ali_rec_rnk_with_fgサービスを呼び出すためのサンプルコード:

import com.aliyun.openservices.eas.predict.http.*;
import com.aliyun.openservices.eas.predict.request.EasyRecRequest;

PredictClient client = new PredictClient(new HttpConfig());
// Specify the endpoint of the service that you want to call. The endpoint starts with your user ID. 
client.setEndpoint("xxxxxxx.vpc.cn-hangzhou.pai-eas.aliyuncs.com");
client.setModelName("ali_rec_rnk_with_fg");
// Specify the token of the service. 
client.setToken("******");

EasyRecRequest easyrecRequest = new EasyRecRequest(separator);
// userFeatures: Specify multiple user features at the same time. Separate multiple user features with \u0002 (CTRL_B). For each user feature, separate the feature name and feature value with a colon (:). 
//  user_fea0:user_fea0_val\u0002user_fea1:user_fea1_val
// For more information about the feature value format, visit https://easyrec.readthedocs.io/en/latest/feature/rtp_fg.html.
easyrecRequest.appendUserFeatureString(userFeatures);
// Alternatively, add one user feature at a time.
// easyrecRequest.addUserFeature(String userFeaName, T userFeaValue). 
// T: the type of the feature value. Valid values: String, float, long, and int. 

// contextFeatures: Specify multiple context features at the same time. Separate multiple context features with \u0002 (CTRL_B). For each context feature, separate the feature name and feature value with a colon (:). 
//   ctxt_fea0:ctxt_fea0_ival0:ctxt_fea0_ival1:ctxt_fea0_ival2\u0002ctxt_fea1:ctxt_fea1_ival0:ctxt_fea1_ival1:ctxt_fea1_ival2
easyrecRequest.appendContextFeatureString(contextFeatures);
// Alternatively, add one context feature at a time.
// easyrecRequest.addContextFeature(String ctxtFeaName, List<Object> ctxtFeaValue). 
// Valid data types of ctxtFeaValue: String, Float, Long, and Integer. 

// itemIdStr: the list of item IDs to be predicted. Separate multiple item IDs with commas (,). 
easyrecRequest.appendItemStr(itemIdStr, ",");
// Alternatively, add one item ID at a time.
// easyrecRequest.appendItemId(String itemId)

PredictProtos.PBResponse response = client.predict(easyrecRequest);

for (Map.Entry<String, PredictProtos.Results> entry : response.getResultsMap().entrySet()) {
    String key = entry.getKey();
    PredictProtos.Results value = entry.getValue();
    System.out.print("key: " + key);
    for (int i = 0; i < value.getScoresCount(); i++) {
        System.out.format("value: %.6g\n", value.getScores(i));
    }
}

// Obtain the features processed by FG to compare with the offline features. 
// Set DebugLevel to 1 to return the generated features. 
easyrecRequest.setDebugLevel(1);
PredictProtos.PBResponse response = client.predict(easyrecRequest);
Map<String, String> genFeas = response.getGenerateFeaturesMap();
for(String itemId: genFeas.keySet()) {
    System.out.println(itemId);
    System.out.println(genFeas.get(itemId));
}

SDK for Python

SDK For Pythonの使用方法の詳細については、「SDK for Python」をご参照ください。本番環境でSDK for Javaを使用することを推奨します。 ali_rec_rnk_with_fgサービスを呼び出すためのサンプルコード:

from eas_prediction import PredictClient

from eas_prediction.easyrec_request import EasyRecRequest
from eas_prediction.easyrec_predict_pb2 import PBFeature
from eas_prediction.easyrec_predict_pb2 import PBRequest

if __name__ == '__main__':
    endpoint = 'http://xxxxxxx.vpc.cn-hangzhou.pai-eas.aliyuncs.com'
    service_name = 'ali_rec_rnk_with_fg'
    token = '******'

    client = PredictClient(endpoint, service_name)
    client.set_token(token)
    client.init()

    req = PBRequest()
    uid = PBFeature()
    uid.string_feature = 'u0001'
    req.user_features['user_id'] = uid
    age = PBFeature()
    age.int_feature = 12
    req.user_features['age'] = age
    weight = PBFeature()
    weight.float_feature = 129.8
    req.user_features['weight'] = weight

    req.item_ids.extend(['item_0001', 'item_0002', 'item_0003'])
    
    easyrec_req = EasyRecRequest()
    easyrec_req.add_feed(req, debug_level=0)
    res = client.predict(easyrec_req)
    print(res)

設定する必要があるパラメータ:

endpoint: 呼び出すサービスのエンドポイント。エンドポイントはユーザーIDで始まります。エンドポイントを取得するには、[Elastic Algorithm Service (EAS)] ページに移動し、呼び出すサービスを見つけて、[service Type] 列の [Invocation Method] をクリックします。
service_name: サービスの名前。サービス名は、Elastic Algorithm service (EAS) ページで取得できます。
token: サービスのトークン。 [呼び出し方法] ダイアログボックスでトークンを取得できます。

FGが無効の場合のサンプルコード `(fg_mode=bypass)`

SDK for Java

SDK for Javaを使用する前に、Maven環境を設定する必要があります。 Maven環境の設定方法については、「SDK For Java」をご参照ください。 ali_rec_rnk_no_fgサービスを呼び出すためのサンプルコード:

import java.util.List;

import com.aliyun.openservices.eas.predict.http.PredictClient;
import com.aliyun.openservices.eas.predict.http.HttpConfig;
import com.aliyun.openservices.eas.predict.request.TFDataType;
import com.aliyun.openservices.eas.predict.request.TFRequest;
import com.aliyun.openservices.eas.predict.response.TFResponse;

public class TestEasyRec {
    public static TFRequest buildPredictRequest() {
        TFRequest request = new TFRequest();
 
        request.addFeed("user_id", TFDataType.DT_STRING, 
                        new long[]{5}, new String []{ "u0001", "u0001", "u0001"});
      	request.addFeed("age", TFDataType.DT_FLOAT, 
                        new long[]{5}, new float []{ 18.0f, 18.0f, 18.0f});
        // Note: If you set INPUT_TILE to 2, you can simplify the code in the following way:
        //    request.addFeed("user_id", TFDataType.DT_STRING,
        //            new long[]{1}, new String []{ "u0001" });
        //    request.addFeed("age", TFDataType.DT_FLOAT, 
        //            new long[]{1}, new float []{ 18.0f});
      	request.addFeed("item_id", TFDataType.DT_STRING, 
                        new long[]{5}, new String []{ "i0001", "i0002", "i0003"});  
        request.addFetch("probs");
      	return request;
    }

    public static void main(String[] args) throws Exception {
        PredictClient client = new PredictClient(new HttpConfig());

        // Call setDirectEndpoint to access the service by using a virtual private cloud (VPC) direct connection channel. 
        //   client.setDirectEndpoint("pai-eas-vpc.cn-shanghai.aliyuncs.com");
        // You need to create a VPC direct connection channel on the EAS page of the PAI console. 
        // Compared with using a gateway, using the direct connection channel improves stability and performance. 
        client.setEndpoint("xxxxxxx.vpc.cn-hangzhou.pai-eas.aliyuncs.com");
        client.setModelName("ali_rec_rnk_no_fg");
        client.setToken("");
        long startTime = System.currentTimeMillis();
        for (int i = 0; i < 100; i++) {
            try {
                TFResponse response = client.predict(buildPredictRequest());
                // probs: the name of the output field. You can run the cURL command to view the input and output of the model.
                //   curl xxxxxxx.vpc.cn-hangzhou.pai-eas.aliyuncs.com -H "Authorization:{token}"
                List<Float> result = response.getFloatVals("probs");
                System.out.print("Predict Result: [");
                for (int j = 0; j < result.size(); j++) {
                    System.out.print(result.get(j).floatValue());
                    if (j != result.size() - 1) {
                        System.out.print(", ");
                    }
                }
                System.out.print("]\n");
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
        long endTime = System.currentTimeMillis();
        System.out.println("Spend Time: " + (endTime - startTime) + "ms");
        client.shutdown();
    }
}

Python 用 SDK

SDK For Pythonの使用方法の詳細については、「SDK for Python」をご参照ください。パフォーマンスが限られているため、SDK for Pythonはデバッグ目的でのみ使用することをお勧めします。 ali_rec_rnk_no_fgサービスを呼び出すためのサンプルコード:

#!/usr/bin/env python

from eas_prediction import PredictClient
from eas_prediction import StringRequest
from eas_prediction import TFRequest

if __name__ == '__main__':
    client = PredictClient('http://xxxxxxx.vpc.cn-hangzhou.pai-eas.aliyuncs.com', 'ali_rec_rnk_no_fg')
    client.set_token('')
    client.init()

    req = TFRequest('server_default') # Replace server_dafault with the signature_name of the actual model. For more information, see https://www.alibabacloud.com/help/en/pai/user-guide/sdk-for-python 
    req.add_feed('user_id', [3], TFRequest.DT_STRING, ['u0001'] * 3)
    req.add_feed('age', [3], TFRequest.DT_FLOAT, [18.0] * 3)
    # Note: If you set INPUT_TILE to 2, you can simplify the code in the following way:
    #   req.add_feed('user_id', [1], TFRequest.DT_STRING, ['u0001'])
    #   req.add_feed('age', [1], TFRequest.DT_FLOAT, [18.0])
    req.add_feed('item_id', [3], TFRequest.DT_STRING, 
        ['i0001', 'i0002', 'i0003'])
    for x in range(0, 100):
        resp = client.predict(req)
        print(resp)

カスタムサービス要求を作成することもできます。詳細については、「リクエスト構文」をご参照ください。

リクエスト構文

Python以外のクライアントの場合は、. protoファイルから手動で予測コードを生成する必要があります。次のprotobuf定義を使用して、カスタムサービス要求のコードを生成します。

tf_predict.proto: TensorFlowモデルのprotobuf定義

syntax = "proto3";

option cc_enable_arenas = true;
option go_package = ".;tf";
option java_package = "com.aliyun.openservices.eas.predict.proto";
option java_outer_classname = "PredictProtos";

enum ArrayDataType {
  // Not a legal value for DataType. Used to indicate a DataType field
  // has not been set.
  DT_INVALID = 0;

  // Data types that all computation devices are expected to be
  // capable to support.
  DT_FLOAT = 1;
  DT_DOUBLE = 2;
  DT_INT32 = 3;
  DT_UINT8 = 4;
  DT_INT16 = 5;
  DT_INT8 = 6;
  DT_STRING = 7;
  DT_COMPLEX64 = 8;  // Single-precision complex
  DT_INT64 = 9;
  DT_BOOL = 10;
  DT_QINT8 = 11;     // Quantized int8
  DT_QUINT8 = 12;    // Quantized uint8
  DT_QINT32 = 13;    // Quantized int32
  DT_BFLOAT16 = 14;  // Float32 truncated to 16 bits.  Only for cast ops.
  DT_QINT16 = 15;    // Quantized int16
  DT_QUINT16 = 16;   // Quantized uint16
  DT_UINT16 = 17;
  DT_COMPLEX128 = 18;  // Double-precision complex
  DT_HALF = 19;
  DT_RESOURCE = 20;
  DT_VARIANT = 21;  // Arbitrary C++ data types
}

// Dimensions of an array
message ArrayShape {
  repeated int64 dim = 1 [packed = true];
}

// Protocol buffer representing an array
message ArrayProto {
  // Data Type.
  ArrayDataType dtype = 1;

  // Shape of the array.
  ArrayShape array_shape = 2;

  // DT_FLOAT.
  repeated float float_val = 3 [packed = true];

  // DT_DOUBLE.
  repeated double double_val = 4 [packed = true];

  // DT_INT32, DT_INT16, DT_INT8, DT_UINT8.
  repeated int32 int_val = 5 [packed = true];

  // DT_STRING.
  repeated bytes string_val = 6;

  // DT_INT64.
  repeated int64 int64_val = 7 [packed = true];

  // DT_BOOL.
  repeated bool bool_val = 8 [packed = true];
}

// PredictRequest specifies which TensorFlow model to run, as well as
// how inputs are mapped to tensors and how outputs are filtered before
// returning to user.
message PredictRequest {
  // A named signature to evaluate. If unspecified, the default signature
  // will be used
  string signature_name = 1;

  // Input tensors.
  // Names of input tensor are alias names. The mapping from aliases to real
  // input tensor names is expected to be stored as named generic signature
  // under the key "inputs" in the model export.
  // Each alias listed in a generic signature named "inputs" should be provided
  // exactly once in order to run the prediction.
  map<string, ArrayProto> inputs = 2;

  // Output filter.
  // Names specified are alias names. The mapping from aliases to real output
  // tensor names is expected to be stored as named generic signature under
  // the key "outputs" in the model export.
  // Only tensors specified here will be run/fetched and returned, with the
  // exception that when none is specified, all tensors specified in the
  // named signature will be run/fetched and returned.
  repeated string output_filter = 3;
  
  // Debug flags
  // 0: just return prediction results, no debug information
  // 100: return prediction results, and save request to model_dir 
  // 101: save timeline to model_dir
  int32 debug_level = 100;
}

// Response for PredictRequest on successful run.
message PredictResponse {
  // Output tensors.
  map<string, ArrayProto> outputs = 1;
}

easyrec_predict.proto: TensorFlowモデルとFGのprotobuf定義

syntax = "proto3";

option cc_enable_arenas = true;
option go_package = ".;easyrec";
option java_package = "com.aliyun.openservices.eas.predict.proto";
option java_outer_classname = "EasyRecPredictProtos";

import "tf_predict.proto";

// context features
message ContextFeatures {
  repeated PBFeature features = 1;
}

message PBFeature {
  oneof value {
    int32 int_feature = 1;
    int64 long_feature = 2;
    string string_feature = 3;
    float float_feature = 4;
  }
}

// PBRequest specifies the request for aggregator
message PBRequest {
  // Debug flags
  // 0: just return prediction results, no debug information
  // 3: return features generated by FG module, string format, feature values are separated by \u0002, 
  //    could be used for checking feature consistency check and generating online deep learning samples 
  // 100: return prediction results, and save request to model_dir 
  // 101: save timeline to model_dir
  // 102: for recall models such as DSSM and MIND, only only return Faiss retrieved results
  //      but also return user embedding vectors.
  int32 debug_level = 1;

  // user features
  map<string, PBFeature> user_features = 2;

  // item ids, static(daily updated) item features 
  // are fetched from the feature cache resides in 
  // each processor node by item_ids
  repeated string item_ids = 3;

  // context features for each item, realtime item features
  //    could be passed as context features.
  map<string, ContextFeatures> context_features = 4;

  // embedding retrieval neighbor number.
  int32 faiss_neigh_num = 5;
}

// return results
message Results {
  repeated double scores = 1 [packed = true];
}

enum StatusCode {
  OK = 0;
  INPUT_EMPTY = 1;
  EXCEPTION = 2;
}

// PBResponse specifies the response for aggregator
message PBResponse {
  // results
  map<string, Results> results = 1;

  // item features
  map<string, string> item_features = 2;

  // fg generate features
  map<string, string> generate_features = 3;

  // context features
  map<string, ContextFeatures> context_features = 4;

  string error_msg = 5;

  StatusCode status_code = 6;

  // item ids
  repeated string item_ids = 7;

  repeated string outputs = 8;

  // all fg input features
  map<string, string> raw_features = 9;

  // output tensors
  map<string, ArrayProto> tf_outputs = 10;
}

背景情報

制限事項

プロセッサのバージョン

手順1: サービスのデプロイ

サンプルコード (FGが有効)

サンプルコード (FGが無効の場合)

EasyRecプロセッサの推論最適化に使用するパラメータ

ステップ2: コールサービス

FGが有効な場合のサンプルコード (fg_mode=tf)

SDK for Java

SDK for Python

FGが無効の場合のサンプルコード (fg_mode=bypass)

SDK for Java

Python 用 SDK

リクエスト構文

FGが有効な場合のサンプルコード `(fg_mode=tf)`

FGが無効の場合のサンプルコード `(fg_mode=bypass)`