TensorFlowサービスのリクエストを作成する - Platform For AI

このトピックでは、ユニバーサルプロセッサに基づくTensorFlowサービスのリクエストを構築する方法について説明します。

入力データに関する注意事項

Elastic Algorithm Service (EAS) は、TensorFlowモデルをサービスとして展開するための組み込みのTensorFlowプロセッサを提供します。パフォーマンスを保証するには、入力データと出力データがプロトコルバッファ形式であることを確認する必要があります。

例

パブリックテストモデルは、中国 (上海) リージョンでサービスとして展開されます。サービス名はmnist_saved_model_exampleです。このサービスには、このリージョンのVPCのすべてのユーザーがアクセスできます。アクセストークンが指定されていません。 http://pai-eas-vpc.cn-shanghai.aliyuncs.com/api/predict/mnist_saved_model_example エンドポイントにリクエストを送信して、サービスを呼び出すことができます。次のセクションでは、サービスを呼び出す方法について説明します。

モデル情報を取得します。
GETリクエストを送信して、signature_name、name、type、shapeなどのモデル情報を取得できます。次のセクションでは、応答のサンプルを示します。
```
$curl http://pai-eas-vpc.cn-shanghai.aliyuncs.com/api/predict/mnist_saved_model_example | python -mjson.tool
{
    "inputs": [
        {
            "name": "images" 、
            "shape": [
                -1,
                784
            ],
            "type": "DT_FLOAT"
        }
    ],
    "outputs": [
        {
            "name": "scores" 、
            "shape": [
                -1,
                10
            ],
            "type": "DT_FLOAT"
        }
    ],
    "signature_name": "predict_images"
} 
```
このモデルは、混合国立標準技術研究所 (MNIST) データセットを使用する分類モデルである。 MNISTデータセットのダウンロード入力データ型はDT_FLOATです。たとえば、入力形状は [-1,784] です。最初のディメンションはbatch_sizeです。リクエストにイメージが1つしか含まれていない場合、batch_sizeは1に設定されます。第2の次元は、784次元ベクトルである。テストモデルをトレーニングするときは、入力を1次元ベクトルにフラット化します。したがって、28 × 28ピクセルを有する画像は、28 × 28である長さ784の1次元ベクトルに平坦化されなければならない。入力を作成するときは、shapeの値に関係なく、1次元のベクトルにフラット化する必要があります。この例では、画像を入力すると、画像は1x784の1次元ベクトルにフラット化されます。モデルのトレーニング時に指定する入力シェイプが [-1,28, 28] の場合、リクエストの入力を1x28x28の1次元ベクトルにフラット化する必要があります。リクエストで指定された入力シェイプがモデルの入力シェイプと一致しない場合、リクエストは失敗します。

プロトコルバッファをインストールし、サービスを呼び出します。このトピックでは、Python 2クライアントを使用してTensorFlowサービスを呼び出す方法について説明します。

EASは、Pythonクライアント用のプロトコルバッファパッケージを提供します。次のコマンドを実行してインストールします。

$ pip install http://eas-data.oss-cn-shanghai.aliyuncs.com/sdk/pai_tf_predict_proto-1.0-py2.py3-none-any.whl

サービスを呼び出して予測を行うには、次のサンプルコードを使用します。

#! /usr/bin/env python
# -*- coding: UTF-8 -*-
import json
from urlparse import urlparse
from com.aliyun.api.gateway.sdk import client
from com.aliyun.api.gateway.sdk.http import request
from com.aliyun.api.gateway.sdk.common import constant
from pai_tf_predict_proto import tf_predict_pb2
import cv2
import numpy as np
with open('2.jpg', 'rb') as infile:
    buf = infile.read()
    # Use NumPy to convert bytes to a NumPy array.
    x = np.fromstring(buf, dtype='uint8')
    # Decode the array into a 28-by-28 matrix.
    img = cv2.imdecode(x, cv2.IMREAD_UNCHANGED)
    # The API for prediction requires one-dimensional vectors of length 784. Therefore, you must reshape the matrix into such a vector.
    img = np.reshape(img, 784)
def predict(url, app_key, app_secret, request_data):
    cli = client.DefaultClient(app_key=app_key, app_secret=app_secret)
    body = request_data
    url_ele = urlparse(url)
    host = 'http://' + url_ele.hostname
    path = url_ele.path
    req_post = request.Request(host=host, protocol=constant.HTTP, url=path, method="POST", time_out=6000)
    req_post.set_body(body)
    req_post.set_content_type(constant.CONTENT_TYPE_STREAM)
    stat,header, content = cli.execute(req_post)
    return stat, dict(header) if header is not None else {}, content
def demo():
    # Enter the model information. Click the model name to obtain the information.
    app_key = 'YOUR_APP_KEY'
    app_secret = 'YOUR_APP_SECRET'
    url = 'YOUR_APP_URL'
    # Construct a service.
    request = tf_predict_pb2.PredictRequest()
    request.signature_name = 'predict_images'
    request.inputs['images'].dtype = tf_predict_pb2.DT_FLOAT  # The type of the images parameter.
    request.inputs['images'].array_shape.dim.extend([1, 784])  # The shape of the images parameter.
    request.inputs['images'].float_val.extend(img)  # The data about the images parameter.
    request.inputs['keep_prob '].dtype = tf_predict_pb2.DT_FLOAT # The type of the keep_prob parameter.
    request.inputs['keep_prob'].float_val.extend([0.75])  # The default value of shape is 1.
    # Serialize data in the Protocol Buffers format to a string and transfer the string.
    request_data = request.SerializeToString()
    stat, header, content = predict(url, app_key, app_secret, request_data)
    if stat != 200:
        print 'Http status code: ', stat
        print 'Error msg in header: ', header['x-ca-error-message'] if 'x-ca-error-message' in header else ''
        print 'Error msg in body: ', content
    else:
        response = tf_predict_pb2.PredictResponse()
        response.ParseFromString(content)
        print(response)
if __name__ == '__main__':
    demo()

次のセクションでは、出力を示します。

outputs {
  key: "scores"
  value {
    dtype: DT_FLOAT
    array_shape {
      dim: 1
      dim: 10
    }
    float_val: 0.0
    float_val: 0.0
    float_val: 1.0
    float_val: 0.0
    float_val: 0.0
    float_val: 0.0
    float_val: 0.0
    float_val: 0.0
    float_val: 0.0
    float_val: 0.0
  }
}

10のカテゴリのスコアが出力に一覧表示されます。出力は、入力イメージが2.jpgの場合、value[2] を除くすべての値が0であることを示しています。最終的な予測結果は2であり、これは正しい。

他の言語のクライアントを使用してサービスを呼び出す

Python以外の言語でクライアントを使用する場合は、. protoファイルを作成します。次のセクションでは、サンプルコードを示します。

次のコンテンツを含むtf.protoなどのリクエストコードファイルを準備します。

syntax = "proto3";
option cc_enable_arenas = true;
option java_package = "com.aliyun.openservices.eas.predict.proto";
option java_outer_classname = "PredictProtos";
enum ArrayDataType {
  // Not a legal value for DataType. Used to indicate a DataType field
  // has not been set.
  DT_INVALID = 0;
  // Data types that all computation devices are expected to be
  // capable to support.
  DT_FLOAT = 1;
  DT_DOUBLE = 2;
  DT_INT32 = 3;
  DT_UINT8 = 4;
  DT_INT16 = 5;
  DT_INT8 = 6;
  DT_STRING = 7;
  DT_COMPLEX64 = 8;  // Single-precision complex.
  DT_INT64 = 9;
  DT_BOOL = 10;
  DT_QINT8 = 11;     // Quantized int8.
  DT_QUINT8 = 12;    // Quantized uint8.
  DT_QINT32 = 13;    // Quantized int32.
  DT_BFLOAT16 = 14;  // Float32 truncated to 16 bits.  Only for cast ops.
  DT_QINT16 = 15;    // Quantized int16.
  DT_QUINT16 = 16;   // Quantized uint16.
  DT_UINT16 = 17;
  DT_COMPLEX128 = 18;  // Double-precision complex.
  DT_HALF = 19;
  DT_RESOURCE = 20;
  DT_VARIANT = 21;  // Arbitrary C++ data types.
}
// Dimensions of an array.
message ArrayShape {
  repeated int64 dim = 1 [packed = true];
}
// Protocol buffer representing an array.
message ArrayProto {
  // Data Type.
  ArrayDataType dtype = 1;
  // Shape of the array.
  ArrayShape array_shape = 2;
  // DT_FLOAT.
  repeated float float_val = 3 [packed = true];
  // DT_DOUBLE.
  repeated double double_val = 4 [packed = true];
  // DT_INT32, DT_INT16, DT_INT8, DT_UINT8.
  repeated int32 int_val = 5 [packed = true];
  // DT_STRING.
  repeated bytes string_val = 6;
  // DT_INT64.
  repeated int64 int64_val = 7 [packed = true];
  // DT_BOOL.
  repeated bool bool_val = 8 [packed = true];
}
// PredictRequest specifies which TensorFlow model to run, as well as
// how inputs are mapped to tensors and how outputs are filtered before
// returning to user.
message PredictRequest {
  // A named signature to evaluate. If unspecified, the default signature
  // will be used.
  string signature_name = 1;
  // Input tensors.
  // Names of input tensor are alias names. The mapping from aliases to real
  // input tensor names is expected to be stored as named generic signature
  // under the key "inputs" in the model export.
  // Each alias listed in a generic signature named "inputs" should be provided
  // exactly once in order to run the prediction.
  map<string, ArrayProto> inputs = 2;
  // Output filter.
  // Names specified are alias names. The mapping from aliases to real output
  // tensor names is expected to be stored as named generic signature under
  // the key "outputs" in the model export.
  // Only tensors specified here will be run/fetched and returned, with the
  // exception that when none is specified, all tensors specified in the
  // named signature will be run/fetched and returned.
  repeated string output_filter = 3;
}
// Response for PredictRequest on successful run.
message PredictResponse {
  // Output tensors.
  map<string, ArrayProto> outputs = 1;
}

このファイルで、PredictRequestはTensorFlowサービスの入力形式を定義し、PredictResponseはサービスの出力形式を定義します。プロトコルバッファの詳細については、「プロトコルバッファ」をご参照ください。

protocをインストールします。

#/bin/bash
PROTOC_ZIP=protoc-3.3.0-linux-x86_64.zip
curl -OL https://github.com/google/protobuf/releases/download/v3.3.0/$PROTOC_ZIP
unzip -o $PROTOC_ZIP -d ./ bin/protoc
rm -f $PROTOC_ZIP

リクエストコードファイルを生成します。
- Java
```
$ bin/protoc -- java_out=./ tf.proto
```
  コマンドが完了すると、リクエストコードファイルcom/aliyun/openservices/eas/predict/proto/PredictProtos.javaが現在のディレクトリに生成されます。ファイルをプロジェクトにインポートします。
- Python
```
$ bin/protoc -- python_out=./ tf.proto
```
  コマンドが完了すると、現在のディレクトリにリクエストコードファイルtf_pb2.pyが生成されます。 importコマンドを実行して、ファイルをプロジェクトにインポートします。
- C++
```
$ bin/protoc -- cpp_out=./ tf.proto
```
  コマンドが完了すると、tf.pb.ccとtf.pb.hを含むリクエストコードファイルが現在のディレクトリに生成されます。 include tf.pb.hコマンドをコードに追加し、tf.pb.ccをコンパイルリストに追加します。