EAS SDK allows you to call a model service in a simple and stable manner. This topic describes the methods of EAS SDK for Java and provides sample code for common use cases, such as string input and output, tensor input and output, the queue service, and request data compression.
Add dependencies
To integrate EAS SDK for Java in your project, add the eas-sdk dependency in the pom.xml file. For information about the latest version of the SDK, visit the Maven repository. Sample code:
<dependency>
<groupId>com.aliyun.openservices.eas</groupId>
<artifactId>eas-sdk</artifactId>
<version>2.0.20</version>
</dependency>
As of version 2.0.5, the SDK supports the queue service, which manages priority for asynchronous requests. To use the queue service without compatibility issues, add the required versions of the following dependencies:
<dependency>
<groupId>org.java-websocket</groupId>
<artifactId>Java-WebSocket</artifactId>
<version>1.5.1</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>3.1</version>
</dependency>
Methods
Class | Method | Description |
PredictClient |
|
|
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
| Description: customizes the URL of a request. | |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
HttpConfig |
|
|
|
| |
|
| |
|
| |
|
| |
|
| |
| Returns the status code of the last call. | |
| Returns the error message of the last call. | |
TFRequest |
|
|
|
| |
|
| |
TFResponse |
|
|
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
QueueClient |
|
|
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
| Description: stops the queue service. | |
DataFrame |
|
|
|
| |
|
|
Demos
Use string input and output
If you use custom processors to deploy a model, such as a Predictive Model Markup Language (PMML) model, the request content is often formatted as a string. Sample code:
import com.aliyun.openservices.eas.predict.http.PredictClient;
import com.aliyun.openservices.eas.predict.http.HttpConfig;
public class TestString {
public static void main(String[] args) throws Exception {
// Start and initialize a client. A PredictClient instance is shared by multiple requests. Do not create a PredictClient instance for each request.
PredictClient client = new PredictClient(new HttpConfig());
client.setToken("YWFlMDYyZDNmNTc3M2I3MzMwYmY0MmYwM2Y2MTYxMTY4NzBkNzdj****");
// To use the VPC direct connection feature, call the setDirectEndpoint method.
// Example: client.setDirectEndpoint("pai-eas-vpc.cn-shanghai.aliyuncs.com");
// You must enable the VPC direct connection feature and configure a vSwitch in the PAI console. After you enable the feature, you can call the service without the need to passing a gateway, which improves stability and performance.
// Note: To call a service by using a gateway, use the endpoint that starts with your user ID. To obtain the endpoint, find the service that you want to call on the EAS-Online Model Services page and click Invocation Method in the Service Type column. In the dialog box that appears, you can view the endpoint. To call a service by using the VPC direct connection feature, use the endpoint in the pai-eas-vpc.{region_id}.aliyuncs.com format.
client.setEndpoint("182848887922****.vpc.cn-shanghai.pai-eas.aliyuncs.com");
client.setModelName("scorecard_pmml_example");
// Define the input string.
String request = "[{\"money_credit\": 3000000}, {\"money_credit\": 10000}]";
System.out.println(request);
// EAS returns a string.
try {
String response = client.predict(request);
System.out.println(response);
} catch (Exception e) {
e.printStackTrace();
}
// Close the client.
client.shutdown();
return;
}
}
The preceding sample code performs the following steps:
Call the
PredictClient
method to create a client for the service. If multiple services are involved, create multiple clients.Configure the token, endpoint, and model name parameters for the client.
Create a request variable of the STRING type as the input and call the
client.predict
method to send an HTTP request. The service returns the response parameter.
Use TensorFlow input and output
If your service uses TensorFlow models, the input must use the TFRequest format and the output must use the TFResponse format. Sample code:
import java.util.List;
import com.aliyun.openservices.eas.predict.http.PredictClient;
import com.aliyun.openservices.eas.predict.http.HttpConfig;
import com.aliyun.openservices.eas.predict.request.TFDataType;
import com.aliyun.openservices.eas.predict.request.TFRequest;
import com.aliyun.openservices.eas.predict.response.TFResponse;
public class TestTF {
public static TFRequest buildPredictRequest() {
TFRequest request = new TFRequest();
request.setSignatureName("predict_images");
float[] content = new float[784];
for (int i = 0; i < content.length; i++) {
content[i] = (float) 0.0;
}
request.addFeed("images", TFDataType.DT_FLOAT, new long[]{1, 784}, content);
request.addFetch("scores");
return request;
}
public static void main(String[] args) throws Exception {
PredictClient client = new PredictClient(new HttpConfig());
// To use the VPC direct connection feature, call the setDirectEndpoint method.
// Example: client.setDirectEndpoint("pai-eas-vpc.cn-shanghai.aliyuncs.com");
// You must enable the VPC direct connection feature and configure a vSwitch in the PAI console. After you enable the feature, you can call the service without the need to passing a gateway, which improves stability and performance.
// Note: To call a service by using a gateway, use the endpoint that starts with your user ID. To obtain the endpoint, find the service that you want to call on the EAS-Online Model Services page and click Invocation Method in the Service Type column. In the dialog box that appears, you can view the endpoint. To call a service by using the VPC direct connection feature, use the endpoint in the pai-eas-vpc.{region_id}.aliyuncs.com format.
client.setEndpoint("182848887922****.vpc.cn-shanghai.pai-eas.aliyuncs.com");
client.setModelName("mnist_saved_model_example");
client.setToken("YTg2ZjE0ZjM4ZmE3OTc0NzYxZDMyNmYzMTJjZTQ1YmU0N2FjMTAy****");
long startTime = System.currentTimeMillis();
int count = 1000;
for (int i = 0; i < count; i++) {
try {
TFResponse response = client.predict(buildPredictRequest());
List<Float> result = response.getFloatVals("scores");
System.out.print("Predict Result: [");
for (int j = 0; j < result.size(); j++) {
System.out.print(result.get(j).floatValue());
if (j != result.size() - 1) {
System.out.print(", ");
}
}
System.out.print("]\n");
} catch (Exception e) {
e.printStackTrace();
}
}
long endTime = System.currentTimeMillis();
System.out.println("Spend Time: " + (endTime - startTime) + "ms");
client.shutdown();
}
}
The preceding sample code performs the following steps:
Call the
PredictClient
method to create a client for the service. If multiple services are involved, create multiple clients.Configure the token, endpoint, and model name parameters for the client.
Encapsulate the input by using the TFRequest class and the output by using the TFResponse class.
Use the queue service
Use the QueueClient class to implement the queue service. Sample code:
import com.alibaba.fastjson.JSONObject;
import com.aliyun.openservices.eas.predict.http.HttpConfig;
import com.aliyun.openservices.eas.predict.http.QueueClient;
import com.aliyun.openservices.eas.predict.queue_client.QueueUser;
import com.aliyun.openservices.eas.predict.queue_client.WebSocketWatcher;
public class DemoWatch {
public static void main(String[] args) throws Exception {
/** Create a client for the queue service. */
String queueEndpoint = "18*******.cn-hangzhou.pai-eas.aliyuncs.com";
String inputQueueName = "test_queue_service";
String sinkQueueName = "test_queue_service/sink";
String queueToken = "test-token";
/** Create the input queue. After you add data to the input queue, the inference service automatically reads the request data from the input queue. */
QueueClient inputQueue =
new QueueClient(queueEndpoint, inputQueueName, queueToken, new HttpConfig(), new QueueUser());
/** Create the output queue. After the inference service processes the input data, the result is written to the output queue. */
QueueClient sinkQueue =
new QueueClient(queueEndpoint, sinkQueueName, queueToken, new HttpConfig(), new QueueUser());
/** Clear data in the queue. Use with caution. */
inputQueue.clear();
sinkQueue.clear();
/** Add data to the input queue. */
int count = 10;
for (int i = 0; i < count; ++i) {
String data = Integer.toString(i);
inputQueue.put(data.getBytes(), null);
/** The queue service supports multi-priority queues. You can call the put method to set the data priority. The default priority is 0. */
// inputQueue.put(data.getBytes(), 0L, null);
}
/** Call the watch method to subscribe to the data of the output queue. The window size is 5. */
WebSocketWatcher watcher = inputQueue.watch(0L, 5L, false, true, null);
/** You can configure the WatchConfig parameter to specify the number of retries, the retry interval (in seconds), and whether to retry indefinitely. If you do not configure the WatchConfig parameter, the default number of retries is 3 and the default retry interval is 5. */
// WebSocketWatcher watcher = sink_queue.watch(0L, 5L, false, true, null, new WatchConfig(3, 1));
// WebSocketWatcher watcher = sink_queue.watch(0L, 5L, false, true, null, new WatchConfig(true, 10));
/** Obtain output data. */
for (int i = 0; i < count; ++i) {
try {
/** Call the getDataFrame method to obtain data of the DataFrame type. If no data is available, the method blocks until data is available. */
byte[] data = watcher.getDataFrame().getData();
System.out.println("[watch] data = " + new String(data));
} catch (RuntimeException ex) {
System.out.println("[watch] error = " + ex.getMessage());
break;
}
}
/** Close the watcher. Each client can have only one watcher. If you do not close a watcher, an error is reported when you create another client for the queue service. */
watcher.close();
Thread.sleep(2000);
JSONObject attrs = sinkQueue.attributes();
System.out.println(attrs.toString());
/** Close the client. */
inputQueue.shutdown();
sinkQueue.shutdown();
}
}
The preceding sample code performs the following steps:
Call the
QueueClient
method to create a client for queue service. Make sure that you create an input queue and an output queue, which are required for an inference service.Call the
put()
method to send data to the input queue and thewatch()
method to subscribe to data in the output queue.NoteFor the convenience of demonstration, this example sends data and subscribes to data in the same thread. In your actual implementation, you can send data and subscribe to data in different threads.
Compress request data
If the size of request data is large, EAS SDK allows you to compress the data in the Zlib or Gzip format before you send the data to the server. To use the data compression feature, configure the rpc.decompressor
parameter when you deploy the service.
Sample configuration for service deployment:
"metadata": {
"rpc": {
"decompressor": "zlib"
}
}
Sample code for sending compressed data:
package com.aliyun.openservices.eas.predict;
import com.aliyun.openservices.eas.predict.http.Compressor;
import com.aliyun.openservices.eas.predict.http.PredictClient;
import com.aliyun.openservices.eas.predict.http.HttpConfig;
public class TestString {
public static void main(String[] args) throws Exception{
// Start and initialize a client.
PredictClient client = new PredictClient(new HttpConfig());
client.setEndpoint("18*******.cn-hangzhou.pai-eas.aliyuncs.com");
client.setModelName("echo_compress");
client.setToken("YzZjZjQwN2E4NGRkMDMxNDk5NzhhZDcwZDBjOTZjOGYwZDYxZGM2****");
// You can also set the compressor to Compressor.Gzip.
client.setCompressor(Compressor.Zlib);
// Define the input string.
String request = "[{\"money_credit\": 3000000}, {\"money_credit\": 10000}]";
System.out.println(request);
// EAS returns a string.
String response = client.predict(request);
System.out.println(response);
// Close the client.
client.shutdown();
return;
}
}
References
The status code in the response indicates whether the request is successful. For more information, see Appendix: Status codes.
For information about other methods to call a model service, see Methods for calling services.