Official Elastic Algorithm Service (EAS) SDKs are provided to call services deployed based on their models. EAS SDKs reduce the amount of time required for defining the call logic and improve call stability. This topic describes EAS SDK for Python. Demos are provided to show how to use EAS SDK for Python to call services. In these demos, inputs and outputs are of commonly used types.
Install the SDK
pip install -U eas-prediction --user
Methods
Class | Method | Description |
PredictClient |
|
|
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
| Description: initializes a client object. After all the preceding methods that are used to set parameters are called, the parameters take effect only after you call the | |
|
| |
StringRequest |
|
|
StringResponse |
|
|
TFRequest |
|
|
|
| |
|
| |
|
| |
TFResponse |
|
|
|
| |
TorchRequest |
| Description: creates an object of the TorchRequest class. |
|
| |
|
| |
|
| |
TorchResponse |
|
|
|
| |
QueueClient |
|
|
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
Watcher |
|
|
| Description: stops a watcher to close backend connections. Note Only one watcher can be started for a single client. You must close the watcher before you can start another watcher. |
Demos
Input and output as strings
If you use custom processors to deploy models as services, strings are often used to call the services, such as a service deployed based on a Predictive Model Markup Language (PMML) model. For more information, see the following demo:
#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction import StringRequest if __name__ == '__main__': client = PredictClient('http://182848887922****.cn-shanghai.pai-eas.aliyuncs.com', 'scorecard_pmml_example') client.set_token('YWFlMDYyZDNmNTc3M2I3MzMwYmY0MmYwM2Y2MTYxMTY4NzBkNzdj****') client.init() request = StringRequest('[{"fea1": 1, "fea2": 2}]') for x in range(0, 1000000): resp = client.predict(request) print(resp)
Input and output as tensors
If you use TensorFlow to deploy models as services, you must use the TFRequest and TFResponse classes to call the services. For more information, see the following demo:
#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction import StringRequest from eas_prediction import TFRequest if __name__ == '__main__': client = PredictClient('http://182848887922****.cn-shanghai.pai-eas.aliyuncs.com', 'mnist_saved_model_example') client.set_token('YTg2ZjE0ZjM4ZmE3OTc0NzYxZDMyNmYzMTJjZTQ1YmU0N2FjMTAy****') client.init() #request = StringRequest('[{}]') req = TFRequest('predict_images') req.add_feed('images', [1, 784], TFRequest.DT_FLOAT, [1] * 784) for x in range(0, 1000000): resp = client.predict(req) print(resp)
Use a VPC direct connection channel to call a service
You can use a VPC direct connection channel to access only the services that are deployed in the dedicated resource group for EAS. In addition, to use the channel, the dedicated resource group for EAS and the specified vSwitch must be connected to the VPC. For more information how to purchase EAS dedicated resource groups and network connection, see Work with dedicated resource groups and Configure network connectivity. Compared with the regular mode, this mode contains an additional line of code:
client.set_endpoint_type(ENDPOINT_TYPE_DIRECT)
. You can use this mode in high-concurrency and heavy-traffic scenarios. For more information, see the following demo:#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction import StringRequest from eas_prediction import TFRequest from eas_prediction import ENDPOINT_TYPE_DIRECT if __name__ == '__main__': client = PredictClient('http://pai-eas-vpc.cn-hangzhou.aliyuncs.com', 'mnist_saved_model_example') client.set_token('M2FhNjJlZDBmMzBmMzE4NjFiNzZhMmUxY2IxZjkyMDczNzAzYjFi****') client.set_endpoint_type(ENDPOINT_TYPE_DIRECT) client.init() request = TFRequest('predict_images') request.add_feed('images', [1, 784], TFRequest.DT_FLOAT, [1] * 784) for x in range(0, 1000000): resp = client.predict(request) print(resp)
Call a PyTorch model
If you use PyTorch to deploy models as services, you must use the TorchRequest and TorchResponse classes to call the services. For more information, see the following demo:
#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction import TorchRequest if __name__ == '__main__': client = PredictClient('http://182848887922****.cn-shanghai.pai-eas.aliyuncs.com', 'pytorch_gpu_wl') client.init() req = TorchRequest() req.add_feed(0, [1, 3, 224, 224], TorchRequest.DT_FLOAT, [1] * 150528) # req.add_fetch(0) import time st = time.time() timer = 0 for x in range(0, 10): resp = client.predict(req) timer += (time.time() - st) st = time.time() print(resp.get_tensor_shape(0)) # print(resp) print("average response time: %s s" % (timer / 10) )
Call a Blade processor-based model
If you use Blade processors to deploy models as services, you must use the BladeRequest and BladeResponse classes to call the services. For more information, see the following demo:
#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction import BladeRequest if __name__ == '__main__': client = PredictClient('http://182848887922****.cn-shanghai.pai-eas.aliyuncs.com', 'nlp_model_example') client.init() req = BladeRequest() req.add_feed('input_data', 1, [1, 360, 128], BladeRequest.DT_FLOAT, [0.8] * 85680) req.add_feed('input_length', 1, [1], BladeRequest.DT_INT32, [187]) req.add_feed('start_token', 1, [1], BladeRequest.DT_INT32, [104]) req.add_fetch('output', BladeRequest.DT_FLOAT) import time st = time.time() timer = 0 for x in range(0, 10): resp = client.predict(req) timer += (time.time() - st) st = time.time() # print(resp) # print(resp.get_values('output')) print(resp.get_tensor_shape('output')) print("average response time: %s s" % (timer / 10) )
Call a Blade processor-based model that is compatible with default TensorFlow methods
You can use the TFRequest and TFResponse classes to call a Blade processor-based model that is compatible with default TensorFlow methods supported by EAS. For more information, see the following demo:
#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction.blade_tf_request import TFRequest # Need Importing blade TFRequest if __name__ == '__main__': client = PredictClient('http://182848887922****.cn-shanghai.pai-eas.aliyuncs.com', 'nlp_model_example') client.init() req = TFRequest(signature_name='predict_words') req.add_feed('input_data', [1, 360, 128], TFRequest.DT_FLOAT, [0.8] * 85680) req.add_feed('input_length', [1], TFRequest.DT_INT32, [187]) req.add_feed('start_token', [1], TFRequest.DT_INT32, [104]) req.add_fetch('output') import time st = time.time() timer = 0 for x in range(0, 10): resp = client.predict(req) timer += (time.time() - st) st = time.time() # print(resp) # print(resp.get_values('output')) print(resp.get_tensor_shape('output')) print("average response time: %s s" % (timer / 10) )
Use the queuing service to send and subscribe to data
You can send and query data in a queue, query the state of a queue, and subscribe to data pushed by a queue. In the following demo, a thread pushes data to a queue, and another thread uses a watcher to subscribe to the pushed data. For more information, see the following demo:
#!/usr/bin/env python from eas_prediction import QueueClient import threading if __name__ == '__main__': endpoint = '182848887922****.cn-shanghai.pai-eas.aliyuncs.com' queue_name = 'test_group.qservice/sink' token = 'YmE3NDkyMzdiMzNmMGM3ZmE4ZmNjZDk0M2NiMDA3OTZmNzc1MTUx****' queue = QueueClient(endpoint, queue_name) queue.set_token(token) queue.init() queue.set_timeout(30000) # truncate all messages in the queue attributes = queue.attributes() if 'stream.lastEntry' in attributes: queue.truncate(int(attributes['stream.lastEntry']) + 1) count = 100 # create a thread to send messages to the queue def send_thread(): for i in range(count): index, request_id = queue.put('[{}]') print('send: ', i, index, request_id) # create a thread to watch messages from the queue def watch_thread(): watcher = queue.watch(0, 5, auto_commit=True) i = 0 for x in watcher.run(): print('recv: ', i, x.index, x.tags['requestId']) i += 1 if i == count: break watcher.close() thread1 = threading.Thread(target=watch_thread) thread2 = threading.Thread(target=send_thread) thread1.start() thread2.start() thread1.join() thread2.join()