Python method

0.0.201

Machine Learning Platform for AI (PAI)-Blade provides a Python method that you can use to integrate the model optimization process into a pipeline. This topic describes detailed information about the Python method, including the syntax, input parameters, and response.

optimize

PAI-Blade provides the optimize method that you can use to optimize a model.

Syntax

def optimize(
    model: Any,
    optimization_level: str,
    device_type: str,
    config: Optional[Config] = None,
    inputs: Optional[List[str]] = None,
    outputs: Optional[List[str]] = None,
    input_shapes: Optional[List[List[str]]] = None,
    input_ranges: Optional[List[List[str]]] = None,
    test_data: List[Dict[str, np.ndarray]] = [],
    calib_data: List[Dict[str, np.ndarray]] = [],
    custom_ops: List[str] = [],
    verbose: bool = False,
) -> Tuple[Any, OptimizeSpec, OptimizeReport]:
    pass

Input parameters

Parameter	Type	Required	Description	Default value

Parameter	Type	Required	Description	Default value
model	Multiple types	Yes	The model to be optimized. If the model to be optimized is a TensorFlow model, the following formats are supported: A GraphDef object. The path of a GraphDef PB file whose name is suffixed with `.pb` or `.pbtxt`. A string that specifies the directory of a SavedModel file. If the model to be optimized is a PyTorch model, the following formats are supported: A torch.nn.Module object. A string that specifies the directory of an exported torch.nn.Module file. The name of the file is suffixed with `.pt`.	N/A
optimization_level	STRING	Yes	The optimization level. Valid values (not case-sensitive): o1: lossless optimization, such as graph rewriting and compilation optimization o2: quantization	N/A
device_type	STRING	Yes	The type of the device on which the model is run. Valid values (not case-sensitive): gpu cpu edge (unavailable for PyTorch)	N/A
inputs	LIST[STRING]	No	The name of the input node. If you do not specify this parameter, the system automatically infers.	None
outputs	LIST[STRING]	No	The name of the output node. If you do not specify this parameter, the system automatically infers.	None
input_shapes	LIST[LIST[STRING]]	No	The possible shapes of input tensors. You can specify this parameter to improve the optimization effects in specific scenarios. The number of elements in an inner list must equal the number of input tensors in a model. Each element is a string that specifies an input shape. Example: `'1512'`. If you want to specify multiple groups of possible shapes, you can add elements to the outer list. For example, you need to specify a group of possible shapes for one model and multiple groups of possible shapes for another model. The following sample code provides an example: `[['1512', '3256']]` `[ ['1512', '3256'], ['5512', '9256'], ['10512', '27*256'] ]`	None
input_ranges	LIST[LIST[STRING]]	No	The value ranges of input tensors. The number of elements in an inner list must equal the number of input tensors in a model. Each element is a string that specifies a value range. A value range can be specified by using brackets ([]), real numbers, and characters. Examples: '[1,2]', '[0.3,0.9]', and '[a,f]'. If you want to specify multiple groups of value ranges, you can add elements to the outer list. For example, you need to specify a group of value ranges for one model and multiple groups of value ranges for another model. The following sample code provides an example: `[['[0.1,0.4]', '[a,f]']]` `[ ['[0.1,0.4]', '[a,f]'], ['[1.1,1.4]', '[h,l]'], ['[2.1,2.4]', '[n,z]'] ]`	None
test_data	Multiple types	No	The test data that is used to calibrate the run speed of the model. The data type of the test data varies based on the type of the model. The test data used for a TensorFlow model consists of multiple groups of feed_dict arguments. The corresponding data type is LIST[DICT[STRING, np.ndarray]]. The test data used for a PyTorch model consists of multiple tuples of input tensors. The corresponding data type is LIST[Tuple[torch.tensor, ]].	[]
calib_data	Multiple types	No	The calibration data that is used to quantize the model. This parameter is required if the optimization_level parameter is set to o2. The data type of the calibration data is the same as that of the test data.	[]
custom_ops	LIST[STRING]	No	The path of the custom operator library. If the model relies on a custom operator library, you must add the path of the custom operator library to the list.	[]
verbose	BOOL	No	Specifies whether to display more logs. Valid values: True: displays more logs. False: does not display more logs.	False
config	blade.Config	No	The advanced configurations. For more information, see the following table that describes the parameters of blade.Config.	N/A

The blade.Config data type is used to set advanced parameters for the optimization. The following sample code provides an example on the syntax of the constructor:

class Config(ABC):
    def __init__(
        self,
        disable_fp16_accuracy_check: bool = False,
        disable_fp16_perf_check: bool = False,
        enable_static_shape_compilation_opt: bool = False,
        enable_dynamic_shape_compilation_opt: bool = True,
        quant_config: Optional[Dict[str, str]] = None,
    ) -> None:
        pass

The following table describes the parameters in the syntax.

Parameter	Required	Type	Description	Default value

Table 1. blade.Config
Parameter	Required	Type	Description	Default value
disable_fp16_accuracy_check	No	BOOL	Specifies whether to enable accuracy verification in FP16 optimization. Valid values: False: disables accuracy verification. True: enables accuracy verification.	False
disable_fp16_perf_check	No	BOOL	Specifies whether to enable performance verification in FP16 optimization. Valid values: False: disables performance verification. True: enables performance verification.	False
enable_static_shape_compilation_opt	No	BOOL	Specifies whether to enable static shape compilation. Valid values: False: disables static shape compilation. True: enables static shape compilation.	False
enable_dynamic_shape_compilation_opt	No	BOOL	Specifies whether to enable dynamic shape compilation. Valid values: False: disables dynamic shape compilation. True: enables dynamic shape compilation.	True
quant_config	No	DICT[STRING, STRING]	The quantization configurations. Only the weight_adjustment key is supported. The key specifies whether to reduce the loss of precision by adjusting the model parameters. Valid values for weight_adjustment: "true": enabled "false": disabled	None

Response

A tuple that contains three elements is returned. The data type of the tuple is Tuple[Any, OptimizeSpec, OptimizeReport]. The following table describes the three elements.

No.	Element	Type	Description

No.	Element	Type	Description
1	Optimized model	Multiple types	The type of the optimized model is the same as that of the original model. For example, if you specify a SavedModel file of TensorFlow to be optimized, a GraphDef object is returned.
2	External dependencies	OptimizeSpec	The external dependencies including environment variables and compilation cache ensure that the optimization results meet your expectations. You can make the external dependencies effective by using the `WITH` statement in Python. This parameter is not required in the SDK.
3	Optimization report	OptimizeReport	For more information, see Optimization report.

Feedback

Previous: Use AICompiler to optimize modelsNext: Optimization report

On this page （1, T）

optimize

Chat now with Alibaba Cloud Customer Service to assist you in finding the right products and services to meet your needs.

optimize

Sales Support

Technical Support

Connect & Report Abuse

About Alibaba Cloud

Our Global Network

Quick Start

Global Offices

Olympic Games Paris 2024 New

Stade Roland Garros – Glitz from the Past New

Place de la Concorde – “Breaking” the Barriers New

Vaires-sur-Marne Nautical Stadium – Sports with Sustainability New

International Broadcast Center – Images, Sounds, and Data that Captivate Billions New

Customer Success Stories New

Trust Center

Security & Compliance Center

Cloud Compliance Resources

Security Compliance FAQs

Product & Feature Update New

Cloud Forward

Press Room

Alibaba Cloud e-Magazine New

Alibaba Cloud in Analyst Research

Notice

Go Global Service New

Go Global Alliance with Alibaba Cloud

Asia Accelerator Hot

Information Compliance

China Gateway - MLPS 2.0 Compliance New

China Gateway - Networking

China Gateway - Global Application Acceleration New

China Gateway - Security

China Gateway - Data Security New

ICP Support Hot

China Gateway - Omnichannel Data Mid-End New

China Gateway - Organizational Data Mid-End New

China Gateway - Business Mid-End New

China Gateway - AI Service for Conversational Chatbots New

China Gateway - Online Education

China Gateway - Domain Registration

Work at Alibaba Cloud

Experienced Professionals

Students and Graduates

Free Trial

Pricing

Promo Center

Price Reduction

Pay Less and Deploy More

FinOps

Elastic Compute Service (ECS)

Simple Application Server (SAS)

Elastic GPU Service

Elastic Desktop Service (EDS)

Object Storage Service (OSS)

Cloud Enterprise Network (CEN)

Web Application Firewall (WAF)

Domain Names

Lingma

Container Compute Service (ACS)

Secure Access Service Edge (SASE)

Intelligent Media Services(IMS)

Edge Security Acceleration (ESA)(Original DCDN)

Intelligent Media Management

DingTalk Enterprise

YiDA

Alibaba Cloud Model Studio

Apsara Prime - For Easy Cloud Product Selection

Alibaba Cloud ECS - Cater All Your Cloud Hosting Needs

1TB CDN—Get Free 1 TB Outbound Traffic Plan Now

Security—Under Attack? Get Free Security Support

Short Message Service - Free Testing is Available

Elastic Compute Service (ECS) Hot

CloudBox

Compute Nest

Dedicated Host Hot

ECS Bare Metal Instance

Elastic GPU Service Featured

Simple Application Server (SAS) Hot

Auto Scaling

Cloud Phone Beta

Elastic Desktop Service (EDS) Featured

Batch Compute

Elastic High Performance Computing (E-HPC)

Super Computing Cluster (SCC)