All Products
Search
Document Center

Platform For AI:Python method

Last Updated:Feb 27, 2024

Machine Learning Platform for AI (PAI)-Blade provides a Python method that you can use to integrate the model optimization process into a pipeline. This topic describes detailed information about the Python method, including the syntax, input parameters, and response.

optimize

PAI-Blade provides the optimize method that you can use to optimize a model.

  • Syntax

    def optimize(
        model: Any,
        optimization_level: str,
        device_type: str,
        config: Optional[Config] = None,
        inputs: Optional[List[str]] = None,
        outputs: Optional[List[str]] = None,
        input_shapes: Optional[List[List[str]]] = None,
        input_ranges: Optional[List[List[str]]] = None,
        test_data: List[Dict[str, np.ndarray]] = [],
        calib_data: List[Dict[str, np.ndarray]] = [],
        custom_ops: List[str] = [],
        verbose: bool = False,
    ) -> Tuple[Any, OptimizeSpec, OptimizeReport]:
        pass
  • Input parameters

    Parameter

    Type

    Required

    Description

    Default value

    model

    Multiple types

    Yes

    The model to be optimized.

    • If the model to be optimized is a TensorFlow model, the following formats are supported:

      • A GraphDef object.

      • The path of a GraphDef PB file whose name is suffixed with .pb or .pbtxt.

      • A string that specifies the directory of a SavedModel file.

    • If the model to be optimized is a PyTorch model, the following formats are supported:

      • A torch.nn.Module object.

      • A string that specifies the directory of an exported torch.nn.Module file. The name of the file is suffixed with .pt.

    N/A

    optimization_level

    STRING

    Yes

    The optimization level. Valid values (not case-sensitive):

    • o1: lossless optimization, such as graph rewriting and compilation optimization

    • o2: quantization

    N/A

    device_type

    STRING

    Yes

    The type of the device on which the model is run. Valid values (not case-sensitive):

    • gpu

    • cpu

    • edge (unavailable for PyTorch)

    N/A

    inputs

    LIST[STRING]

    No

    The name of the input node. If you do not specify this parameter, the system automatically infers.

    None

    outputs

    LIST[STRING]

    No

    The name of the output node. If you do not specify this parameter, the system automatically infers.

    None

    input_shapes

    LIST[LIST[STRING]]

    No

    The possible shapes of input tensors. You can specify this parameter to improve the optimization effects in specific scenarios. The number of elements in an inner list must equal the number of input tensors in a model. Each element is a string that specifies an input shape. Example: '1*512'. If you want to specify multiple groups of possible shapes, you can add elements to the outer list. For example, you need to specify a group of possible shapes for one model and multiple groups of possible shapes for another model. The following sample code provides an example:

    • [['1*512', '3*256']]

    • [
          ['1*512', '3*256'],
          ['5*512', '9*256'],
          ['10*512', '27*256']
      ]

    None

    input_ranges

    LIST[LIST[STRING]]

    No

    The value ranges of input tensors. The number of elements in an inner list must equal the number of input tensors in a model. Each element is a string that specifies a value range.

    A value range can be specified by using brackets ([]), real numbers, and characters. Examples: '[1,2]', '[0.3,0.9]', and '[a,f]'. If you want to specify multiple groups of value ranges, you can add elements to the outer list. For example, you need to specify a group of value ranges for one model and multiple groups of value ranges for another model. The following sample code provides an example:

    • [['[0.1,0.4]', '[a,f]']]

    • [
          ['[0.1,0.4]', '[a,f]'],
          ['[1.1,1.4]', '[h,l]'],
          ['[2.1,2.4]', '[n,z]']
      ]

    None

    test_data

    Multiple types

    No

    The test data that is used to calibrate the run speed of the model. The data type of the test data varies based on the type of the model.

    • The test data used for a TensorFlow model consists of multiple groups of feed_dict arguments. The corresponding data type is LIST[DICT[STRING, np.ndarray]].

    • The test data used for a PyTorch model consists of multiple tuples of input tensors. The corresponding data type is LIST[Tuple[torch.tensor, ]].

    []

    calib_data

    Multiple types

    No

    The calibration data that is used to quantize the model. This parameter is required if the optimization_level parameter is set to o2. The data type of the calibration data is the same as that of the test data.

    []

    custom_ops

    LIST[STRING]

    No

    The path of the custom operator library. If the model relies on a custom operator library, you must add the path of the custom operator library to the list.

    []

    verbose

    BOOL

    No

    Specifies whether to display more logs. Valid values:

    • True: displays more logs.

    • False: does not display more logs.

    False

    config

    blade.Config

    No

    The advanced configurations. For more information, see the following table that describes the parameters of blade.Config.

    N/A

    The blade.Config data type is used to set advanced parameters for the optimization. The following sample code provides an example on the syntax of the constructor:

    class Config(ABC):
        def __init__(
            self,
            disable_fp16_accuracy_check: bool = False,
            disable_fp16_perf_check: bool = False,
            enable_static_shape_compilation_opt: bool = False,
            enable_dynamic_shape_compilation_opt: bool = True,
            quant_config: Optional[Dict[str, str]] = None,
        ) -> None:
            pass

    The following table describes the parameters in the syntax.

    Table 1. blade.Config

    Parameter

    Required

    Type

    Description

    Default value

    disable_fp16_accuracy_check

    No

    BOOL

    Specifies whether to enable accuracy verification in FP16 optimization. Valid values:

    • False: disables accuracy verification.

    • True: enables accuracy verification.

    False

    disable_fp16_perf_check

    No

    BOOL

    Specifies whether to enable performance verification in FP16 optimization. Valid values:

    • False: disables performance verification.

    • True: enables performance verification.

    False

    enable_static_shape_compilation_opt

    No

    BOOL

    Specifies whether to enable static shape compilation. Valid values:

    • False: disables static shape compilation.

    • True: enables static shape compilation.

    False

    enable_dynamic_shape_compilation_opt

    No

    BOOL

    Specifies whether to enable dynamic shape compilation. Valid values:

    • False: disables dynamic shape compilation.

    • True: enables dynamic shape compilation.

    True

    quant_config

    No

    DICT[STRING, STRING]

    The quantization configurations. Only the weight_adjustment key is supported. The key specifies whether to reduce the loss of precision by adjusting the model parameters. Valid values for weight_adjustment:

    • "true": enabled

    • "false": disabled

    None

  • Response

    A tuple that contains three elements is returned. The data type of the tuple is Tuple[Any, OptimizeSpec, OptimizeReport]. The following table describes the three elements.

    No.

    Element

    Type

    Description

    1

    Optimized model

    Multiple types

    The type of the optimized model is the same as that of the original model. For example, if you specify a SavedModel file of TensorFlow to be optimized, a GraphDef object is returned.

    2

    External dependencies

    OptimizeSpec

    The external dependencies including environment variables and compilation cache ensure that the optimization results meet your expectations. You can make the external dependencies effective by using the WITH statement in Python. This parameter is not required in the SDK.

    3

    Optimization report

    OptimizeReport

    For more information, see Optimization report.