All Products
Search
Document Center

Platform For AI:Image metric learning (raw)

Last Updated:Aug 26, 2024

If your business involves metric learning, you can use the image metric learning (raw) component of Platform for AI (PAI) to build metric learning models for inference. This topic describes how to configure the image metric learning (raw) component and provides an example on how to use the component.

Prerequisites

OSS is activated, and Machine Learning Studio is authorized to access OSS. For more information, see Activate OSS and Grant the permissions that are required to use Machine Learning Designer.

Limits

You can use the image metric learning (raw) component with the computing resources of Deep Learning Containers (DLC).

Overview

The image metric learning (raw) component provides mainstream models such as ResNet50, ResNet18, ResNet34, ResNet101, swint_tiny, swint_small, swint_base, vit_tiny, vit_small, vit_base, xcit_tiny, xcit_small, and xcit_base.

Configure the component in the PAI console

  • Input ports

    Input port (from left to right)

    Data type

    Recommended upstream component

    Required

    data annotation path for training

    OSS

    Read File Data

    No

    data annotation path for evaluation

    OSS

    Read File Data

    No

  • Component parameters

    Tab

    Parameter

    Required

    Description

    Default value

    Fields Setting

    model type

    Yes

    The model type used for training. Valid values:

    • DataParallelMetricLearning

    • ModelParallelMetricLearning

    DataParallelMetricLearning

    oss dir to save model

    Yes

    The Object Storage Service (OSS) directory in which the training model is stored. Example: oss://examplebucket/yun****/designer_test.

    None

    oss annotation path for training data

    No

    If you do not specify the labeled training data by using an input port, you must configure this parameter.

    Note

    If you use both an input port and this parameter to specify the labeled training data, the value specified by the input port takes precedence.

    The OSS path in which the labeled training data is stored. Example: oss://examplebucket/yun****/data/imagenet/meta/train_labeled.txt.

    Each data record in the train_labeled.txt file is stored in the absolute path/image name.jpg label_id format.

    Important

    image storage path and label_id are separated by a space.

    None

    oss annotation path for evaluation data

    No

    If you do not use the data annotation path for evaluation input port to specify the labeled evaluation data, you must configure this parameter.

    Note

    If you use both an input port and this parameter to specify the labeled evaluation data, the value specified by the input port takes precedence.

    The OSS path in which the labeled evaluation data is stored. Example: oss://examplebucket/yun****/data/imagenet/meta/val_labeled.txt.

    Each data record in the val_labeled.txt file is stored in the absolute path/image name.jpg label_id format.

    Important

    image storage path and label_id are separated by a space.

    None

    class list file

    No

    You can specify the class name or set the parameter to the OSS path where the TXT file that contains the class name is located.

    None

    Data Source Type

    Yes

    The type of input data. Valid values: ClsSourceImageList and ClsSourceItag.

    ClsSourceImageList

    oss path for pretrained model

    No

    The OSS path of your pre-trained model. If you have a pre-trained model, set this parameter to the OSS path of your pre-trained model. If you do not configure this parameter, the default pre-trained model provided by PAI is used.

    None

    Parameters Setting

    backbone

    Yes

    The backbone model that you want to use. Valid values:

    • resnet_50

    • resnet_18

    • resnet_34

    • resnet_101

    • swin_transformer_tiny

    • swin_transformer_small

    • swin_transformer_base

    resnet50

    image size after resizing

    Yes

    The size of the resized image. Unit: pixels.

    224

    backbone output channels

    Yes

    The feature dimensions exported by the mainstream model. The value must be an integer.

    2048

    backbone output channels

    Yes

    The feature dimensions exported by the neck. The value must be an integer.

    1536

    training data classification label range

    Yes

    The number of dimensions in the output data.

    None

    metric loss

    Yes

    The loss function evaluates the degree of inconsistency between values predicted by the training model and actual values. The scope of the event. Valid values:

    • AMSoftmax recommend margin 0.4 scale 30

    • ArcFaceLoss recommend margin 28.6 scale 64

    • CosFaceLoss recommend margin 0.35 scale 64

    • LargeMarginSoftmaxLoss recommend margin 4 scale 1

    • SphereFaceLoss recommend margin 4 scale 1

    • ModelParallel AMSoftmax

    • ModelParallel Softmax

    AMSoftmax recommend margin 0.4 scale 30

    metric learning loss scale parameter

    Yes

    The scale that you want to use for the loss function. Configure this parameter based on the loss function that you select.

    30

    metric learning loss margin parameter

    Yes

    The margin that you want to use for the loss function. Configure this parameter based on the loss function that you select.

    0.4

    metric learning loss weight in all losses

    No

    The weight that you want to use for the loss function, which indicates the optimization ratio of metric learning and the classification model.

    1.0

    optimizer

    Yes

    The optimization method for model training. Valid values:

    • SGD

    • AdamW

    SGD

    initial learning rate

    Yes

    The initial learning rate. The value is a floating-point number.

    0.03

    batch size

    Yes

    The size of a training batch, which indicates the number of data samples used for model training in each iteration.

    None

    total train epochs

    Yes

    The total number of epochs. An epoch ends when a round of training is complete on all data samples. The total number of epochs indicates the total number of training rounds conducted on data samples.

    200

    save checkpoint epoch

    No

    The frequency at which a checkpoint is saved. The value of 1 indicates that a checkpoint is saved each time an epoch ends.

    10

    Execution Tuning

    io thread num for training.

    No

    The number of threads used to read the training data.

    4

    use fp 16

    No

    Specifies whether to enable FP16 to reduce memory usage during model training.

    None

    single worker or distributed on MaxCompute or DLC

    Yes

    The compute engine that is used to run the component. You can select a compute engine based on your business requirements. Valid values:

    • single_on_dlc

    • distribute_on_dlc

    single_on_dlc

    number of worker

    No

    If you select distribute_on_dlc for single worker or distributed on MaxCompute or DLC parameter, you configure set this parameter.

    The number of concurrent workers used in computing.

    1

    gpu machine type

    Yes

    The GPU specifications that you want to use.

    8vCPU+60GB Mem+1xp100-ecs.gn5-c8g1.2xlarge

Examples

The following figure shows a sample pipeline in which the image metric learning (raw) component is used. 工作流In this example, configure the components in the preceding figure by performing the following steps:

  1. Prepare data. Label images by using iTAG provided by PAI. For more information, see iTAG.

  2. Use the Read File Data-4 and Read File Data-5 components to read the labeled training data and labeled evaluation data. To read the data, set the OSS Data Path parameter of each component to the OSS path in which the data that you want to retrieve is stored.

  3. Draw lines from the preceding two components to the image metric learning (raw) component and configure the parameters for the image metric learning (raw) component. For more information, see the "Configure component in the PAI console" section of this topic.