iTAG of Machine Learning Platform for AI (PAI) provides labeling templates for optical character recognition (OCR), object detection, and image classification. When you create an image labeling job, you can select a labeling template based on your business scenario. This topic describes scenarios of image labeling templates and the data structures of input and output data for these templates.
Background information
iTAG provides image labeling templates that support the following features:
OCR
OCR is used to extract text from input images, and then classify the images based on the text.
Scenarios
This labeling template applies to scenarios such as the recognition of identity cards, tickets, license plates, and bank cards.
Data structures
Input data
Each row in the .manifest file of input data contains an object. Each row must contain the source field.
{"data":{"source":"oss://****.oss-cn-hangzhou.aliyuncs.com/demo_test/ocr_pic/img6.jpeg"}} ...
Output data
Each row in the .manifest file of output data contains an object and the labeling results for the object. The following code provides an example on the JSON string in each row:
{ "data": { "source": "oss://****.oss-cn-hangzhou.aliyuncs.com/demo_test/ocr_pic/img6.jpeg" }, "label-144863699223676****": { "results": [ { "questionId": "1", "data": [ { "id": "ecdb7552-2a4e-4d0e-8abb-0f1a2dc0****", "type": "image/polygon", "value": [ [ 368.1112214498511, 71.72740814299901 ], [ 444.34359483614696, 71.72740814299901 ], [ 444.34359483614696, 106.26762661370405 ], [ 368.1112214498511, 106.26762661370405 ] ], "labels": { "OCR result": "Financial consultant", "Single-choice": "Label 1" } } ], "rotation": 0, "markTitle": "Label configuration for OCR", "width": 1024, "type": "image", "height": 1024 } ] } }
Object detection
Object detection is used to locate a specific object in an image. The rectangle selection tool is commonly used.
Scenarios
This labeling template applies to scenarios such as vehicle detection, passenger detection, and image search.
Data structures
Input data
Each row in the .manifest file of input data contains an object. Each row must contain the source field.
{"data":{"source":"oss://****.oss-cn-hangzhou.aliyuncs.com/pic_ocr/img17.jpeg"}} ...
Output data
Each row in the .manifest file of output data contains an object and the labeling results for the object. The following code provides an example on the JSON string in each row:
{ "data": { "source": "oss://****.oss-cn-hangzhou.aliyuncs.com/pic_ocr/img17.jpeg" }, "label-144853549785619****": { "results": [ { "questionId": "1", "data": [ { "id": "e02a574b-9fd9-45e9-8c8a-9682567b****", "type": "image/polygon", "value": [ [ 499.93454545454546, 255.0981818181818 ], [ 911.0109090909091, 255.0981818181818 ], [ 911.0109090909091, 338.6836363636363 ], [ 499.93454545454546, 338.6836363636363 ] ], "labels": { "Single-choice": "Label 1" } } ], "rotation": 0, "markTitle": "Label configuration for object detection", "width": 1024, "type": "image", "height": 1024 } ] } }
Image classification
Image classification is used to find one or more labels that match an input image from a set of labels and add the labels to the image. This template supports single-label and multi-label image classification.
Scenarios
This labeling template applies to scenarios such as image classification, image recognition, image search, and content recommendation.
Data structures
Input data
Each row in the .manifest file of input data contains an object. Each row must contain the source field.
{"data":{"source":"oss://****.oss-cn-hangzhou.aliyuncs.com/iTAG/pic/1.jpg"}} ...
Output data
Each row in the .manifest file of output data contains an object and the labeling results for the object. The following code provides an example on the JSON string in each row:
{ "data": { "source": "oss://****.oss-cn-hangzhou.aliyuncs.com/pic/3.jpg" }, "label-143082452899667****": { "results": [ { "questionId": "2", "data": [ "Label 1", "Label 2" ], "markTitle": "Multiple-choice", "type": "survey/multivalue" } ] } }