Use EasyVision to detect objects

0.0.201

Platform for AI (PAI) provides EasyVision, which is an enhanced algorithm framework for visual intelligence that provides various features for model training and prediction. You can use EasyVision to train and apply computer vision models for your computer vision applications. This topic describes how to use EasyVision in Data Science Workshop (DSW) to detect objects.

Prerequisites

A development environment that uses software with the following versions is prepared:

Python 2.7 or Python 3.4 or later
PAI-TensorFlow or TensorFlow 1.8 or later
Note
If you use a DSW instance, we recommend that you select an image of TensorFlow 1.12 and an instance type whose memory is greater than 16 GB.
ossutil is downloaded and installed. For more information, see Install ossutil.
Important
After you download ossutil, you need to set the endpoint parameter to https://oss-cn-zhangjiakou.aliyuncs.com in the configuration file.

Step 1: Prepare data

Use ossutil to download the Pascal dataset to the current directory.

ossutil64  cp -r  oss://pai-vision-data-hz/data/voc0712_tfrecord/ data/voc0712_tfrecord

Download the ResNet50 pre-trained model to the current directory.

mkdir -p pretrained_models/
ossutil64 cp -r oss://pai-vision-data-hz/pretrained_models/resnet_v1d_50/ pretrained_models/resnet_v1d_50

Step 2: Start a training task in the current directory

Single-machine mode
```
import easy_vision
easy_vision.train_and_evaluate(easy_vision.RFCN_SAMPLE_CONFIG)
```
In the training process, the model is evaluated every 5,000 rounds of training.

Multi-machine mode

You can use multiple servers to train the model. Make sure that each server has at least two GPUs. In multi-machine mode, you must start the following child processes:

ps: the parameter server.
master: the master node that writes summaries, saves checkpoints, and periodically evaluates the model.
worker: the worker node that processes specific data.

Run the following code to start a training task:

#-*- encoding:utf-8 -*-
import multiprocessing
import sys
import os
import easy_vision
import json
import logging
import subprocess
import time

# train config under distributed settings
config=easy_vision.RFCN_DISTRIBUTE_SAMPLE_CONFIG

# The configuration of the cluster. 
TF_CONFIG={'cluster':{
             'ps': ['localhost:12921'],
             'master': ['localhost:12922'],
             'worker': ['localhost:12923']
            }
          }

def job(task, gpu):
  task_name = task['type']
  # redirect python log and tf log to log_file_name
  # [logs/master.log, logs/worker.log, logs/ps.log]
  log_file_name = "logs/%s.log" % task_name

  TF_CONFIG['task'] = task
  os.environ['TF_CONFIG'] = json.dumps(TF_CONFIG)
  os.environ['CUDA_VISIBLE_DEVICES'] = gpu
  train_cmd = 'python -m easy_vision.python.train_eval --pipeline_config_path %s' % config
  logging.info('%s > %s 2>&1 ' % (train_cmd, log_file_name))
  with open(log_file_name, 'w') as lfile:
    return subprocess.Popen(train_cmd.split(' '), stdout= lfile, stderr=subprocess.STDOUT)


if __name__ == '__main__':
  procs = {}
  # start ps job on cpu
  task = {'type':'ps', 'index':0}
  procs['ps'] = job(task, '')
  # start master job on gpu 0
  task = {'type':'master', 'index':0}
  procs['master'] = job(task, '0')
  # start worker job on gpu 1
  task = {'type':'worker', 'index':0}
  procs['worker'] = job(task, '1')

  num_worker = 2
  for k, proc in procs.items():
    logging.info('%s pid: %d' %(k, proc.pid))

  task_failed = None
  task_finish_cnt = 0
  task_has_finished = {k:False for k in procs.keys()}
  while True:
    for k, proc in procs.items():
      if proc.poll() is None:
        if task_failed is not None:
          logging.error('task %s failed, %s quit' % (task_failed, k))
          proc.terminate()
          if k != 'ps':
            task_has_finished[k] = True
            task_finish_cnt += 1
          logging.info('task_finish_cnt %d' % task_finish_cnt)
      else:
        if not task_has_finished[k]:
          #process quit by itself
          if k != 'ps':
            task_finish_cnt += 1
            task_has_finished[k] = True
          logging.info('task_finish_cnt %d' % task_finish_cnt)
          if proc.returncode != 0:
            logging.error('%s failed' %k)
            task_failed = k
          else:
            logging.info('%s run successfully' % k)

    if task_finish_cnt >= num_worker:
      break
    time.sleep(1)

Step 3: Use TensorBoard to monitor the training task

The checkpoints and event files of the model are saved in the directory in which pascal_resnet50_rfcn_model resides. Run the following command to obtain the logon link of TensorBoard. Then, open TensorBoard in a browser. You can view the loss and mean average precision (mAP) of the training task.

Important

You must run the command in Linux. To run the command, switch to the directory in which pascal_resnet50_rfcn_model resides, or replace the path following --logdir in the command with the actual path of pascal_resnet50_rfcn_model. Otherwise, the command fails to be run.

tensorboard --port 6006 --logdir pascal_resnet50_rfcn_model  [ --host 0.0.0.0 ]

View the following information in TensorBoard:

Training loss Metrics:
- loss: the total loss of the training task.
- loss/loss/rcnn_cls: the classification loss.
- loss/loss/rcnn_reg: the regression loss.
- loss/loss/regularization_loss: the regularization loss.
- loss/loss/rpn_cls: the classification loss of region proposal network (RPN).
- loss/loss/rpn_reg: the regression loss of RPN.
Test mAP PascalBoxes07 and PascalBoxes are used as metrics to calculate the test mAP as shown in the preceding figure. PascalBoxes07 is commonly used in related studies.

Step 4: Test and evaluate the model

After the training task is completed, you can test and evaluate the trained model.

Run the following command to install the easy_vision package:

pip install https://pai-vision-data-hz.oss-accelerate.aliyuncs.com/release/easy_vision-1.12.4.1-py2.py3-none-any.whl

Use other datasets to test the model. Then, check the detection result of each image.
```
import easy_vision
test_filelist = 'path/to/filelist.txt' # each line is an image file path
detect_results = easy_vision.predict(easy_vision.RFCN_SAMPLE_CONFIG, test_filelist=test_filelist)
```
The filelist.txt file contains the on-premises path of an image. Each line is an image file path. The detection result of each image in eval_data is returned in the detect_results parameter in the [detection_boxes, box_probability, box_class] format. detection_boxes indicates the location of the detected object, box_class indicates the category of the object, and box_probability indicates the confidence level of the detection result.

Evaluate the trained model.

import easy_vision
eval_metrics = easy_vision.evaluate(easy_vision.RFCN_SAMPLE_CONFIG)

The eval_metrics parameter indicates evaluation metrics, including the PascalBoxes07, PascalBoxes, global_step, and the following loss metrics: loss, loss/loss/rcnn_cls, loss/loss/rcnn_reg, loss/loss/rpn_cls, loss/loss/rpn_reg, and loss/loss/total_loss. The following example shows the metrics.

PascalBoxes07 Metric

PascalBoxes07_PerformanceByCategory/AP@0.5IOU/aeroplane = 0.74028647
PascalBoxes07_PerformanceByCategory/AP@0.5IOU/bicycle = 0.77216494
......
PascalBoxes07_PerformanceByCategory/AP@0.5IOU/train = 0.771075
PascalBoxes07_PerformanceByCategory/AP@0.5IOU/tvmonitor = 0.70221454
PascalBoxes07_Precision/mAP@0.5IOU = 0.6975172

PascalBoxes Metric

PascalBoxes_PerformanceByCategory/AP@0.5IOU/aeroplane = 0.7697732
PascalBoxes_PerformanceByCategory/AP@0.5IOU/bicycle = 0.80088705
......
PascalBoxes_PerformanceByCategory/AP@0.5IOU/train = 0.8002225
PascalBoxes_PerformanceByCategory/AP@0.5IOU/tvmonitor = 0.72775906
PascalBoxes_Precision/mAP@0.5IOU = 0.7182514

global_step and loss

global_step = 75000
loss = 0.51076376
loss/loss/rcnn_cls = 0.23392382
loss/loss/rcnn_reg = 0.12589474
loss/loss/rpn_cls = 0.13748208
loss/loss/rpn_reg = 0.013463326
loss/loss/total_loss = 0.51076376

Step 5: Export the model

Run the following code to export the model as a SavedModel file:

import easy_vision
easy_vision.export(export_dir, pipeline_config_path, checkpoint_path)

After you run the preceding code, a model directory is created in the export_dir directory. The name of the model directory contains the UNIX timestamp that indicates the time when the directory is created. All checkpoints of the model are exported to a SavedModel file in the model directory.

Step 6: Evaluate the SavedModel file

Run the following code to evaluate the exported SavedModel file. All metrics of the model are contained in the evaluation result file and logs.

from easy_vision.python.main import predictor_evaluate
predictor_evaluate(predictor_eval_config)

In the preceding code, predictor_eval_config indicates the .proto file that is used for the evaluation. For more information, see Protocol Documentation. You can also use the following files for evaluation:

detector_eval.config for object detection
text_detector_eval.config for text detection
text_recognizer_eval.config for text recognition
text_spotter_eval.config for end-to-end text recognition

Step 7: Deploy the model as a service

Save the SavedModel file in Object Storage Service (OSS) and use the file to deploy a service in Elastic Algorithm Service (EAS). For more information, see Create a service.

Feedback

Previous: Use cases for DSWNext: Use EasyTransfer to develop a text classification model

On this page （1, T）

Prerequisites

Step 1: Prepare data

Step 2: Start a training task in the current directory

Step 3: Use TensorBoard to monitor the training task

Step 4: Test and evaluate the model

Step 5: Export the model

Step 6: Evaluate the SavedModel file

Step 7: Deploy the model as a service

Chat now with Alibaba Cloud Customer Service to assist you in finding the right products and services to meet your needs.

Prerequisites

Step 1: Prepare data

Step 2: Start a training task in the current directory

Step 3: Use TensorBoard to monitor the training task

Step 4: Test and evaluate the model

Step 5: Export the model

Step 6: Evaluate the SavedModel file

Step 7: Deploy the model as a service

Sales Support

Technical Support

Connect & Report Abuse

About Alibaba Cloud

Our Global Network

Quick Start

Global Offices

Olympic Games Paris 2024 New

Stade Roland Garros – Glitz from the Past New

Place de la Concorde – “Breaking” the Barriers New

Vaires-sur-Marne Nautical Stadium – Sports with Sustainability New

International Broadcast Center – Images, Sounds, and Data that Captivate Billions New

Customer Success Stories New

Trust Center

Security & Compliance Center

Cloud Compliance Resources

Security Compliance FAQs

Product & Feature Update New

Cloud Forward

Press Room

Alibaba Cloud e-Magazine New

Alibaba Cloud in Analyst Research

Notice

Go Global Service New

Go Global Alliance with Alibaba Cloud

China Gateway Hot

Information Compliance

China Gateway - MLPS 2.0 Compliance New

China Gateway - Networking

China Gateway - Global Application Acceleration New

China Gateway - Security

China Gateway - Data Security New

ICP Support Hot

China Gateway - Omnichannel Data Mid-End New

China Gateway - Organizational Data Mid-End New

China Gateway - Business Mid-End New

China Gateway - AI Service for Conversational Chatbots New

China Gateway - Online Education

China Gateway - Domain Registration

Work at Alibaba Cloud

Experienced Professionals

Students and Graduates

Free Trial

Pricing

Promo Center

Price Reduction

Pay Less and Deploy More

FinOps

Elastic Compute Service (ECS)

Simple Application Server (SAS)

Elastic GPU Service

Elastic Desktop Service (EDS)

Object Storage Service (OSS)

Cloud Enterprise Network (CEN)

Web Application Firewall (WAF)

Domain Names

Container Compute Service (ACS)

Secure Access Service Edge (SASE)

Intelligent Media Services(IMS)

Edge Security Acceleration (ESA)(Original DCDN)

Intelligent Media Management

DingTalk Enterprise

YiDA

Alibaba Cloud Model Studio

Apsara Prime - For Easy Cloud Product Selection

Alibaba Cloud ECS - Cater All Your Cloud Hosting Needs

1TB CDN—Get Free 1 TB Outbound Traffic Plan Now

Security—Under Attack? Get Free Security Support

Short Message Service - Free Testing is Available

Elastic Compute Service (ECS) Hot

CloudBox

Compute Nest

Dedicated Host Hot

ECS Bare Metal Instance

Elastic Desktop Service (EDS) Featured

Cloud Phone Beta

Elastic GPU Service Featured

Simple Application Server (SAS) Hot

Auto Scaling

Batch Compute

Elastic High Performance Computing (E-HPC)

Super Computing Cluster (SCC)

Function Compute (FC)