Help Center > > User Guide (AI Beginners)> Model Training> Introduction to Built-in Algorithms

Introduction to Built-in Algorithms

Updated at: Aug 12, 2019 GMT+08:00

Based on the common AI engines in the industry, ModelArts is integrated with preset algorithms to implement common purposes. You can directly select the algorithms for training jobs, without concerning model development.

Preset algorithms of ModelArts adopt MXNet and TensorFlow engines and are mainly used for detection of object classes and locations, image classification, and semantic image segmentation.

Viewing Built-in Algorithms

In the left navigation pane of the ModelArts management console, click Training Jobs. On the displayed page, click Built-in Algos. In the preset algorithm list, click next to an algorithm name to view details about the algorithm.

You can click Create Training Job in the Operation column for an algorithm to quickly create a training job, for which this algorithm serves as the Algorithm Source.

Figure 1 Preset algorithm list

Available preset algorithms include the following:

yolo_v3

Table 1 Algorithm description

Parameter

Description

Name

yolo_v3

Usage

Object class and location

Engine Type

MXNet, MXNet-1.2.1-python2.7

Precision

81.7%(mAP)

mAP is an indicator that measures the effect of an object detection algorithm. For object detection tasks, the precision rate (Precision) and recall rate (Recall) can be calculated for each class of object. The rates can be calculated and tested multiple times for each class of object based on different thresholds, and a P-R curve is obtained accordingly. The area under the curve is the average value.

Training Dataset

PASCAL VOC2007, detection of 20 classes of objects

Data Format

shape: [H>=224, W>=224, C>=1]; type: int8

Running Parameter

lr=0.0001 ; mom=0.9 ; wd=0.0005

For more available running parameters, see Table 2.

Table 2 Running parameters

Optional Parameter

Description

Default Value

lr

Learning rate

0.0001

mom

Momentum of the training network

0.9

wd

Parameter weight decay coefficient, L2

0.0005

num_classes

Total number of image classes in training. You do not need to plus 1 here.

None

split_spec

Split ratio of the training set and validation set

0.8

batch_size

Total number of training images updated each time

4

eval_frequence

Frequency for validating the model. By default, validation is performed every epoch.

1

num_epoch

Number of training epochs

10

num_examples

Total number of images used for training. For example, if the total number of images is 1,000, the images used for training is 800 based on the split ratio.

16551

disp_batches

The loss and training speed of the model is displayed every N batches.

20

warm_up_epochs

Number of epochs when the target learning rate of the warm-up strategy is reached

0

lr_steps

Number of epochs when the learning rate attenuates in the multi-factor strategy. By default, the learning rate attenuates to 0.1 times of the original value at the 10th and 15th epochs.

10,15

retinanet_resnet_v1_50

Table 3 Algorithm description

Parameter

Description

Name

retinanet_resnet_v1_50

Usage

Object class and location

Engine Type

TensorFlow, TF-1.8.0-python2.7

Precision

83.15%(mAP)

mAP is an indicator that measures the effect of an object detection algorithm. For object detection tasks, the precision rate (Precision) and recall rate (Recall) can be calculated for each class of object. The rates can be calculated and tested multiple times for each class of object based on different thresholds, and a P-R curve is obtained accordingly. The area under the curve is the average value.

Training Dataset

ImageNet-1k; [H, W, C=3]

Data Format

shape: [H, W, C>=1]; type: int8

Running Parameter

By default, no running parameters are set for the algorithm. For more available running parameters, see Table 4.

Table 4 Running parameters

Optional Parameter

Description

Default Value

split_spec

Split ratio of the training set and validation set

train:0.8,eval:0.2

num_gpus

Number of used GPUs

1

batch_size

Number of images for each iteration (standalone)

32

eval_batch_size

Number of images read each step during validation (standalone)

32

learning_rate_strategy

Learning rate strategy. For example, 10:0.001,20:0.0001 indicates that the learning rate for 0 to 10 epochs is 0.001, and that for 10 to 20 epochs is 0.0001.

0.002

evaluate_every_n_epochs

A validation is performed after N epochs are trained.

1

save_interval_secs

Interval for saving the model. The unit is second.

2000000

max_epochs

Maximum number of training epochs

100

log_every_n_steps

Logs are printed every N steps. By default, logs are printed every 10 steps.

10

save_summaries_steps

The summary is saved every five steps.

5

weight_decay

L2 regularization weight decay

0.00004

optimizer

Optimizer. The options are as follows:

  • dymomentumw
  • sgd
  • adam
  • momentum

momentum

momentum

Optimizer parameter momentum

0.9

patience

After eight epochs, if the precision is not improved, the learning rate will attenuate.

8

decay_patience

Training stops after the learning rate decreases N times.

1

decay_min_delta

Minimum difference between precisions of different learning rates. If the difference is greater than 0.001, the precision is improved.

0.001

min_delta

If the monitoring metric is less than 0.001, the training does not improve.

0.001

rcnn_iou_threshold

IoU threshold used for calculating the map when SSD or Faster R-CNN are used

0.5

inception_v3

Table 5 Algorithm description

Parameter

Description

Name

inception_v3

Usage

Image classification

Engine Type

TensorFlow, TF-1.8.0-python2.7

Precision

78.00%(top1), 93.90%(top5)

  • top1 indicates that the classification is considered correct only when the image with the maximum probability is the correct image.
  • top5 indicates that the classification is considered correct only when the correct image is within the top 5 images.

Training Dataset

ImageNet, classification of 1,000 image classes

Data Format

shape: [H, W, C>=1]; type: int8

Running Parameter

batch_size=32 ; split_spec=train:0.8,eval:0.2 ;

For more available running parameters, see Table 6.

Table 6 Running parameters

Optional Parameter

Description

Default Value

split_spec

Split ratio of the training set and validation set

train:0.8,eval:0.2

num_gpus

Number of used GPUs

1

batch_size

Number of images for each iteration (standalone)

32

eval_batch_size

Number of images read each step during validation (standalone)

32

learning_rate_strategy

Learning rate strategy. For example, 10:0.001,20:0.0001 indicates that the learning rate for 0 to 10 epochs is 0.001, and that for 10 to 20 epochs is 0.0001.

0.002

evaluate_every_n_epochs

A validation is performed after N epochs are trained.

1

save_interval_secs

Interval for saving the model. The unit is second.

2000000

max_epoches

Maximum number of training epochs

100

log_every_n_steps

Logs are printed every N steps. By default, logs are printed every 10 steps.

10

save_summaries_steps

The summary is saved every five steps.

5

weight_decay

L2 regularization weight decay

0.00004

optimizer

Optimizer. The options are as follows:

  • dymomentumw
  • sgd
  • adam
  • momentum

momentum

momentum

Optimizer parameter momentum

0.9

patience

After eight epochs, if the precision is not improved, the learning rate will attenuate.

8

decay_patience

Training stops after the learning rate decreases N times.

1

decay_min_delta

Minimum difference between precisions of different learning rates. If the difference is greater than 0.001, the precision is improved.

0.001

min_delta

If the monitoring metric is less than 0.001, the training does not improve.

0.001

image_size

Size of the input image. If this parameter is set to None, the default image size prevails.

None

lr_warmup_strategy

Warm-up strategy (linear or exponential)

linear

num_readers

Number of threads for reading data

64

fp16

Whether to use FP16 for training

FALSE

max_lr

Maximum learning rate for the dymomentum and dymomentumw optimizers, or when use_lr_schedule is used

6.4

min_lr

Minimum learning rate for the dymomentum and dymomentumw optimizers, or when use_lr_schedule is used

0.005

warmup

Proportion of warm-up in total training steps. This parameter is valid when use_lr_schedule is lcd or poly.

0.1

cooldown

Minimum learning rate in the warm-up

0.05

max_mom

Maximum momentum. This parameter is valid for dynamic momentum.

0.98

min_mom

Minimum momentum. This parameter is valid for dynamic momentum.

0.85

use_lars

Whether to use LARS

FALSE

use_nesterov

Whether to use Nesterov Momentum

TRUE

preprocess_threads

Number of threads for image preprocessing

12

use_lr_schedule

Learning rate adjustment policy ('lcd':linear_cosine_decay, 'poly':polynomial_decay)

None

darknet_53

Table 7 Algorithm description

Parameter

Description

Name

darknet_53

Usage

Image classification

Engine Type

MXNet, MXNet-1.2.1-python2.7

Precision

78.56%(top1), 94.43%(top5)

  • top1 indicates that the classification is considered correct only when the image with the maximum probability is the correct image.
  • top5 indicates that the classification is considered correct only when the correct image is within the top 5 images.

Training Dataset

ImageNet, classification of 1,000 image classes

Data Format

shape: [H>=224, W>=224, C>=1]; type: int8

Running Parameter

split_spec=0.8 ; batch_size=4 ;

For more available running parameters, see Table 8.

Table 8 Running parameters

Optional Parameter

Description

Default Value

num_classes

Total number of image classes in training

None

num_epoch

Number of training epochs

10

batch_size

Total amount of input data each time the parameters are updated

4

lr

Learning rate

0.0001

image_shape

Shape of the input image

3,224,224

split_spec

Split ratio of the training set and validation set

0.8

save_frequency

Interval for saving the model, indicating that the model is saved every N epochs

1

SegNet_VGG_BN_16

Table 9 Algorithm description

Parameter

Description

Name

SegNet_VGG_BN_16

Usage

Image semantic segmentation

Engine Type

MXNet, MXNet-1.2.1-python2.7

Precision

89%(pixel acc)

pixel acc indicates the ratio of correct pixels to total pixels.

Training Dataset

Camvid

Data Format

shape: [H=360, W=480, C==3]; type: int8

Running Parameter

deploy_on_terminal=False; deploy_on_terminal=False

For more available running parameters, see Table 10.

Table 10 Running parameters

Optional Parameter

Description

Default Value

lr

Learning rate

0.0001

mom

Momentum of the training network

0.9

wd

Attenuation coefficient

0.0005

num_classes

Total number of image classes in training. You do not need to plus 1 here.

11

batch_size

Total number of training images updated each time

8

num_epoch

Number of training epochs

15

save_frequency

Interval for saving the model, indicating that the model is saved every N epochs

1

num_examples

Total number of images used for training, which indicates the number of files in train.txt

2953

data_shape

Shape of the input image

3,256,256

optimizer

Optimizer. The default value is sgd. Another option is nag.

sgd

lr_steps

Number of epochs when the learning rate attenuates in the multi-factor strategy. By default, the learning rate attenuates to 0.1 times of the original value at the 10th and 15th epochs.

7,12

ResNet_v2_50

Table 11 Algorithm description

Parameter

Description

Name

ResNet_v2_50

Usage

Image classification

Engine Type

MXNet, MXNet-1.2.1-python2.7

Precision

75.55%(top1), 92.6%(top5)

  • top1 indicates that the classification is considered correct only when the image with the maximum probability is the correct image.
  • top5 indicates that the classification is considered correct only when the correct image is within the top 5 images.

Training Dataset

ImageNet, classification of 1,000 image classes

Data Format

shape: [H>=32, W>=32, C>=1]; type: int8

Running Parameter

split_spec=0.8 ; batch_size=4 ;

The available running parameters are the same as those for the darknet_53 algorithm. For details, see Table 8.

ResNet_v1_50

Table 12 Algorithm description

Parameter

Description

Name

ResNet_v1_50

Usage

Image classification

Engine Type

TensorFlow, TF-1.8.0-python2.7

Precision

74.2%(top1), 91.7%(top5)

  • top1 indicates that the classification is considered correct only when the image with the maximum probability is the correct image.
  • top5 indicates that the classification is considered correct only when the correct image is within the top 5 images.

Training Dataset

ImageNet, classification of 1,000 image classes

Data Format

shape: [H>=600,W<=1024,C>=1];type:int8

Running Parameter

batch_size=32 ; split_spec=train:0.8,eval:0.2 ;

The available running parameters are the same as those for the inception_v3 algorithm. For details, see Table 6.

Faster_RCNN_ResNet_v2_50

Table 13 Algorithm description

Parameter

Description

Name

Faster_RCNN_ResNet_v2_50

Usage

Object class and location

Engine Type

MXNet, MXNet-1.2.1-python2.7

Precision

80.05%(mAP)

mAP is an indicator that measures the effect of an object detection algorithm. For object detection tasks, the precision rate (Precision) and recall rate (Recall) can be calculated for each class of object. The rates can be calculated and tested multiple times for each class of object based on different thresholds, and a P-R curve is obtained accordingly. The area under the curve is the average value.

Training Dataset

PASCAL VOC2007, detection of 20 classes of objects

Data Format

shape: [H, W, C==3]; type: int8

Running Parameter

lr=0.0001 ; eval_frequence=1 ;

For more available running parameters, see Table 14.

Table 14 Running parameters

Optional Parameter

Description

Default Value

num_classes

Total number of image classes in training. The value must plus 1 because there is a background class.

None

eval_frequence

Frequency for validating the model. By default, validation is performed every epoch.

1

lr

Learning rate

0.0001

mom

Momentum of the training network

0.9

wd

Parameter weight decay coefficient, L2

0.0005

export_model

Whether the generated model is generated as the format required for deploying the inference service

TRUE

split_spec

Split ratio of the training set and validation set

0.8

optimizer

Optimizer. The default value is sgd. Another option is nag.

sgd

Faster_RCNN_ResNet_v1_50

Table 15 Algorithm description

Parameter

Description

Name

Faster_RCNN_ResNet_v1_50

Usage

Object class and location

Engine Type

TensorFlow, TF-1.8.0-python2.7

Precision

73.6%(mAP)

mAP is an indicator that measures the effect of an object detection algorithm. For object detection tasks, the precision rate (Precision) and recall rate (Recall) can be calculated for each class of object. The rates can be calculated and tested multiple times for each class of object based on different thresholds, and a P-R curve is obtained accordingly. The area under the curve is the average value.

Training Dataset

PASCAL VOC2007, detection of 20 classes of objects

Data Format

shape: [H>=600,W<=1024,C>=1];type:int8

Running Parameter

batch_size=32 ; split_spec=train:0.8,eval:0.2 ;

The available running parameters are the same as those for the retinanet_resnet_v1_50 algorithm. For details, see Table 4.

Did you find this page helpful?

Submit successfully!

Thank you for your feedback. Your feedback helps make our documentation better.

Failed to submit the feedback. Please try again later.

Which of the following issues have you encountered?







Please complete at least one feedback item.

Content most length 200 character

Content is empty.

OK Cancel