# renom_img.api.detection ¶

class  Yolov1  ( class_map=None , cells=7 , bbox=2 , imsize=(224 , 224) , load_pretrained_weight=False , train_whole_network=False )

Yolov1 object detection algorithm.

 Parameters: class_map ( list , dict ) – List of class names. cells ( int or tuple ) – Cell size. bbox ( int ) – Number of boxes. imsize ( int , tuple ) – Image size. load_pretrained_weight ( bool , str ) – If True, pretrained weights will be downloaded to the current directory and loaded as the initial weight values. If a string is given, weight values will be loaded and initialized from the weights in the given file name. train_whole_network ( bool ) – Flag specifying whether to freeze or train the base layers of the model during training. If True, trains all layers of the model. If False, the convolutional base is frozen during training.

Example

>>> from renom_img.api.detection.yolo_v1 import Yolov1
>>> from renom_img.api.utility.load import parse_xml_detection
>>>
>>> train_label_path_list = ...  # Provide list of training label paths
>>> annotation_list, class_map = parse_xml_detection(train_label_path_list)
>>>
>>> model = Yolov1(class_map, cells=7, bbox=2, imsize=(224,224), load_pretrained_weight=True, train_whole_network=True)


References

Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi
You Only Look Once: Unified, Real-Time Object Detection

 build_data  ( )

This function returns a function which creates input data and target data specified for Yolov1.

 Returns: Returns function which creates input data and target data. (function)

Example

>>> builder = model.build_data()  # This will return function.
>>> x, y = builder(image_path_list, annotation_list)
>>> z = model(x)
>>> loss = model.loss(z, y)

 fit  ( train_img_path_list=None , train_annotation_list=None , valid_img_path_list=None , valid_annotation_list=None , epoch=136 , batch_size=64 , optimizer=None , augmentation=None , callback_end_epoch=None )

This function performs training with the given data and hyperparameters.

 Parameters: train_img_path_list ( list ) – List of image paths. train_annotation_list ( list ) – List of annotations. valid_img_path_list ( list ) – List of image paths for validation. valid_annotation_list ( list ) – List of annotations for validation. epoch ( int ) – Number of training epochs. batch_size ( int ) – Batch size. augmentation ( Augmentation ) – Augmentation object. callback_end_epoch ( function ) – Given function will be called at the end of each epoch. Training loss list and validation loss list. (tuple)

Example

>>> train_img_path_list, train_annot_list = ... # Define train data
>>> valid_img_path_list, valid_annot_list = ... # Define validation data
>>> class_map = ... # Define class map
>>> model = Yolov1(class_map) # Specify any algorithm provided by ReNomIMG API here
>>> model.fit(
...     # Feeds image and annotation data
...     train_img_path_list,
...     train_annot_list,
...     valid_img_path_list,
...     valid_annot_list,
...     epoch=8,
...     batch_size=8)
>>>


The following arguments will be given to the function  callback_end_epoch  .

• epoch (int) - Current epoch number.
• model (Model) - Model object.
• avg_train_loss_list (list) - List of average train loss of each epoch.
• avg_valid_loss_list (list) - List of average valid loss of each epoch.
 forward  ( x )

Performs forward propagation. You can call this function using the  __call__  method.

 Parameters: x ( ndarray , Node ) – Input to Yolov1. Returns raw output of Yolov1. (Node)

Example

>>> import numpy as np
>>> x = np.random.rand(1, 3, 224, 224)
>>>
>>> class_map = ["dog", "cat"]
>>> model = Yolov1(class_map)
>>>
>>> y = model.forward(x) # Forward propagation.
>>> y = model(x)  # Same as above result.

 get_bbox  ( z , score_threshold=0.3 , nms_threshold=0.4 )

Calculates the bounding box location, size and class information for model predictions.

 Parameters: z ( ndarray ) – Output array of neural network. score_threshold ( float ) – The threshold for confidence score. Predicted boxes which have a lower confidence score than the threshold are discarded. The default is 0.3. nms_threshold ( float ) – The threshold for non-maximum supression. The default is 0.4. List of predicted bbox, score and class for each image. The format of the return value is shown below. Box coordinates and size will be returned as ratios to the original image size. Therefore, the values of ‘box’ are in the range [0 ~ 1]. (list)
# An example of a return value.
[
[ # Prediction for first image.
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
...
],
[ # Prediction for second image.
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
...
],
...
]


Example

>>> z = model(x)
>>> model.get_bbox(z)
[[{'box': [0.21, 0.44, 0.11, 0.32], 'score':0.823, 'class':1, 'name':'dog'}],
[{'box': [0.87, 0.38, 0.84, 0.22], 'score':0.423, 'class':0, 'name':'cat'}]]


Note

Box coordinates and size will be returned as ratios to the original image size. Therefore, the values of ‘box’ are in the range [0 ~ 1].

 load  ( filename )

Load saved weights to model.

 Parameters: filename ( str ) – File name of saved model.

Example

>>> model = rm.Dense(2)
>>> model.load("model.hd5")

 loss  ( x , y )

Loss function specified for yolov1.

 Parameters: x ( Node , ndarray ) – Output data of neural network. y ( Node , ndarray ) – Target data. Loss between x and y. (Node)

Example

>>> z = model(x)
>>> model.loss(z, y)

 predict  ( img_list , batch_size=1 , score_threshold=0.3 , nms_threshold=0.4 )

This method accepts an array of image paths, list of image paths, or a path to an image.

 Parameters: img_list ( string , list , ndarray ) – Path to an image, list of path or ndarray. score_threshold ( float ) – The threshold for the confidence score. Predicted boxes that have a lower confidence score than the threshold are discarded. The default is 0.3. nms_threshold ( float ) – The threshold for non maximum supression. The default is 0.4. List of predicted bbox, score and class of each image. The format of the return value is shown below. Box coordinates and size will be returned as ratios to the original image size. Therefore, the values of ‘box’ are in the range [0 ~ 1]. (list)
# An example of a return value.
[
[ # Prediction for first image.
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
...
],
[ # Prediction for second image.
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
...
],
...
]


Example

>>>
>>> model.predict(['img01.jpg', 'img02.jpg']])
[[{'box': [0.21, 0.44, 0.11, 0.32], 'score':0.823, 'class':1, 'name':'dog'}],
[{'box': [0.87, 0.38, 0.84, 0.22], 'score':0.423, 'class':0, 'name':'cat'}]]


Note

Box coordinates and size will be returned as ratios to the original image size. Therefore, the values of ‘box’ are in the range [0 ~ 1].

 preprocess  ( x )

Performs preprocessing for a given array.

 Parameters: x ( ndarray , Node ) – Image array for preprocessing.
 regularize  ( )

Adds a regularization term to the loss function.

Example

>>> x = numpy.random.rand(1, 3, 224, 224)  # Input image
>>> y = ...  # Ground-truth label
>>>
>>> class_map = ['cat', 'dog']
>>> model = Yolov1(class_map)
>>>
>>> z = model(x)  # Forward propagation
>>> loss = model.loss(z, y)  # Loss calculation
>>> reg_loss = loss + model.regularize()  # Add weight decay term.

class  Yolov2  ( class_map=None , anchor=None , imsize=(320 , 320) , load_pretrained_weight=False , train_whole_network=False )

Yolov2 object detection algorithm.

 Parameters: class_map ( list , dict ) – List of class names. anchor ( AnchorYolov2 ) – Anchors. imsize ( list ) – Image size(s). This can be either an image size ex):(320, 320) or list of image sizes ex):[(288, 288), (320, 320)]. If a list of image sizes is provided, the prediction method uses the last image size of the list for prediction. load_pretrained_weight ( bool , str ) – Argument specifying whether or not to load pretrained weight values. If True, pretrained weights will be downloaded to the current directory and loaded as the initial weight values. If a string is given, weight values will be loaded and initialized from the weights in the given file name. train_whole_network ( bool ) – Flag specifying whether to freeze or train the base layers of the model during training. If True, trains all layers of the model. If False, the convolutional base is frozen during training.

Example

>>> from renom_img.api.detection.yolo_v2 import Yolov2, create_anchor
>>> from renom_img.api.utility.load import parse_xml_detection
>>>
>>> train_label_path_list = ...  # provide list of paths to training data
>>> annotation_list, class_map = parse_xml_detection(train_label_path_list)
>>> my_anchor = create_anchor(annotation_list)
>>>
>>> model = Yolov2(class_map, anchor=my_anchor, imsize=(320,320), load_pretrained_weight=True, train_whole_network=True)


References

Joseph Redmon, Ali Farhadi
YOLO9000: Better, Faster, Stronger

Note

If you save this model using the ‘save’ method, anchor information (list of anchors and their base sizes) will be saved. Therefore, when you load your own saved model, you do not need to provide the ‘anchor’ and ‘anchor_size’ arguments.

 build_data  ( imsize_list=None )

This function returns a function which creates input data and target data specified for Yolov2.

 Returns: Returns function which creates input data and target data. (function)

Example

>>> builder = model.build_data()  # This will return function.
>>> x, y = builder(image_path_list, annotation_list)
>>> z = model(x)
>>> loss = model.loss(z, y)

 fit  ( train_img_path_list , train_annotation_list , valid_img_path_list=None , valid_annotation_list=None , epoch=160 , batch_size=16 , optimizer=None , imsize_list=None , augmentation=None , callback_end_epoch=None )

This function performs training with the given data and hyperparameters. Yolov2 is trained using multiple scale images. Therefore, this function requires a list of image sizes. If this is not provided, the model will be trained using a fixed image size.

 Parameters: train_img_path_list ( list ) – List of image paths. train_annotation_list ( list ) – List of annotations. valid_img_path_list ( list ) – List of image paths for validation. valid_annotation_list ( list ) – List of annotations for validation. epoch ( int ) – Number of training epochs. batch_size ( int ) – Batch size. imsize_list ( list ) – List of image sizes. augmentation ( Augmentation ) – Augmentation object. callback_end_epoch ( function ) – The given function will be called at the end of each epoch. Training loss list and validation loss list. (tuple)

Example

>>> from renom_img.api.detection.yolo_v2 import Yolov2
>>> train_img_path_list, train_annot_list = ... # Define train data.
>>> valid_img_path_list, valid_annot_list = ...i # Define validation data.
>>> class_map = ... # List of class names.
>>> model = Yolov2(class_map)
>>> model.fit(
...     # Feeds image and annotation data.
...     train_img_path_list,
...     train_annot_list,
...     valid_img_path_list,
...     valid_annot_list,
...     epoch=8,
...     batch_size=8)
>>>


The following arguments will be given to the function  callback_end_epoch  .

• epoch (int) - Current epoch number.
• model (Model) - Yolov2 object.
• avg_train_loss_list (list) - List of average train loss of each epoch.
• avg_valid_loss_list (list) - List of average valid loss of each epoch.
 forward  ( x )

Performs forward propagation. You can call this function using the  __call__  method.

 Parameters: x ( ndarray , Node ) – Input to Yolov2.
 get_bbox  ( z , score_threshold=0.3 , nms_threshold=0.4 )

Calculates the bounding box location, size and class information for model predictions.

Example

>>> z = model(x)
>>> model.get_bbox(z)
[[{'box': [0.21, 0.44, 0.11, 0.32], 'score':0.823, 'class':1, 'name':'dog'}],
[{'box': [0.87, 0.38, 0.84, 0.22], 'score':0.423, 'class':0, 'name':'cat'}]]

 Parameters: z ( ndarray ) – Output array of neural network. List of predicted bbox, score and class of each image. The format of the return value is shown below. Box coordinates and size will be returned as ratios to the original image size. Therefore, the values in ‘box’ are in the range [0 ~ 1]. (list)
# An example of a return value.
[
[ # Prediction for first image.
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
...
],
[ # Prediction for second image.
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
...
],
...
]


Note

Box coordinates and size will be returned as ratios to the original image size. Therefore, the values in ‘box’ are in the range [0 ~ 1].

 load  ( filename )

Load saved weights to model.

 Parameters: filename ( str ) – File name of saved model.

Example

>>> model = rm.Dense(2)
>>> model.load("model.hd5")

 loss  ( x , buffer , y )

Loss function of Yolov2 algorithm.

 Parameters: x ( ndarray , Node ) – Output of model. y ( ndarray , Node ) – Target array. Loss between x and y. (Node)

Example

>>> builder = model.build_data()  # This will return a builder function.
>>> x, buffer, y = builder(image_path_list, annotation_list)
>>> z = model(x)
>>> loss = model.loss(z, buffer, y)

 predict  ( img_list , batch_size=1 , score_threshold=0.3 , nms_threshold=0.4 )

This method accepts an array of image paths, list of image paths, or a path to an image.

 Parameters: img_list ( string , list , ndarray ) – Path to an image, list of path or ndarray. score_threshold ( float ) – The threshold for the confidence score. Predicted boxes that have a lower confidence score than the threshold are discarded. The default is 0.3. nms_threshold ( float ) – The threshold for non maximum supression. The default is 0.4. List of predicted bbox, score and class of each image. The format of the return value is shown below. Box coordinates and size will be returned as ratios to the original image size. Therefore, the values of ‘box’ are in the range [0 ~ 1]. (list)
# An example of a return value.
[
[ # Prediction for first image.
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
...
],
[ # Prediction for second image.
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
...
],
...
]


Example

>>>
>>> model.predict(['img01.jpg', 'img02.jpg']])
[[{'box': [0.21, 0.44, 0.11, 0.32], 'score':0.823, 'class':1, 'name':'dog'}],
[{'box': [0.87, 0.38, 0.84, 0.22], 'score':0.423, 'class':0, 'name':'cat'}]]


Note

Box coordinates and size will be returned as ratios to the original image size. Therefore, the values of ‘box’ are in the range [0 ~ 1].

 preprocess  ( x )

Performs preprocessing for a given array.

 Parameters: x ( ndarray , Node ) – Image array for preprocessing.
 regularize  ( )

Adds a regularization term to the loss function.

Example

>>> x = numpy.random.rand(1, 3, 224, 224)  # Input image
>>> y = ...  # Ground-truth label
>>>
>>> class_map = ['cat', 'dog']
>>> model = Yolov2(class_map)
>>>
>>> z = model(x)  # Forward propagation
>>> loss = model.loss(z, y)  # Loss calculation
>>> reg_loss = loss + model.regularize()  # Add weight decay term.

class  SSD  ( class_map=None , imsize=(300 , 300) , overlap_threshold=0.5 , load_pretrained_weight=False , train_whole_network=False )

SSD object detection algorithm.

 Parameters: class_map ( list , dict ) – List of class names. imsize ( int or tuple ) – Image size. Must be 300x300. This can either be specified as an integer ex): 300 or tuple ex): (300,300). overlap_threshold ( float ) – Threshold to be used in selecting the best prior box. This threshold sould be between 0 and 1. The default is 0.5. load_pretrained_weight ( bool , string ) – Whether or not to load pretrained weights for the backbone model. If True, pretrained weights will be downloaded to the current directory and loaded into the model. If a string is provided, pretrained weightwill be loaded from the specified filename. The default is False. train_whole_network ( bool ) – Whether or not to train the whole network. If True, all network layers will be trained. If False, the backbone network layers will be set to inference mode, and no updates will be performed for the backbone network weights. The default is False.

Example

>>> from renom_img.api.detection.ssd import SSD
>>> from renom_img.api.utility.load import parse_xml_detection
>>>
>>> train_label_path_list = ...  # provide list of paths to training data
>>> annotation_list, class_map = parse_xml_detection(train_label_path_list)
>>>
>>> model = SSD(class_map, imsize=(300,300), load_pretrained_weight=True, train_whole_network=True)


References

Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg
SSD: Single Shot MultiBox Detector

 build_data  ( )

This function returns a function which creates input data and target data specified for SSD.

 Returns: Returns function which creates input data and target data. (function)

Example

>>> builder = model.build_data()  # This will return a builder function.
>>> x, y = builder(image_path_list, annotation_list)
>>> z = model(x)
>>> loss = model.loss(z, y)

 fit  ( train_img_path_list=None , train_annotation_list=None , valid_img_path_list=None , valid_annotation_list=None , epoch=136 , batch_size=64 , optimizer=None , augmentation=None , callback_end_epoch=None )

This function performs training with the given data and hyperparameters.

 Parameters: train_img_path_list ( list ) – List of image paths. train_annotation_list ( list ) – List of annotations. valid_img_path_list ( list ) – List of image paths for validation. valid_annotation_list ( list ) – List of annotations for validation. epoch ( int ) – Number of training epochs. batch_size ( int ) – Batch size. augmentation ( Augmentation ) – Augmentation object. callback_end_epoch ( function ) – Given function will be called at the end of each epoch. Training loss list and validation loss list. (tuple)

Example

>>> train_img_path_list, train_annot_list = ... # Define train data
>>> valid_img_path_list, valid_annot_list = ... # Define validation data
>>> class_map = ... # Define class map
>>> model = SSD(class_map) # Specify any algorithm provided by ReNomIMG API here
>>> model.fit(
...     # Feeds image and annotation data
...     train_img_path_list,
...     train_annot_list,
...     valid_img_path_list,
...     valid_annot_list,
...     epoch=8,
...     batch_size=8)
>>>


The following arguments will be given to the function  callback_end_epoch  .

• epoch (int) - Current epoch number.
• model (Model) - Model object.
• avg_train_loss_list (list) - List of average train loss of each epoch.
• avg_valid_loss_list (list) - List of average valid loss of each epoch.
 forward  ( x )

Performs forward propagation. You can call this function using the  __call__  method.

 Parameters: x ( ndarray , Node ) – Input to SSD. Returns raw output of SSD. (Node)

Example

>>> import numpy as np
>>> x = np.random.rand(1, 3, 224, 224)
>>>
>>> class_map = ["dog", "cat"]
>>> model = SSD(class_map)
>>>
>>> y = model.forward(x) # Forward propagation.
>>> y = model(x)  # Same as above result.

 get_bbox  ( z , score_threshold=0.6 , nms_threshold=0.45 )

Calculates the bounding box location, size and class information for model predictions.

Example

>>> z = model(x)
>>> model.get_bbox(z)
[[{'box': [0.21, 0.44, 0.11, 0.32], 'score':0.823, 'class':1, 'name':'dog'}],
[{'box': [0.87, 0.38, 0.84, 0.22], 'score':0.423, 'class':0, 'name':'cat'}]]

 Parameters: z ( ndarray ) – Output array of neural network. List of predicted bbox, score and class for each image. The format of the return value is shown below. Box coordinates and size will be returned as ratios to the original image size. Therefore, the values of ‘box’ are in the range [0 ~ 1]. (list)
# An example of a return value.
[
[ # Prediction for first image.
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
...
],
[ # Prediction for second image.
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
...
],
...
]


Note

Box coordinates and size will be returned as ratios to the original image size. Therefore, the values of ‘box’ are in the range [0 ~ 1].

 load  ( filename )

Load saved weights to model.

 Parameters: filename ( str ) – File name of saved model.

Example

>>> model = rm.Dense(2)
>>> model.load("model.hd5")

 loss  ( x , y , neg_pos_ratio=3.0 )

Loss function specified for SSD.

 Parameters: x ( Node , nd_array ) – Output data of neural network. y ( Node , nd_array ) – Target data. neg_pos_ratio ( float ) – Positive and Negative ratio to be used for hard negative mining. After the matching with true boxes, most of the prior boxes are negative. To eliminate imbalance between positive and negative boxes this ratio is used by the loss function. The default value is 3.0. Loss between x and y
Return type:
Node

Example

>>> z = model(x)
>>> loss = model.loss(x,y)

 predict  ( img_list , batch_size=1 , score_threshold=0.6 , nms_threshold=0.45 )

This method accepts an ndarray of image paths, a list of image paths, or an image path as a string.

 Parameters: img_list ( string , list , ndarray ) – Path to an image, list of path or ndarray. score_threshold ( float ) – The threshold for the confidence score. Predicted boxes which have a lower confidence score than the threshold are discarded. The default is 0.6. nms_threshold ( float ) – The threshold for non-maximum supression. The default is 0.45. List of predicted bbox, score and class for each image. The format of the return value is shown below. Box coordinates and size will be returned as ratios to the original image size. Therefore, the values of ‘box’ are in the range [0 ~ 1]. (list)
# An example of a return value.
[
[ # Prediction for first image.
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
...
],
[ # Prediction for second image.
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
{'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
...
],
...
]


Example

>>>
>>> model.predict(['img01.jpg', 'img02.jpg']])
[[{'box': [0.21, 0.44, 0.11, 0.32], 'score':0.823, 'class':1, 'name':'dog'}],
[{'box': [0.87, 0.38, 0.84, 0.22], 'score':0.423, 'class':0, 'name':'cat'}]]


Note

Box coordinates and size will be returned as ratios to the original image size. Therefore, the values of ‘box’ are in the range [0 ~ 1].

 preprocess  ( x )

Performs preprocessing for a given array.

 Parameters: x ( ndarray , Node ) – Image array for preprocessing.
 regularize  ( )

Adds a regularization term to the loss function.

Example

>>> x = numpy.random.rand(1, 3, 224, 224)  # Input image
>>> y = ...  # Ground-truth label
>>>
>>> class_map = ['cat', 'dog']
>>> model = SSD(class_map)
>>>
>>> z = model(x)  # Forward propagation
>>> loss = model.loss(z, y)  # Loss calculation
>>> reg_loss = loss + model.regularize()  # Add weight decay term.