Plant Phenomics / 2020 / Article / Tab 1

Review Article

Convolutional Neural Networks for Image-Based High-Throughput Plant Phenotyping: A Review

Table 1

Summary of major CNN architecture developed for image classification, object detection, and semantic and instance segmentation.

ModelVision taskKey conceptSource code (third-party implementation)

AlexNetImage classificationA five-layer CNN architecture (TensorFlow) (PyTorch)

ZFNetImage classificationFeature visualization for model improvement (Caffe2)

VGGNetImage classificationSmall-sized (3 by 3) convolutional filters to increase the depth of CNNs (up to 19 layers) (Caffe)

Inception familyImage classificationInception modules for increasing the width of CNNs and therefore the capability of feature representation (TensorFlow) (PyTorch)

ResNet familyImage classificationResidual representation and skip connection scheme to enable the training of very deep CNNs (up to 1000 layers) (TensorFlow) (PyTorch)

DenseNetImage classificationDense block modules to substantially decrease the number of model parameters (therefore computational cost) and strengthen feature propagation (therefore feature learning capability) (supports multiple DL framework)

NASNetImage classificationReinforcement learning on a small dataset to find optimal convolutional cells that are used to build a CNN architecture for a large dataset (TensorFlow)

RCNN familyObject detectionA two-stage framework to generate regions of interest (ROIs) and then predict the class label and calculate the bounding box coordinates for each ROI (TensorFlow) for Faster RCNN (Caffe2) for R-FCN, and Fast/Faster RCNN

YOLO familyObject detectionA one-stage framework to regress both class labels and bounding box coordinates for each grid cell on the last feature map (C++)

SSDObject detectionA one-stage framework to regress class labels and bounding box coordinates for anchors in each grid cell on feature maps extracted from different convolution layers (thus different resolutions) (Caffe) (TensorFlow) for SSD

RetinaNetObject detectionA one-stage framework to use focal loss that is a new loss function to solve the foreground-background class imbalance problem (Caffe2) for RetinaNet

FCNSemantic segmentationFully convolutional architecture to train and predict classes at the pixel level in an end-to-end manner for semantic segmentation (Caffe) (TensorFlow) (PyTorch)

U-NetSemantic segmentationAn encoder-decoder architecture for semantic segmentation (Caffe) (TensorFlow) (PyTorch)

DeepLab familySemantic segmentationAtrous convolution operation to simultaneously increase receptive field and reduce the computation complexity to improve the segmentation accuracy; fully connected conditional random field (CRF) as a postprocessing method to improve the segmentation accuracy (Caffe) (TensorFlow) (PyTorch)

Mask RCNNInstance segmentationMasking head with ROI align operation on top of the Faster RCNN model to significantly improve segmentation accuracy (Caffe2) (TensorFlow) for Mask RCNN

Note: source code provided by original authors.