A Review of Deep Learning based Object Detection Techniques

Jash Sheth
2019 International Journal for Research in Applied Science and Engineering Technology  
The aim of object detection is to recognize instances of semantic objects belonging to a certain class within an image, accurately predict the location of the object in the image, and then to classify it according to a corresponding class label. In the past few years, there have been a lot of new and constantly improving models proposed for this task. Deep Learning based approaches, especially those involving Deep Convolutional Neural Networks, have been the most popular for good reason. In
more » ... paper, we aim to review the latest approaches in tackling the problem of object detection, while understanding the drawbacks of each approach as well as the improvements observed with the subsequent models. We then compare the results obtained by each model on popular datasets. In the last part, we aim to offer ideas for future work, scope for improvement and potential application areas. 1155 ©IJRASET: All Rights are Reserved C. Two Families of Object Detectors Recently, there have been two approaches observed in the deep learning based object detectors: i) One stage approach, ii) Two stage approach. Two stage approach involves first generating region proposals from components of the input image, and then classifying into the corresponding class, generally involving an ROI pooling layer in between. Such object detectors have higher accuracy as compared to their one stage counterparts. Examples of these include R- CNN [14], SPPNet [15], Fast R-CNN [16], Faster R-CNN [17] and Mask R-CNN [18]. Object detectors following the one stage approach directly make predictions in one step following a unified framework. Such object detectors are much faster in speed, making them suitable for real time applications. Few of the most widely used one stage object detectors are YOLO [19], RetinaNet [20], SSD [21], RefineDet [22], and YOLOv2 [23].
doi:10.22214/ijraset.2019.9165 fatcat:3jt7rr3uuvdtbgilhg52cdiarm