Robust Head Detection in Complex Videos Using Two-stage Deep Convolution Framework

Sultan Daud Khan, Yasir Ali, Basim Zafar, Abdulfattah Noorwali
2020 IEEE Access  
Pedestrian head detection plays an important role in identifying and localizing individuals in real world visual data. Head detection is a nontrivial problem due to considerable variance in camera viewpoints, scales, human poses, and appearances in the scene. Thanks to the translation invariance property of convolutional neural networks (CNNs) which enables large capacity CNNs to handle the problem of appearance and pose variations in the scene. However, the problem of scale invariance is still
more » ... an open issue. To address this problem, this paper presents a two-stage head detection framework that utilizes fully convolutional network (FCN) to generate scale-aware proposals followed by CNN that classifies each proposal into two classes, i.e. head and background. Experiments results show that using scale-aware proposals obtained by FCN, the object recall rate and mean average precision (mAP) are improved. Additionaly, we demonstrate that our framework achieved state-of-the-art results on four challenging benchmark datasets, i.e. HollywoodHeads, Casablanca, SHOCK, and WIDERFACE. INDEX TERMS Convolutional neural networks, non-maximal suppression, head detection, crowd counting, motion analysis. VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ 98680 VOLUME 8, 2020
doi:10.1109/access.2020.2995764 fatcat:pyifzfobbvfgffhprrtbbfrn7e