2,033 Hits in 3.6 sec

A multivariate intersection over union of SiamRPN network for visual tracking

Zhihui Huang, Huimin Zhao, Jin Zhan, Huakang Li
2021 The Visual Computer  
Finally, based on SiamPRN tracker, we compared the tracking performance of $$\ell _1$$ ℓ 1 -smooth loss, IOU loss, GIOU loss, DIOU loss, and MIOU loss.  ...  location of bounding box.  ...  is cross-entropy (CE) loss.  ... 
doi:10.1007/s00371-021-02150-1 fatcat:2zjmshjiurdnhmofyncgkujlem

Deep Neural Network-based Speaker-Aware Information Logging (SAIL) for Augmentative and Alternative Communication

Gang Hu, Szu-Han Kay Chen, Neal Mazur
2021 Special Issue: Blockchain and Artificial Intelligence Applications  
People with complex communication needs can use a high-technology Augmentative and Alternative Communication (AAC) device to communicate with others.  ...  Therefore, this paper presents a solution using a deep neural network-based visual analysis approach to process videos to detect different AAC users in practice sessions.  ...  The weight term α helps us in balancing the contribution of the location loss.  ... 
doi:10.37965/jait.2021.0017 fatcat:2his7f34yfeu3gzq5p6lyvzpu4

Real-Time Pattern-Recognition of GPR Images with YOLO v3 Implemented by Tensorflow

Yuanhong Li, Zuoxi Zhao, Yangfan Luo, Zhi Qiu
2020 Sensors  
This paper deals with the vacillating signal similarity intersection over union (IoU) (V-IoU) methods.  ...  Currently, a few robust AI approach can detect targets by real-time with high precision or automation for GPR images recognition.  ...  Figure 12 showed that the parabola with signal oscillation due to some highly conductive targets can be identified and located by the YOLO v3 detector with V-IoU.  ... 
doi:10.3390/s20226476 pmid:33198420 pmcid:PMC7696763 fatcat:eegkpidhxfegheg5zreprzi6bq

Towards Balanced Learning for Instance Recognition

Jiangmiao Pang, Kai Chen, Qi Li, Zhihai Xu, Huajun Feng, Jianping Shi, Wanli Ouyang, Dahua Lin
2021 International Journal of Computer Vision  
Instance recognition is rapidly advanced along with the developments of various deep convolutional neural networks.  ...  It integrates IoU-balanced sampling, balanced feature pyramid, and objective re-weighting, respectively for reducing the imbalance at sample, feature, and objective level.  ...  Experiments also show that the performance is not sensitive to K, as long as the samples with higher IoU are more likely selected.  ... 
doi:10.1007/s11263-021-01434-2 fatcat:lvblzs5cnzfz7daf2exgx6u3pq


Jiahui Yu, Yuning Jiang, Zhangyang Wang, Zhimin Cao, Thomas Huang
2016 Proceedings of the 2016 ACM on Multimedia Conference - MM '16  
To address the issue, we firstly introduce a novel Intersection over Union (IoU) loss function for bounding box prediction, which regresses the four bounds of a predicted box as a whole unit.  ...  By taking the advantages of IoU loss and deep fully convolutional networks, the UnitBox is introduced, which performs accurate and efficient localization, shows robust to objects of varied shapes and scales  ...  Note that with 0 ≤ IoU ≤ 1, L = −ln(IoU ) is essentially a cross-entropy loss with input of IoU : we can view IoU as a kind of random variable sampled from Bernoulli distribution, with p(IoU = 1) = 1,  ... 
doi:10.1145/2964284.2967274 dblp:conf/mm/YuJWCH16 fatcat:noc64tqf2bawzhaoqgxj43mgfe

An Integrated Design for Classification and Localization of Diabetic Foot Ulcer based on CNN and YOLOv2-DFU Models

Javaria Amin, Muhammad Sharif, Muhammad Almas Anjum, Habib Ullah Khan, Muhammad Sheraz Arshad Malik, Seifedine Kadry
2020 IEEE Access  
as compared with other classifiers.  ...  In addition, after the classification, the Gradient-weighted class activation mapping (Grad-Cam) model is used to visualize the high-level features of the infected region for better understanding.  ...  (b) Infection Figure 16 -FIGURE 17 . 1617 17, shows exact location of the infected region is localized with confidence scores.  ... 
doi:10.1109/access.2020.3045732 fatcat:4lwslorainb2fhhunzntql2n24

An Improved Character Recognition Framework for Containers Based on DETR Algorithm

Xiaofang Zhao, Peng Zhou, Ke Xu, Liyun Xiao
2021 Sensors  
In addition, multi-scale location encoding is introduced on the basis of the original sinusoidal position encoding model, improving the sensitivity of input position information for the transformer structure  ...  An improved DETR (detection with transformers) object detection framework is proposed to realize accurate detection and recognition of characters on shipping containers.  ...  The boxes that do not contain the target and boxes with recognition error are matched with the background class, and the other boxes are matched with the ground truth.  ... 
doi:10.3390/s21134612 fatcat:p6hgaehp5retlnghvhmfegfvdy

Automatic Recognition and Classification System of Thyroid Nodules in CT Images Based on CNN

Wenjun Li, Siyi Cheng, Kai Qian, Keqiang Yue, Hao Liu, Elpida Keravnou
2021 Computational Intelligence and Neuroscience  
In the test set, the segmentation IOU reaches 0.855, and the classification output accuracy reaches 85.92%.  ...  After each module is connected in series with the algorithm, the automatic classification of each nodule can be realized.  ...  In the model training, the loss function used in segmentation network is Dice Loss + Binary Cross Entropy, the optimizer is Adam, and the performance evaluation index is IOU.  ... 
doi:10.1155/2021/5540186 pmid:34135949 pmcid:PMC8175135 fatcat:kq6x4veljzgt7phubkeaacmtya

An Efficient Method for High-Speed Railway Dropper Fault Detection Based on Depthwise Separable Convolution

Shiwang Liu, Long Yu, Dongkai Zhang
2019 IEEE Access  
INDEX TERMS OCS, dropper location, fault recognition, depthwise separable convolution.  ...  First, a dropper progressive location network (DPLN) was adopted to obtain the dropper. The DPLN was mainly composed of a pantograph location network (PLN) and a dropper location network (DLN).  ...  Each anchor is responsible for locating objects with IOU greater than the threshold.  ... 
doi:10.1109/access.2019.2942079 fatcat:xuia47f6zzaubmahhwb6r36snq

Hierarchical Attentive Recurrent Tracking [article]

Adam R. Kosiorek, Alex Bewley, Ingmar Posner
2017 arXiv   pre-print
To improve training convergence, we augment the loss function with terms for a number of auxiliary tasks relevant for tracking.  ...  Inspired by how the human visual cortex employs spatial attention and separate "where" and "what" processing pathways to actively suppress irrelevant visual features, this work develops a hierarchical  ...  Acknowledgements We would like to thank Oiwi Parker Jones and Martin Engelcke for discussions and valuable insights and Neil Dhir for his help with editing the paper.  ... 
arXiv:1706.09262v2 fatcat:gvhhwavwjnek5ijpddesyjeipe

Using Deep Learning to Identify Utility Poles with Crossarms and Estimate Their Locations from Google Street View Images

Weixing Zhang, Chandi Witharana, Weidong Li, Chuanrong Zhang, Xiaojiang Li, Jason Parent
2018 Sensors  
Traditional methods of detecting and mapping utility poles are inefficient and costly because of the demand for visual interpretation with quality data sources or intense field inspection.  ...  In general, this study indicates that even in a complex background, most utility poles can be detected with the use of DL, and the LOB measurement method can estimate the locations of most UPCs.  ...  The authors would like to thank Krista Rogers for her helpful reviews of this manuscript and thank Shahearn Philemon for visually interpreting reference utility poles from high-resolution aerial images  ... 
doi:10.3390/s18082484 pmid:30071580 fatcat:insamw62wfddldmsmqesm3r3xu

Use HiResCAM instead of Grad-CAM for faithful explanations of convolutional neural networks [article]

Rachel Lea Draelos, Lawrence Carin
2021 arXiv   pre-print
Overall, this work advances convolutional neural network explanation approaches and may aid in the development of trustworthy models for sensitive applications.  ...  on PASCAL VOC 2012, including crowd-sourced evaluations, illustrate that while HiResCAM's explanations faithfully reflect the model, Grad-CAM often expands the attention to create bigger and smoother visualizations  ...  Grad-CAM intuition: visualizing features, not locations. Grad-CAM does not directly visualize important locations, and Grad-CAM does not reflect the model's computations.  ... 
arXiv:2011.08891v4 fatcat:yo6gawyxdraedbe3nmwnwzvxu4

Geometry-Aware Recurrent Neural Networks for Active Visual Recognition [article]

Ricson Cheng, Ziyan Wang, Katerina Fragkiadaki
2018 arXiv   pre-print
physical locations in the world scene and latent feature locations.  ...  Combined with active view selection policies, our model learns to select informative viewpoints to integrate information from by "undoing" cross-object occlusions, seamlessly combining geometry with learning  ...  We train with cross-entropy loss between the average logit and the ground truth class label of the (groundtruth) shape in the scene which has the highest Intersection over Union (IoU) with each voxel cluster  ... 
arXiv:1811.01292v2 fatcat:3caf2xhu7ngyzbvh5jwdqw2ooe

Stroke Lesion Segmentation with Visual Cortex Anatomy Alike Neural Nets [article]

Chuanlong Li
2021 arXiv   pre-print
Fast and precise stroke lesion detection and location is an extreme important process with regards to stroke diagnosis, treatment, and prognosis.  ...  Intuitively, this work presents a more brain alike model which mimics the anatomical structure of the human visual cortex.  ...  This paper utilizes the EMLLoss function proposed by [35] and combines it with the Binary Cross Entropy Loss. The EMLLoss is constructed with Focal Loss and Dice Loss.  ... 
arXiv:2105.06544v2 fatcat:hsj5uf2yuzhlfo3vjtq7glxzcu

Finding the Evidence: Localization-aware Answer Prediction for Text Visual Question Answering [article]

Wei Han and Hantao Huang and Tao Han
2020 arXiv   pre-print
Text-based visual question answering (text VQA) task focuses on visual questions that require reading text in images.  ...  Existing text VQA systems generate an answer by selecting from optical character recognition (OCR) texts or a fixed vocabulary.  ...  The IoU as an evidence score is also shown in each image. For Figure (c), IoU 0 (0.32) indicates OCR recognition error, 0.32 is the IoU with the GT bounding box.  ... 
arXiv:2010.02582v1 fatcat:uet3xdoftfetjficvjki2ixkdi
« Previous Showing results 1 — 15 out of 2,033 results