Filters








3,663 Hits in 4.3 sec

Adaptive Dilated Network With Self-Correction Supervision for Counting

Shuai Bai, Zhiqun He, Yu Qiao, Hanzhe Hu, Wei Wu, Junjie Yan
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
In this paper, we propose an adaptive dilated convolution and a novel supervised learning framework named self-correction (SC) supervision.  ...  In the feature level, the proposed adaptive dilated convolution predicts a continuous value as the specific dilation rate for each location, which adapts the scale variation better than a discrete and  ...  Methodology We propose a framework for objects counting, which is shown in Fig. 2 . It consists of the adaptive dilated convolution network and the self-correction supervision.  ... 
doi:10.1109/cvpr42600.2020.00465 dblp:conf/cvpr/BaiHQHWY20 fatcat:ygf34vaqjfhz3axdbsmp52us34

PANet: Perspective-Aware Network with Dynamic Receptive Fields and Self-Distilling Supervision for Crowd Counting [article]

Xiaoshuang Chen, Yiru Zhao, Yu Qin, Fei Jiang, Mingyuan Tao, Xiansheng Hua, Hongtao Lu
2021 arXiv   pre-print
Different from most previous works which use Gaussian kernels to generate the density map as the supervised information, we propose the self-distilling supervision (SDS) training method.  ...  The framework is able to adjust the receptive field by the dilated convolution parameters according to the input image, which helps the model to extract more discriminative features for each local region  ...  After this adjustment, the student network is trained with L 1 loss again with new supervision targets for the final count estimation.  ... 
arXiv:2111.00406v1 fatcat:bwiqzljatbhblowktj5lxznrqi

CNN-based Density Estimation and Crowd Counting: A Survey [article]

Guangshuai Gao, Junyu Gao, Qingjie Liu, Qi Wang, Yunhong Wang
2020 arXiv   pre-print
In these works, they are must be helpful for the development of crowd counting. However, the question we should consider is why they are effective for this task.  ...  Through our analysis, we expect to make reasonable inference and prediction for the future development of crowd counting, and meanwhile, it can also provide feasible solutions for the problem of object  ...  ACKNOWLEDGMENT The authors would like to thank reviewers for their valuable suggestions and comments.  ... 
arXiv:2003.12783v1 fatcat:uqsoismxkzft7audwvdpr3dt7q

Contextual Phonetic Pretraining for End-to-end Utterance-level Language and Speaker Recognition [article]

Shaoshi Ling, Julian Salazar, Katrin Kirchhoff
2019 arXiv   pre-print
Results remain competitive when using a novel dilated convolutional model for language recognition, or when ASR pretraining is done with character labels only.  ...  We first train the model on the Fisher English corpus with context-independent phoneme labels, then use its representations at inference time as features for task-specific models on the NIST LRE07 closed-set  ...  One could relax the supervised ASR task to semi-supervised or self-supervised learning, as is done with language representation modeling [4] .  ... 
arXiv:1907.00457v1 fatcat:cqbrqf6bxfgr3frxyjko2rpkem

Learning Affinity from Attention: End-to-End Weakly-Supervised Semantic Segmentation with Transformers [article]

Lixiang Ru and Yibing Zhan and Baosheng Yu and Bo Du
2022 arXiv   pre-print
Weakly-supervised semantic segmentation (WSSS) with image-level labels is an important and challenging task.  ...  In addition, to efficiently derive reliable affinity labels for supervising AFA and ensure the local consistency of pseudo labels, we devise a Pixel-Adaptive Refinement module that incorporates low-level  ...  N + and N − count the number of R + and R − . Intuitively, Eq. 5 enforces the network to learn highly confident semantic affinity relations from MHSA.  ... 
arXiv:2203.02664v1 fatcat:baaptaavajdftnac5s2f4zdsje

Learning to Count Anything: Reference-less Class-agnostic Counting with Weak Supervision [article]

Michael Hobley, Victor Prisacariu
2022 arXiv   pre-print
Specifically, we demonstrate that self-supervised vision transformer features combined with a lightweight count regression head achieve competitive results when compared to other class-agnostic counting  ...  tasks without the need for point-level supervision or reference images.  ...  This requires an individually trained network for each type of object with limited to no capacity to adapt to previously unseen classes.  ... 
arXiv:2205.10203v1 fatcat:gikmzpdgineuvnocobrnnrdae4

Self-Supervised Nuclei Segmentation in Histopathological Images Using Attention [article]

Mihir Sahasrabudhe, Stergios Christodoulidis, Roberto Salgado, Stefan Michiels, Sherene Loi, Fabrice André, Nikos Paragios, Maria Vakalopoulou
2020 arXiv   pre-print
In this study, we present a self-supervised approach for segmentation of nuclei for whole slide histopathology images.  ...  We show that the identification of the magnification level for tiles can generate a preliminary self-supervision signal to locate nuclei.  ...  In this paper, we proposed a self-supervised method for nuclei segmentation exploiting magnification level determination as a self-supervision signal.  ... 
arXiv:2007.08373v1 fatcat:s7go5znc2nfndohmvjukbuqj6a

Semi-Supervised Crowd Counting via Self-Training on Surrogate Tasks [article]

Yan Liu, Lingqiao Liu, Peng Wang, Pingping Zhang, Yinjie Lei
2020 arXiv   pre-print
This paper tackles the semi-supervised crowd counting problem from the perspective of feature learning.  ...  To reduce the annotation cost, one attractive solution is to leverage a large number of unlabeled images to build a crowd counting model in semi-supervised fashion.  ...  Several works [5, 17] combine the VGG [18] structure with dilated convolution to assemble the semantic features for density regression.  ... 
arXiv:2007.03207v2 fatcat:fgah5xk37fdk3auj62gkpo4zqq

Crowd Scene Analysis by Output Encoding [article]

Yao Xue, Siming Liu, Yonghui Li, Xueming Qian
2020 arXiv   pre-print
Also, we develop an Adaptive Receptive Field Weighting (ARFW) module, which further deals with scale variation issue by adaptively emphasizing informative channels that have proper receptive field size  ...  Grasping the accurate crowd location (rather than merely crowd count) is important for spatially identifying high-risk regions in congested scenes.  ...  For example, Zhang et al. [17] solve the crossscene crowd counting problem with a deep convolutional neural network fed with density map and global count datasets.  ... 
arXiv:2001.09556v1 fatcat:yellmcx2trcfvpjrjx2biz54ia

Adaptive Ensembling: Unsupervised Domain Adaptation for Political Document Analysis

Shrey Desai, Barea Sinno, Alex Rosenfeld, Junyi Jessy Li
2019 Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)  
To bridge this gap, we present adaptive ensembling, an unsupervised domain adaptation framework, equipped with a novel text classification model and time-aware training to ensure our methods work well  ...  with diachronic corpora.  ...  Thanks as well to Greg Durrett, Katrin Erk, and the anonymous reviewers for their helpful comments. This work was partially supported by the NSF Grant IIS-1850153.  ... 
doi:10.18653/v1/d19-1478 dblp:conf/emnlp/DesaiSRL19 fatcat:hhqzsequhbfezecnu62frzinsu

Adaptive Ensembling: Unsupervised Domain Adaptation for Political Document Analysis [article]

Shrey Desai, Barea Sinno, Alex Rosenfeld, Junyi Jessy Li
2019 arXiv   pre-print
To bridge this gap, we present adaptive ensembling, an unsupervised domain adaptation framework, equipped with a novel text classification model and time-aware training to ensure our methods work well  ...  with diachronic corpora.  ...  Thanks as well to Greg Durrett, Katrin Erk, and the anonymous reviewers for their helpful comments. This work was partially supported by the NSF Grant IIS-1850153.  ... 
arXiv:1910.12698v1 fatcat:2xdgjt3k6ndebiitkud7lqbshm

Deep learning with self-supervision and uncertainty regularization to count fish in underwater images [article]

Penny Tarling, Mauricio Cantor, Albert Clapés, Sergio Escalera
2021 arXiv   pre-print
We utilise abundant unlabelled data in a self-supervised task to improve the supervised counting task.  ...  From experiments on both contrasting datasets, we demonstrate our network outperforms the few other deep learning models implemented for solving this task.  ...  This way the model can learn the correct order within a pair according to number of fish [60] . It is not necessary to know the exact count of either image, hence enabling the self-supervised task.  ... 
arXiv:2104.14964v1 fatcat:6gokk6cixfhi3kdr3pll67pk2i

Attention Mechanism Guided Deep Regression Model for Acne Severity Grading

Saeed Alzahrani, Baidaa Al-Bander, Waleed Al-Nuaimy
2022 Computers  
The proposed fully convolutional regressor module adapts UNet with dilated convolution filters to systematically aggregate multi-scale contextual information for density maps generation.  ...  To this end, we develop a multi-scale dilated fully convolutional regressor for density map generation integrated with an attention mechanism.  ...  In this fashion, we merge the dilated UNet dense regressor with Faster R-CNN network for density map regression allowing us to determine the count of acne lesions and subsequently grade the severity.  ... 
doi:10.3390/computers11030031 fatcat:dg3qmhnyvjesnmuzsvxognvgmu

Grayscale Images and RGB Video: Compression by Morphological Neural Network [chapter]

Osvaldo de Souza, Paulo César Cortez, Francisco A. T. F. da Silva
2012 Lecture Notes in Computer Science  
Network application results are presented for grayscale images and RGB video with a 352 × 288 pixel size.  ...  This paper investigates image and RGB video compression by a supervised morphological neural network.  ...  In [4] , the authors discussed various ANN architectures for image compression and presented the results for a back-propagation network (BPN), hierarchical back-propagation network (HBPN), and adaptive  ... 
doi:10.1007/978-3-642-33212-8_20 fatcat:i6jtkshj4bf2rdju3seaibypb4

VisDrone-CC2020: The Vision Meets Drone Crowd Counting Challenge Results [article]

Dawei Du, Longyin Wen, Pengfei Zhu, Heng Fan, Qinghua Hu, Haibin Ling, Mubarak Shah, Junwen Pan, Ali Al-Ali, Amr Mohamed, Bakour Imene, Bin Dong (+43 others)
2021 arXiv   pre-print
The collected dataset is formed by 3,360 images, including 2,460 images for training, and 900 images for testing. Specifically, we manually annotate persons with points in each video frame.  ...  To this end, we collect a large-scale dataset and organize the Vision Meets Drone Crowd Counting Challenge (VisDrone-CC2020) in conjunction with the 16th European Conference on Computer Vision (ECCV 2020  ...  [42] propose a Relational Attention Network with a self-attention mechanism. Jiang et al.  ... 
arXiv:2107.08766v1 fatcat:kdwlstqthvaybpxgpb5vaqmxne
« Previous Showing results 1 — 15 out of 3,663 results