10 Hits in 0.98 sec

CurriculumNet: Weakly Supervised Learning from Large-Scale Web Images [article]

Sheng Guo, Weilin Huang, Haozhi Zhang, Chenfan Zhuang, Dengke Dong, Matthew R. Scott, Dinglong Huang
2018 arXiv   pre-print
We present a simple yet efficient approach capable of training deep neural networks on large-scale weakly-supervised web images, which are crawled raw from the Internet by using text queries, without any  ...  This allows for an efficient implementation of curriculum learning on large-scale web images, resulting in a high-performance CNN model, where the negative impact of noisy labels is reduced substantially  ...  Conclusion We have presented CurriculumNet -a new training strategy able to train CNN models more efficiently on large-scale weakly-supervised web images, where no human annotation is provided.  ... 
arXiv:1808.01097v4 fatcat:2wpmcqnemrehjo7chvlvpeosra

Weakly Supervised Learning with Side Information for Noisy Labeled Images [article]

Lele Cheng, Xiangzeng Zhou, Liming Zhao, Dangwei Li, Hong Shang, Yun Zheng, Pan Pan, Yinghui Xu
2020 arXiv   pre-print
In this paper, we present an efficient weakly supervised learning by using a Side Information Network (SINet), which aims to effectively carry out a large scale classification with severely noisy labels  ...  Besides, we released a fine-grained product dataset called AliProducts, which contains more than 2.5 million noisy web images crawled from the internet by using queries generated from 50,000 fine-grained  ...  Particularly, we investigate the learning capability on large-scale web images without any human annotation. Datasets.  ... 
arXiv:2008.11586v2 fatcat:336pp4msefeatjnt6bxd2k73ry

Suppressing Mislabeled Data via Grouping and Self-Attention [article]

Xiaojiang Peng, Kai Wang, Zhaoyang Zeng, Qing Li, Jianfei Yang, Yu Qiao
2020 arXiv   pre-print
Deep networks achieve excellent results on large-scale clean data but degrade significantly when learning from noisy labels.  ...  The AFM has several appealing benefits for noise-robust deep learning. (i) It does not rely on any assumptions and extra clean subset.  ...  Introduction In recent years, deep neural networks (DNNs) have achieved great success in various tasks, particularly in supervised learning tasks on large-scale image recognition challenges, such as ImageNet  ... 
arXiv:2010.15603v1 fatcat:zkm7py2bgjg45crdvl5dw7wzdq

Learning from Web Data with Self-Organizing Memory Module [article]

Yi Tu, Li Niu, Junjie Chen, Dawei Cheng, Liqing Zhang
2020 arXiv   pre-print
Learning from web data has attracted lots of research interest in recent years.  ...  Particularly, we formulate our method under the framework of multi-instance learning by grouping ROIs (i.e., images and their region proposals) from the same category into bags.  ...  Datasets Clothing1M: Clothing1M [54] is a large-scale fashion dataset designed for webly supervised learning.  ... 
arXiv:1906.12028v5 fatcat:75qvpao3jfcgljanwdrg5rallm

Webly Supervised Image Classification with Metadata: Automatic Noisy Label Correction via Visual-Semantic Graph

Jingkang Yang, Weirong Chen, Litong Feng, Xiaopeng Yan, Huabin Zheng, Wayne Zhang
2020 Proceedings of the 28th ACM International Conference on Multimedia  
Experiments on realistic webly supervised learning datasets Webvision-1000 and NUS-81-Web show the effectiveness and robustness of VSGraph-LC.  ...  Webly supervised learning becomes attractive recently for its efficiency in data expansion without expensive human labeling.  ...  INTRODUCTION Deep convolutional neural networks (CNNs) are successful by virtue of large-scale datasets with human annotation [23] .  ... 
doi:10.1145/3394171.3413952 dblp:conf/mm/YangCFYZZ20 fatcat:iithcxh27rbcvov4m7uqa4hv4u

Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels [article]

Lu Jiang, Di Huang, Mason Liu, Weilong Yang
2020 arXiv   pre-print
First, we establish the first benchmark of controlled real-world label noise from the web. This new benchmark enables us to study the web label noise in a controlled setting for the first time.  ...  Performing controlled experiments on noisy data is essential in understanding deep learning across noise levels.  ...  Curriculumnet: Weakly supervised learning from large-scale web images. In European Conference on Computer Vision (ECCV), 2018.  ... 
arXiv:1911.09781v3 fatcat:l2yhsf4oebcs3b7wkxl2uamcfu

When Do Curricula Work? [article]

Xiaoxia Wu and Ethan Dyer and Behnam Neyshabur
2021 arXiv   pre-print
We first investigate the implicit curricula resulting from architectural and optimization bias and find that samples are learned in a highly consistent order.  ...  Inspired by common use cases of curriculum learning in practice, we investigate the role of limited training time budget and noisy data in the success of curriculum learning.  ...  We appreciate the help with the experiments from Zachary Cain, Sam Ritchie, Ambrose Slone, and Vaibhav Singh. Lastly, XW would like to thank the team for all of the hospitality.  ... 
arXiv:2012.03107v3 fatcat:4ledippvgjbt7oaxhtnlvqp2ua

A Survey of Label-noise Representation Learning: Past, Present and Future [article]

Bo Han, Quanming Yao, Tongliang Liu, Gang Niu, Ivor W. Tsang, James T. Kwok, Masashi Sugiyama
2021 arXiv   pre-print
We first clarify a formal definition for LNRL from the perspective of machine learning.  ...  Classical machine learning implicitly assumes that labels of the training data are sampled from a clean distribution, which can be too restrictive for real-world scenarios.  ...  As far as we know, LNRL is a special case of machine learning, which belongs to weakly supervised learning [44] .  ... 
arXiv:2011.04406v2 fatcat:76np6wyzvvag7ehy23cwyzdozm

Multi-label Iterated Learning for Image Classification with Label Ambiguity [article]

Sai Rajeswar, Pau Rodriguez, Soumye Singhal, David Vazquez, Aaron Courville
Transfer learning from large-scale pre-trained models has become essential for many computer vision tasks.  ...  Recent studies have shown that datasets like ImageNet are weakly labeled since images with multiple object classes present are assigned a single label.  ...  crawled from the web [50] .  ... 
doi:10.48550/arxiv.2111.12172 fatcat:3nqqsh3arjc6bbcodb54cx6yji

Exploiting Curriculum Learning in Unsupervised Neural Machine Translation

Jinliang Lu, Jiajun Zhang
2021 Findings of the Association for Computational Linguistics: EMNLP 2021   unpublished
Curriculumnet: Weakly super- Acknowledgements vised learning from large-scale web images.  ...  To perform such fine-grained learning where x and y indicate sentences sampled from from easy to difficult, we borrow the idea from self- monolingual dataset φl1 and φl2 . l1 and l2  ... 
doi:10.18653/v1/2021.findings-emnlp.79 fatcat:jls247yfpfcnbocow7r6u4muvi