2,875 Hits in 3.8 sec

Exploring the Limits of Weakly Supervised Pretraining [article]

Dhruv Mahajan, Ross Girshick, Vignesh Ramanathan, Kaiming He, Manohar Paluri, Yixuan Li, Ashwin Bharambe, Laurens van der Maaten
2018 arXiv   pre-print
State-of-the-art visual perception models for a wide range of tasks rely on supervised pretraining. ImageNet classification is the de facto pretraining task for these models.  ...  Even so, relatively little is known about the behavior of pretraining with datasets that are multiple orders of magnitude larger.  ...  Discussion We have attempted to explore the limits of supervised pretraining.  ... 
arXiv:1805.00932v1 fatcat:3qro6b6sindwtnvfzjbna4lufm

Exploring the Limits of Weakly Supervised Pretraining [chapter]

Dhruv Mahajan, Ross Girshick, Vignesh Ramanathan, Kaiming He, Manohar Paluri, Yixuan Li, Ashwin Bharambe, Laurens van der Maaten
2018 Lecture Notes in Computer Science  
State-of-the-art visual perception models for a wide range of tasks rely on supervised pretraining. ImageNet classification is the de facto pretraining task for these models.  ...  Even so, relatively little is known about the behavior of pretraining with datasets that are multiple orders of magnitude larger.  ...  Exploring the Limits of Weakly Supervised Pretraining  ... 
doi:10.1007/978-3-030-01216-8_12 fatcat:vldy33o7tfchzix7gvqajujala

Semi-weakly Supervised Contrastive Representation Learning for Retinal Fundus Images [article]

Boon Peng Yap, Beng Koon Ng
2021 arXiv   pre-print
We explore the value of weak labels in learning transferable representations for medical images.  ...  We consider weak labels in the form of pseudo-labels and propose a semi-weakly supervised contrastive learning (SWCL) framework for representation learning using semi-weakly annotated images.  ...  prompts us to explore the use of weak labels to inject information about lesions into image patches.  ... 
arXiv:2108.02122v1 fatcat:mb6tdf6sgjenpkrybq6s4rcvmi

Open-Vocabulary Object Detection Using Captions [article]

Alireza Zareian, Kevin Dela Rosa, Derek Hao Hu, Shih-Fu Chang
2021 arXiv   pre-print
Weakly supervised and zero-shot learning techniques have been explored to scale object detectors to more categories with less supervision, but they have not been as successful and widely adopted as supervised  ...  In this paper, we put forth a novel formulation of the object detection problem, namely open-vocabulary object detection, which is more general, more practical, and more effective than weakly supervised  ...  We are inspired by the rich literature of weakly supervised visual grounding methods [42, 9, 5, 1] to design our imagecaption pretraining technique.  ... 
arXiv:2011.10678v2 fatcat:ven4oegqnrdilb4reguistgxnm

Self-supervised Pretraining of Visual Features in the Wild [article]

Priya Goyal, Mathilde Caron, Benjamin Lefaudeux, Min Xu, Pengchao Wang, Vivek Pai, Mannat Singh, Vitaliy Liptchinsky, Ishan Misra, Armand Joulin, Piotr Bojanowski
2021 arXiv   pre-print
However, the premise of self-supervised learning is that it can learn from any random image and from any unbounded dataset.  ...  Our final SElf-supERvised (SEER) model, a RegNetY with 1.3B parameters trained on 1B random images with 512 GPUs achieves 84.2% top-1 accuracy, surpassing the best self-supervised pretrained model by 1%  ...  While the benefit of pretraining has been demonstrated in computer vision, it has been in the limited scope of curated datasets originally collected for supervised or weakly supervised learning.  ... 
arXiv:2103.01988v2 fatcat:u4qjr2aq45ahxiokncoi75jif4

Weakly Supervised Context Encoder using DICOM metadata in Ultrasound Imaging [article]

Szu-Yeu Hu, Shuhang Wang, Wei-Hung Weng, JingChao Wang, XiaoHong Wang, Arinc Ozturk, Qian Li, Viksit Kumar, Anthony E. Samir
2020 arXiv   pre-print
In this work, we leverage DICOM metadata from ultrasound images to help learn representations of the ultrasound image.  ...  Modern deep learning algorithms geared towards clinical adaption rely on a significant amount of high fidelity labeled data.  ...  The authors are solely responsible for the content, and the work does not represent the official views of the National Institutes of Health.  ... 
arXiv:2003.09070v1 fatcat:x7orhc5hwfb4njhn464zmbcmqm

Pretrained Encoders are All You Need [article]

Mina Khan, P Srivatsa, Advait Rane, Shriram Chenniappa, Rishabh Anand, Sherjil Ozair, Pattie Maes
2021 arXiv   pre-print
Our results show that pretrained representations are at par with state-of-the-art self-supervised methods trained on domain-specific data.  ...  We also explore fine-tuning pretrained representations with self-supervised techniques, i.e., contrastive predictive coding, spatio-temporal contrastive learning, and augmentations.  ...  We also investigate self-supervised finetuning of pretrained representations using state-of-the-art (SotA) self-supervised methods.  ... 
arXiv:2106.05139v1 fatcat:qkfqyxpkindlxollaxucut56uu

Watch, Listen and Tell: Multi-modal Weakly Supervised Dense Event Captioning [article]

Tanzila Rahman, Bicheng Xu, Leonid Sigal
2019 arXiv   pre-print
Specifically, we focus on the problem of weakly-supervised dense event captioning in videos and show that audio on its own can nearly rival performance of a state-of-the-art visual model and, combined  ...  However, much of the research has been limited to approaches that either do not take audio corresponding to video into account at all, or those that model the audio-visual correlations in service of sound  ...  Acknowledgments: This work was funded in part by the Vector Institute for AI, Canada CIFAR AI Chair, NSERC Canada Research Chair (CRC) and an NSERC Discovery and Discovery Accelerator Supplement Grants  ... 
arXiv:1909.09944v2 fatcat:paj6fq6mwvdmfk73ubfz55jg4a

Depth CNNs for RGB-D scene recognition: learning from scratch better than transferring from RGB-CNNs [article]

Xinhang Song, Luis Herranz, Shuqiang Jiang
2018 arXiv   pre-print
In contrast, we focus on the bottom layers, and propose an alternative strategy to learn depth features combining local weakly supervised training from patches followed by global fine tuning with images  ...  In contrast, current RGB-D scene data is much more limited, so often leverages RGB large datasets, by transferring pretrained RGB CNN models and fine-tuning with the target RGB-D dataset.  ...  Weakly supervised pretrained CNN It is difficult to learn deep CNN from scratch with depth images, due to the lack of enough training data.  ... 
arXiv:1801.06797v1 fatcat:bde4sdx7arehhhqvada57rifnm

SimVLM: Simple Visual Language Model Pretraining with Weak Supervision [article]

Zirui Wang, Jiahui Yu, Adams Wei Yu, Zihang Dai, Yulia Tsvetkov, Yuan Cao
2022 arXiv   pre-print
However, the requirement for expensive annotations including clean image captions and regional labels limits the scalability of existing approaches, and complicates the pretraining procedure with the introduction  ...  Without utilizing extra data or task-specific customization, the resulting model significantly outperforms previous pretraining methods and achieves new state-of-the-art results on a wide range of discriminative  ...  Srinivasan, Samira Daruki, Nan Du and Aashi Jain for help with data preparation, Chao Jia, Zhen Li, Jonathan Shen, Colin Raffel and Sharan Narang for assistance on experimental settings, and others in the  ... 
arXiv:2108.10904v3 fatcat:glozbeeytvdyvcgl7ersyz4i34

Billion-Scale Pretraining with Vision Transformers for Multi-Task Visual Representations [article]

Josh Beal, Hao-Yu Wu, Dong Huk Park, Andrew Zhai, Dmitry Kislyuk
2021 arXiv   pre-print
In this work, we describe how we (1) generate a dataset with over a billion images via large weakly-supervised pretraining to improve the performance of these visual representations, and (2) leverage Transformers  ...  To support this backbone model, we detail a systematic approach to deriving weakly-supervised image annotations from heterogenous text signals, demonstrating the benefits of clustering techniques to handle  ...  The authors would like to thank Eric Tzeng, Raymond Shiau, Kofi Boakye, Vahid Kazemi, and Chuck Rosenberg for valuable discussions regarding the paper, and the anonymous reviewers and ACs for their helpful  ... 
arXiv:2108.05887v1 fatcat:gm5lzf4pkrg3zez7unuq7epp3a

Learning Effective RGB-D Representations for Scene Recognition

Xinhang Song, Shuqiang Jiang, Luis Herranz, Chengpeng Chen
2019 IEEE Transactions on Image Processing  
The first limitation is the lack of depth data for training deep learning models.  ...  using weak supervision via patches.  ...  A smaller architecture and a weakly supervised pretraining strategy for the bottom layers enables us to overcome the problem of very limited depth data.  ... 
doi:10.1109/tip.2018.2872629 pmid:30281448 fatcat:lnhv5g46s5dpngmstriivjncgy

Low-Resource Machine Translation Training Curriculum Fit for Low-Resource Languages [article]

Garry Kuwanto, Afra Feyza Akyürek, Isidora Chara Tourni, Siyang Li, Alexander Gregory Jones, Derry Wijaya
2021 arXiv   pre-print
to Kazakh, showcasing the potential of weakly-supervised NMT for the low-resource languages.  ...  When trained on supervised data, our training curriculum achieves a new state-of-the-art result on the Somali dataset (BLEU of 29.3 for Somali to English).  ...  After pretraining the LM, we train a NMT model in an unsupervised or weakly-supervised (using comparable data) manner.  ... 
arXiv:2103.13272v2 fatcat:l7ttvdozivalpdjj4jnvdduaju

Weakly Supervised 3D Object Detection from Point Clouds [article]

Zengyi Qin, Jinglu Wang, Yan Lu
2020 arXiv   pre-print
The source code and pretrained models are publicly available at  ...  Weakly supervised learning is a promising approach to reducing the annotation requirement, but existing weakly supervised object detectors are mostly for 2D detection rather than 3D.  ...  image recognition network pretrained on existing Figure 1 : Overview of the proposed weakly supervised 3D object detection framework.  ... 
arXiv:2007.13970v1 fatcat:2n6irzxxh5cj7mjckign3knlsq

Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model [article]

Wenhan Xiong, Jingfei Du, William Yang Wang, Veselin Stoyanov
2019 arXiv   pre-print
Recent breakthroughs of pretrained language models have shown the effectiveness of self-supervised learning for a wide range of natural language processing (NLP) tasks.  ...  Moreover, we propose a simple yet effective weakly supervised pretraining objective, which explicitly forces the model to incorporate knowledge about real-world entities.  ...  CONCLUSION We introduce a weakly supervised method to encourage pretrained language models to learn entitylevel knowledge.  ... 
arXiv:1912.09637v1 fatcat:jgh27a23fnfj5bcxjlbktxpq3m
« Previous Showing results 1 — 15 out of 2,875 results