Filters








18,205 Hits in 2.6 sec

Composed Fine-Tuning: Freezing Pre-Trained Denoising Autoencoders for Improved Generalization [article]

Sang Michael Xie, Tengyu Ma, Percy Liang
2021 arXiv   pre-print
While labeled input-output pairs are expensive to obtain, "unlabeled" outputs, i.e. outputs without corresponding inputs, are freely available (e.g. code on GitHub) and provide information about output  ...  Pre-training captures this structure by training a denoiser to denoise corrupted versions of unlabeled outputs.  ...  However, we utilize unlabeled output data and use the frozen denoiser in the final model, improving OOD generalization.  ... 
arXiv:2006.16205v3 fatcat:6cg3pl4bv5gxdgeiqrwe3fmpku

Semi-Supervised Learning with GANs for Device-Free Fingerprinting Indoor Localization [article]

Kevin M. Chen, Ronald Y. Chang
2020 arXiv   pre-print
The proposed system uses a small amount of labeled data and a large amount of unlabeled data (i.e., semi-supervised), thus considerably reducing the expensive data labeling effort.  ...  and significantly superior performance with equal, highly limited amount of labeled data.  ...  The main difference between DCGAN and CNN is that CNN accepts labeled data only (i.e., supervised), while DCGAN can be trained with labeled data as well as unlabeled data (i.e., semi-supervised).  ... 
arXiv:2008.07111v1 fatcat:zfsiipyuvnd5bh3e4i3om4llpa

Master-Teacher-Student: A Weakly Labelled Semi-Supervised Framework for Audio Tagging and Sound Event Detection

Yuzhuo LIU, Hangting CHEN, Qingwei ZHAO, Pengyuan ZHANG
2022 IEICE transactions on information and systems  
A popular method is teacher-student learning, making student models learn from pseudo-labels generated by teacher models from unlabelled data.  ...  To generate high-quality pseudo-labels, we propose a master-teacherstudent framework trained with a dual-lead policy.  ...  The teacher model, then, can learn from unlabelled data with these labels: L MT = BCE(M(x u ), T (x u )), (5) where M(x u ), T (x u ) are outputs of the master and teacher models.  ... 
doi:10.1587/transinf.2021edl8082 fatcat:awcfbv5envg2thzxcti2nprbfy

Knowledge Distillation and Data Selection for Semi-Supervised Learning in CTC Acoustic Models [article]

Prakhar Swarup, Debmalya Chakrabarty, Ashtosh Sapru, Hitesh Tulsiani, Harish Arsikere, Sri Garimella
2020 arXiv   pre-print
mechanisms for leveraging unlabelled data to boost performance of student models.  ...  On a semi-supervised ASR setting with 40000 hours of carefully selected unlabelled data, our CTC-SSL approach gives 17% relative WER improvement over a baseline CTC system trained with labelled data.  ...  seed model with additional unlabelled data [23, 24] in the self-learning framework.  ... 
arXiv:2008.03923v1 fatcat:ry26pyteyzdtbj6iye5bsf2maq

Confidence Learning for Semi-Supervised Acoustic Event Detection

Yuzhuo Liu, Hangting Chen, Jian Wang, Pei Wang, Pengyuan Zhang
2021 Applied Sciences  
The classic self-training method carries out predictions for unlabeled data and then selects predictions with high probabilities as pseudo-labels for retraining.  ...  In recent years, the involvement of synthetic strongly labeled data, weakly labeled data, and unlabeled data has drawn much research attention in semi-supervised acoustic event detection (SAED).  ...  Data Availability Statement: The DCASE 2019 TASK4 dataset was analyzed in this study.  ... 
doi:10.3390/app11188581 fatcat:onydv6gbzrfzlocjw6lxmndcbe

Supervised and Semi-Supervised Text Categorization using LSTM for Region Embeddings [article]

Rie Johnson, Tong Zhang
2016 arXiv   pre-print
The best results were obtained by combining region embeddings in the form of LSTM and convolution layers trained on unlabeled data.  ...  We view it as a special case of a general framework which jointly trains a linear model with a non-linear feature generator consisting of 'text region embedding + pooling'.  ...  with unlabeled data and were fine-tuned with labeled data; pre-training used either the language model objective or autoencoder objective.  ... 
arXiv:1602.02373v2 fatcat:px3n2gudyrbmtebaqhtlu7alla

Adversarial Semi-Supervised Audio Source Separation applied to Singing Voice Extraction [article]

Daniel Stoller, Sebastian Ewert, Simon Dixon
2018 arXiv   pre-print
With only few datasets available, often extensive data augmentation is used to combat overfitting.  ...  Based on this idea, we drive the separator towards outputs deemed as realistic by discriminator networks that are trained to tell apart real from separator samples.  ...  However, early approaches [3, 4] often have to make many simplifying assumptions about the data generation process to constrain the generative model such that the difficult problem of posterior inference  ... 
arXiv:1711.00048v2 fatcat:j3k55vkw2zactle2cnjv7vbxxe

Adversarial Semi-Supervised Audio Source Separation Applied to Singing Voice Extraction

Daniel Stoller, Sebastian Ewert, Simon Dixon
2018 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
With only few datasets available, often extensive data augmentation is used to combat overfitting.  ...  Based on this idea, we drive the separator towards outputs deemed as realistic by discriminator networks that are trained to tell apart real from separator samples.  ...  However, early approaches [3, 4] often have to make many simplifying assumptions about the data generation process to constrain the generative model such that the difficult problem of posterior inference  ... 
doi:10.1109/icassp.2018.8461722 dblp:conf/icassp/StollerED18 fatcat:me566fosvndg5n3ucyg5l456va

Unsupervised Neural Text Simplification

Sai Surya, Abhijit Mishra, Anirban Laha, Parag Jain, Karthik Sankaranarayanan
2019 Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics  
Our analysis (both quantitative and qualitative involving human evaluators) on public test data shows that the proposed model can perform text-simplification at both lexical and syntactic levels, competitive  ...  The framework is trained using unlabeled text collected from en-Wikipedia dump.  ...  Models with high word-diff, SARI and BLEU are picked during model-selection (with validation data). Model selection also involved manually examining the quality and relevance of generations.  ... 
doi:10.18653/v1/p19-1198 dblp:conf/acl/SuryaMLJS19 fatcat:mvbgh5vzzndenbx3zx6e63mwpu

Be Consistent! Improving Procedural Text Comprehension using Label Consistency

Xinya Du, Bhavana Dalvi, Niket Tandon, Antoine Bosselut, Wen-tau Yih, Peter Clark, Claire Cardie
2019 Proceedings of the 2019 Conference of the North  
We present a new learning framework that leverages label consistency during training, allowing consistency bias to be built into the model.  ...  ., their location) change with time given a procedural text (e.g., a paragraph about photosynthesis, a recipe).  ...  For testing, the input is the model M and a set of unlabeled (and ungrouped) examples x t , and the output are their predicted labels ŷt .  ... 
doi:10.18653/v1/n19-1244 dblp:conf/naacl/DuDTBYCC19 fatcat:q3hrjer7nveh5pcu7hhe5szy4u

A Study on machine learning methods and applications in genetics and genomics

K Jayanthi, C Mahesh
2018 International Journal of Engineering & Technology  
One of the complex data is genetics and genomic data which needs to analyse various set of functions automatically by the computers.  ...  Machine learning enables computers to help humans in analysing knowledge from large, complex data sets.  ...  The learning algorithm can also compare its output with correct, intended output and find errors in order to modify the model accordingly.  ... 
doi:10.14419/ijet.v7i1.7.10653 fatcat:5ujd2eydezaodilse4ifqsbk5e

Power Pooling Operators and Confidence Learning for Semi-Supervised Sound Event Detection [article]

Yuzhuo Liu and Hangting Chen and Pengyuan Zhang
2020 arXiv   pre-print
In recent years, the involvement of synthetic strongly labeled data,weakly labeled data and unlabeled data has drawn much research attentionin semi-supervised sound event detection (SSED).  ...  Self-training models carry out predictions without strong annotations and then take predictions with high probabilities as pseudo-labels for retraining.  ...  In comparison, synthetic strongly labeled data, weakly labeled data with clip-level categories only and unlabeled data are widely available.  ... 
arXiv:2005.11459v1 fatcat:z3j4hmepizhfjplmksxjla6kc4

Semi-Supervised Learning with Data Augmentation for End-to-End ASR

Felix Weninger, Franco Mana, Roberto Gemello, Jesús Andrés-Ferrer, Puming Zhan
2020 Interspeech 2020  
Specifically, we generate the pseudo labels for the unlabeled data onthe-fly with a seq2seq model after perturbing the input features with DA.  ...  In the result, the Noisy Student algorithm with soft labels and consistency regularization achieves 10.4 % word error rate (WER) reduction when adding 475 h of unlabeled data, corresponding to a recovery  ...  Another possible implementation of SSL is via teacher-student training, where a 'student' model is trained to replicate the outputs of a powerful 'teacher' model on the unlabeled data [20] .  ... 
doi:10.21437/interspeech.2020-1337 dblp:conf/interspeech/WeningerMGAZ20 fatcat:wbhkrqglmnbzhfvexzbk7shevy

Semi-Supervised Learning with Data Augmentation for End-to-End ASR [article]

Felix Weninger, Franco Mana, Roberto Gemello, Jesús Andrés-Ferrer, Puming Zhan
2020 arXiv   pre-print
Specifically, we generate the pseudo labels for the unlabeled data on-the-fly with a seq2seq model after perturbing the input features with DA.  ...  In the result, the Noisy Student algorithm with soft labels and consistency regularization achieves 10.4% word error rate (WER) reduction when adding 475h of unlabeled data, corresponding to a recovery  ...  Another possible implementation of SSL is via teacher-student training, where a 'student' model is trained to replicate the outputs of a powerful 'teacher' model on the unlabeled data [20] .  ... 
arXiv:2007.13876v1 fatcat:hbcpvsgi55cevpawp7vpyzvxcq

Semi-Supervised Learning by Label Gradient Alignment [article]

Jacob Jackson, John Schulman
2019 arXiv   pre-print
We present label gradient alignment, a novel algorithm for semi-supervised learning which imputes labels for the unlabeled data and trains on the imputed labels.  ...  We then formulate an optimization problem whose objective is to minimize the distance between the labeled and the unlabeled data in this space, and we solve it by gradient descent on the imputed labels  ...  SSL seeks to use unlabeled data together with a small amount of labeled data to produce a better model than could be obtained from the labeled data alone.  ... 
arXiv:1902.02336v1 fatcat:mktd2h6rqfckzpsbujf4kurci4
« Previous Showing results 1 — 15 out of 18,205 results