Filters








648 Hits in 2.9 sec

Utilization of unlabeled development data for speaker verification

Gang Liu, Chengzhu Yu, Navid Shokouhi, Abhinav Misra, Hua Xing, John H. L. Hansen
2014 2014 IEEE Spoken Language Technology Workshop (SLT)  
This study investigates a potential solution to resolve this challenge by effectively utilizing unlabeled development data with universal imposter clustering.  ...  Unfortunately, this efficiency is obtained at the cost of a required large corpus of labeled development data, which is too expensive/unrealistic in many cases.  ...  CONCLUSIONS In this study, we investigated a series of algorithms for speaker verification involving unlabeled development data. We explored two approaches to utilize the unlabeled development data.  ... 
doi:10.1109/slt.2014.7078611 dblp:conf/slt/LiuYSMXH14 fatcat:xigylza7izf53kzblgx7s6k6sm

Graph-based Label Propagation for Semi-Supervised Speaker Identification [article]

Long Chen, Venkatesh Ravichandran, Andreas Stolcke
2021 arXiv   pre-print
Speaker identification in the household scenario (e.g., for smart speakers) is typically based on only a few enrollment utterances but a much larger set of unlabeled data, suggesting semisupervised learning  ...  We show in experiments on the VoxCeleb dataset that this approach makes effective use of unlabeled data and improves speaker identification accuracy compared to two state-of-the-art scoring methods as  ...  Rationales for comparing these methods are: 1) CS and CSEA are the most commonly used methods for speaker verification in previous works, but they do not use unlabeled data for prediction. 2) 2-CS and  ... 
arXiv:2106.08207v1 fatcat:5kjp7bohtfbp3lqgn3wd2emmwe

Weakly Supervised PLDA Training [article]

Lantian Li, Yixiang Chen, Dong Wang, Chenghui Zhao
2017 arXiv   pre-print
PLDA is a popular normalization approach for the i-vector model, and it has delivered state-of-the-art performance in speaker verification.  ...  However, PLDA training requires a large amount of labelled development data, which is highly expensive in most cases.  ...  Another approach to utilizing unlabelled data is to produce labels for these data automatically.  ... 
arXiv:1609.08441v2 fatcat:odryucpaajavlct4grirlle3iu

Cross-lingual Text-independent Speaker Verification Using Unsupervised Adversarial Discriminative Domain Adaptation

Wei Xia, Jing Huang, John H.L. Hansen
2019 ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
Being able to improve cross-lingual speaker verification system using unlabeled data can greatly increase the robustness of the system and reduce human labeling costs.  ...  Further data analysis of ADDA adapted speaker embedding shows that the learned speaker embeddings can perform well on speaker classification for the target domain data, and are less dependent with respect  ...  To accomplish this, A set of unlabeled data for the new language is needed. We use the target domain AISHELL unlabeled training data.  ... 
doi:10.1109/icassp.2019.8682259 dblp:conf/icassp/XiaHH19 fatcat:ulwrq5klbbad7dfbzdtg3b6sga

Domain adaptation based Speaker Recognition on Short Utterances [article]

Ahilan Kanagasundaram, David Dean, Sridha Sridharan, Clinton Fookes
2016 arXiv   pre-print
short utterances are used for evaluation, the performance gain of in-domain speaker verification reduces at an increasing rate.  ...  over out-domain PLDA speaker verification when SWB and NIST data are respectively used for S normalization.  ...  Unfortunately, the performance of many of these approaches degrades rapidly as the available amount of enrolment and/or verification speech decreases [7, 8, 9] , limiting the utility of speaker verification  ... 
arXiv:1610.02831v2 fatcat:26kuyr52oragrdttyty7tficqy

Graph Convolutional Network Based Semi-Supervised Learning on Multi-Speaker Meeting Data [article]

Fuchuan Tong, Siqi Zheng, Min Zhang, Yafeng Chen, Hongbin Suo, Qingyang Hong, Lin Li
2022 arXiv   pre-print
In reality, we are often presented with enormous amounts of unlabeled data from multi-party meetings and discussions.  ...  We show that this proposed approach effectively uses unlabeled data and improves speaker recognition accuracy.  ...  In this work, we present techniques for exploiting unlabeled meeting data. We demonstrate that learning from unlabeled utterances is indeed a practical avenue for improving speaker verification.  ... 
arXiv:2204.11501v1 fatcat:4hjc5rgvxnhrjovhwpc4etjdau

Self-Supervised Speaker Verification with Simple Siamese Network and Self-Supervised Regularization [article]

Mufan Sang, Haoqi Li, Fang Liu, Andrew O. Arnold, Li Wan
2022 arXiv   pre-print
Training speaker-discriminative and robust speaker verification systems without speaker labels is still challenging and worthwhile to explore.  ...  We also explore the effectiveness of alternative online data augmentation strategies on both the time domain and frequency domain.  ...  To investigate the effectiveness of data augmentation, models are trained with different data augmentation strategies on the development set of VoxCeleb1 which contains 148,642 utterances for 1211 speakers  ... 
arXiv:2112.04459v2 fatcat:amiklcmgjzdsnl4vssn6t7icwi

An iterative framework for unsupervised learning in the PLDA based speaker verification

Wenbo Liu, Zhiding Yu, Ming Li
2014 The 9th International Symposium on Chinese Spoken Language Processing  
To automatically retrieve the speaker labels of unlabeled training data, we propose to use the Affinity Propagation (AP) -a clustering method that takes pairwise data similarity as input -to generate the  ...  We present an iterative and unsupervised learning approach for the speaker verification task.  ...  We first perform clustering on the unlabeled development data to estimate the labels and train a PLDA model.  ... 
doi:10.1109/iscslp.2014.6936726 dblp:conf/iscslp/LiuYL14 fatcat:hbysaus6o5drdcagfrdwasilwu

Semi-Supervised Contrastive Learning with Generalized Contrastive Loss and Its Application to Speaker Recognition [article]

Nakamasa Inoue, Keita Goto
2020 arXiv   pre-print
In experiments, we applied the proposed framework to text-independent speaker verification on the VoxCeleb dataset.  ...  We demonstrate that GCL enables the learning of speaker embeddings in three manners, supervised learning, semi-supervised learning, and unsupervised learning, without any changes in the definition of the  ...  ACKNOWLEDGMENT This work was partially supported by the Japan Science and Technology Agency, ACT-X Grant JPMJAX1905, and the Japan Society for the Promotion of Science, KAKENHI Grant 19K22865.  ... 
arXiv:2006.04326v1 fatcat:5mfzgyfluzfq3jznyjmml67kxa

Local Training for PLDA in Speaker Verification [article]

Chenghui Zhao, Lantian Li, Dong Wang, April Pu
2016 arXiv   pre-print
PLDA is a popular normalization approach for the i-vector model, and it has delivered state-of-the-art performance in speaker verification.  ...  However, PLDA training requires a large amount of labeled development data, which is highly expensive in most cases.  ...  How to use unlabeled data is a critical problem in particular for practical systems.  ... 
arXiv:1609.08433v1 fatcat:yq7zl7dxh5ae3difb6spknuvia

Speaker verification using end-to-end adversarial language adaptation [article]

Johan Rohdin, Themos Stafylakis, Anna Silnova, Hossein Zeinali, Lukas Burget, Oldrich Plchot
2018 arXiv   pre-print
In this paper we investigate the use of adversarial domain adaptation for addressing the problem of language mismatch between speaker recognition corpora.  ...  In the context of speaker verification, adversarial domain adaptation methods aim at minimizing certain divergences between the distribution that the utterance-level features follow (i.e. speaker embeddings  ...  DA methods for speaker verification are of particular interest, as for many real-world applications large amounts of target domain labeled data are rarely available.  ... 
arXiv:1811.02331v1 fatcat:5mjkpfvyxnbvtmkobd2whumkda

Neural Predictive Coding using Convolutional Neural Networks towards Unsupervised Learning of Speaker Characteristics [article]

Arindam Jati, Panayiotis Georgiou
2019 arXiv   pre-print
We train a convolutional deep siamese network to produce "speaker embeddings" by learning to separate 'same' vs 'different' speaker pairs which are generated from an unlabeled data of audio streams.  ...  This paper provides a novel approach, we term Neural Predictive Coding (NPC), to learn speaker-specific characteristics in a completely unsupervised manner from large amounts of unlabeled training data  ...  Data for speaker verification experiment A recently released large speaker verification corpus, Vox-Celeb (version 1) [59] is employed for the speaker verification experiment.  ... 
arXiv:1802.07860v2 fatcat:6mhav3jkb5ewfno7z4nzfdkguu

Exploring the Use of an Unsupervised Autoregressive Model as a Shared Encoder for Text-Dependent Speaker Verification

Vijay Ravi, Ruchao Fan, Amber Afshan, Huanhua Lu, Abeer Alwan
2020 Interspeech 2020  
In this paper, we propose a novel way of addressing textdependent automatic speaker verification (TD-ASV) by using a shared-encoder with task-specific decoders.  ...  Index Terms: speaker verification, unsupervised-learning, feature-representation, shared-encoder, domain-adaptation.  ...  Since we utilize out of domain data which do not have phrase-ID labels, we add an extra category for all utterances whose contents do not match the given 10 phrases of the evaluation data.  ... 
doi:10.21437/interspeech.2020-2957 dblp:conf/interspeech/RaviFALA20 fatcat:qbunf4hytrd6vpb3f4hiwhm6bi

Exploring the Use of an Unsupervised Autoregressive Model as a Shared Encoder for Text-Dependent Speaker Verification [article]

Vijay Ravi, Ruchao Fan, Amber Afshan, Huanhua Lu, Abeer Alwan
2020 arXiv   pre-print
In this paper, we propose a novel way of addressing text-dependent automatic speaker verification (TD-ASV) by using a shared-encoder with task-specific decoders.  ...  We show that the proposed approach can leverage from large, unlabeled, data-rich domains, and learn speech patterns independent of downstream tasks.  ...  Since we utilize out of domain data which do not have phrase-ID labels, we add an extra category for all utterances whose contents do not match the given 10 phrases of the evaluation data.  ... 
arXiv:2008.03615v1 fatcat:ke3hgnm5zbcizhbz2z6khw5j4e

Analysis of Language Dependent Front-End for Speaker Recognition

Srikanth Madikeri, Subhadeep Dey, Petr Motlicek
2018 Interspeech 2018  
In this paper, we address the scenario in which one can develop a Automatic Speech Recognizer with limited resources for a language present in the evaluation condition, thus enabling the use of a DNN acoustic  ...  In Deep Neural Network (DNN) i-vector based speaker recognition systems, acoustic models trained for Automatic Speech Recognition are employed to estimate sufficient statistics for i-vector modeling.  ...  This mean is estimated from the unlabeled development data in SRE2016. This data will be referred to as SRE16U.  ... 
doi:10.21437/interspeech.2018-2071 dblp:conf/interspeech/MadikeriDM18 fatcat:2z26w7klmfddfefytrvzxy5zsy
« Previous Showing results 1 — 15 out of 648 results