A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Utilization of unlabeled development data for speaker verification
2014
2014 IEEE Spoken Language Technology Workshop (SLT)
This study investigates a potential solution to resolve this challenge by effectively utilizing unlabeled development data with universal imposter clustering. ...
Unfortunately, this efficiency is obtained at the cost of a required large corpus of labeled development data, which is too expensive/unrealistic in many cases. ...
CONCLUSIONS In this study, we investigated a series of algorithms for speaker verification involving unlabeled development data. We explored two approaches to utilize the unlabeled development data. ...
doi:10.1109/slt.2014.7078611
dblp:conf/slt/LiuYSMXH14
fatcat:xigylza7izf53kzblgx7s6k6sm
Graph-based Label Propagation for Semi-Supervised Speaker Identification
[article]
2021
arXiv
pre-print
Speaker identification in the household scenario (e.g., for smart speakers) is typically based on only a few enrollment utterances but a much larger set of unlabeled data, suggesting semisupervised learning ...
We show in experiments on the VoxCeleb dataset that this approach makes effective use of unlabeled data and improves speaker identification accuracy compared to two state-of-the-art scoring methods as ...
Rationales for comparing these methods are: 1) CS and CSEA are the most commonly used methods for speaker verification in previous works, but they do not use unlabeled data for prediction. 2) 2-CS and ...
arXiv:2106.08207v1
fatcat:5kjp7bohtfbp3lqgn3wd2emmwe
Weakly Supervised PLDA Training
[article]
2017
arXiv
pre-print
PLDA is a popular normalization approach for the i-vector model, and it has delivered state-of-the-art performance in speaker verification. ...
However, PLDA training requires a large amount of labelled development data, which is highly expensive in most cases. ...
Another approach to utilizing unlabelled data is to produce labels for these data automatically. ...
arXiv:1609.08441v2
fatcat:odryucpaajavlct4grirlle3iu
Cross-lingual Text-independent Speaker Verification Using Unsupervised Adversarial Discriminative Domain Adaptation
2019
ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Being able to improve cross-lingual speaker verification system using unlabeled data can greatly increase the robustness of the system and reduce human labeling costs. ...
Further data analysis of ADDA adapted speaker embedding shows that the learned speaker embeddings can perform well on speaker classification for the target domain data, and are less dependent with respect ...
To accomplish this, A set of unlabeled data for the new language is needed. We use the target domain AISHELL unlabeled training data. ...
doi:10.1109/icassp.2019.8682259
dblp:conf/icassp/XiaHH19
fatcat:ulwrq5klbbad7dfbzdtg3b6sga
Domain adaptation based Speaker Recognition on Short Utterances
[article]
2016
arXiv
pre-print
short utterances are used for evaluation, the performance gain of in-domain speaker verification reduces at an increasing rate. ...
over out-domain PLDA speaker verification when SWB and NIST data are respectively used for S normalization. ...
Unfortunately, the performance of many of these approaches degrades rapidly as the available amount of enrolment and/or verification speech decreases [7, 8, 9] , limiting the utility of speaker verification ...
arXiv:1610.02831v2
fatcat:26kuyr52oragrdttyty7tficqy
Graph Convolutional Network Based Semi-Supervised Learning on Multi-Speaker Meeting Data
[article]
2022
arXiv
pre-print
In reality, we are often presented with enormous amounts of unlabeled data from multi-party meetings and discussions. ...
We show that this proposed approach effectively uses unlabeled data and improves speaker recognition accuracy. ...
In this work, we present techniques for exploiting unlabeled meeting data. We demonstrate that learning from unlabeled utterances is indeed a practical avenue for improving speaker verification. ...
arXiv:2204.11501v1
fatcat:4hjc5rgvxnhrjovhwpc4etjdau
Self-Supervised Speaker Verification with Simple Siamese Network and Self-Supervised Regularization
[article]
2022
arXiv
pre-print
Training speaker-discriminative and robust speaker verification systems without speaker labels is still challenging and worthwhile to explore. ...
We also explore the effectiveness of alternative online data augmentation strategies on both the time domain and frequency domain. ...
To investigate the effectiveness of data augmentation, models are trained with different data augmentation strategies on the development set of VoxCeleb1 which contains 148,642 utterances for 1211 speakers ...
arXiv:2112.04459v2
fatcat:amiklcmgjzdsnl4vssn6t7icwi
An iterative framework for unsupervised learning in the PLDA based speaker verification
2014
The 9th International Symposium on Chinese Spoken Language Processing
To automatically retrieve the speaker labels of unlabeled training data, we propose to use the Affinity Propagation (AP) -a clustering method that takes pairwise data similarity as input -to generate the ...
We present an iterative and unsupervised learning approach for the speaker verification task. ...
We first perform clustering on the unlabeled development data to estimate the labels and train a PLDA model. ...
doi:10.1109/iscslp.2014.6936726
dblp:conf/iscslp/LiuYL14
fatcat:hbysaus6o5drdcagfrdwasilwu
Semi-Supervised Contrastive Learning with Generalized Contrastive Loss and Its Application to Speaker Recognition
[article]
2020
arXiv
pre-print
In experiments, we applied the proposed framework to text-independent speaker verification on the VoxCeleb dataset. ...
We demonstrate that GCL enables the learning of speaker embeddings in three manners, supervised learning, semi-supervised learning, and unsupervised learning, without any changes in the definition of the ...
ACKNOWLEDGMENT This work was partially supported by the Japan Science and Technology Agency, ACT-X Grant JPMJAX1905, and the Japan Society for the Promotion of Science, KAKENHI Grant 19K22865. ...
arXiv:2006.04326v1
fatcat:5mfzgyfluzfq3jznyjmml67kxa
Local Training for PLDA in Speaker Verification
[article]
2016
arXiv
pre-print
PLDA is a popular normalization approach for the i-vector model, and it has delivered state-of-the-art performance in speaker verification. ...
However, PLDA training requires a large amount of labeled development data, which is highly expensive in most cases. ...
How to use unlabeled data is a critical problem in particular for practical systems. ...
arXiv:1609.08433v1
fatcat:yq7zl7dxh5ae3difb6spknuvia
Speaker verification using end-to-end adversarial language adaptation
[article]
2018
arXiv
pre-print
In this paper we investigate the use of adversarial domain adaptation for addressing the problem of language mismatch between speaker recognition corpora. ...
In the context of speaker verification, adversarial domain adaptation methods aim at minimizing certain divergences between the distribution that the utterance-level features follow (i.e. speaker embeddings ...
DA methods for speaker verification are of particular interest, as for many real-world applications large amounts of target domain labeled data are rarely available. ...
arXiv:1811.02331v1
fatcat:5mjkpfvyxnbvtmkobd2whumkda
Neural Predictive Coding using Convolutional Neural Networks towards Unsupervised Learning of Speaker Characteristics
[article]
2019
arXiv
pre-print
We train a convolutional deep siamese network to produce "speaker embeddings" by learning to separate 'same' vs 'different' speaker pairs which are generated from an unlabeled data of audio streams. ...
This paper provides a novel approach, we term Neural Predictive Coding (NPC), to learn speaker-specific characteristics in a completely unsupervised manner from large amounts of unlabeled training data ...
Data for speaker verification experiment A recently released large speaker verification corpus, Vox-Celeb (version 1) [59] is employed for the speaker verification experiment. ...
arXiv:1802.07860v2
fatcat:6mhav3jkb5ewfno7z4nzfdkguu
Exploring the Use of an Unsupervised Autoregressive Model as a Shared Encoder for Text-Dependent Speaker Verification
2020
Interspeech 2020
In this paper, we propose a novel way of addressing textdependent automatic speaker verification (TD-ASV) by using a shared-encoder with task-specific decoders. ...
Index Terms: speaker verification, unsupervised-learning, feature-representation, shared-encoder, domain-adaptation. ...
Since we utilize out of domain data which do not have phrase-ID labels, we add an extra category for all utterances whose contents do not match the given 10 phrases of the evaluation data. ...
doi:10.21437/interspeech.2020-2957
dblp:conf/interspeech/RaviFALA20
fatcat:qbunf4hytrd6vpb3f4hiwhm6bi
Exploring the Use of an Unsupervised Autoregressive Model as a Shared Encoder for Text-Dependent Speaker Verification
[article]
2020
arXiv
pre-print
In this paper, we propose a novel way of addressing text-dependent automatic speaker verification (TD-ASV) by using a shared-encoder with task-specific decoders. ...
We show that the proposed approach can leverage from large, unlabeled, data-rich domains, and learn speech patterns independent of downstream tasks. ...
Since we utilize out of domain data which do not have phrase-ID labels, we add an extra category for all utterances whose contents do not match the given 10 phrases of the evaluation data. ...
arXiv:2008.03615v1
fatcat:ke3hgnm5zbcizhbz2z6khw5j4e
Analysis of Language Dependent Front-End for Speaker Recognition
2018
Interspeech 2018
In this paper, we address the scenario in which one can develop a Automatic Speech Recognizer with limited resources for a language present in the evaluation condition, thus enabling the use of a DNN acoustic ...
In Deep Neural Network (DNN) i-vector based speaker recognition systems, acoustic models trained for Automatic Speech Recognition are employed to estimate sufficient statistics for i-vector modeling. ...
This mean is estimated from the unlabeled development data in SRE2016. This data will be referred to as SRE16U. ...
doi:10.21437/interspeech.2018-2071
dblp:conf/interspeech/MadikeriDM18
fatcat:2z26w7klmfddfefytrvzxy5zsy
« Previous
Showing results 1 — 15 out of 648 results