Automatic Error Correction for Speaker Embedding Learning with Noisy Labels

Fuchuan Tong, Yan Liu, Song Li, Jie Wang, Lin Li, Qingyang Hong
2021 Conference of the International Speech Communication Association  
Despite the superior performance deep neural networks have achieved in speaker verification tasks, much of their success benefits from the availability of large-scale and carefully labeled datasets. However, noisy labels often occur during data collection. In this paper, we propose an automatic error correction method for deep speaker embedding learning with noisy labels. Specifically, a label noise correction loss is proposed that leverages a model's generalization capability to correct noisy
more » ... abels during training. In addition, we improve the vanilla AM-Softmax to estimate a more robust speaker posterior by introducing sub-centers. When applied on the VoxCeleb dataset, the proposed method performs gracefully when noisy labels are introduced. Moreover, when combining with the Bayesian estimation of PLDA with noisy training labels at the back-end, the whole system performs better under conditions in which noisy labels are present.
doi:10.21437/interspeech.2021-2021 dblp:conf/interspeech/TongLLWLH21 fatcat:6ynqdzftlrckhbzuvo3fbu42ey