A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Deep Speaker Embedding Extraction with Channel-Wise Feature Responses and Additive Supervision Softmax Loss Function
2019
Interspeech 2019
In speaker verification, the convolutional neural networks (C-NN) have been successfully leveraged to achieve a great performance. Most of the models based on CNN primarily focus on learning the distinctive speaker embedding from the horizontal direction (time-axis). However, the feature relationship between channels is usually neglected. In this paper, we firstly aim toward an alternate direction of recalibrating the channelwise features by introducing the recently proposed
doi:10.21437/interspeech.2019-1704
dblp:conf/interspeech/ZhouJLLH19
fatcat:6va5knr4cnf4lhh2mjlpecybua