Dynamic Margin Softmax Loss for Speaker Verification

Dao Zhou, Longbiao Wang, Kong Aik Lee, Yibo Wu, Meng Liu, Jianwu Dang, Jianguo Wei
2020 Interspeech 2020  
We propose a dynamic-margin softmax loss for the training of deep speaker embedding neural network. Our proposal is inspired by the additive-margin softmax (AM-Softmax) loss reported earlier. In AM-Softmax loss, a constant margin is used for all training samples. However, the angle between the feature vector and the ground-truth class center is rarely the same for all samples. Furthermore, the angle also changes during training. Thus, it is more reasonable to set a dynamic margin for each
more » ... ng sample. In this paper, we propose to dynamically set the margin of each training sample commensurate with the cosine angle of that sample, hence, the name dynamic-additivemargin softmax (DAM-Softmax) loss. More specifically, the smaller the cosine angle is, the larger the margin between the training sample and the corresponding class in the feature space should be to promote intra-class compactness. Experimental results show that the proposed DAM-Softmax loss achieves stateof-the-art performance on the VoxCeleb dataset by 1.94% in equal error rate (EER). In addition, our method also outperforms AM-Softmax loss when evaluated on the Speakers in the Wild (SITW) corpus.
doi:10.21437/interspeech.2020-1106 dblp:conf/interspeech/ZhouWLWLDW20 fatcat:iaqjbpklbjgnln7scxamd3ju7m