A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Phone-centric local variability vector for text-constrained speaker verification
2015
Interspeech 2015
unpublished
This paper investigates the use of frame alignment given by a deep neural network (DNN) for text-constrained speaker verification task, where the lexical contents of the test utterances are limited to a finite set of vocabulary. The DNN makes use of information carried by the target and its contextual frames to assign it probabilistically to one of the phonetic states. The frame alignment is therefore more precise and less ambiguous than that generated by a Gaussian mixture model (GMM). Using
doi:10.21437/interspeech.2015-90
fatcat:s4rz2p33djeitcf7np2fmhf7du