I-vectors in the context of phonetically-constrained short utterances for speaker verification

Anthony Larcher, Pierre-Michel Bousquet, Kong Aik Lee, Driss Matrouf, Haizhou Li, Jean-Francois Bonastre
2012 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
Short speech duration remains a critical factor of performance degradation when deploying a speaker verification system. To overcome this difficulty, a large number of commercial applications impose the use of fixed pass-phrases. In this context, we show that the performance of the popular i-vector approach can be greatly improved by taking advantage of the phonetic information that they convey. Moreover, as i-vectors require a conditioning process to reach high accuracy, we show that further
more » ... provements are possible by taking advantage of this phonetic information within the normalisation process. We compare two methods, Within Class Covariance Normalization (WCCN) and Eigen Factor Radial (EFR), both relying on parameters estimated on the same development data. Our study suggests that WCCN is more robust to data mismatch but less efficient than EFR when the development data has a better match with the test data.
doi:10.1109/icassp.2012.6288986 dblp:conf/icassp/LarcherBLMLB12 fatcat:qkkwi46wr5hd7lkfc56dxw56gq