Speaker identification with whispered speech based on modified LFCC parameters and feature mapping

Xing Fan, John H.L. Hansen
2009 2009 IEEE International Conference on Acoustics, Speech and Signal Processing  
Much research recently in speaker recognition has been devoted to robustness due to microphone and channel effects. However, changes in vocal effort, especially whispered speech, present significant challenges in maintaining system performance. Due to the absence of any periodic excitation in whisper, the spectral structure in whisper and neutral speech will differ. Therefore, performance of speaker ID systems, trained mainly with high energy voiced phonemes, degrades when tested with whisper.
more » ... his study considers a front-end feature compensation method for whispered speech to improve speaker recognition using a neutral trained system. First, an alternative feature vector with linear frequency cepstral coefficients (LFCC) is introduced based on spectral analysis from both speech modes. Next, for the first time a feature mapping is proposed for reducing whisper/neutral mismatch in speaker ID. Feature mapping is applied on a frame-by-frame basis between two speaker independent GMMs (Gaussian Mixture Models) of whispered and neutral speech. Text independent closed set speaker ID results show an absolute 20% improvement in accuracy when compared with a traditional MFCC feature based system. This result confirms a viable approach to improving speaker ID performance between neutral and whispered speech conditions.
doi:10.1109/icassp.2009.4960643 dblp:conf/icassp/FanH09 fatcat:udizeng46bdetmdivf5toidznq