New single-ended objective measure for non-intrusive speech quality evaluation

Abdulhussain E. Mahdi, Dorel Picovici
2008 Signal, Image and Video Processing  
This article proposes a new output-based method for non-intrusive assessment of speech quality of voice communication systems and evaluates its performance. The method requires access to the processed (degraded) speech only, and is based on measuring perception-motivated objective auditory distances between the voiced parts of the output speech to appropriately matching references extracted from a pre-formulated codebook. The codebook is formed by optimally clustering a large number of
more » ... c speech vectors extracted from a database of clean speech records. The auditory distances are then mapped into objective Mean Opinion listening quality scores. An efficient data-mining tool known as the Self-Organizing Map (SOM) achieves the required clustering and mapping/reference matching processes. In order to obtain a perception-based, speaker-independent parametric representation of the speech, three domain transformation techniques have been investigated. The first technique is based on a Perceptual Linear Prediction (PLP) model, the second utilises a Bark Spectrum (BS) analysis and the third utilises Mel-Frequency Cepstrum Coefficients (MFCC). Reported evaluation results show that the proposed method provides high correlation with subjective listening quality scores, yielding accuracy similar to that of the ITU-T P.563 while maintaining a relatively low computational complexity. Results also demonstrate that the method outperforms the PESQ in a number of distortion conditions, such as those of speech degraded by channel impairments.
doi:10.1007/s11760-008-0092-1 fatcat:4tbjxdislbhe3e35q42qhhewaa