VOICED SPEECH ENHANCEMENT BASED ON ADAPTIVE FILTERING OF SELECTED INTRINSIC MODE FUNCTIONS

KAIS KHALDI, MONIA TURKI-HADJ ALOUANE, ABDEL-OUAHAB BOUDRAA
2010 Advances in Adaptive Data Analysis  
is an open access repository that collects the work of Arts et Métiers ParisTech researchers and makes it freely available over the web where possible. This is an author-deposited version published in: http://sam.ensam.eu Handle ID: .http://hdl.handle.net/10985/8987 To cite this version : Kais KHALDI, Monia TURKI, Abdel-Ouahab BOUDRAA -Voiced speech enhancement based on adaptive filtering of selected intrinsic mode functions In this paper a new method for voiced speech enhancement combining the
more » ... Empiri-17 cal Mode Decomposition (EMD) and the Adaptive Center Weighted Average (ACWA) filter is introduced. Noisy signal is decomposed adaptively into intrinsic oscillatory com-19 ponents called Intrinsic Mode Functions (IMFs). Since voiced speech structure is mostly distributed on both medium and low frequencies, the shorter scale IMFs of the noisy 21 signal are beneath noise, however the longer scale ones are less noisy. Therefore, the main idea of the proposed approach is to only filter the shorter scale IMFs, and to 23 keep the longer scale ones unchanged. In fact, the filtering of longer scale IMFs will introduce distortion rather than reducing noise. The denoising method is applied to sev-25 eral voiced speech signals with different noise levels and the results are compared with wavelet approach, ACWA filter and EMD-ACWA (filtering of all IMFs using ACWA fil-27 ter). Relying on exhaustive simulations, we show the efficiency of the proposed method for reducing noise and its superiority over other denoising methods, i.e., to improve 29 Signal-to-Noise Ratio (SNR), and to offer better listening quality based on a Perceptual Evaluation of Speech Quality (PESQ). The present study is limited to signals corrupted 31 by additive white Gaussian noise. Keywords: Voiced speech enhancement; Empirical Mode Decomposition; ACWA filter. 33 recorded and transmitted speech signals contain a considerable amount of acoustic 1 background noise. Furthermore, with the growth of mobile communication applications, the problem of reducing the background noise has become increasingly 3 important. Different strategies have been proposed for noise reduction, such as Wiener filter [Proakis and Manolakis (1996)] or subspace filtering [Hermus et al. 5 (2007)]. These linear methods have attracted significant interests and investigations due to their easy design and implementation. However, these approaches are not 7 very effective when signals contain sharp shapes or impulses of short duration. To overcome these limits, nonlinear approaches, such as wavelet analysis, have been 9 proposed [Donoho (1995)]. However, the fixed basis functions limit the performance of the wavelets over particular class of nonstationary signals. Recently, a new data-11 driven method, called Empirical Mode Decomposition (EMD), has been introduced by Huang et al. [1998] for analyzing nonlinear and nonstationary signals. The EMD 13 decomposes adaptively a signal into intrinsic oscillatory components called Intrinsic Mode Functions (IMFs). The basis functions of EMD are derived from the signal 15 itself and hence, the analysis is adaptive in contrast to traditional methods where the basis functions are fixed.
doi:10.1142/s1793536910000409 fatcat:jhz4425trvfh7mrsmmdhwzh7de