A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Estimating nonnegative matrix model activations with deep neural networks to increase perceptual speech quality
2015
Journal of the Acoustical Society of America
As a means of speech separation, time-frequency masking applies a gain function to the timefrequency representation of noisy speech. On the other hand, nonnegative matrix factorization (NMF) addresses separation by linearly combining basis vectors from speech and noise models to approximate noisy speech. This paper presents an approach for improving the perceptual quality of speech separated from background noise at low signal-to-noise ratios. An ideal ratio mask is estimated, which separates
doi:10.1121/1.4928612
pmid:26428778
pmcid:PMC5392055
fatcat:2ydwgowujfbh7a4f5jaziqtmva