A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
On Training Targets for Deep Learning Approaches to Clean Speech Magnitude Spectrum Estimation
[post]
2020
unpublished
The estimation of the clean speech short-time magnitude spectrum (MS) is key for speech enhancement and separation. Moreover, an automatic speech recognition (ASR) system that employs a front-end relies on clean speech MS estimation to remain robust. Training targets for deep learning approaches to clean speech MS estimation fall into three main categories: computational auditory scene analysis (CASA), MS, and minimum mean-square error (MMSE) training targets. In this study, we aim to determine
doi:10.36227/techrxiv.13012760
fatcat:yuxdewgmujcmlppvle7afj56oy