A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Auxiliary Loss Function for Target Speech Extraction and Recognition with Weak Supervision Based on Speaker Characteristics
2021
Conference of the International Speech Communication Association
Automatic speech recognition systems deteriorate in presence of overlapped speech. A popular approach to alleviate this is target speech extraction. The extraction system is usually trained with a loss function measuring the discrepancy between the estimated and the reference target speech. This often leads to distortions to the target signal which is detrimental to the recognition accuracy. Additionally, it is necessary to have the strong supervision provided by parallel data consisting of
doi:10.21437/interspeech.2021-986
dblp:conf/interspeech/ZmolikovaDR0C21
fatcat:5z4xpdjwmffbnby2ari7bf4uay