Speech Signal Enhancement in Cocktail Party Scenarios by Deep Learning based Virtual Sensing of Head-Mounted Microphones

Tim Fischer, Marco Caversaccio, Wilhelm Wimmer
2021 Hearing Research  
The cocktail party effect refers to the human sense of hearing's ability to pay attention to a single conversation while filtering out all other background noise. To mimic this human hearing ability for people with hearing loss, scientists integrate beamforming algorithms into the signal processing path of hearing aids or implants' audio processors. Although these algorithms' performance strongly depends on the number and spatial arrangement of the microphones, most devices are equipped with a
more » ... mall number of microphones mounted close to each other on the audio processor housing. We measured and evaluated the impact of the number and spatial arrangement of hearing aid or head-mounted microphones on the performance of the established Minimum Variance Distortionless Response beamformer in cocktail party scenarios. The measurements revealed that the optimal microphone placement exploits monaural cues (pinna-effect), is close to the target signal, and creates a large distance spread due to its spatial arrangement. However, this microphone placement is impractical for hearing aid or implant users, as it includes microphone positions such as on the forehead. To overcome microphones' placement at impractical positions, we propose a deep virtual sensing estimation of the corresponding audio signals. The results of objective measures and a subjective listening test with 20 participants showed that the virtually sensed microphone signals significantly improved the speech quality, especially in cocktail party scenarios with low signal-to-noise ratios. Subjective speech quality was assessed using a 3-alternative forced choice procedure to determine which of the presented speech mixtures was most pleasant to understand. Hearing aid and cochlear implant (CI) users might benefit from the presented approach using virtually sensed microphone signals, especially in noisy environments.
doi:10.1016/j.heares.2021.108294 pmid:34182232 fatcat:te6gzp4cxnbmtg6f4hrqjj2orq