Voice familiarity via training affects listening effort during a voice cue sensitivity task with vocoder degraded speech
Understanding speech in real-life can be challenging and effortful when multiple people speak at the same time. In speech-on-speech (SoS) perception, normal hearing (NH) listeners can use fundamental frequency (F0) and vocal-tract length (VTL) voice cues to separate speech streams, spoken by different talkers. However, such voice segregation can be challenging for cochlear implant (CI) users, as CI users have a reduced sensitivity to F0 and VTL voice cues. Additionally, vocoder studies show
... listening effort is increased with increased spectral degradation in the speech signal. In SoS listening, familiarity with a talker's voice can improve speech intelligibility for NH listeners. However, it is unknown if voice familiarity improves sensitivity to F0 and VTL voice cues and affects listening effort, especially when the speech signal is vocoder degraded. In this study, we aimed to provide voice familiarity by implicit short-term voice training. During training, participants listened to an audiobook segment of approximately 30 minutes that contained 13 chapters, and after each chapter, they answered a context related question. Voice sensitivity, namely just-noticeable-differences (JNDs) for F0 and VTL voice cues combined (F0+VTL), was measured with an odd-one-out task in a 3 alternative forced choice adaptive paradigm. Simultaneously, listening effort was measured via pupillometry. Our results showed that voice training did not improve sensitivity to small F0+VTL voice cue differences measured at the threshold level for both non-vocoded and vocoded conditions. However, according to Generalized Additive Mixed Models (GAMM) analysis results, effort while listening to vocoded speech was less for trained (familiar) compared to untrained voices. These findings suggest that voice familiarity through implicit voice training can be of benefit for voice cue perception through reducing listening effort for vocoded speech, even in the absence of a behavioral effect.