Multi-Microphone Neural Speech Separation for Far-Field Multi-Talker Speech Recognition

Takuya Yoshioka, Hakan Erdogan, Zhuo Chen, Fil Alleva
2018 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
These features are normalized on a per-utterance basis. • The spectral features are mean-and variance-normalized. • The spatial features are mean-normalized. Simply feeding multi-microphone STFT coefficients resulted in performance degradation (see Tab. 2).
doi:10.1109/icassp.2018.8462081 dblp:conf/icassp/YoshiokaECA18 fatcat:n4thrqcn7jbs5g64juxjehe36i