Factorial Hidden Restricted Boltzmann Machines for noise robust speech recognition

Steven J. Rennie, Petr Fousek, Pierre L. Dognin
2012 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
We present the Factorial Hidden Restricted Boltzmann Machine (FHRBM) for robust speech recognition. Speech and noise are modeled as independent RBMs, and the interaction between them is explicitly modeled to capture how speech and noise combine to generate observed noisy speech features. In contrast with RBMs, where the bottom layer of random variables is observed, inference in the FHRBM is intractable, scaling exponentially with the number of hidden units. We introduce variational algorithms
more » ... r efficient approximate inference that scale linearly with the number of hidden units. Compared to traditional factorial models of noisy speech, which are based on GMMs, the FHRBM has the advantage that the representations of both speech and noise are highly distributed, allowing the model to learn a partsbased representation of noisy speech data that can generalize better to previously unseen noise compositions. Preliminary results suggest that the approach is promising.
doi:10.1109/icassp.2012.6288869 dblp:conf/icassp/RennieFD12 fatcat:cbfbcdys55eghi3rxut63z5pdy