Benchmarking methods for audio-visual recognition using tiny training sets

Xavier Alameda-Pineda, Jordi Sanchez-Riera, Radu Horaud
2013 2013 IEEE International Conference on Acoustics, Speech and Signal Processing  
The problem of choosing a classifier for audio-visual command recognition is addressed. Because such commands are culture-and user-dependant, methods need to learn new commands from a few examples. We benchmark three state-ofthe-art discriminative classifiers based on bag of words and SVM. The comparison is made on monocular and monaural recordings of a publicly available dataset. We seek for the best trade off between speed, robustness and size of the training set. In the light of over 150,000
more » ... experiments, we conclude that this is a promising direction of work towards a flexible methodology that must be easily adaptable to a large variety of users.
doi:10.1109/icassp.2013.6638341 dblp:conf/icassp/Alameda-PinedaSH13 fatcat:cx54ril6gfcnzge6ncl4vrglau