Inter-modality mapping in robot with recurrent neural network

Tetsuya Ogata, Shun Nishide, Hideki Kozima, Kazunori Komatani, Hiroshi G. Okuno
2010 Pattern Recognition Letters  
A system for mapping between different sensory modalities was developed for a robot system to enable it to generate motions expressing auditory signals and sounds generated by object movement. A recurrent neural network model with parametric bias, which has good generalization ability, is used as a learning model. Since the correspondences between auditory signals and visual signals are too numerous to memorize, the ability to generalize is indispensable. This system was implemented in the
more » ... on" robot, and the robot was shown horizontal reciprocating or rotating motions with the sound of friction and falling or overturning motion with the sound of collision by manipulating a box object. Keepon behaved appropriately not only from learned events but also from unknown events and generated various sounds in accordance with observed motions. the robot to handle all the data simultaneously, which is the approach we have taken. People deal with "cross-modal information" by, for example, expressing auditory information (e.g. sounds of collision) by using visual expressions like gestures (e.g., moving the hand quickly and stopping it sharply). These gestures are apparently related to the development of onomatopoeia [3] . We call this process "inter-modality mapping." Arsenio and Fitzpatrick proposed an interesting method for object recognition using "periodic dynamics" in multi-modal information [4] . Using this method, a humanoid robot called Cog recognizes objects by coupling data from different modes. For example, a hammer is recognized
doi:10.1016/j.patrec.2010.05.002 fatcat:kp735kf2lvfbza5varedelrc74