Toward Interactive Grounded Language Acqusition

Thomas Kollar, Jayant Krishnamurthy, Grant Strimel
2013 Robotics: Science and Systems IX  
This paper addresses the problem of enabling robots to interactively learn visual and spatial models from multi-modal interactions involving speech, gesture and images. Our approach, called Logical Semantics with Perception (LSP), provides a natural and intuitive interface by significantly reducing the amount of supervision that a human is required to provide. This paper demonstrates LSP in an interactive setting. Given speech and gesture input, LSP is able to learn object and relation
more » ... rs for objects like mugs and relations like left and right. We extend LSP to generate complex natural language descriptions of selected objects using adjectives, nouns and relations, such as "the orange mug to the right of the green book." Furthermore, we extend LSP to incorporate determiners (e.g., "the") into its training procedure, enabling the model to generate acceptable relational language 20% more often than the unaugmented model.
doi:10.15607/rss.2013.ix.005 dblp:conf/rss/KollarKS13 fatcat:l2g2set4zzak7mgaaofctie5xi