Filters








8,014 Hits in 2.3 sec

Multimodal Routing: Improving Local and Global Interpretability of Multimodal Language Analysis [article]

Yao-Hung Hubert Tsai, Martin Q. Ma, Muqiao Yang, Ruslan Salakhutdinov, Louis-Philippe Morency
2020 arXiv   pre-print
Recent multimodal learning with strong performances on human-centric tasks such as sentiment analysis and emotion recognition are often black-box, with very limited interpretability.  ...  In this paper we propose Multimodal Routing, which dynamically adjusts weights between input modalities and output representations differently for each input sample.  ...  In human multimodal language, such routing dynamically changes weights between modalities and output labels for each sample as shown in Fig. 1 .  ... 
arXiv:2004.14198v2 fatcat:2lcirbrwmjh4fozfqtjitbhyi4

Multimodal Routing: Improving Local and Global Interpretability of Multimodal Language Analysis

Yao-Hung Hubert Tsai, Martin Ma, Muqiao Yang, Ruslan Salakhutdinov, Louis-Philippe Morency
2020 Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)  
Recent multimodal learning with strong performances on human-centric tasks such as sentiment analysis and emotion recognition are often black-box, with very limited interpretability.  ...  In this paper we propose Multimodal Routing, which dynamically adjusts weights between input modalities and output representations differently for each input sample.  ...  We perform two iterations of routing between features and concepts with dimension d c = 64 where d c is the dimension of concepts. All experi-  ... 
doi:10.18653/v1/2020.emnlp-main.143 pmid:33969363 pmcid:PMC8106385 fatcat:fknnqv6a6zbx7fub66j6mvilvy

QuickSet

Philip R. Cohen, Michael Johnston, David McGee, Sharon Oviatt, Jay Pittman, Ira Smith, Liang Chen, Josh Clow
1997 Proceedings of the fifth conference on Applied natural language processing -  
The paper briefly describes the system and illustrates its use in multimodal simulation setup.  ...  This paper presents a novel multimodal system applied to the setup and control of distributed interactive simulations.  ...  Muitimodal integration agent: The multimodal interpretation agent accepts typed feature structure meaning representations from the language and gesture recognition agents, and produces a unified multimodal  ... 
doi:10.3115/974557.974562 dblp:conf/anlp/CohenJMOPSCC97 fatcat:u5ol7gf7avhpvlqjrqi6ghrfvm

Unification-based multimodal parsing

Michael Johnston
1998 Proceedings of the 36th annual meeting on Association for Computational Linguistics -  
This is an effective solution for a broad class of systems, but limits multimodal utterances to combinations of a single spoken phrase with a single gesture.  ...  We show how the unification-based approach can be scaled up to provide a full multimodal grammar formalism.  ...  Conclusion The multimodal language processing architecture presented here enables parsing and interpretation of natural human input distributed across two or three spatial dimensions, time, and the acoustic  ... 
doi:10.3115/980845.980949 dblp:conf/acl/Johnston98 fatcat:s7vgogrdhrbehphlgencufv74y

QuickSet

Philip R. Cohen, Michael Johnston, David McGee, Sharon Oviatt, Jay Pittman, Ira Smith, Liang Chen, Josh Clow
1997 Proceedings of the fifth ACM international conference on Multimedia - MULTIMEDIA '97  
The paper describes the overall system architecture, a novel multimodal integration strategy offering mutual compensation among modalities, and provides examples of multimodal simulation setup.  ...  running on a hand-held PC, communicating via wireless LAN through an agent architecture to a number of systems, including NRaD's 1 LeatherNet system, a distributed interactive training simulator built for  ...  For example, it was discovered there that multimodal interaction would lead to simpler language than unimodal speech.  ... 
doi:10.1145/266180.266328 dblp:conf/mm/CohenJMOPSCC97 fatcat:ffni4loiybcrnd2zneby5evfau

Confirmation in multimodal systems

David R. McGee, Philip R. Cohen, Sharon Oviatt
1998 Proceedings of the 36th annual meeting on Association for Computational Linguistics -  
Multimodal systems--those that combine simultaneous input from more than one modality, for example speech and gesture----have historically been designed so that they either request confirmation of speech  ...  Systems that attempt to understand natural human input make nfistakes, even humans. However, humans avoid misunderstandings by confirming doubtful input.  ...  Special thanks to Donald Hanley for his insightful editorial comment and friendship. Finally, sincere thanks to the people who volunteered to participate as subjects in this research.  ... 
doi:10.3115/980691.980705 dblp:conf/acl/McGeeCO98 fatcat:h2p7cts5tzbqzajvxkfjepezde

Unification-based multimodal integration

Michael Johnston, Philip R. Cohen, David McGee, Sharon L. Oviatt, James A. Pittman, Ira Smith
1997 Proceedings of the 35th annual meeting on Association for Computational Linguistics -  
This paper describes a multimodal language processing architecture which supports interfaces allowing simultaneous input from speech and gesture recognition.  ...  Recent empirical research has shown conclusive advantages of multimodal interaction over speech-only interaction for mapbased tasks.  ...  between humans and machines.  ... 
doi:10.3115/976909.979653 dblp:conf/acl/JohnstonCMOPS97 fatcat:4hajgxa5drbztkda5bem3pnufq

Speech centric multimodal interfaces for disabled users

Knut Kvale, Narada Dilp Warakagoda, Klaus Fellbaum
2008 Technology and Disability  
design for all.  ...  This paper explores how multimodal interfaces make it easier for people with sensory impairments to interact with mobile terminals such as PDAs and 3rd generation mobile phones (3G/UMTS).  ...  This work has been financed by the BRAGE-project of the research program "Knowledge development for Norwegian language technology" (KUNSTI) of the Norwegian Research Council.  ... 
doi:10.3233/tad-2008-20204 fatcat:nrolxb5oojf2ddiewslcdqtbhe

Multimodal application for foreign language teaching

Teresa Magal-Royo, Jose Luis Gimenez-Lopez, Blas Pairy, Jesus Garcia-Laborda, Jimena Gonzalez-del Rio
2011 2011 14th International Conference on Interactive Collaborative Learning  
This paper shows the possibility of establishing multimodal architectures within the applications for specific language learning areas with ubiquitous devices, evidencing the technical and formal aspects  ...  The current development of educational applications for language learning has experienced a qualitative change in the criteria of interaction between users and devices due to the technological advances  ...  Multimodal integration based on specific or finite situations is applied directly to applications with established information routes.  ... 
doi:10.1109/icl.2011.6059564 fatcat:nikrqtmtf5b5tcpwoqjt763ohu

A user interface framework for multimodal VR interactions

Marc Erich Latoschik
2005 Proceedings of the 7th international conference on Multimodal interfaces - ICMI '05  
This article presents a User Interface (UI) framework for multimodal interactions targeted at immersive virtual environments.  ...  Specialized node types use these facilities to implement required processing tasks like gesture detection, preprocessing of the visual scene for multimodal integration, or translation of movements into  ...  AI-representation for the KRL as well as a neural network layer which will support the KRL as well as the matching stage of the gesture processing.  ... 
doi:10.1145/1088463.1088479 dblp:conf/icmi/Latoschik05 fatcat:u6l5l7zqyzdszosbxd4qrne6di

User-centered modeling for spoken language and multimodal interfaces

S. Oviatt
1996 IEEE Multimedia  
Such work is yielding more user-centered and robust interfaces for next-generation spoken language and multimodal systems.  ...  The present article summarizes recent research on usercentered modeling of human language and performance during spoken and multimodal interaction, as well as interface design aimed at next-generation  ...  In this approach, spoken language understanding compensates for ambiguity and potential errors in gestural interpretation, and vice-versa, through a statistically-ranked unification of semantic interpretations  ... 
doi:10.1109/93.556458 fatcat:s4wrmd2ifbglnbihttjnitgtry

Causal inference in graph-text constellations: Designing verbally annotated graphs

Christopher Habel, Cengiz Acarturk'
2011 Tsinghua Science and Technology  
Multimodal documents combining language and graphs are wide-spread in print media as well as in electronic media.  ...  Based on the experimental investigation of readers' inferences under different conditions, guidelines for the design of multimodal documents including text and statistical information graphics are suggested  ...  In addition, we thank Human Computer Interaction Research and Application Laboratory of the Middle East Technical University for their generous technical support.  ... 
doi:10.1016/s1007-0214(11)70002-5 fatcat:rbgxzhwf2fgzlmd6wxzu25quaa

Multimodal Interfaces to Mobile Terminals – A Design-For-All Approach [chapter]

Knut Kvale, Narada Dilp
2010 User Interfaces  
Acknowledgements We would like to express our thanks to Tone Finne, Eli Qvenild and Bjørgulv Høigaard at Bredtvet Resource Centre for helping us with the user evaluation and for valuable discussions and  ...  We are grateful to our colleagues Ragnhild Halvorsrud, Jon Emil Natvig and Gunhild Luke at Telenor for their inspiration and help.  ...  EMMA markup language is intended for use by systems that provide semantic interpretations for a variety of inputs, including but not necessarily limited to, speech, natural language text, GUI and ink input  ... 
doi:10.5772/9499 fatcat:7epdy5o56zhblfhg3b2226ck2e

Multimodality and Ambient Intelligence [chapter]

Anton Nijholt
2004 Philips Research  
interface has perceptual competence that includes being able to interpret what is going on in the environment.  ...  interface has perceptual competence that includes being able to interpret what is going on in the environment.  ...  In this report we surveyed part of our research on multimodal interfaces in the last three years. Many students and researchers contributed to this research.  ... 
doi:10.1007/978-94-017-0703-9_2 fatcat:xtmpw73y5bed7dffm265koanpy

Multimodal interfaces: Challenges and perspectives

Nicu Sebe
2009 Journal of Ambient Intelligence and Smart Environments  
However, the newly developed multimodal interfaces are using recognition-based technologies that must interpret human-speech, gesture, gaze, movement patterns, and other behavioral cues.  ...  In this paper we review the major approaches to multimodal Human Computer Interaction, giving an overview the user and task modeling, and to the multimodal fusion.  ...  These results, typically an n-best list of conjectured lexical items and related timestamp information, are then routed to appropriate agents for further language processing.  ... 
doi:10.3233/ais-2009-0003 fatcat:y2zimhpxrngznf64a2yulttkhm
« Previous Showing results 1 — 15 out of 8,014 results