3,003 Hits in 1.4 sec

Joint Audio-Visual Tracking Using Particle Filters

Dmitry N. Zotkin, Ramani Duraiswami, Larry S. Davis
2002 EURASIP Journal on Advances in Signal Processing  
We implement the algorithm in the context of a videoconferencing and meeting recording system.  ...  One advantage of our proposed tracker is its ability to seamlessly handle temporary absence of some measurements (e.g., camera occlusion or silence).  ...  We collected multimodal data during three simulated meetings of different types (lecture-type meeting where there is one primary speaker and occasional short interruptions occur, seminar-type meeting where  ... 
doi:10.1155/s1110865702206058 fatcat:c62nsokbcrhy3cenkd73wc5pj4

A realtime multimodal system for analyzing group meetings by combining face pose tracking and speaker diarization

Kazuhiro Otsuka, Shoko Araki, Kentaro Ishizuka, Masakiyo Fujimoto, Martin Heinrich, Junji Yamato
2008 Proceedings of the 10th international conference on Multimodal interfaces - IMCI '08  
The face position/pose data output by the face tracker is used to estimate the focus of attention in the group.  ...  This paper also presents new 3-D visualization schemes for meeting scenes and the results of an analysis.  ...  Multimodal smart rooms Recently, a number of multimodal systems for meeting applications have been developed by a number of research groups; they are often referred to as "smart rooms".  ... 
doi:10.1145/1452392.1452446 dblp:conf/icmi/OtsukaAIFHY08 fatcat:ecx55jf4irbqhfczemblazofzq

ATTRACkTIVE Advanced Travel Companion and Tracking Services

Daniel Schmidt, Achim Von Der Embse, Leyre Merle Carrera
2018 Zenodo  
Trackers.  ...  In order to meet this need, it is structured in a modular way, to allow both the easy creation and integration of new modules and the upgrade of existing ones.  ... 
doi:10.5281/zenodo.1456540 fatcat:rrbhpsq7hjhj5mdhrz4ek3qfnq

Speech Enhancement and Recognition in Meetings With an Audio–Visual Sensor Array

Hari Krishna Maganti, Daniel Gatica-Perez, Iain McCowan
2007 IEEE Transactions on Audio, Speech, and Language Processing  
In this paper, we present an integrated approach, in which an audio-visual multiperson tracker is used to track active speakers with high accuracy.  ...  This paper addresses the problem of distant speech acquisition in multiparty meetings, using multiple microphones and cameras.  ...  The current article examines the use of multimodal sensor arrays in the context of instrumented meeting rooms.  ... 
doi:10.1109/tasl.2007.906197 fatcat:4qqpwox7l5drte4c4fembnru74

ERmed – Towards Medical Multimodal Cyber-Physical Environments [chapter]

Daniel Sonntag
2014 Lecture Notes in Computer Science  
In this paper, we discuss the design and implementation of our resulting Medical Multimodal Cyber-Physical Environment and focus on how situation awareness provided by the environmental sensors effectively  ...  With new technologies towards medical cyber-physical systems, such as networked head-mounted displays (HMDs) and eye trackers, new interaction opportunities arise for real-time interaction between cyber-physical  ...  In this paper, we discuss the design and implementation of a first Medical Multimodal Cyber-Physical Environment and focus on how situation awareness meets mutual knowledge, which is defined by grounded  ... 
doi:10.1007/978-3-319-07527-3_34 fatcat:ap62zhyrynhdhon7eyivk3pj3u

Hierarchical audio-visual cue integration framework for activity analysis in intelligent meeting rooms

Shankar T. Shivappa, Mohan M. Trivedi, Bhaskar D. Rao
2009 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops  
We demonstrate this in a smart meeting room context equipped with 3 cameras and 16 microphones.  ...  Scene understanding in the context of a smart meeting room involves the extraction of various kinds of cues at different levels of semantic abstraction.  ...  General and specific multimodal fusion schemes have also been proposed [18] [6] [16] .  ... 
doi:10.1109/cvprw.2009.5204224 dblp:conf/cvpr/ShivappaTR09 fatcat:wywnweppbfahhbalu4wh3hzbri

Toward Mobile Eye-Based Human-Computer Interaction

Andreas Bulling, Hans Gellersen
2010 IEEE pervasive computing  
In contrast to their stationary counterparts, mobile eye trackers must conserve power to meet operating times required for long-term studies in research and commercial applications.  ...  Multimodal interfaces could automatically select input modalities best suited for the situation at hand.  ... 
doi:10.1109/mprv.2010.86 fatcat:kwb5wqd5rrevth7br3agneveoy

Target Detection and Tracking With Heterogeneous Sensors

Huiyu Zhou, M. Taj, A. Cavallaro
2008 IEEE Journal on Selected Topics in Signal Processing  
We present a multimodal detection and tracking algorithm for sensors composed of a camera mounted between two microphones.  ...  Index Terms-Multimodal detection and tracking, Kalman filter, particle filter, heterogeneous sensors, low bit rate communication.  ...  for (1) MPEG-1, (2) MPEG-2, (3) MPEG-4 and (4) the metadata generated by the proposed multimodal tracker.  ... 
doi:10.1109/jstsp.2008.2001429 fatcat:6jttgi4scja4tpzyxzdjxpknnu

SNAG: Spoken Narratives and Gaze Dataset

Preethi Vaidyanathan, Emily T. Prud'hommeaux, Jeff B. Pelz, Cecilia O. Alm
2018 Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)  
In this paper, we describe a new multimodal dataset that consists of gaze measurements and spoken descriptions collected in parallel during an image inspection task.  ...  The eye tracker is under the display. The observer wears a lapel microphone connected to a TASCAM recorder. Figure 2 : 2 Example of multimodal data.  ...  We describe the data collection procedure using a high-quality eye-tracker, summary statistics of the multimodal data, and the results of applying a visual-lingustic alignment framework to automatically  ... 
doi:10.18653/v1/p18-2022 dblp:conf/acl/VaidyanathanPPA18 fatcat:6p4gjw5ttjdlrf62osyejujawu

Natural communication with information systems

I. Marsic, A. Medl, J. Flanagan
2000 Proceedings of the IEEE  
An experimental multimodal system is developed to study several aspects of natural style human-computer communication.  ...  The multimodal enhancement components (multimodal connector, speech, tactile glove, and gaze tracker) and the multimodal manager provide for interface customization. voice commands.  ...  Snapshot of a user view during collaboration in a meeting place .  ... 
doi:10.1109/5.880088 fatcat:zziia4ixb5gsxpzflpi6wbl5ma

Multimodal active speaker detection and virtual cinematography for video conferencing [article]

Ross Cutler, Ramin Mehran, Sam Johnson, Cha Zhang, Adam Kirk, Oliver Whyte, Adarsh Kowdle
2022 arXiv   pre-print
The system was tuned and evaluated using extensive crowdsourcing techniques and evaluated on a dataset with N=100 meetings, each 2-5 minutes in length.  ...  The primary goal for the data capture of meetings is to capture a large variety of meeting data, similar to that which the device will see in actual usage.  ...  We use a state-of-the-art tracker to interpolate the location of participants between labeled frames (Figure 8 ).  ... 
arXiv:2002.03977v3 fatcat:w4atnhwdmvdi7cv6n2ca7xsyna

Эмпирическое исследование мультиканальной коммуникации: русские рассказы и разговоры о грушах

А. Кибрик, О. Федорова
2018 Психология Журнал Высшей школы экономики  
Judging by the published metadata, the largest multimodal corpus is the AMI Meeting Corpus, 100 hours long (Carletta, 2006) ; however, most of the informa tion of this corpus is presented in the form  ...  The terms "multimodal communication"2 and "multimodal corpus" first appeared in the 1980s, cf. Taylor, 1989 .  ... 
doi:10.17323/1813-8918-2018-2-191-200 fatcat:bvykca2jxvdtjfhjwxl2db4t4e

VACE Multimodal Meeting Corpus [chapter]

Lei Chen, R. Travis Rose, Ying Qiao, Irene Kimbara, Fey Parrill, Haleema Welji, Tony Xu Han, Jilin Tu, Zhongqiang Huang, Mary Harper, Francis Quek, Yingen Xiong (+3 others)
2006 Lecture Notes in Computer Science  
With our focus on multimodality, we investigate the interaction among speech, gesture, posture, and gaze in meetings. For this purpose, a high quality multimodal corpus is being produced.  ...  In this paper, we report on the infrastructure we have developed to support our research on multimodal cues for understanding meetings.  ...  This research has been supported by the Advanced Research and Development Activity ARDA VACEII grant 665661: From Video to Information: Cross-Model Analysis of Planning Meetings.  ... 
doi:10.1007/11677482_4 fatcat:nj4vz67sorfi3f26n5jwvscu24

From conversational tooltips to grounded discourse

Louis-Philippe Morency, Trevor Darrell
2004 Proceedings of the 6th international conference on Multimodal interfaces - ICMI '04  
The head pose tracker returns the rotational and translational velocity at each frame.  ...  Justine Cassell [5, 6] and Candace Sidner [24, 25, 22] , have developed rich models of multimodal output in the context of embodied natural language conversation, including multimodal representations  ... 
doi:10.1145/1027933.1027940 dblp:conf/icmi/MorencyD04 fatcat:35gfhg4hjfefdg7k5kiyb57i2i

Reading paths and visual perception in multimodal research, psychology and brain sciences

Tuomo Hiippala
2012 Journal of Pragmatics  
Applicable stateof-the-art theories of multimodal analysis are then described, along with the technological requirements for the eye tracker and its software.  ...  XML annotation, output and transformations are proposed for combining the results of multimodal analysis and the observer behaviour captured using an eye tracker.  ...  (Kappas & Olk 2008, p. 162) Based on this view, several focal points may be identified where the research interests of multimodality meet those of psychology and brain sciences.  ... 
doi:10.1016/j.pragma.2011.12.008 fatcat:5vjdfzrrtzcw3asiksvtypibru
« Previous Showing results 1 — 15 out of 3,003 results