A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
Multi-Modal Perception Attention Network with Self-Supervised Learning for Audio-Visual Speaker Tracking
[article]
2021
arXiv
pre-print
In this paper, we propose a novel Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities. ...
Multi-modal fusion is proven to be an effective method to improve the accuracy and robustness of speaker tracking, especially in complex scenarios. ...
Introduction Speaker tracking is the foundation task for intelligent systems to implement behavior analysis and human-computer interaction. ...
arXiv:2112.07423v1
fatcat:3ed73idqsbggxhpmdmamfh3dxe
Audio-Visual Tibetan Speech Recognition Based on a Deep Dynamic Bayesian Network for Natural Human Robot Interaction
2012
International Journal of Advanced Robotic Systems
Although multi-stream Dynamic Bayesian Network and coupled HMM are widely used for audio-visual speech recognition, they fail to learn the shared features between modalities and ignore the dependency of ...
In this paper, we propose a Deep Dynamic Bayesian Network (DDBN) to perform unsupervised extraction of spatial-temporal multimodal features from Tibetan audio-visual speech data and build an accurate audio-visual ...
input layer and the single modality features layer, one for audio data and one for visual data. ...
doi:10.5772/54000
fatcat:264bssoirfhn7jcn7whuqottjm
Sensor Fusion and Environmental Modelling for Multimodal Sentient Computing
2007
2007 IEEE Conference on Computer Vision and Pattern Recognition
Adaptive Multi-modal Fusion of Tracking Hypotheses The dynamic component of the world model benefits from a high-level fusion of the visual and ultrasonic modalities for robust multi-object tracking and ...
The achieved spatial granularity is better than 3cm for > 95% of Bat observations (assuming only small motion) and Bats may be polled using radio base stations and a variable quality of service to give ...
Adaptive Multi-modal Fusion of Tracking Hypotheses The dynamic component of the world model benefits from a high-level fusion of the visual and ultrasonic modalities for robust multi-object tracking and ...
doi:10.1109/cvpr.2007.383526
dblp:conf/cvpr/TownZ07
fatcat:dfkkliujlnfxdconr5su6infhm
Semantic Interpretation of Multi-Modal Human-Behaviour Data
2017
Künstliche Intelligenz
of largescale, dynamic, multi-modal sensory data, or data streams. ...
multi-modal sensory data relevant to a range of application domains and problem contexts where interpreting human behaviour is central. ...
We remain grateful to all contributing authors for their persistence and effort across the two review rounds that were undertaken for the preparation of this special issue. ...
doi:10.1007/s13218-017-0511-y
fatcat:dkfdb2pxgrcafatws6ibf7hute
Multi-Modal Data Fusion Techniques and Applications
[chapter]
2009
Multi-Camera Networks
The integration of heterogeneous sensors can provide complementary and redundant information that fused to visual cues allows the system to obtain an enriched and more robust scene interpretation. ...
A discussion about possible architectures and algorithms is proposed showing, through systems examples, the benefits of the combination of other sensor typologies for camera network-based applications. ...
Multi-modal Tracking using Bayesian Filtering Multisensor Data Fusion for target tracking is a very active and investigated domain because of its utility in a wide range of applications (for a survey on ...
doi:10.1016/b978-0-12-374633-7.00011-2
fatcat:drnoysj2fbaifcjdbiiukzbq6u
Multi-sensory and Multi-modal Fusion for Sentient Computing
2006
International Journal of Computer Vision
This paper presents an approach to multi-sensory and multi-modal fusion in which computer vision information obtained from calibrated cameras is integrated with a large-scale sentient computing system ...
It is shown that the fusion process significantly enhances the capabilities and robustness of both sensory modalities, thus enabling the system to maintain a richer and more accurate world model. 1 From ...
Acknowledgements The author gratefully acknowledges financial support from the Royal Commission for the Exhibition of 1851. ...
doi:10.1007/s11263-006-7834-8
fatcat:xmgx64sgsncpnjm6dlrxlrs6su
Preface: 3D GeoInfo 2021
2021
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Topics included: 3D data creation and acquisition 3D data processing and analysis 3D data management - data quality, metadata, provenance and trust Data integration, information fusion, multi-modal data ...
analysis 3D visualization, including gamification, virtual reality, augmented reality 3D and Artificial Intelligence/Machine Learning 3D and Big Data, parallel computing, cloud computing 3D city modeling ...
• 3D data creation and acquisition • 3D data processing and analysis • 3D data management -data quality, metadata, provenance and trust • Data integration, information fusion, multi-modal data analysis ...
doi:10.5194/isprs-archives-xlvi-4-w4-2021-1-2021
fatcat:kvd7ez4se5eshhbes33wfeqxgi
Preface: 3D GeoInfo 2021
2021
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Topics included: 3D data creation and acquisition 3D data processing and analysis 3D data management - data quality, metadata, provenance and trust Data integration, information fusion, multi-modal data ...
analysis 3D visualization, including gamification, virtual reality, augmented reality 3D and Artificial Intelligence/Machine Learning 3D and Big Data, parallel computing, cloud computing 3D city modeling ...
• 3D data creation and acquisition • 3D data processing and analysis • 3D data management -data quality, metadata, provenance and trust • Data integration, information fusion, multi-modal data analysis ...
doi:10.5194/isprs-annals-viii-4-w2-2021-1-2021
fatcat:74th5oq5zvhoxd4npwby447fhm
Temporal Aggregation for Adaptive RGBT Tracking
[article]
2022
arXiv
pre-print
performance.As for multi-modal tracking, constrained to the limited RGBT datasets, the adaptive fusion sub-network is appended to our method at the decision level to reflect the complementary characteristics ...
In this paper, we propose an RGBT tracker which takes spatio-temporal clues into account for robust appearance model learning, and simultaneously, constructs an adaptive fusion sub-network for cross-modal ...
Commonly, the key point of processing multi-modal data is located on the fusion of multi-modal representations. ...
arXiv:2201.08949v2
fatcat:shfvqqqixncltcbfaaofbhfi5a
Socializing Multimodal Sensors for Information Fusion
2015
Proceedings of the 23rd ACM international conference on Multimedia - MM '15
multimodal sensors and 3) a new social-cyberphysical paradigm where human and sensors are collaborating for event fusion. ...
However, social sensor fusion for situation awareness is still in its infancy and lacks a unified framework to aggregate and composite real-time media streams from diverse sensors and social network platforms ...
Moreover, multi-modal sensor fusion has been studied for combining multiple level of information for various multimedia tasks [1] . ...
doi:10.1145/2733373.2807995
dblp:conf/mm/Wang15
fatcat:7g7rgxylprfijfgi7wwhwilig4
A Survey for Deep RGBT Tracking
[article]
2022
arXiv
pre-print
Visual object tracking with the visible (RGB) and thermal infrared (TIR) electromagnetic waves, shorted in RGBT tracking, recently draws increasing attention in the tracking community. ...
This survey can be treated as a look-up-table for researchers who are concerned about RGBT tracking. ...
For fusion at feature level, the multi-modal fusion locates between the feature extraction and similarity evaluation. ...
arXiv:2201.09296v2
fatcat:bcmxwou4bjgupdipfzoh33q3ta
Multi-modal biometrics for real-life person-specific emotional human-robot-interaction
2014
2014 IEEE International Conference on Robotics and Biomimetics (ROBIO 2014)
The robot can interact with its interaction partner emotionally by analyzing the emotions of her expressed either visually, facial expression, or auditive, speech prosody. ...
The second way is the targetting of such systems for merely one person. ...
In current classification fusion research, usually two types of multi-modal fusion strategies are applied namely feature level fusion and decision level fusion. ...
doi:10.1109/robio.2014.7090354
dblp:conf/robio/RabieH14
fatcat:x723wfn52vczletebx37gx7ibu
Interactive Multi-scale Fusion of 2D and 3D Features for Multi-object Tracking
[article]
2022
arXiv
pre-print
Moreover, the ablation studies indicates the effectiveness of multi-scale feature fusion and pre-training on single modality. ...
Specifically, through multi-scale interactive query and fusion between pixel-level and point-level features, our method, can obtain more distinguishing features to improve the performance of multiple object ...
INTERACTIVE FEATURE FUSION FOR MULTI-OBJECT TRACKING
A. ...
arXiv:2203.16268v1
fatcat:4egeregsjvavfgbrc2marm2wsm
Machine Perception and Learning Grand Challenge: Situational Intelligence Using Cross-Sensory Fusion
2014
Frontiers in Robotics and AI
In their 2013 paper, Blasch et al. (2013) survey recent research efforts to accommodate the effects of context in information fusion for target tracking applications. ...
The notion of context itself is often incoherent and ill-defined across sensing modalities and applications: image processing research generally assumes only the visual scene to be the context for object ...
FA9550-12-1-0270 and by the Office of Naval Research (ONR) under Grant No N00014-11-1-0893. ...
doi:10.3389/frobt.2014.00007
fatcat:4ihbm6yd3vb7tkwhsvmmwcmowu
Benchmark Driven Framework for Development of Emotion Sensing Support Systems
2012
2012 European Intelligence and Security Informatics Conference
This paper presents a new framework for the development of emotion sensing support systems that is a complete, easily extendible, flexible, and configurable environment with intensive benchmark capabilities ...
It provides: 1) effective collaboration platform between technological and psychological researches, and 2) intensive benchmarking capabilities to test the performance of the entire system as well as individual ...
It also provides a complete overview of all tracking results across time, useful for debugging and data visualization. ...
doi:10.1109/eisic.2012.27
dblp:conf/eisic/PolikovskyQKOB12
fatcat:owbugw6mn5gzvguu36bpzfmvbm
« Previous
Showing results 1 — 15 out of 10,141 results