50 Hits in 5.1 sec

3D for Free: Crossmodal Transfer Learning using HD Maps [article]

Benjamin Wilson, Zsolt Kira, James Hays
2020 arXiv   pre-print
Critically, we constrain this ill-posed 2D-to-3D mapping by using high-definition maps and object size priors. The result of the mining process is 3D cuboids with varying confidence.  ...  3D object detection is a core perceptual challenge for robotics and autonomous driving.  ...  Thus, our method is better thought of as a crossmodality transfer learning method that relies on (1) the relative maturity of 2D instance segmentation and corresponding datasets compared to 3D detection  ... 
arXiv:2008.10592v1 fatcat:kr5q5djjyzfylbnkhio7v3lde4

Is Multimedia Multisensorial? - A Review of Mulsemedia Systems

Alexandra Covaci, Longhao Zou, Irina Tal, Gabriel-Miro Muntean, Gheorghita Ghinea
2018 ACM Computing Surveys  
-makes possible the inclusion of layered sensory stimulation and interaction through multiple sensory channels. e recent upsurge in technology and wearables provides mulsemedia researchers a vehicle for  ...  Mobile phones and tablets can be used for stereo-based depth measurement opening up new possibilities for 3D reconstruction [94] . 3D printers started to be used to provide a tactile dimension to traditional  ...  In [200] , the authors used crossmodal stimuli in their study, and found an improvement of free recall of crossmodal audiovisual stimuli compared to modality-speci c, audio or visual stimuli. ese ndings  ... 
doi:10.1145/3233774 fatcat:dmov3hwt5fbxfktysxuarhezia

Crossmodal Language Grounding in an Embodied Neurocognitive Model [article]

Stefan Heinrich, Yuan Yao, Tobias Hinz, Zhiyuan Liu, Thomas Hummel, Matthias Kerzel, Cornelius Weber, Stefan Wermter
2020 arXiv   pre-print
It addresses developmental robotic interaction and extends its learning capabilities using larger-scale knowledge-based data.  ...  This model can also provide the basis for further crossmodal integration of perceptually grounded cognitive representations.  ...  FUNDING The authors gratefully acknowledge partial support from the German Research Foundation (DFG) and the National Science Foundation of China (NSFC) under project Crossmodal Learning (TRR-169).  ... 
arXiv:2006.13546v1 fatcat:ok7lhtpparg3ni33bxagjuoyae

Unsupervised Black-Box Model Domain Adaptation for Brain Tumor Segmentation

Xiaofeng Liu, Chaehwa Yoo, Fangxu Xing, C.-C. Jay Kuo, Georges El Fakhri, Je-Won Kang, Jonghye Woo
2022 Frontiers in Neuroscience  
Unsupervised domain adaptation (UDA) is an emerging technique that enables the transfer of domain knowledge learned from a labeled source domain to unlabeled target domains, providing a way of coping with  ...  We extensively validated our framework on a few datasets and deep learning backbones, demonstrating the potential for our framework to be applied in challenging yet realistic clinical settings.  ...  Kuzborskij and Orabona (2013) proposed a detailed theoretical analysis of hypothesis transfer learning for linear regression, which is the basis for subsequent UDA solutions that do not rely on source  ... 
doi:10.3389/fnins.2022.837646 pmid:35720708 pmcid:PMC9201342 fatcat:srpmcqkmybhedcqxu64ef7bm7u

Image Translation for Medical Image Generation – Ischemic Stroke Lesions [article]

Moritz Platscher and Jonathan Zopes and Christian Federau
2021 arXiv   pre-print
We demonstrate with the example of ischemic stroke that an improvement in lesion segmentation is feasible using deep learning based augmentation.  ...  Deep learning based disease detection and segmentation algorithms promise to improve many clinical processes.  ...  ACKNOWLEDGMENTS The authors would like to thank Sebastian Kozerke and Thomas Joyce for useful discussions.  ... 
arXiv:2010.02745v2 fatcat:h72525ipibctfe3plskcisjape

Multi-Modal 3D Object Detection in Autonomous Driving: a Survey [article]

Yingjie Wang, Qiuyu Mao, Hanqi Zhu, Yu Zhang, Jianmin Ji, Yanyong Zhang
2021 arXiv   pre-print
To bridge this gap and motivate future research, this survey devotes to review recent fusion-based 3D detection deep learning models that leverage multiple sensor data sources, especially cameras and LiDARs  ...  Next, we discuss some popular datasets for multi-modal 3D object detection, with a special focus on the sensor data included in each dataset.  ...  It also provides HD maps for automatic map creation aka. map automation.  ... 
arXiv:2106.12735v2 fatcat:5twzbk4yhrcfzddp7zghnsivna

Multimodal Learning with Transformers: A Survey [article]

Peng Xu, Xiatian Zhu, David A. Clifton
2022 arXiv   pre-print
, and multimodal Transformers, from a geometrically topological perspective, (3) a review of multimodal Transformer applications, via two important paradigms, i.e., for multimodal pretraining and for specific  ...  Transformer is a promising neural network learner, and has achieved great success in various machine learning tasks.  ...  We hope that this survey gives a helpful and detailed overview for new researchers and practitioners, provides a convenient reference for relevant experts (e.g., multimodal machine learning researchers  ... 
arXiv:2206.06488v1 fatcat:6aoaczzbtvc43my2kmobo7glvy

2021 Index IEEE Robotics and Automation Letters Vol. 6

2021 IEEE Robotics and Automation Letters  
The Author Index contains the primary entry for each item, listed under the first author's name.  ...  ., +, LRA July 2021 4433-4440 HD Map Update for Autonomous Driving With Crowdsourced Data.  ...  ., +, LRA Jan. 2021 32-39 Crowdsourcing HD Map Update for Autonomous Driving With Crowdsourced Data.  ... 
doi:10.1109/lra.2021.3119726 fatcat:lsnerdofvveqhlv7xx7gati2xu

Direct Structural Connections between Voice- and Face-Recognition Areas

H. Blank, A. Anwander, K. von Kriegstein
2011 Journal of Neuroscience  
Using probabilistic tractography, we show evidence that the FFA is structurally connected with voice-sensitive areas in STS.  ...  Currently, there are two opposing models for how voice and face information is integrated in the human brain to recognize person identity.  ...  The training, including learning and test, took 25 min. Training was repeated twice for all participants.  ... 
doi:10.1523/jneurosci.2091-11.2011 pmid:21900569 pmcid:PMC6623403 fatcat:cljg6q6dufbcxnvhazt2le6k6i

Deep 3D human pose estimation: A review

Jinbao Wang, Shujie Tan, Xiantong Zhen, Shuo Xu, Feng Zheng, Zhenyu He, Ling Shao
2021 Computer Vision and Image Understanding  
In this paper, we provide a thorough review of existing deep learning based works for 3D pose estimation, summarize the advantages and disadvantages of these methods and provide an in-depth understanding  ...  Furthermore, we also explore the commonly-used benchmark datasets on which we conduct a comprehensive study for comparison and analysis.  ...  Mehta et al. (2017a) address the generalization problem of 3D pose estimation by transfer learning.  ... 
doi:10.1016/j.cviu.2021.103225 fatcat:hvlgjuxd2zfgji6k4y4g65cs7y

TWO!EARS Deliverable D4.1 - Feedback-loop selection and listing (WP4: Active listening, feedback loops & integration of cross-modal information; FP7-ICT-2013-C TWO!EARS FET-Open Project 618075)

Jens Blauert, Thomas Walther
2019 Zenodo  
This deliverable mainly entails our advance on the key task for the current project period, namely, task 4.1.  ...  Also, multi-modal approaches have been reviewed and evaluated with regard to their value for Two!Ears. Her [...]  ...  As a first step, an active hearing process is performed while the learning an auditory-motor map. Next, this map is used for a-priori passive sound localization.  ... 
doi:10.5281/zenodo.2595244 fatcat:3oocvxholvgr3ecgubmq3uwxqa

Audiovisual reproduction in surrounding display: Effect of spatial width of audio and video

Olli Rummukainen, Ville Pulkki
2012 2012 Fourth International Workshop on Quality of Multimedia Experience  
Constrained correspondence analysis of the free description data suggests the reasons for highest perceived degradation to be caused by wrong audio direction, reduced video width and missing essential  ...  In addition, free descriptions of the most prominent degrading factors were collected.  ...  The machine learning algorithm uses supervised learning to eventually learn how the QoS parameters affect the perceived MOS.  ... 
doi:10.1109/qomex.2012.6263861 dblp:conf/qomex/RummukainenP12 fatcat:rv5ag4yqdbaxrelsfqnwcmnbwa

Detection and assessment of Parkinson's disease based on gait analysis: A survey

Yao Guo, Jianxin Yang, Yuxuan Liu, Xun Chen, Guang-Zhong Yang
2022 Frontiers in Aging Neuroscience  
These impairments have been used as a clinical sign for the early detection of PD, as well as an objective index for pervasive monitoring of the PD patients in daily life.  ...  The intervening measures for improving gait performance are summarized, in which the smart devices for gait intervention are emphasized.  ...  Yunbo Li and Miss Mengnan He for the contribution of literature search.  ... 
doi:10.3389/fnagi.2022.916971 fatcat:pmugk3unr5g3vktd4e5g73usza

Crowding is similar for eye movements and manual responses

F. Yildirim, F. W. Cornelissen
2014 Journal of Vision  
visual and memory search for words Sage E P Boettcher, Jeremy M Wolfe 53.305 Updating for free?  ...  Exogenous attention enables visual perceptual learning and task transfer S F A Szpiro, S Cohen, M Carrasco 56.328 Spatial attention generalizes perceptual learning to untrained locations in an acuity  ... 
doi:10.1167/14.10.789 fatcat:uxwzbzfhbjc3ddyfhbs2ilr4qe

A Comprehensive Review of the Video-to-Text Problem [article]

Jesus Perez-Martin and Benjamin Bustos and Silvio Jamil F. Guimarães and Ivan Sipiran and Jorge Pérez and Grethel Coello Said
2021 arXiv   pre-print
When the visual information is related to videos, this takes us into Video-Text Research, which includes several challenging tasks such as video question answering, video summarization with natural language  ...  We analyze twenty-six benchmark datasets, showing their drawbacks and strengths for the problem requirements.  ...  In our recent work (Perez-Martin et al. 2021) , we have applied crossmodal retrieval for learning representations with implicit syntactic information and generate more syntactically correct descriptions  ... 
arXiv:2103.14785v3 fatcat:xwzziozwjbghfobtowu5bny6bu
« Previous Showing results 1 — 15 out of 50 results