A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
3D for Free: Crossmodal Transfer Learning using HD Maps
[article]
2020
arXiv
pre-print
Critically, we constrain this ill-posed 2D-to-3D mapping by using high-definition maps and object size priors. The result of the mining process is 3D cuboids with varying confidence. ...
3D object detection is a core perceptual challenge for robotics and autonomous driving. ...
Thus, our method is better thought of as a crossmodality transfer learning method that relies on (1) the relative maturity of 2D instance segmentation and corresponding datasets compared to 3D detection ...
arXiv:2008.10592v1
fatcat:kr5q5djjyzfylbnkhio7v3lde4
Is Multimedia Multisensorial? - A Review of Mulsemedia Systems
2018
ACM Computing Surveys
-makes possible the inclusion of layered sensory stimulation and interaction through multiple sensory channels. e recent upsurge in technology and wearables provides mulsemedia researchers a vehicle for ...
Mobile phones and tablets can be used for stereo-based depth measurement opening up new possibilities for 3D reconstruction [94] . 3D printers started to be used to provide a tactile dimension to traditional ...
In [200] , the authors used crossmodal stimuli in their study, and found an improvement of free recall of crossmodal audiovisual stimuli compared to modality-speci c, audio or visual stimuli. ese ndings ...
doi:10.1145/3233774
fatcat:dmov3hwt5fbxfktysxuarhezia
Crossmodal Language Grounding in an Embodied Neurocognitive Model
[article]
2020
arXiv
pre-print
It addresses developmental robotic interaction and extends its learning capabilities using larger-scale knowledge-based data. ...
This model can also provide the basis for further crossmodal integration of perceptually grounded cognitive representations. ...
FUNDING The authors gratefully acknowledge partial support from the German Research Foundation (DFG) and the National Science Foundation of China (NSFC) under project Crossmodal Learning (TRR-169). ...
arXiv:2006.13546v1
fatcat:ok7lhtpparg3ni33bxagjuoyae
Unsupervised Black-Box Model Domain Adaptation for Brain Tumor Segmentation
2022
Frontiers in Neuroscience
Unsupervised domain adaptation (UDA) is an emerging technique that enables the transfer of domain knowledge learned from a labeled source domain to unlabeled target domains, providing a way of coping with ...
We extensively validated our framework on a few datasets and deep learning backbones, demonstrating the potential for our framework to be applied in challenging yet realistic clinical settings. ...
Kuzborskij and Orabona (2013) proposed a detailed theoretical analysis of hypothesis transfer learning for linear regression, which is the basis for subsequent UDA solutions that do not rely on source ...
doi:10.3389/fnins.2022.837646
pmid:35720708
pmcid:PMC9201342
fatcat:srpmcqkmybhedcqxu64ef7bm7u
Image Translation for Medical Image Generation – Ischemic Stroke Lesions
[article]
2021
arXiv
pre-print
We demonstrate with the example of ischemic stroke that an improvement in lesion segmentation is feasible using deep learning based augmentation. ...
Deep learning based disease detection and segmentation algorithms promise to improve many clinical processes. ...
ACKNOWLEDGMENTS The authors would like to thank Sebastian Kozerke and Thomas Joyce for useful discussions. ...
arXiv:2010.02745v2
fatcat:h72525ipibctfe3plskcisjape
Multi-Modal 3D Object Detection in Autonomous Driving: a Survey
[article]
2021
arXiv
pre-print
To bridge this gap and motivate future research, this survey devotes to review recent fusion-based 3D detection deep learning models that leverage multiple sensor data sources, especially cameras and LiDARs ...
Next, we discuss some popular datasets for multi-modal 3D object detection, with a special focus on the sensor data included in each dataset. ...
It also provides HD maps for automatic map creation aka. map automation. ...
arXiv:2106.12735v2
fatcat:5twzbk4yhrcfzddp7zghnsivna
Multimodal Learning with Transformers: A Survey
[article]
2022
arXiv
pre-print
, and multimodal Transformers, from a geometrically topological perspective, (3) a review of multimodal Transformer applications, via two important paradigms, i.e., for multimodal pretraining and for specific ...
Transformer is a promising neural network learner, and has achieved great success in various machine learning tasks. ...
We hope that this survey gives a helpful and detailed overview for new researchers and practitioners, provides a convenient reference for relevant experts (e.g., multimodal machine learning researchers ...
arXiv:2206.06488v1
fatcat:6aoaczzbtvc43my2kmobo7glvy
2021 Index IEEE Robotics and Automation Letters Vol. 6
2021
IEEE Robotics and Automation Letters
The Author Index contains the primary entry for each item, listed under the first author's name. ...
., +, LRA July 2021 4433-4440 HD Map Update for Autonomous Driving With Crowdsourced Data. ...
., +, LRA Jan. 2021 32-39 Crowdsourcing HD Map Update for Autonomous Driving With Crowdsourced Data. ...
doi:10.1109/lra.2021.3119726
fatcat:lsnerdofvveqhlv7xx7gati2xu
Direct Structural Connections between Voice- and Face-Recognition Areas
2011
Journal of Neuroscience
Using probabilistic tractography, we show evidence that the FFA is structurally connected with voice-sensitive areas in STS. ...
Currently, there are two opposing models for how voice and face information is integrated in the human brain to recognize person identity. ...
The training, including learning and test, took 25 min. Training was repeated twice for all participants. ...
doi:10.1523/jneurosci.2091-11.2011
pmid:21900569
pmcid:PMC6623403
fatcat:cljg6q6dufbcxnvhazt2le6k6i
Deep 3D human pose estimation: A review
2021
Computer Vision and Image Understanding
In this paper, we provide a thorough review of existing deep learning based works for 3D pose estimation, summarize the advantages and disadvantages of these methods and provide an in-depth understanding ...
Furthermore, we also explore the commonly-used benchmark datasets on which we conduct a comprehensive study for comparison and analysis. ...
Mehta et al. (2017a) address the generalization problem of 3D pose estimation by transfer learning. ...
doi:10.1016/j.cviu.2021.103225
fatcat:hvlgjuxd2zfgji6k4y4g65cs7y
TWO!EARS Deliverable D4.1 - Feedback-loop selection and listing (WP4: Active listening, feedback loops & integration of cross-modal information; FP7-ICT-2013-C TWO!EARS FET-Open Project 618075)
2019
Zenodo
This deliverable mainly entails our advance on the key task for the current project period, namely, task 4.1. ...
Also, multi-modal approaches have been reviewed and evaluated with regard to their value for Two!Ears. Her [...] ...
As a first step, an active hearing process is performed while the learning an auditory-motor map. Next, this map is used for a-priori passive sound localization. ...
doi:10.5281/zenodo.2595244
fatcat:3oocvxholvgr3ecgubmq3uwxqa
Audiovisual reproduction in surrounding display: Effect of spatial width of audio and video
2012
2012 Fourth International Workshop on Quality of Multimedia Experience
Constrained correspondence analysis of the free description data suggests the reasons for highest perceived degradation to be caused by wrong audio direction, reduced video width and missing essential ...
In addition, free descriptions of the most prominent degrading factors were collected. ...
The machine learning algorithm uses supervised learning to eventually learn how the QoS parameters affect the perceived MOS. ...
doi:10.1109/qomex.2012.6263861
dblp:conf/qomex/RummukainenP12
fatcat:rv5ag4yqdbaxrelsfqnwcmnbwa
Detection and assessment of Parkinson's disease based on gait analysis: A survey
2022
Frontiers in Aging Neuroscience
These impairments have been used as a clinical sign for the early detection of PD, as well as an objective index for pervasive monitoring of the PD patients in daily life. ...
The intervening measures for improving gait performance are summarized, in which the smart devices for gait intervention are emphasized. ...
Yunbo Li and Miss Mengnan He for the contribution of literature search. ...
doi:10.3389/fnagi.2022.916971
fatcat:pmugk3unr5g3vktd4e5g73usza
Crowding is similar for eye movements and manual responses
2014
Journal of Vision
visual and memory search for words Sage E P Boettcher, Jeremy M Wolfe 53.305 Updating for free? ...
Exogenous attention enables visual perceptual learning and task transfer S F A Szpiro, S Cohen, M Carrasco 56.328 Spatial attention generalizes perceptual learning to untrained locations in an acuity ...
doi:10.1167/14.10.789
fatcat:uxwzbzfhbjc3ddyfhbs2ilr4qe
A Comprehensive Review of the Video-to-Text Problem
[article]
2021
arXiv
pre-print
When the visual information is related to videos, this takes us into Video-Text Research, which includes several challenging tasks such as video question answering, video summarization with natural language ...
We analyze twenty-six benchmark datasets, showing their drawbacks and strengths for the problem requirements. ...
In our recent work (Perez-Martin et al. 2021) , we have applied crossmodal retrieval for learning representations with implicit syntactic information and generate more syntactically correct descriptions ...
arXiv:2103.14785v3
fatcat:xwzziozwjbghfobtowu5bny6bu
« Previous
Showing results 1 — 15 out of 50 results