Filters








8 Hits in 1.9 sec

OmniNet: A unified architecture for multi-modal multi-task learning [article]

Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain
2020 arXiv   pre-print
The proposed architecture further enables a single model to support tasks with multiple input modalities as well as asynchronous multi-task learning, thus we refer to it as OmniNet.  ...  We introduce an extended and unified architecture that can be used for tasks involving a variety of modalities like image, text, videos, etc.  ...  To address this gap, we extend Transformer towards a unified architecture, namely OmniNet, which enables a single model to support tasks with multiple input modalities and asynchronous multi-task learning  ... 
arXiv:1907.07804v2 fatcat:xb4326fqjfar5okp7w6dlfyrgm

Software/Hardware Co-design for Multi-modal Multi-task Learning in Autonomous Systems [article]

Cong Hao, Deming Chen
2021 arXiv   pre-print
Therefore, autonomous systems essentially require multi-modal multi-task (MMMT) learning which must be aware of hardware performance and implementation strategies.  ...  ., multi-modal data from different sensors, requiring diverse data preprocessing, sensor fusion, and feature aggregation.  ...  Similarly, Omninet [25] is another unified architecture for MMMT. Most recently, Lu et al. [26] introduce a multi-task model that handles twelve different datasets simultaneously.  ... 
arXiv:2104.04000v1 fatcat:vf673pujtvhg7nyqrlppmsk2fa

SCENIC: A JAX Library for Computer Vision Research and Beyond [article]

Mostafa Dehghani and Alexey Gritsenko and Anurag Arnab and Matthias Minderer and Yi Tay
2021 arXiv   pre-print
Scenic supports a diverse range of vision tasks (e.g., classification, segmentation, detection)and facilitates working on multi-modal problems, along with GPU/TPU support for multi-host, multi-device large-scale  ...  Scenic also offers optimized implementations of state-of-the-art research models spanning a wide range of modalities.  ...  Sunayana Rane, Josip Djolonga, Lucas Beyer, Alexander Kolesnikov, Xiaohua Zhai, Rob Romijnders, Rianne van den Berg, Jonathan Heek, Olivier Teboul, Marco Cuturi, Lu Jiang, Mario Lučić, and Neil Houlsby for  ... 
arXiv:2110.11403v1 fatcat:x2t6wy54vzh3zmuj5poh6qehkm

SOLIS – The MLOps journey from data acquisition to actionable insights [article]

Razvan Ciobanu, Alexandru Purdila, Laurentiu Piciu, Andrei Damian
2022 arXiv   pre-print
Machine Learning operations is unarguably a very important and also one of the hottest topics in Artificial Intelligence lately.  ...  Being able to define very clear hypotheses for actual real-life problems that can be addressed by machine learning models, collecting and curating large amounts of data for model training and validation  ...  Inference OmniNet approach The main objective of our proposed OmniNet DAG deployment architecture is to have multiple task-oriented neural models cooperate and similarly integrate with each other to  ... 
arXiv:2112.11925v2 fatcat:bgkdomsm3rcmzn2x6bmzmppdmq

12-in-1: Multi-Task Vision and Language Representation Learning

Jiasen Lu, Vedanuj Goswami, Marcus Rohrbach, Devi Parikh, Stefan Lee
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
Our approach culminates in a single model on 12 datasets from four broad categories of task including visual question answering, caption-based image retrieval, grounding referring expressions, and multi-modal  ...  required for success at these tasks overlap significantly.  ...  Conclusion In this work, we develop a training regime and experimental setting for large-scale, multi-modal, multi-task learning.  ... 
doi:10.1109/cvpr42600.2020.01045 dblp:conf/cvpr/LuGRPL20 fatcat:kmcnv5rwdjcflfgjwqy3cugv7u

12-in-1: Multi-Task Vision and Language Representation Learning [article]

Jiasen Lu, Vedanuj Goswami, Marcus Rohrbach, Devi Parikh, Stefan Lee
2020 arXiv   pre-print
Our approach culminates in a single model on 12 datasets from four broad categories of task including visual question answering, caption-based image retrieval, grounding referring expressions, and multi-modal  ...  required for success at these tasks overlap significantly.  ...  Conclusion In this work, we develop a training regime and experimental setting for large-scale, multi-modal, multi-task learning.  ... 
arXiv:1912.02315v2 fatcat:bjlhdvftabdfdpskqwzd5yzia4

Multimodal Intelligence: Representation Learning, Information Fusion, and Applications [article]

Chao Zhang, Zichao Yang, Xiaodong He, Li Deng
2020 arXiv   pre-print
Regarding multimodal fusion, this review focuses on special architectures for the integration of representations of unimodal signals for a particular task.  ...  Regarding multimodal representation learning, we review the key concepts of embedding, which unify multimodal signals into a single vector space and thereby enable cross-modality signal processing.  ...  ACKNOWLEDGEMENT The authors are grateful to the editor and anonymous reviewers for their valuable suggestions that helped to make this paper better.  ... 
arXiv:1911.03977v3 fatcat:ojazuw3qzvfqrdweul6qdpxuo4

The Origins of Informatics

M. F. Collen
1994 JAMIA Journal of the American Medical Informatics Association  
A digital computer required a central processing unit with a primary or main memory to hold the data being processed, a program of instructions for processing the data; and circuitry to  ...  The digital computer has profound implications for the development and practice of clinical medicine. n J Am Med Informatics Assoc. 1994;1:91-107.  ...  Greenes defined a programming language ". . . as a formal language used to facilitate a description, by a human user, of a procedure for solving a problem or a task.  ... 
doi:10.1136/jamia.1994.95236152 pmid:7719803 pmcid:PMC116189 fatcat:tb4zyqkskncu5opdutblapneje