Filters








8,902 Hits in 6.7 sec

Gated Ensemble of Spatio-temporal Mixture of Experts for Multi-task Learning in Ride-hailing System [article]

M. H. Rahman, S. M. Rifaat, S. N. Sadeek, M. Abrar, D. Wang
2021 arXiv   pre-print
Therefore, a multi-task learning architecture is proposed in this study by developing gated ensemble of spatio-temporal mixture of experts network (GESME-Net) with convolutional recurrent neural network  ...  Furthermore, an input agnostic feature weighting layer is integrated with the architecture for learning joint representation in multi-task learning and revealing the contribution of the input features  ...  ME t gate t i t i f f f = =  X X X (5) Although such ensemble of experts is proven to learn task relationships in multi-task learning, the experts utilized are feedforward neural networks that are unable  ... 
arXiv:2012.15408v2 fatcat:kzrvaxlbrzdg7bdcmzghk4grgu

Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners [article]

Shashank Gupta, Subhabrata Mukherjee, Krishan Subudhi, Eduardo Gonzalez, Damien Jose, Ahmed H. Awadallah, Jianfeng Gao
2022 arXiv   pre-print
This results in a sparsely activated multi-task model with a large number of parameters, but with the same computational cost as that of a dense model.  ...  In this work, we study whether sparsely activated Mixture-of-Experts (MoE) improve multi-task learning by specializing some weights for learning shared representations and using the others for learning  ...  Output of the sparse MoE layer is given by: h(x s ) = i G(x s ) i E i (x s ) (2) where G(x s ) i denotes the probability of selecting expert E i for x s . 3 Sparse Multi-task Learning with Mixture-of-Experts  ... 
arXiv:2204.07689v1 fatcat:ohylqqsrnbhxdksqma7snz63gy

Multi-task feature learning by using trace norm regularization

Zhang Jiangmei, Yu Binfeng, Ji Haibo, Kunpeng Wang
2017 Open Physics  
We propose a new learning approach, which employs the mixture of expert model to divide a learning task into several related sub-tasks, and then uses the trace norm regularization to extract common feature  ...  Multi-task learning can extract the correlation of multiple related machine learning problems to improve performance.  ...  Consequently, with the aid of MOE model, we can apply the multi-task learning technique in a single task learning problem.  ... 
doi:10.1515/phys-2017-0079 fatcat:u7xk7avnmnahhjpwmxo7huakdy

Adaptive and Dynamic Knowledge Transfer in Multi-task Learning with Attention Networks [chapter]

Tao Ma, Ying Tan
2020 Communications in Computer and Information Science  
in multi-task learning.  ...  The knowledge transfer mainly depends on task relationships. Most of existing multi-task learning methods guide learning processes based on predefined task relationships.  ...  -Multi-gate Mixture-of-Experts (MMoE): It adopts the multi-gate mixture-of-experts structure to MTL [13] .  ... 
doi:10.1007/978-981-15-7205-0_1 fatcat:n33tj4hepnb4vlwpcjhv7fkzuy

Phenotypical Ontology Driven Framework for Multi-Task Learning [article]

Mohamed Ghalwash, Zijun Yao, Prithwish Chakraborty, James Codella, Daby Sow
2020 arXiv   pre-print
In this paper, we propose OMTL, an Ontology-driven Multi-Task Learning framework, that is designed to overcome such data limitations.  ...  Despite the large number of patients in Electronic Health Records (EHRs), the subset of usable data for modeling outcomes of specific phenotypes are often imbalanced and of modest size.  ...  Multi-task mixture of expert models [3] is the stateof-the-art multi-task model and it shares some of the properties of our model, such as learning a shared representation across all outcomes.  ... 
arXiv:2009.02188v1 fatcat:f3fd5t5rlbgrlnflwzfzrnmiya

Prototype Feature Extraction for Multi-task Learning

Shen Xin, Yuhang Jiao, Cheng Long, Yuguang Wang, Xiaowei Wang, Sen Yang, Ji Liu, Jie Zhang
2022 Proceedings of the ACM Web Conference 2022  
To address this issue, we propose a novel multi-task learning model based on Prototype Feature Extraction (PFE) to balance task-specific objectives and inter-task relationships.  ...  Our model utilizes the learned prototype features and task-specific experts for MTL. We implement PFE on two public datasets.  ...  Multi-gate Mixture-of-Experts (MMoE) improves MoE by adding a gating network д k for each task k. We define q k as the network of the k-th task.  ... 
doi:10.1145/3485447.3512119 fatcat:n6vd3f6s7fhsbmjboug2ycuxry

MTNet: A Multi-Task Neural Network for On-Field Calibration of Low-Cost Air Monitoring Sensors [article]

Haomin Yu and Yangli-ao Geng and Yingjun Zhang and Qingyong Li and Jiayu Zhou
2021 arXiv   pre-print
Specifically, in the shared module, we extend the multi-gate mixture-of-experts structure to harmonize the task conflicts and correlations among different tasks; in each task-specific module, we introduce  ...  In this paper, we propose a multi-task calibration network (MTNet) to calibrate multiple sensors (e.g., carbon monoxide and nitrogen oxide sensors) simultaneously, modeling the interactions among tasks  ...  The parameters of DeepCM are set according to the recommendations in the original paper. • MMoE: Multi-gate mixture-of-experts (MMoE) [24] is a multi-task learning architecture, which explicitly learns  ... 
arXiv:2105.04425v1 fatcat:j4jjaryh6nbiplwosl2axcvuua

Scenario Adaptive Mixture-of-Experts for Promotion-Aware Click-Through Rate Prediction [article]

Xiaofeng Pan, Yibin Shen, Jing Zhang, Keren Yu, Hong Wen, Shui Liu, Chengjun Mao, Bo Cao
2022 arXiv   pre-print
Technically, it follows the idea of Mixture-of-Experts by adopting multiple experts to learn feature representations, which are modulated by a Feature Gated Network (FGN) via an attention mechanism.  ...  In this work, we propose Scenario Adaptive Mixture-of-Experts (SAME), a simple yet effective model that serves both promotion and normal scenarios.  ...  representative model of multi-task learning methods.  ... 
arXiv:2112.13747v2 fatcat:qu7qtomqfbc47j4wrwbb2d7quq

Restoring Spatially-Heterogeneous Distortions using Mixture of Experts Network [article]

Sijin Kim, Namhyuk Ahn, Kyung-Ah Sohn
2020 arXiv   pre-print
In addition, we also propose a mixture of experts network to effectively restore a multi-distortion image.  ...  Motivated by the multi-task learning, we design our network to have multiple paths that learn both common and distortion-specific representations.  ...  Multi-task Learning Multi-task learning (MTL) guides a model to learn both common and distinct representations across different tasks [21, 22] .  ... 
arXiv:2009.14563v1 fatcat:tmvg6ckt25bhbm5nj2vwb46ogu

Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss [article]

Xing Cheng, Hezheng Lin, Xiangyu Wu, Fan Yang, Dong Shen
2021 arXiv   pre-print
In this paper, we propose a multi-stream Corpus Alignment network with single gate Mixture-of-Experts (CAMoE) and a novel Dual Softmax Loss (DSL) to solve the two heterogeneity.  ...  The CAMoE employs Mixture-of-Experts (MoE) to extract multi-perspective video representations, including action, entity, scene, etc., then align them with the corresponding part of the text.  ...  As shown in Table . 4, multi-task with distinct input and single gate outperforms others, which indicates the efficiency of experts learning information from particular aspects.  ... 
arXiv:2109.04290v3 fatcat:3nh7fdmsyrae7fdpfedyvfgc3y

Latent Domain Learning with Dynamic Residual Adapters [article]

Lucas Deecke, Timothy Hospedales, Hakan Bilen
2020 arXiv   pre-print
While recent techniques in domain adaptation and multi-domain learning enable the learning of more domain-agnostic features, their success relies on the presence of domain labels, typically requiring manual  ...  In this scenario, standard model training leads to the overfitting of large domains, while disregarding smaller ones.  ...  In this case, best performances are usually obtained by fitting a collection of models, with each model solving an individual sub-task.  ... 
arXiv:2006.00996v1 fatcat:ubrws7swezhennpzjzh3quxnye

M3E2: Multi-gate Mixture-of-experts for Multi-treatment Effect Estimation [article]

Raquel Aoki, Yizhou Chen, Martin Ester
2022 arXiv   pre-print
This work proposes the M3E2, a multi-task learning neural network model to estimate the effect of multiple treatments.  ...  We compared M3E2 with three baselines in three synthetic benchmark datasets: two with multiple treatments and one with one treatment.  ...  In a multi-gate mixture-of-expert (MMoE) architecture [15] , a hard-parameter sharing network can be interpreted as a single expert model.  ... 
arXiv:2112.07574v2 fatcat:nwflmz2bffcgve2co4n4agfeoa

Modular Learning System and Scheduling for Behavior Acquisition in Multi-agent Environment [chapter]

Yasutake Takahashi, Kazuhiro Edazawa, Minoru Asada
2005 Lecture Notes in Computer Science  
This paper presents a method of modular learning in a multiagent environment, by which the learning agent can adapt its behaviors to the situations as results of the other agent's behaviors.  ...  The existing reinforcement learning approaches have been suffering from the policy alternation of others in multiagent dynamic environments such as RoboCup competitions since other agent behaviors may  ...  Jacobs and Jordan [4] proposed the mixture of experts, in which a set of the expert modules learn and the gating system weights the output of the each expert module for the final system output.  ... 
doi:10.1007/978-3-540-32256-6_51 fatcat:livhqjq5nvbfdjzxckxsj5sjai

A Multitask Learning Model with Multiperspective Attention and Its Application in Recommendation

Yingshuai Wang, Dezheng Zhang, Aziguli Wulamu, Jin Jing
2021 Computational Intelligence and Neuroscience  
To achieve more flexible parameter sharing and maintaining the special feature advantage of each task, we improve the attention mechanism at the view of expert interactive.  ...  Training models to predict click and order targets at the same time. For better user satisfaction and business effectiveness, multitask learning is one of the most important methods in e-commerce.  ...  On the bases of shared bottom, multi-gate mixture of experts (MMOE) [2] designs different gate networks for different tasks.  ... 
doi:10.1155/2021/8550270 pmid:34691173 pmcid:PMC8536436 fatcat:4fggvnrr3rcjhfjlowf5qqmi2i

Cycled Compositional Learning between Images and Text [article]

Jongseok Kim, Youngjae Yu, Seunghwan Lee, GunheeKim
2021 arXiv   pre-print
IQ 2020 challenge and have won the first place with the ensemble of our model.  ...  Since this one-way mapping is highly under-constrained, we couple it with an inverse relation learning with the Correction Network and introduce a cycled relation for given Image We participate in Fashion  ...  Since attributes of fashion correspond to various parts of the image, we utilize a mixture of experts to solve the task.  ... 
arXiv:2107.11509v1 fatcat:cdmsc7yozrgxndgexkyoatftzu
« Previous Showing results 1 — 15 out of 8,902 results