27 Hits in 3.9 sec

Multitask Bandit Learning Through Heterogeneous Feedback Aggregation [article]

Zhi Wang, Chicheng Zhang, Manish Kumar Singh, Laurel D. Riek, Kamalika Chaudhuri
2021 arXiv   pre-print
In many real-world applications, multiple agents seek to learn how to perform highly related yet slightly different tasks in an online bandit learning protocol.  ...  We develop an upper confidence bound-based algorithm, RobustAgg(ϵ), that adaptively aggregates rewards collected by different players.  ...  Multitask Bandit Learning Through Heterogeneous Feedback Aggregation E Proof of the lower bounds E.1 Gap-independent lower bound with known We first restate Theorem 10. Theorem 10.  ... 
arXiv:2010.15390v2 fatcat:obhmf2l7zvcjbns6ailmpbwt7u

Gaussian process decentralized data fusion meets transfer learning in large-scale distributed cooperative perception

Ruofei Ouyang, Bryan Kian Hsiang Low
2019 Autonomous Robots  
Zheng, Zhou Zhao, Fanwei Zhu, Kevin Chen-Chuan Chang, Minghui Wu, Jing Ying Distant-supervision of Heterogeneous Multitask Learning for Social Event Forecasting with Multilingual Indicators liang Zhao*  ...  Katsunori Ohnishi, Yoshitaka Ushiku, Tatsuya Harada High Rank Matrix Completion with Side Information Ehsan Elhamifar*, Yugang Wang HodgeRank with Information Maximization for Crowdsourced Pairwise Ranking Aggregation  ... 
doi:10.1007/s10514-018-09826-z fatcat:67yqhwmgozccxni56rxmuapjgm

Joint Demand Forecasting and DQN-Based Control for Energy-Aware Mobile Traffic Offloading

Chih-Wei Huang, Po-Chen Chen
2020 IEEE Access  
INDEX TERMS Heterogeneous network, mobile traffic offloading, mobile traffic forecasting, deep learning, deep reinforcement learning, big data.  ...  Due to increasing system complexity, network operators are facing severe challenges and looking for machine learning-based solutions.  ...  The environment feedbacks a reward to the agent and enters the next epoch. Therefore, the agent learns the properties of the environment by interacting with it.  ... 
doi:10.1109/access.2020.2985679 fatcat:abcazsuerjd5nlyakt3u7mjm6u

Mobile Traffic Offloading with Forecasting using Deep Reinforcement Learning [article]

Chih-Wei Huang, Po-Chen Chen
2019 arXiv   pre-print
Due to increasing system complexity, network operators are facing severe challenges and looking for machine learning-based solutions.  ...  In this work, we propose an energy-aware mobile traffic offloading scheme in the heterogeneous network jointly apply deep Q network (DQN) decision making and advanced traffic demand forecasting.  ...  The environment feedbacks a reward to the agent and enters the next epoch. Therefore, the agent learns the properties of the environment by interacting with it.  ... 
arXiv:1911.07452v1 fatcat:ifnzestljrgpflwnqpnprfwkt4

2019 Index IEEE Transactions on Automatic Control Vol. 64

2019 IEEE Transactions on Automatic Control  
., +, TAC Nov. 2019 4470-4483 Online Convex Optimization With Time-Varying Constraints and Bandit Feedback.  ...  Stella, L., +, TAC Feb. 2019 697-711 Online Convex Optimization With Time-Varying Constraints and Bandit Feedback.  ... 
doi:10.1109/tac.2020.2967132 fatcat:o2hd2t4jz5fbpkcemjt5aj7xrm

The Internet of Federated Things (IoFT): A Vision for the Future and In-depth Survey of Data-driven Approaches for Federated Learning [article]

Raed Kontar, Naichen Shi, Xubo Yue, Seokhyun Chung, Eunshin Byon, Mosharaf Chowdhury, Judy Jin, Wissam Kontar, Neda Masoud, Maher Noueihed, Chinedum E. Okwudire, Garvesh Raskutti (+3 others)
2021 arXiv   pre-print
We end by describing the vision and challenges of IoFT in reshaping different industries through the lens of domain experts.  ...  model that quickly adapts to new devices or learning tasks.  ...  are unsupervised multitask learners.  ... 
arXiv:2111.05326v1 fatcat:bbgdhtuqcrhstgakt2vxuve2ca

Fast-adapting and Privacy-preserving Federated Recommender System [article]

Qinyong Wang, Hongzhi Yin, Tong Chen, Junliang Yu, Alexander Zhou, Xiangliang Zhang
2021 arXiv   pre-print
On the other hand, to better embrace the data heterogeneity commonly existing in FL, we innovatively introduce a first-order meta-learning method that enables fast in-device personalization with only few  ...  To this end, we propose a DNN-based recommendation model called PrivRec running on the decentralized federated learning (FL) environment, which ensures that a user's data never leaves his/her during the  ...  Multi-task learning The second category of such work is to view the personalization problem as multitask learning [91, 8] .  ... 
arXiv:2104.00919v3 fatcat:u4io4mmvfzg6rnt5u3gw7kajue

2018 Index IEEE Transactions on Automatic Control Vol. 63

2018 IEEE Transactions on Automatic Control  
., +, TAC March 2018 742-751 Deadline Scheduling as Restless Bandits.  ...  Cruz-Zavala, E., +, TAC Dec. 2018 4309-4316 On Weight-Prioritized Multitask Control of Humanoid Robots.  ... 
doi:10.1109/tac.2019.2896796 fatcat:bwmqasulnzbwhin5hv4547ypfe

Table of Contents

2020 2020 IEEE Symposium Series on Computational Intelligence (SSCI)  
Finite-time Adaptive Optimal Output Feedback Control of Linear Systems with Intermittent FeedbackAvimanyuSahoo, Vignesh Narayanan and Qiming Zhao .......... 233 Revisiting Maximum Entropy Inverse Reinforcement  ...  Algorithm for Military Workforce Planning Problems: A Simulation-Optimization Approach Karam Sallam, Hasan Turan, Ripon Chakrabortty, Sondoss Elsawah and Michael Ryan .......... 2504 A Multi-Armed Bandit  ... 
doi:10.1109/ssci47803.2020.9308155 fatcat:hyargfnk4vevpnooatlovxm4li

Deep Learning based Recommender System: A Survey and New Perspectives [article]

Shuai Zhang, Lina Yao, Aixin Sun, Yi Tay
2018 arXiv   pre-print
property of learning feature representations from scratch.  ...  Evidently, the field of deep learning in recommender system is flourishing. This article aims to provide a comprehensive review of recent research efforts on deep learning based recommender systems.  ...  Reinforcement Learning techniques such as contextual-bandit approach [86] had shown superior recommendation performance in real-world applications.  ... 
arXiv:1707.07435v6 fatcat:2q2dbfy2jvdydhbrmmbyrzctnq

2021 Index IEEE Internet of Things Journal Vol. 8

2021 IEEE Internet of Things Journal  
He, Z., +, JIoT June 15, 2021 9706-9716 Privacy-Preserving Collaborative Learning for Multiarmed Bandits in IoT.  ...  ., +, JIoT Feb. 15, 2021 2364-2378 Peer Offloading With Delayed Feedback in Fog Networks. Yang, M., +, JIoT Sept. 1, 2021 13690-13702 Securing SDN-Controlled IoT Networks Through Edge Blockchain.  ... 
doi:10.1109/jiot.2022.3141840 fatcat:42a2qzt4jnbwxihxp6rzosha3y

Phasic norepinephrine is a neural interrupt signal for unexpected events in rapidly unfolding sensory sequences – evidence from pupillometry [article]

Sijia Zhao, Maria Chait, Fred Dick, Peter Dayan, Shigeto Furukawa, Hsin-I Liao
2018 biorxiv/medrxiv   pre-print
The rewards associated to each option depend on the observable features through an initially unknown function which can be learned through experience.  ...  Statistical analysis of aggregate accuracy revealed a main effect of both described probability and magnitude.  ...  Learning a behavior in presence of positive and negative rewards is often referred to as reinforcement learning [1] .  ... 
doi:10.1101/466367 fatcat:a3bquw6n55amhodwmfoa2frboa

IA Meets CRNs: A Prospective Review on the Application of Deep Architectures in Spectrum Management

Mduduzi C. Hlophe, Bodhaswar T. Maharaj
2021 IEEE Access  
THE PROBLEM WITH REINFORCEMENT LEARNING Wireless networks are controlled through feedback signals in order to avoid instability and malfunctioning, which can be a challenge in distributed spectrum management  ...  RANDOMIZATION AND FEEDBACK CLASSES The efficient extensions of the foundational algorithms differ mainly in the way in which feedback from the environment is utilized to speed up the learning process.  ... 
doi:10.1109/access.2021.3104099 fatcat:ucyvpx36drdj5dcl2npxipkhd4

Federated Residual Learning [article]

Alekh Agarwal, John Langford, Chen-Yu Wei
2020 arXiv   pre-print
Our framework is robust to data heterogeneity, addressing the slow convergence problem traditional federated learning methods face when the data is non-i.i.d. across clients.  ...  We study a new form of federated learning where the clients train personalized local models and make predictions jointly with the server-side shared model.  ...  Feature hashing for large scale multitask learning.  ... 
arXiv:2003.12880v1 fatcat:omtz7hchhvg6haepck3r4o5juq

Bayesian Optimization for Policy Search via Online-Offline Experimentation [article]

Benjamin Letham, Eytan Bakshy
2019 arXiv   pre-print
We measure empirical learning curves which show substantial gains from including data from biased offline experiments, and show how these learning curves are consistent with theoretical results for multi-task  ...  Online field experiments are the gold-standard way of evaluating changes to real-world interactive machine learning systems.  ...  Finally, we thank Roberto Calandra, Mohammad Ghavamzadeh, Ron Kohavi, Alex Deng, and our anonymous reviewers for helpful feedback on this work.  ... 
arXiv:1904.01049v2 fatcat:u4ym64dx7jcfjlhxxmsdhmx4ya
« Previous Showing results 1 — 15 out of 27 results