11,207 Hits in 4.9 sec

Distributed Bayesian optimization of deep reinforcement learning algorithms

M. Todd Young, Jacob Hinkle, Ramakrishnan Kannan, Arvind Ramanathan
2020 Journal of Parallel and Distributed Computing  
., Distributed Bayesian optimization of deep reinforcement learning algorithms, Journal of Parallel and Distributed Computing (2020), doi: https://doi.  ...  Now, recent work has brought the techniques of deep learning to bear on sequential decision processes in the area of deep reinforcement learning (DRL).  ...  The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (  ... 
doi:10.1016/j.jpdc.2019.07.008 fatcat:gwibvufibjhwjojaiqa7pj2xde

Efficient Hyperparameter Optimization for Differentially Private Deep Learning [article]

Aman Priyanshu, Rakshit Naidu, Fatemehsadat Mireshghallah, Mohammad Malekzadeh
2021 arXiv   pre-print
framework: evolutionary, Bayesian, and reinforcement learning.  ...  As we believe our work has implications to be utilized in the pipeline of private deep learning, we open-source our code at  ...  [19] uses reinforcement learning to efficiently tune hyperparameters needed for quantization of deep neural networks and find the bit-widths for weights of each layer that would provide optimal computation-accuracy  ... 
arXiv:2108.03888v1 fatcat:gomewnongbei3dptgmyniiniz4

Deep Bayesian Reward Learning from Preferences [article]

Daniel S. Brown, Scott Niekum
2019 arXiv   pre-print
Bayesian inverse reinforcement learning (IRL) methods are ideal for safe imitation learning, as they allow a learning agent to reason about reward uncertainty and the safety of a learned policy.  ...  We demonstrate that B-REX learns imitation policies that are competitive with a state-of-the-art deep imitation learning method that only learns a point estimate of the reward function.  ...  One of our contributions is to propose the algorithm B-REX, first deep Bayesian IRL algorithm that can scale to complex control problems with visual observations.  ... 
arXiv:1912.04472v1 fatcat:c2ouhzmearhupopywchlvp7ckq

Automatic tuning of hyper-parameters of reinforcement learning algorithms using Bayesian optimization with behavioral cloning [article]

Juan Cruz Barsce, Jorge A. Palombarini, Ernesto C. Martínez
2021 arXiv   pre-print
Also, by tightly integrating Bayesian optimization in a reinforcement learning agent design, the number of state transitions needed to converge to the optimal policy for a given task is reduced.  ...  Optimal setting of several hyper-parameters in machine learning algorithms is key to make the most of available data.  ...  Distributed Bayesian Optimization of Deep Reinforcement Learning Algorithms. Journal of Parallel and Distributed Computing. 2020;139:43–52. 48.  ... 
arXiv:2112.08094v1 fatcat:hd4bvvjrpzgn3oweaq2l747k6m

Quantity vs. Quality: On Hyperparameter Optimization for Deep Reinforcement Learning [article]

Lars Hertel, Pierre Baldi, Daniel L. Gillen
2020 arXiv   pre-print
From our experiments we conclude that Bayesian optimization with a noise robust acquisition function is the best choice for hyperparameter optimization in reinforcement learning tasks.  ...  Reinforcement learning algorithms can show strong variation in performance between training runs with different random seeds.  ...  An algorithm that closely resembles SHA is used to tune hyperparameters of deep reinforcement learning algorithms in the Stable-baselines [Hill et al., 2018] package.  ... 
arXiv:2007.14604v2 fatcat:wiulxfb4c5ha7ink2limugggfe

Meta Learning for Hyperparameter Optimization in Dialogue System

Jen-Tzung Chien, Wei Xiang Lieow
2019 Interspeech 2019  
The performance of dialogue system based on deep reinforcement learning (DRL) highly depends on the selected hyperparameters in DRL algorithms.  ...  This paper presents a meta learning approach to carry out multifidelity Bayesian optimization where a two-level recurrent neural network (RNN) is developed for sequential learning and optimization.  ...  Experiments Experimental setup Neural meta learning based on multifidelity Bayesian optimization was evaluated for hyperparameter optimization in deep reinforcement learning (DRL) for task-oriented dialogue  ... 
doi:10.21437/interspeech.2019-1383 dblp:conf/interspeech/ChienL19 fatcat:fyloues5dbd3dmpwt6vidiyal4

Survivable Hyper-Redundant Robotic Arm with Bayesian Policy Morphing

Sayyed Jaffar Ali Raza, Apan Dastider, Mingjie Lin
2020 2020 IEEE 16th International Conference on Automation Science and Engineering (CASE)  
In this paper we present a Bayesian reinforcement learning framework that allows robotic manipulators to adaptively recover from random mechanical failures autonomously, hence being survivable.  ...  To this end, we formulate the framework of Bayesian Policy Morphing (BPM) that enables a robot agent to self-modify its learned policy after the diminution of its maneuvering dimensionality.  ...  Bayesian Reinforcement Learning for Robotics For statically constrained working environment, utilizing deep RL-based algorithm to optimize motion trajectory for robotic systems with unknown dynamics has  ... 
doi:10.1109/case48305.2020.9216963 dblp:conf/case/RazaDL20 fatcat:3c2ommhpczbfzmrqyour4l2t44

Randomized Prior Functions for Deep Reinforcement Learning [article]

Ian Osband, John Aslanides, Albin Cassirer
2018 arXiv   pre-print
Dealing with uncertainty is essential for efficient reinforcement learning.  ...  There is a growing literature on uncertainty estimation for deep learning from fixed datasets, but many of the most popular approaches are poorly-suited to sequential decision problems.  ...  This paper can be thought of as a specific type of 'deep exploration via randomized value functions', whose line of research has been crucially driven by the contributions of (and conversations with) Benjamin  ... 
arXiv:1806.03335v2 fatcat:zkly3q224zad5cpqk7esoazr3e

UCB Exploration via Q-Ensembles [article]

Richard Y. Chen, Szymon Sidor, Pieter Abbeel, John Schulman
2017 arXiv   pre-print
We show how an ensemble of Q^*-functions can be leveraged for more effective exploration in deep reinforcement learning.  ...  We build on well established algorithms from the bandit setting, and adapt them to the Q-learning setting. We propose an exploration strategy based on upper-confidence bounds (UCB).  ...  Introduction Deep reinforcement learning seeks to learn mappings from high-dimensional observations to actions. Deep Q-learning (Mnih et al.  ... 
arXiv:1706.01502v3 fatcat:v3ury7x35zcntiiij4niyrcebi

Universal Reinforcement Learning Algorithms: Survey and Experiments [article]

John Aslanides, Jan Leike, Marcus Hutter
2017 arXiv   pre-print
Many state-of-the-art reinforcement learning (RL) algorithms typically assume that the environment is an ergodic Markov Decision Process (MDP).  ...  In contrast, the field of universal reinforcement learning (URL) is concerned with algorithms that make as few assumptions as possible about the environment.  ...  Right: AIξ compared to the (MC-approximated) optimal policy AIµ with θ = 0.75.  ... 
arXiv:1705.10557v1 fatcat:aptsmnq6ajdpvobxerzqlisr3m

Universal Reinforcement Learning Algorithms: Survey and Experiments

John Aslanides, Jan Leike, Marcus Hutter
2017 Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence  
Many state-of-the-art reinforcement learning (RL) algorithms typically assume that the environment is an ergodic Markov Decision Process (MDP).  ...  In contrast, the field of universal reinforcement learning (URL) is concerned with algorithms that make as few assumptions as possible about the environment.  ...  Right: AIξ compared to the (MC-approximated) optimal policy AIµ with θ = 0.75.  ... 
doi:10.24963/ijcai.2017/194 dblp:conf/ijcai/AslanidesLH17 fatcat:vxhz4tjl2bbqrhnizf3yhy7wfu

Evolving the Materials Genome: How Machine Learning Is Fueling the Next Generation of Materials Discovery

Changwon Suh, Clyde Fare, James A. Warren, Edward O. Pyzer-Knapp
2020 Annual review of materials research (Print)  
algorithms, tools, and methods.  ...  Machine learning, applied to chemical and materials data, is transforming the field of materials discovery and design, yet significant work is still required to fully take advantage of machine learning  ...  Here, we look at methodologies that fall into both camps, with particular attention to deep reinforcement learning and Bayesian optimization.  ... 
doi:10.1146/annurev-matsci-082019-105100 fatcat:dyxljg2mu5grzlakeeatvyymd4

Calibration Improves Bayesian Optimization [article]

Shachi Deshpande, Volodymyr Kuleshov
2021 arXiv   pre-print
We propose a simple algorithm to calibrate the uncertainty of posterior distributions over the objective function as part of the Bayesian optimization process.  ...  We show that by improving the uncertainty estimates of the posterior distribution with calibration, Bayesian optimization makes better decisions and arrives at the global optimum in fewer steps.  ...  Practical bayesian optimization of machine learning algorithms.  ... 
arXiv:2112.04620v1 fatcat:wghcueqhgzbi5fn5iofcdd3yyy

What can Machine Learning do for Radio Spectrum Management?

Ebtesam Almazrouei, Gabriele Gianini, Nawaf Almoosa, Ernesto Damiani
2020 Proceedings of the 16th ACM Symposium on QoS and Security for Wireless and Mobile Networks  
We survey Machine learning and Deep Learning algorithms with possible radio applications, and highlight the corresponding challenges.  ...  The opening of the unlicensed radio spectrum creates new opportunities and new challenges for communication technology that can be faced by Machine Learning techniques.  ...  Reinforcement learning (RL) [62] , Deep Neural Networks(DNN), and deep learning (DL) [23] are machine learning models that are applied also for radio signals.  ... 
doi:10.1145/3416013.3426443 dblp:conf/mswim/AlmazroueiGAD20 fatcat:2hjionukbjgo5gx6bnknaiymse

Hyper-parameter optimization based on soft actor critic and hierarchical mixture regularization [article]

Chaoyue Liu, Yulai Zhang
2021 arXiv   pre-print
Hyper-parameter optimization is a crucial problem in machine learning as it aims to achieve the state-of-the-art performance in any model.  ...  In this paper, we model hyper-parameter optimization process as a Markov decision process, and tackle it with reinforcement learning.  ...  Related Work An algorithm that optimizes the hyper-parameters by DQN (Deep Q-Learning Network) has been proposed in [3] .  ... 
arXiv:2112.04084v1 fatcat:u3hgnr3vgrabhobuxoco6hj424
« Previous Showing results 1 — 15 out of 11,207 results