A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
Improving reinforcement learning algorithms: towards optimal learning rate policies
[article]
2021
arXiv
pre-print
This paper investigates to what extent one can improve reinforcement learning algorithms. Our study is split in three parts. ...
Second, we propose a dynamic optimal policy for the choice of the learning rate (γ_k)_k≥ 0 used in stochastic approximation (SA). ...
However, in this section, we present an optimal dynamic policy for the choice of the learning rate (γ k ) k∈N . ...
arXiv:1911.02319v6
fatcat:66q2tkqo5jclpawcjglpapdihe
Reinforcement Learning for Improving Agent Design
[article]
2018
arXiv
pre-print
In many reinforcement learning tasks, the goal is to learn a policy to manipulate an agent, whose design is fixed, to maximize some notion of cumulative reward. ...
The design of the agent's physical structure is rarely optimized for the task at hand. ...
While the original design is symmetric, the learned design (Table 1) breaks symmetry, and biases towards larger rear legs while jointly learning the navigation policy using an asymmetric body. ...
arXiv:1810.03779v2
fatcat:tpronrfxxvazfjgfqgcpihfzzy
An Improved Sarsa(λ) Reinforcement Learning Algorithm for Wireless Communication Systems
2019
IEEE Access
Numerical results demonstrate that the proposed algorithm has the advantage of high learning efficiency and a higher learning-rate tolerance range than Q Learning, Sarsa, Expected Sarsa, and Sarsa(λ) in ...
It does not require prior environmental information and relies only on interaction with the environment to conduct the trial-and-error process and accumulates experience to learn the optimal control policy ...
The use of eligibility traces reduces the number of episodes required by the algorithm to find the optimal policy, which improves the learning efficiency. ...
doi:10.1109/access.2019.2935255
fatcat:ugvxdekwjvhj7fqto3idebtyjy
SIBRE: Self Improvement Based REwards for Adaptive Feedback in Reinforcement Learning
[article]
2020
arXiv
pre-print
We propose a generic reward shaping approach for improving the rate of convergence in reinforcement learning (RL), called Self Improvement Based REwards, or SIBRE. ...
Experiments on several well-known benchmark environments with different RL algorithms show that SIBRE converges to the optimal policy faster and more stably. ...
We assume the existence of a reinforcement learning algorithm for learning the optimal mapping S → A. ...
arXiv:2004.09846v3
fatcat:2dqbb5kktzatnoin3oztaqgq3q
Algorithmic Improvements for Deep Reinforcement Learning applied to Interactive Fiction
[article]
2019
arXiv
pre-print
Text-based games are a natural challenge domain for deep reinforcement learning algorithms. ...
Empirically, we find that these techniques improve the performance of a baseline deep reinforcement learning agent applied to text-based games. ...
Conclusions and Future work We introduced two algorithmic improvements for deep reinforcement learning applied to interactive fiction (IF). ...
arXiv:1911.12511v1
fatcat:jbjomsgubneyhaksbbwgyus33e
Improving Reinforcement Learning with Human Input
2018
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Reinforcement learning (RL) has had many successes when learning autonomously. This paper and accompanying talk consider how to make use of a non-technical human participant, when available. ...
In particular, we consider the case where a human could 1) provide demonstrations of good behavior, 2) provide online evaluative feedback, or 3) define a curriculum of tasks for the agent to learn on. ...
We therefore updated our curriculum learning algorithm to be biased in learning the target task towards concepts that were most frequently seen in the curriculum. ...
doi:10.24963/ijcai.2018/817
dblp:conf/ijcai/Taylor18
fatcat:fkj3vl77pva3bez2uvtfaxrcfu
Algorithmic Improvements for Deep Reinforcement Learning Applied to Interactive Fiction
2020
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
Text-based games are a natural challenge domain for deep reinforcement learning algorithms. ...
Empirically, we find that these techniques improve the performance of a baseline deep reinforcement learning agent applied to text-based games. ...
Conclusions and Future work We introduced two algorithmic improvements for deep reinforcement learning applied to interactive fiction (IF). ...
doi:10.1609/aaai.v34i04.5857
fatcat:tlufceqrlfbunkccc6yeor2wx4
Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching
[chapter]
1992
Reinforcement Learning
that will speed up reinforcement learning. ...
To date, reinforcement learning has mostly been studied solving simple learning tasks. Reinforcement learning methods that have been studied so far typically converge slowly. ...
number of hidden units of the evaluation, policy and utility networks; • ne , rp and tu: the learning rate of the backpropagation algorithm for the evaluation, policy and utility networks; • the momentum ...
doi:10.1007/978-1-4615-3618-5_5
fatcat:rth3jyl4lfcnvltmknuoxijram
Reinforcement Learning for Improving Object Detection
[article]
2020
arXiv
pre-print
In this paper, we introduce an algorithm called ObjectRL to choose the amount of a particular pre-processing to be applied to improve the object detection performances of pre-trained networks. ...
The main motivation for ObjectRL is that an image which looks good to a human eye may not necessarily be the optimal one for a pre-trained object detector to detect objects. ...
We use Adam Optimizer [12] with a learning rate of 10 −3 . We use an −Greedy method for exploration where we anneal linearly with the number of episodes until it reaches 0.05. ...
arXiv:2008.08005v1
fatcat:6wwpphxmgrcspjegbjov5gqf5e
Reinforcement Learning for Improving Agent Design
2019
Artificial Life
In many reinforcement learning tasks, the goal is to learn a policy to manipulate an agent, whose design is fixed, to maximize some notion of cumulative reward. ...
The design of the agent's physical structure is rarely optimized for the task at hand. ...
While the original design is symmetric, the learned design (Table 1) breaks symmetry and biases towards larger rear legs while jointly learning the navigation policy using an asymmetric body. ...
doi:10.1162/artl_a_00301
pmid:31697584
fatcat:xxf3gcdnojgnlag3og72omums4
Improving the dynamics of quantum sensors with reinforcement learning
[article]
2019
arXiv
pre-print
Here, we use the cross entropy method of reinforcement learning to optimize the strength and position of control pulses. ...
By visualizing the evolution of the quantum state, the mechanism exploited by the reinforcement learning method is identified as a kind of spin-squeezing strategy that is adapted to the superradiant damping ...
For training we use the Adam optimizer [54] with learning rate 0.001. ...
arXiv:1908.08416v1
fatcat:sjclnxh2cjgwza7xjqyypxlelu
Improving Reinforcement Learning Speed for Robot Control
2006
2006 IEEE/RSJ International Conference on Intelligent Robots and Systems
Reinforcement Learning (RL) is an intuitive way of programming well-suited for use on autonomous robots because it does not need to specify how the task has to be achieved. ...
In this paper, we develop a theoretical study of the influence of some RL parameters over the learning speed. ...
Under some conditions [12] , Q-learning algorithm is guaranteed to converge to the optimal value function Q * . ...
doi:10.1109/iros.2006.282341
dblp:conf/iros/MatignonLF06
fatcat:xnhnia3uyzdsrofkmiuseaysia
Likelihood Quantile Networks for Coordinating Multi-Agent Reinforcement Learning
[article]
2020
arXiv
pre-print
In particular, each agent considers the likelihood that other agent exploration and policy changes are occurring, essentially utilizing the agent's own estimations to weigh the learning rate that should ...
be applied towards the given samples. ...
CONCLUSION This paper describes a novel distributional RL method for improving performance in cooperative multi-agent reinforcement learning settings. ...
arXiv:1812.06319v6
fatcat:6lsfmhoww5ffjlkwvuwrw4z5je
Cautious Policy Programming: Exploiting KL Regularization in Monotonic Policy Improvement for Reinforcement Learning
[article]
2022
arXiv
pre-print
In this paper, we propose cautious policy programming (CPP), a novel value-based reinforcement learning (RL) algorithm that can ensure monotonic policy improvement during learning. ...
Based on the nature of entropy-regularized RL, we derive a new entropy regularization-aware lower bound of policy improvement that only requires estimating the expected policy advantage function. ...
CPP made a step towards practical monotonic improving RL by leveraging entropy-regularized RL. However, there is still room for improvement. ...
arXiv:2107.05798v3
fatcat:g3uicog4tzhgjph2g6xwlujkby
Improving Maneuver Strategy in Air Combat by Alternate Freeze Games with a Deep Reinforcement Learning Algorithm
2020
Mathematical Problems in Engineering
Agents are trained by alternate freeze games with a deep reinforcement algorithm to deal with nonstationarity. ...
Middleware which connects the agents and air combat simulation software is developed to provide a reinforcement learning environment for agent training. ...
aircraft model and other functions are the same as those proposed in this paper. is environment and the RL agent are packaged as a supplementary material. rough this material, the alternate freeze game DQN algorithm ...
doi:10.1155/2020/7180639
fatcat:lmszqmkfanavtchi6ymx5bghte
« Previous
Showing results 1 — 15 out of 34,699 results