A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Observational Overfitting in Reinforcement Learning
[article]
2019
arXiv
pre-print
A major component of overfitting in model-free reinforcement learning (RL) involves the case where the agent may mistakenly correlate reward with certain spurious features from the observations generated ...
When an agent overfits to different observation spaces even if the underlying MDP dynamics is fixed, we term this observational overfitting. ...
Observational Overfitting in CoinRun In order to verify that this is inherently a feature learning problem rather than a combinatorial problem involving objects, such as in (Santoro et al., 2018) , we ...
arXiv:1912.02975v2
fatcat:tspzwdtir5ayzgsdamtgnhilgq
On overfitting and asymptotic bias in batch reinforcement learning with partial observability
[article]
2019
arXiv
pre-print
learning with partial observability. ...
tradeoff between asymptotic bias and overfitting in the partially observable context. ...
The authors thank the Walloon Region (Belgium) that funded this research in the context of the BATWAL project. ...
arXiv:1709.07796v2
fatcat:hm77udfhj5cz5m7ogszho73jsm
On Overfitting and Asymptotic Bias in Batch Reinforcement Learning with Partial Observability
2019
The Journal of Artificial Intelligence Research
learning with partial observability. ...
tradeoff between asymptotic bias and overfitting in the partially observable context. ...
The authors thank the Walloon Region (Belgium) that funded this research in the context of the BATWAL project. ...
doi:10.1613/jair.1.11478
fatcat:pf7ojjswdvhl5k6psy53hgaf64
On Overfitting and Asymptotic Bias in Batch Reinforcement Learning with Partial Observability (Extended Abstract)
2020
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
In the context of reinforcement learning with partial observability, this paper provides an analysis of the tradeoff between these two error sources. ...
In particular, our theoretical analysis formally characterizes how a smaller state representation increases the asymptotic bias while decreasing the risk of overfitting. ...
The first term in (1) denotes the approximation errors, and the last two terms are regularizations used to prevent overfitting. ...
doi:10.24963/ijcai.2020/695
dblp:conf/ijcai/0001Z20
fatcat:yx2wihhuobgmjjh4aevkbr33g4
Protecting against evaluation overfitting in empirical reinforcement learning
2011
2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL)
We argue that reinforcement learning is particularly vulnerable to environment overfitting and propose as a remedy generalized methodologies, in which evaluations are based on multiple environments sampled ...
Empirical evaluations play an important role in machine learning. However, the usefulness of any evaluation depends on the empirical methodology employed. ...
In contrast, data overfitting is typically not a concern in reinforcement learning. ...
doi:10.1109/adprl.2011.5967363
dblp:conf/adprl/WhitesonTTS11
fatcat:tczpx35tqfcbpbf465tjfw5xha
A Dissection of Overfitting and Generalization in Continuous Reinforcement Learning
[article]
2018
arXiv
pre-print
In this work, we aim to offer new perspectives on the characterization and prevention of overfitting in deep Reinforcement Learning (RL) methods, with a particular focus on continuous domains. ...
The risks and perils of overfitting in machine learning are well known. However most of the treatment of this, including diagnostic tools and remedies, was developed for the supervised learning case. ...
Those observations advocate for the development of new benchmarks in order to study more thorough overfitting in deep RL.
Technical Background Reinforcement Learning. ...
arXiv:1806.07937v2
fatcat:hqapx5lddnazxep7v4uzp4fscm
Decoupling Value and Policy for Generalization in Reinforcement Learning
[article]
2021
arXiv
pre-print
Standard deep reinforcement learning algorithms use a shared representation for the policy and value function, especially when training directly from images. ...
Consequently, the use of a shared representation for the policy and value function can lead to overfitting. ...
Observational overfitting in reinforcement learning. ArXiv, abs/1912.02975, 2020. Srivastava, N., Hinton, G. E., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. ...
arXiv:2102.10330v2
fatcat:45lqpgliszcohjt5enhsttb374
A Brief Look at Generalization in Visual Meta-Reinforcement Learning
[article]
2020
arXiv
pre-print
We also observe that scalability to high-dimensional tasks with sparse rewards remains a significant problem among many of the current meta-reinforcement learning algorithms. ...
Due to the realization that deep reinforcement learning algorithms trained on high-dimensional tasks can strongly overfit to their training environments, there have been several studies that investigated ...
Related Work Overfitting in Reinforcement Learning. ...
arXiv:2006.07262v3
fatcat:7zkmbze76vb7rjl4vd4judrqyu
Rogue-Gym: A New Challenge for Generalization in Reinforcement Learning
[article]
2019
arXiv
pre-print
In this paper, we propose Rogue-Gym, a simple and classic style roguelike game built for evaluating generalization in reinforcement learning (RL). ...
In our experiments, we evaluate a standard reinforcement learning method, PPO, with and without enhancements for generalization. ...
It is improvement in that gradual overfitting (decrease in generalization score) was observed in the case with 10 or 20 training seeds.
D. ...
arXiv:1904.08129v2
fatcat:qz2bmpfy6rehholb2hxwwik75y
Bootstrapped Q-learning with Context Relevant Observation Pruning to Generalize in Text-based Games
[article]
2020
arXiv
pre-print
We show that Reinforcement Learning (RL) methods for solving Text-Based Games (TBGs) often fail to generalize on unseen games, especially in small data regimes. ...
Our method first trains a base model using Q-learning, which typically overfits the training games. ...
Learn what
not to learn: Action elimination with deep reinforce-
ment learning. In Advances in Neural Information
Processing Systems, pages 3562-3573. ...
arXiv:2009.11896v1
fatcat:3g6uuzi7g5ac7fsz3md4rxrvdu
Reinforcing Medical Image Classifier to Improve Generalization on Small Datasets
[article]
2019
arXiv
pre-print
Besides an overall improvement in classification performance, the proposed classifier showed remarkable characteristics of generalized learning, which can have great potential in medical classification ...
Such classification models can easily overfit when applied for medical images because of limited training data, which is a common problem in the field of medical image analysis. ...
The overfitting situation with such small dataset is clear in the learning curves in Figure 2 . ...
arXiv:1909.05630v2
fatcat:rzbsgytedjfkvc267dyrdrlz3y
Quantifying Generalization in Reinforcement Learning
[article]
2018
arXiv
pre-print
In this paper, we investigate the problem of overfitting in deep reinforcement learning. ...
Most notably, we introduce a new environment called CoinRun, designed as a benchmark for generalization in RL. Using CoinRun, we find that agents overfit to surprisingly large training sets. ...
prevalent in supervised learning, can reduce overfitting in our benchmark. ...
arXiv:1812.02341v2
fatcat:nxtdqlnecvaibc4yhediff5kra
The Primacy Bias in Deep Reinforcement Learning
[article]
2022
arXiv
pre-print
This work identifies a common flaw of deep reinforcement learning (RL) algorithms: a tendency to rely on early interactions and ignore useful evidence encountered later. ...
Because of training on progressively growing datasets, deep RL agents incur a risk of overfitting to earlier experiences, negatively affecting the rest of the learning process. ...
Overfitting in RL Generalization and overfitting have many faces in deep reinforcement learning. ...
arXiv:2205.07802v1
fatcat:fy4ikvhvc5aj3hednivsqcnulq
Illuminating Generalization in Deep Reinforcement Learning through Procedural Level Generation
[article]
2018
arXiv
pre-print
Deep reinforcement learning (RL) has shown impressive results in a variety of domains, learning directly from high-dimensional sensory streams. ...
When RL models overfit, even slight modifications to the environment can result in poor agent performance. ...
First, we show that deep reinforcement learning overfits to a large degree on 2D arcade games when trained on a fixed set of levels. ...
arXiv:1806.10729v5
fatcat:ollaxeh7rnb67g4xiib5yl4dsy
OpenAI Gym
[article]
2016
arXiv
pre-print
OpenAI Gym is a toolkit for reinforcement learning research. ...
Background Reinforcement learning assumes that there is an agent that is situated in an environment. ...
To build on recent progress in reinforcement learning, the research community needs good benchmarks on which to compare algorithms. ...
arXiv:1606.01540v1
fatcat:az3nsnngzncobo26r4l6rqxe2y
« Previous
Showing results 1 — 15 out of 18,259 results