18,259 Hits in 4.2 sec

Observational Overfitting in Reinforcement Learning [article]

Xingyou Song, Yiding Jiang, Stephen Tu, Yilun Du, Behnam Neyshabur
2019 arXiv   pre-print
A major component of overfitting in model-free reinforcement learning (RL) involves the case where the agent may mistakenly correlate reward with certain spurious features from the observations generated  ...  When an agent overfits to different observation spaces even if the underlying MDP dynamics is fixed, we term this observational overfitting.  ...  Observational Overfitting in CoinRun In order to verify that this is inherently a feature learning problem rather than a combinatorial problem involving objects, such as in (Santoro et al., 2018) , we  ... 
arXiv:1912.02975v2 fatcat:tspzwdtir5ayzgsdamtgnhilgq

On overfitting and asymptotic bias in batch reinforcement learning with partial observability [article]

Vincent Francois-Lavet, Guillaume Rabusseau, Joelle Pineau, Damien Ernst, Raphael Fonteneau
2019 arXiv   pre-print
learning with partial observability.  ...  tradeoff between asymptotic bias and overfitting in the partially observable context.  ...  The authors thank the Walloon Region (Belgium) that funded this research in the context of the BATWAL project.  ... 
arXiv:1709.07796v2 fatcat:hm77udfhj5cz5m7ogszho73jsm

On Overfitting and Asymptotic Bias in Batch Reinforcement Learning with Partial Observability

Vincent Francois-Lavet, Guillaume Rabusseau, Joelle Pineau, Damien Ernst, Raphael Fonteneau
2019 The Journal of Artificial Intelligence Research  
learning with partial observability.  ...  tradeoff between asymptotic bias and overfitting in the partially observable context.  ...  The authors thank the Walloon Region (Belgium) that funded this research in the context of the BATWAL project.  ... 
doi:10.1613/jair.1.11478 fatcat:pf7ojjswdvhl5k6psy53hgaf64

On Overfitting and Asymptotic Bias in Batch Reinforcement Learning with Partial Observability (Extended Abstract)

Vincent Francois-Lavet, Guillaume Rabusseau, Joelle Pineau, Damien Ernst, Raphael Fonteneau
2020 Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence  
In the context of reinforcement learning with partial observability, this paper provides an analysis of the tradeoff between these two error sources.  ...  In particular, our theoretical analysis formally characterizes how a smaller state representation increases the asymptotic bias while decreasing the risk of overfitting.  ...  The first term in (1) denotes the approximation errors, and the last two terms are regularizations used to prevent overfitting.  ... 
doi:10.24963/ijcai.2020/695 dblp:conf/ijcai/0001Z20 fatcat:yx2wihhuobgmjjh4aevkbr33g4

Protecting against evaluation overfitting in empirical reinforcement learning

Shimon Whiteson, Brian Tanner, Matthew E. Taylor, Peter Stone
2011 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL)  
We argue that reinforcement learning is particularly vulnerable to environment overfitting and propose as a remedy generalized methodologies, in which evaluations are based on multiple environments sampled  ...  Empirical evaluations play an important role in machine learning. However, the usefulness of any evaluation depends on the empirical methodology employed.  ...  In contrast, data overfitting is typically not a concern in reinforcement learning.  ... 
doi:10.1109/adprl.2011.5967363 dblp:conf/adprl/WhitesonTTS11 fatcat:tczpx35tqfcbpbf465tjfw5xha

A Dissection of Overfitting and Generalization in Continuous Reinforcement Learning [article]

Amy Zhang, Nicolas Ballas, Joelle Pineau
2018 arXiv   pre-print
In this work, we aim to offer new perspectives on the characterization and prevention of overfitting in deep Reinforcement Learning (RL) methods, with a particular focus on continuous domains.  ...  The risks and perils of overfitting in machine learning are well known. However most of the treatment of this, including diagnostic tools and remedies, was developed for the supervised learning case.  ...  Those observations advocate for the development of new benchmarks in order to study more thorough overfitting in deep RL. Technical Background Reinforcement Learning.  ... 
arXiv:1806.07937v2 fatcat:hqapx5lddnazxep7v4uzp4fscm

Decoupling Value and Policy for Generalization in Reinforcement Learning [article]

Roberta Raileanu, Rob Fergus
2021 arXiv   pre-print
Standard deep reinforcement learning algorithms use a shared representation for the policy and value function, especially when training directly from images.  ...  Consequently, the use of a shared representation for the policy and value function can lead to overfitting.  ...  Observational overfitting in reinforcement learning. ArXiv, abs/1912.02975, 2020. Srivastava, N., Hinton, G. E., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R.  ... 
arXiv:2102.10330v2 fatcat:45lqpgliszcohjt5enhsttb374

A Brief Look at Generalization in Visual Meta-Reinforcement Learning [article]

Safa Alver, Doina Precup
2020 arXiv   pre-print
We also observe that scalability to high-dimensional tasks with sparse rewards remains a significant problem among many of the current meta-reinforcement learning algorithms.  ...  Due to the realization that deep reinforcement learning algorithms trained on high-dimensional tasks can strongly overfit to their training environments, there have been several studies that investigated  ...  Related Work Overfitting in Reinforcement Learning.  ... 
arXiv:2006.07262v3 fatcat:7zkmbze76vb7rjl4vd4judrqyu

Rogue-Gym: A New Challenge for Generalization in Reinforcement Learning [article]

Yuji Kanagawa, Tomoyuki Kaneko
2019 arXiv   pre-print
In this paper, we propose Rogue-Gym, a simple and classic style roguelike game built for evaluating generalization in reinforcement learning (RL).  ...  In our experiments, we evaluate a standard reinforcement learning method, PPO, with and without enhancements for generalization.  ...  It is improvement in that gradual overfitting (decrease in generalization score) was observed in the case with 10 or 20 training seeds. D.  ... 
arXiv:1904.08129v2 fatcat:qz2bmpfy6rehholb2hxwwik75y

Bootstrapped Q-learning with Context Relevant Observation Pruning to Generalize in Text-based Games [article]

Subhajit Chaudhury, Daiki Kimura, Kartik Talamadupula, Michiaki Tatsubori, Asim Munawar, Ryuki Tachibana
2020 arXiv   pre-print
We show that Reinforcement Learning (RL) methods for solving Text-Based Games (TBGs) often fail to generalize on unseen games, especially in small data regimes.  ...  Our method first trains a base model using Q-learning, which typically overfits the training games.  ...  Learn what not to learn: Action elimination with deep reinforce- ment learning. In Advances in Neural Information Processing Systems, pages 3562-3573.  ... 
arXiv:2009.11896v1 fatcat:3g6uuzi7g5ac7fsz3md4rxrvdu

Reinforcing Medical Image Classifier to Improve Generalization on Small Datasets [article]

Walid Abdullah Al, Il Dong Yun
2019 arXiv   pre-print
Besides an overall improvement in classification performance, the proposed classifier showed remarkable characteristics of generalized learning, which can have great potential in medical classification  ...  Such classification models can easily overfit when applied for medical images because of limited training data, which is a common problem in the field of medical image analysis.  ...  The overfitting situation with such small dataset is clear in the learning curves in Figure 2 .  ... 
arXiv:1909.05630v2 fatcat:rzbsgytedjfkvc267dyrdrlz3y

Quantifying Generalization in Reinforcement Learning [article]

Karl Cobbe, Oleg Klimov, Chris Hesse, Taehoon Kim, John Schulman
2018 arXiv   pre-print
In this paper, we investigate the problem of overfitting in deep reinforcement learning.  ...  Most notably, we introduce a new environment called CoinRun, designed as a benchmark for generalization in RL. Using CoinRun, we find that agents overfit to surprisingly large training sets.  ...  prevalent in supervised learning, can reduce overfitting in our benchmark.  ... 
arXiv:1812.02341v2 fatcat:nxtdqlnecvaibc4yhediff5kra

The Primacy Bias in Deep Reinforcement Learning [article]

Evgenii Nikishin, Max Schwarzer, Pierluca D'Oro, Pierre-Luc Bacon, Aaron Courville
2022 arXiv   pre-print
This work identifies a common flaw of deep reinforcement learning (RL) algorithms: a tendency to rely on early interactions and ignore useful evidence encountered later.  ...  Because of training on progressively growing datasets, deep RL agents incur a risk of overfitting to earlier experiences, negatively affecting the rest of the learning process.  ...  Overfitting in RL Generalization and overfitting have many faces in deep reinforcement learning.  ... 
arXiv:2205.07802v1 fatcat:fy4ikvhvc5aj3hednivsqcnulq

Illuminating Generalization in Deep Reinforcement Learning through Procedural Level Generation [article]

Niels Justesen, Ruben Rodriguez Torrado, Philip Bontrager, Ahmed Khalifa, Julian Togelius, Sebastian Risi
2018 arXiv   pre-print
Deep reinforcement learning (RL) has shown impressive results in a variety of domains, learning directly from high-dimensional sensory streams.  ...  When RL models overfit, even slight modifications to the environment can result in poor agent performance.  ...  First, we show that deep reinforcement learning overfits to a large degree on 2D arcade games when trained on a fixed set of levels.  ... 
arXiv:1806.10729v5 fatcat:ollaxeh7rnb67g4xiib5yl4dsy

OpenAI Gym [article]

Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, Wojciech Zaremba
2016 arXiv   pre-print
OpenAI Gym is a toolkit for reinforcement learning research.  ...  Background Reinforcement learning assumes that there is an agent that is situated in an environment.  ...  To build on recent progress in reinforcement learning, the research community needs good benchmarks on which to compare algorithms.  ... 
arXiv:1606.01540v1 fatcat:az3nsnngzncobo26r4l6rqxe2y
« Previous Showing results 1 — 15 out of 18,259 results