2,647 Hits in 4.9 sec

Observe and Look Further: Achieving Consistent Performance on Atari [article]

Tobias Pohlen, Bilal Piot, Todd Hester, Mohammad Gheshlaghi Azar, Dan Horgan, David Budden, Gabriel Barth-Maron, Hado van Hasselt, John Quan, Mel Večerík, Matteo Hessel, Rémi Munos (+1 others)
2018 arXiv   pre-print
When tested on a set of 42 Atari games, our algorithm exceeds the performance of an average human on 40 games using a common set of hyper parameters.  ...  In this paper, we propose an algorithm that addresses each of these challenges and is able to learn human-level policies on nearly all Atari games.  ...  Conclusion In this paper, we presented a deep Reinforcement Learning (RL) algorithm that achieves human-level performance on a wide variety of MDPs on the Atari 2600 benchmark.  ... 
arXiv:1805.11593v1 fatcat:kluue6jfprhknf47gwxppud5sa

Tournament Selection Improves Cartesian Genetic Programming for Atari Games

Tim Cofala, Lars Elend, Oliver Kramer
2020 The European Symposium on Artificial Neural Networks  
Experimental studies on four exemplary Atari games show that the modifications decrease premature stagnation during the evolutionary optimization process and result in more robust agent strategies.  ...  Based upon preliminary work on the use of CGP playing Atari games, we propose extensions like the repeated evaluation of elite solutions.  ...  Section 2 gives a short introduction to ALE and Atari games. Related work on CGP and Atari games is introduced in Section 3.  ... 
dblp:conf/esann/CofalaEK20 fatcat:cdz6mecoynbc3cftz3gmr6reci

Structured Control Nets for Deep Reinforcement Learning [article]

Mario Srouji, Jian Zhang, Ruslan Salakhutdinov
2018 arXiv   pre-print
We validated our hypothesis with competitive results on simulations from OpenAI MuJoCo, Roboschool, Atari, and a custom 2D urban driving environment, with various ablation and generalization tests, trained  ...  Intuitively, the nonlinear control is for forward-looking and global control, while the linear control stabilizes the local dynamics around the residual of global control.  ...  We also thank Russ Webb, Jerremy Holland, Barry Theobald, and Megan Maher for helpful feedback on the manuscript.  ... 
arXiv:1802.08311v1 fatcat:moh7ywgbibaf5pn7r2zdhvflre

Many-Goals Reinforcement Learning [article]

Vivek Veeriah, Junhyuk Oh, Satinder Singh
2018 arXiv   pre-print
First, in a direct adaptation of Kaelbling's approach we explore if many-goals updating can be used to achieve mastery in non-tabular visual-observation domains.  ...  Second, we explore whether many-goals updating can be used to pre-train a network to subsequently learn faster and better on a single main task of interest.  ...  We evaluate the performance achieved by different agents after 10M steps of training on 49 Atari games.  ... 
arXiv:1806.09605v1 fatcat:we6o4t337vgjrpeydiho6ynk3i

Benchmarking End-to-End Behavioural Cloning on Video Games [article]

Anssi Kanervisto, Joonas Pussinen, Ville Hautamäki
2020 arXiv   pre-print
This also includes end-to-end approaches, where a computer plays a video game like humans do: by looking at the image displayed on the screen, and sending keystrokes to the game.  ...  Behavioural cloning, where a computer is taught to perform a task based on demonstrations, has been successfully applied to various video games and robotics tasks, with and without reinforcement learning  ...  To further study the effect that the quantity of data has on the results, we ran experiments with datasets that only contained the top 1, 2 and 3 episodes of the Atari Grand Challenge dataset.  ... 
arXiv:2004.00981v2 fatcat:vuucimtf3jhnflk7jxkf3cld5m

Natural Environment Benchmarks for Reinforcement Learning [article]

Amy Zhang, Yuxin Wu, Joelle Pineau
2018 arXiv   pre-print
The proposed domains also permit a characterization of generalization through fair train/test separation, and easy comparison and replication of results.  ...  By testing increasingly complex RL algorithms on low-complexity simulation environments, we often end up with brittle RL policies that generalize poorly beyond the very specific domain.  ...  Our results show that visual comprehension is still a difficult task even though we can achieve record scores in Atari on pixel observation.  ... 
arXiv:1811.06032v1 fatcat:3ga5wmkynfbrfemcarxmnxzw5e

Playing Atari Games with Deep Reinforcement Learning and Human Checkpoint Replay [article]

Ionel-Alexandru Hosu, Traian Rebedea
2016 arXiv   pre-print
We tested our method on Montezuma's Revenge and Private Eye, two of the most challenging games from the Atari platform.  ...  The proposed method, human checkpoint replay, consists in using checkpoints sampled from human gameplay as starting points for the learning process.  ...  The first method to achieve human-level performance in an Atari game is deep reinforcement learning [15, 16] .  ... 
arXiv:1607.05077v1 fatcat:37xjpfxltffmxkcznizjum36ue

Large-Scale Study of Curiosity-Driven Learning [article]

Yuri Burda, Harri Edwards, Deepak Pathak, Amos Storkey, Trevor Darrell, Alexei A. Efros
2018 arXiv   pre-print
Our results show surprisingly good performance, and a high degree of alignment between the intrinsic curiosity objective and the hand-designed extrinsic rewards of many game environments.  ...  In this paper: (a) We perform the first large-scale study of purely curiosity-driven learning, i.e. without any extrinsic rewards, across 54 standard benchmark environments, including the Atari game suite  ...  Acknowledgments We would like to thank Chris Lu for helping with the Unity environment, Phillip Isola and Alex Nichols for feedback on the paper.  ... 
arXiv:1808.04355v1 fatcat:nocnlbafbfcalhqxymd7tjgqfa

On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning [article]

Zhao Mandi, Pieter Abbeel, Stephen James
2022 arXiv   pre-print
We hence investigate meta-RL approaches in a variety of vision-based benchmarks, including Procgen, RLBench, and Atari, where evaluations are made on completely novel tasks.  ...  Our findings show that when meta-learning approaches are evaluated on different tasks (rather than different variations of the same task), multi-task pretraining with fine-tuning on new tasks performs  ...  The authors would like to thank Shikun Liu, Danijar Hafnar, Olivia Watkins, Yuqing Du, and Luisa Zintgraf for their help and feedback on initial drafts of the paper.  ... 
arXiv:2206.03271v1 fatcat:654ockasbjbq3p2nnq4jfo2vmy

A Deep Learning Approach for Joint Video Frame and Reward Prediction in Atari Games [article]

Felix Leibfried, Nate Kushman, Katja Hofmann
2017 arXiv   pre-print
Empirical evaluations on five Atari games demonstrate accurate cumulative reward prediction of up to 200 frames.  ...  State-of-the-art reinforcement learning approaches, such as deep Q-networks, are model-free and learn to act effectively across a wide range of environments such as Atari games, but require huge amounts  ...  The networks were hence initially trained on one-step ahead prediction only and later on fine-tuned on further-step ahead prediction.  ... 
arXiv:1611.07078v2 fatcat:brs5gs7r4raslcqibwmillfwri

The Arcade Learning Environment: An Evaluation Platform for General Agents

M. G. Bellemare, Y. Naddaf, J. Veness, M. Bowling
2013 The Journal of Artificial Intelligence Research  
ALE provides an interface to hundreds of Atari 2600 game environments, each one different, interesting, and designed to be a challenge for human players.  ...  In doing so, we also propose an evaluation methodology made possible by ALE, reporting empirical results on over 55 different games.  ...  We would also like to thank our reviewers for their helpful feedback and enthusiasm about the Atari 2600 as a research platform.  ... 
doi:10.1613/jair.3912 fatcat:yudan5ti4rdxtghbdtbswapnzm

Effects of Different Optimization Formulations in Evolutionary Reinforcement Learning on Diverse Behavior Generation [article]

Victor Villin, Naoki Masuyama, Yusuke Nojima
2021 arXiv   pre-print
To better understand how one guides multiple policies toward distinct strategies and benefit from diversity, we need to analyze further the influence of the reward signal modulation and other evolutionary  ...  Experiments on the Atari games stress that optimization formulations which do not consider objectives equally fail at generating diversity and even output agents that are worse at solving the problem at  ...  The original solution provided by the EMOGI paper was observed to have consistent results in terms of task performance.  ... 
arXiv:2110.08122v2 fatcat:vog57trckneopigso5uivompdi

BYOL-Explore: Exploration by Bootstrapped Prediction [article]

Zhaohan Daniel Guo, Shantanu Thakoor, Miruna Pîslar, Bernardo Avila Pires, Florent Altché, Corentin Tallec, Alaa Saade, Daniele Calandriello, Jean-Bastien Grill, Yunhao Tang, Michal Valko, Rémi Munos (+2 others)
2022 arXiv   pre-print
As further evidence of the generality of BYOL-Explore, we show that it achieves superhuman performance on the ten hardest exploration games in Atari while having a much simpler design than other competitive  ...  We show that BYOL-Explore is effective in DM-HARD-8, a challenging partially-observable continuous-action hard-exploration benchmark with visually-rich 3-D environments.  ...  , Kyriacos Nikiforou, Georg Ostrovski, Razvan Pascanu, Doina Precup, Satinder Singh, Hubert Soyer, Pablo Sprechmann, and Karl Tuyls for their support and advice in developing and publishing this work.  ... 
arXiv:2206.08332v1 fatcat:imi5psnpz5fdbnz7ukzcbmuhti

An Empirical Study of Implicit Regularization in Deep Offline RL [article]

Caglar Gulcehre, Srivatsan Srinivasan, Jakub Sygnowski, Georg Ostrovski, Mehrdad Farajtabar, Matt Hoffman, Razvan Pascanu, Arnaud Doucet
2022 arXiv   pre-print
In this work, we conduct a careful empirical study on the relation between effective rank and performance on three offline RL datasets : bsuite, Atari, and DeepMind lab.  ...  Further, we show that several other factors could confound the relationship between effective rank and performance and conclude that studying this association under simplistic assumptions could be highly  ...  We want to thank Mark Rowland, Rishabh Agarwal and Aviral Kumar for the feedback on the early draft version of the paper.  ... 
arXiv:2207.02099v2 fatcat:23kvutruqvahzdoodbas7eyeiu

Relevance-Guided Modeling of Object Dynamics for Reinforcement Learning [article]

William Agnew, Pedro Domingos
2021 arXiv   pre-print
We also highlight the potential of this framework on several Atari games, using our object representation and standard RL and planning algorithms to learn dramatically faster than existing deep RL algorithms  ...  visual consistency.  ...  We look forward to our OLRL framework and grounding of objectness in behavior serving as a foundation for further advancements in this area.  ... 
arXiv:2003.01384v3 fatcat:jzp2y75bkfgd5nm62oty2clfqy
« Previous Showing results 1 — 15 out of 2,647 results