Filters








16 Hits in 2.3 sec

Retrospective Analysis of the 2019 MineRL Competition on Sample Efficient Reinforcement Learning [article]

Stephanie Milani, Nicholay Topin, Brandon Houghton, William H. Guss, Sharada P. Mohanty, Keisuke Nakata, Oriol Vinyals, Noboru Sean Kuno
2020 arXiv   pre-print
To facilitate research in the direction of sample efficient reinforcement learning, we held the MineRL Competition on Sample Efficient Reinforcement Learning Using Human Priors at the Thirty-third Conference  ...  The primary goal of this competition was to promote the development of algorithms that use human demonstrations alongside reinforcement learning to reduce the number of samples needed to solve complex,  ...  Finally, we thank everyone who provided advice or helped organize the competition.  ... 
arXiv:2003.05012v4 fatcat:yotcaadikvgk3n2xkzavit4hxu

The MineRL 2020 Competition on Sample Efficient Reinforcement Learning using Human Priors [article]

William H. Guss, Mario Ynocente Castro, Sam Devlin, Brandon Houghton, Noboru Sean Kuno, Crissman Loomis, Stephanie Milani, Sharada Mohanty, Keisuke Nakata, Ruslan Salakhutdinov, John Schulman, Shinya Shiroshita (+3 others)
2021 arXiv   pre-print
The primary goal of the competition is to foster the development of algorithms which can efficiently leverage human demonstrations to drastically reduce the number of samples needed to solve complex, hierarchical  ...  Resolution of these limitations requires new, sample-efficient methods. To facilitate research in this direction, we propose this second iteration of the MineRL Competition.  ...  challenges exhibited on the primary competition task, ObtainDiamond, we believe that the competition will bring work on the Minecraft domain to the fore of sample-efficient reinforcement learning research  ... 
arXiv:2101.11071v1 fatcat:gzd6vohfavaypnz2vqey6tgkqa

JueWu-MC: Playing Minecraft with Sample-efficient Hierarchical Reinforcement Learning [article]

Zichuan Lin, Junyou Li, Jianing Shi, Deheng Ye, Qiang Fu, Wei Yang
2021 arXiv   pre-print
Notably, we won the championship of the NeurIPS MineRL 2021 research competition and achieved the highest performance score ever.  ...  To address this, we propose JueWu-MC, a sample-efficient hierarchical RL approach equipped with representation learning and imitation learning to deal with perception and exploration.  ...  Sample-efficient Reinforcement Learning. Our work is to build a sample-efficient RL agent for playing Minecraft, and we thereby develop a combination of efficient learning techniques.  ... 
arXiv:2112.04907v1 fatcat:sz4tps5w25a37mrdfjvcaaqe6y

Towards robust and domain agnostic reinforcement learning competitions [article]

William Hebgen Guss, Stephanie Milani, Nicholay Topin, Brandon Houghton, Sharada Mohanty, Andrew Melnik, Augustin Harter, Benoit Buschmaas, Bjarne Jaster, Christoph Berganski, Dennis Heitkamp, Marko Henning (+17 others)
2021 arXiv   pre-print
To demonstrate the efficacy of this design, we proposed, organized, and ran the MineRL 2020 Competition on Sample-Efficient Reinforcement Learning.  ...  Reinforcement learning competitions have formed the basis for standard research benchmarks, galvanized advances in the state-of-the-art, and shaped the direction of the field.  ...  We especially thank Shivam Khandelwal for his help in developing the competition starter-kit and providing constant assistance to the organizers and the participants during the competition.  ... 
arXiv:2106.03748v1 fatcat:6y6am5deljdytd3ng6as2qq4cq

The MineRL BASALT Competition on Learning from Human Feedback [article]

Rohin Shah, Cody Wild, Steven H. Wang, Neel Alex, Brandon Houghton, William Guss, Sharada Mohanty, Anssi Kanervisto, Stephanie Milani, Nicholay Topin, Pieter Abbeel, Stuart Russell (+1 others)
2021 arXiv   pre-print
The MineRL BASALT competition aims to spur forward research on this important class of techniques.  ...  While multiple solutions have been proposed, in this competition we focus on one in particular: learning from human feedback.  ...  The MineRL Diamond challenge focuses on environment sample efficiency, whereas we focus on solving tasks and sample efficiency of human feedback. 2.  ... 
arXiv:2107.01969v1 fatcat:jy7epmcm2zeibgucqbwg2ze5qq

Hierarchical Deep Q-Network from Imperfect Demonstrations in Minecraft [article]

Alexey Skrynnik, Aleksey Staroverov, Ermek Aitygulov, Kirill Aksenov, Vasilii Davydov, Aleksandr I. Panov
2020 arXiv   pre-print
We present Hierarchical Deep Q-Network (HDQfD) that took first place in the MineRL competition. HDQfD works on imperfect demonstrations and utilizes the hierarchical structure of expert trajectories.  ...  In this paper, we present the details of the HDQfD algorithm and give the experimental results in the Minecraft domain.  ...  Acknowledgments This work was supported by the Russian Science Foundation, project no. 18-71-00143. We thank the AIM Tech Company for its organizational and computing support.  ... 
arXiv:1912.08664v4 fatcat:c7t67u2vxzgmjcqxmtwds4ve3i

Retrospective on the 2021 BASALT Competition on Learning from Human Feedback [article]

Rohin Shah, Steven H. Wang, Cody Wild, Stephanie Milani, Anssi Kanervisto, Vinicius G. Goecks, Nicholas Waytowich, David Watkins-Valls, Bharat Prakash, Edmund Mills, Divyansh Garg, Alexander Fries (+4 others)
2022 arXiv   pre-print
The goal of the competition was to promote research towards agents that use learning from human feedback (LfHF) techniques to solve open-world tasks.  ...  We held the first-ever MineRL Benchmark for Agents that Solve Almost-Lifelike Tasks (MineRL BASALT) Competition at the Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021).  ...  Acknowledgments This competition would not have been possible to run without the help of many people and organizations.  ... 
arXiv:2204.07123v1 fatcat:7ini36l36fhb7piuqfolprgt5y

Applied Machine Learning for Games: A Graduate School Course [article]

Yilei Zeng, Aayush Shah, Jameson Thai, Michael Zyda
2021 arXiv   pre-print
Student projects cover use-cases such as training AI-bots in gaming benchmark environments and competitions, understanding human decision patterns in gaming, and creating intelligent non-playable characters  ...  Our students gained hands-on experience in applying state of the art machine learning techniques to solve real-life problems in gaming.  ...  MineRL Environments built on Malmo are released for NeurIPS competitions and MineRL imitation learning datasets  ... 
arXiv:2012.01148v2 fatcat:f44ln32jnbfhrearv234ylteru

Unsupervised Skill-Discovery and Skill-Learning in Minecraft [article]

Juan José Nieto, Roger Creus, Xavier Giro-i-Nieto
2021 arXiv   pre-print
Pre-training Reinforcement Learning agents in a task-agnostic manner has shown promising results.  ...  In our work, we learn a compact latent representation by making use of variational and contrastive techniques.  ...  This work was partially supported by the Postgraduate on Artificial Intelligence with Deep Learning of UPC School, and the Spanish Ministry of Economy and Competitivity under contract TEC2016-75976-R.  ... 
arXiv:2107.08398v1 fatcat:mdhasbzsebg3thtwj4krdgizbq

Reinforcement Learning in Practice: Opportunities and Challenges [article]

Yuxi Li
2022 arXiv   pre-print
This article is a gentle discussion about the field of reinforcement learning in practice, about opportunities and challenges, touching a broad range of topics, with perspectives and without technical  ...  The article is based on both historical and recent research papers, surveys, tutorials, talks, blogs, books, (panel) discussions, and workshops/conferences.  ...  and sample-efficiency.  ... 
arXiv:2202.11296v2 fatcat:xdtsmme22rfpfn6rgfotcspnhy

OpenHoldem: A Benchmark for Large-Scale Imperfect-Information Game Research [article]

Kai Li, Hang Xu, Enmin Zhao, Zhe Wu, Junliang Xing
2021 arXiv   pre-print
modeling and human-computer interactive learning.  ...  We have released OpenHoldem at holdem.ia.ac.cn, hoping it facilitates further studies on the unsolved theoretical and computational issues in this area and cultivate crucial research problems like opponent  ...  Bowling, “Monte Carlo deep reinforcement learning,” arXiv preprint arXiv:2003.13590, 2020. sampling for regret minimization in extensive games,” in Advances in [14] D. Ye, Z.  ... 
arXiv:2012.06168v4 fatcat:f3asym2j55gobcbwmg3m6tcway

Megaverse: Simulating Embodied Agents at One Million Experiences per Second [article]

Aleksei Petrenko, Erik Wijmans, Brennan Shacklett, Vladlen Koltun
2021 arXiv   pre-print
We present Megaverse, a new 3D simulation platform for reinforcement learning and embodied AI research.  ...  The efficient design of our engine enables physics-based simulation with high-dimensional egocentric observations at more than 1,000,000 actions per second on a single 8-GPU node.  ...  The agent was trained on a total of 2 × 10 9 frames of experience, which is equivalent to 2.5 × 10 8 frames on each of the environments.  ... 
arXiv:2107.08170v2 fatcat:6vrk7dj5gfhshd7i6qnp2gtwfm

Say "Sul Sul!" to SimSim, A Sims-Inspired Platform for Sandbox Game AI [article]

Megan Charity, Dipika Rajesh, Rachel Ombok, L. B. Soros
2020 arXiv   pre-print
Importantly, the large number of objects available to the player (whether human or automated) affords a wide variety of solutions to the underlying design problem.  ...  This paper proposes environment design in the life simulation game The Sims as a novel platform and challenge for testing divergent search algorithms.  ...  Acknowledgements This work was supported by the National Science Foundation (Award number 1717324 -RI: Small: General Intelligence through Algorithm Invention and Selection.).  ... 
arXiv:2008.11258v1 fatcat:5taqxmxt75dt5edmirvxjz3coe

The NetHack Learning Environment [article]

Heinrich Küttler and Nantas Nardelli and Alexander H. Miller and Roberta Raileanu and Marco Selvatici and Edward Grefenstette and Tim Rocktäschel
2020 arXiv   pre-print
Progress in Reinforcement Learning (RL) algorithms goes hand-in-hand with the development of challenging environments that test the limits of current methods.  ...  Here, we present the NetHack Learning Environment (NLE), a scalable, procedurally generated, stochastic, rich, and challenging environment for RL research based on the popular single-player terminal-based  ...  Torr, Vegard Mella, Arthur Szlam, Sebastian Riedel, Antoine Bordes, Gabriel Synnaeve, Jeremy Reizenstein, Florian Mayer, as well as ICML 2020 and BeTR-RL 2020 reviewers for their valuable feedback.  ... 
arXiv:2006.13760v2 fatcat:kezxirmxpfb63oxx2hcgchag4a

CraftAssist Instruction Parsing: Semantic Parsing for a Voxel-World Assistant

Kavya Srinet, Yacine Jernite, Jonathan Gray, arthur szlam
2020 Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics   unpublished
We propose a semantic parsing dataset focused on instruction-driven communication with an agent in the game Minecraft 1 . The dataset consists of 7K human utterances and their corresponding parses.  ...  Given proper world state, the parses can be interpreted and executed in game.  ...  The minerl competition on sample efficient reinforce- ment learning using human priors. arXiv preprint arXiv:1904.10079.  ... 
doi:10.18653/v1/2020.acl-main.427 fatcat:3rtsxxmfd5elfi76ctxlmxghja
« Previous Showing results 1 — 15 out of 16 results