Filters








2,742 Hits in 3.6 sec

Towards robust and domain agnostic reinforcement learning competitions [article]

William Hebgen Guss, Stephanie Milani, Nicholay Topin, Brandon Houghton, Sharada Mohanty, Andrew Melnik, Augustin Harter, Benoit Buschmaas, Bjarne Jaster, Christoph Berganski, Dennis Heitkamp, Marko Henning (+17 others)
<span title="2021-06-07">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
To demonstrate the efficacy of this design, we proposed, organized, and ran the MineRL 2020 Competition on Sample-Efficient Reinforcement Learning.  ...  Reinforcement learning competitions have formed the basis for standard research benchmarks, galvanized advances in the state-of-the-art, and shaped the direction of the field.  ...  We especially thank Shivam Khandelwal for his help in developing the competition starter-kit and providing constant assistance to the organizers and the participants during the competition.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2106.03748v1">arXiv:2106.03748v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/6y6am5deljdytd3ng6as2qq4cq">fatcat:6y6am5deljdytd3ng6as2qq4cq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210609203339/https://arxiv.org/pdf/2106.03748v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/48/4e/484ee269ce0536ae15754602cb5143191b8e7853.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2106.03748v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Augment Valuate : A Data Enhancement Pipeline for Data-Centric AI [article]

Youngjune Lee, Oh Joon Kwon, Haeju Lee, Joonyoung Kim, Kangwook Lee, Kee-Eung Kim
<span title="2021-12-07">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
In order to serve as the basis for this automation, we suggest a domain-agnostic pipeline for refining the quality of data in image classification problems.  ...  Data scarcity and noise are important issues in industrial applications of machine learning.  ...  . 3 Methodology Although we focus on the competition, we concentrated on domain-agnostic techniques and relied only on the given data (i.e. we do not consider generative or collection technique) to  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2112.03837v1">arXiv:2112.03837v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/wxku3ppvzzfflhichdy2hoe76q">fatcat:wxku3ppvzzfflhichdy2hoe76q</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20211209011600/https://arxiv.org/pdf/2112.03837v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/be/d6/bed676ebc1545e377221e7b7995b72c916cd8a33.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2112.03837v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Domain-Level Explainability – A Challenge for Creating Trust in Superhuman AI Strategies [article]

Jonas Andrulis, Ole Meyer, Grégory Schott, Samuel Weinbach, Volker Gruhn
<span title="2020-11-12">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
For strategic problems, intelligent systems based on Deep Reinforcement Learning (DRL) have demonstrated an impressive ability to learn advanced solutions that can go far beyond human capabilities, especially  ...  Explainable AI (XAI) has successfully increased transparency for modern AI systems through a variety of measures, however, XAI research has not yet provided approaches enabling domain level insights for  ...  Introduction Deep Reinforcement Learning (DRL) is an area of machine learning where the system learns from interacting with the environment and actions are reinforced based on reward values.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2011.06665v1">arXiv:2011.06665v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/qwqpfzjz7nhmjg2ngx2bkip57i">fatcat:qwqpfzjz7nhmjg2ngx2bkip57i</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20201117004527/https://arxiv.org/pdf/2011.06665v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/c7/82/c782d6c30ab1d678d1ad053fad23ed72e0f4a9c4.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2011.06665v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

The Challenges of Exploration for Offline Reinforcement Learning [article]

Nathan Lambert, Markus Wulfmeier, William Whitney, Arunkumar Byravan, Michael Bloesch, Vibhavari Dasagi, Tim Hertweck, Martin Riedmiller
<span title="2022-02-19">2022</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Offline Reinforcement Learning (ORL) enablesus to separately study the two interlinked processes of reinforcement learning: collecting informative experience and inferring optimal behaviour.  ...  The task-agnostic setting for data collection, where the task is not known a priori, is of particular interest due to the possibility of collecting a single dataset and using it to solve several downstream  ...  Methodology Reinforcement Learning Reinforcement Learning (RL) is a framework where an agent interacts with an environment to solve a task by trial and error.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2201.11861v2">arXiv:2201.11861v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/sajzyrnxuze6lo2lozj4szy4um">fatcat:sajzyrnxuze6lo2lozj4szy4um</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220521190925/https://arxiv.org/pdf/2201.11861v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/09/da/09da56cd3bf72b632c43969be97874fa14a3765c.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2201.11861v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

The MineRL 2020 Competition on Sample Efficient Reinforcement Learning using Human Priors [article]

William H. Guss, Mario Ynocente Castro, Sam Devlin, Brandon Houghton, Noboru Sean Kuno, Crissman Loomis, Stephanie Milani, Sharada Mohanty, Keisuke Nakata, Ruslan Salakhutdinov, John Schulman, Shinya Shiroshita (+3 others)
<span title="2021-01-26">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Further we aim to prompt domain agnostic submissions by implementing several novel competition mechanics including action-space randomization and desemantization of observations and actions.  ...  Although deep reinforcement learning has led to breakthroughs in many difficult domains, these successes have required an ever-increasing number of samples, affording only a shrinking segment of the AI  ...  To maximize the development of domain-agnostic techniques that enable the application of deep reinforcement learning to sample-limited, real-world domains, such as robotics, we carefully developed a novel  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2101.11071v1">arXiv:2101.11071v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/gzd6vohfavaypnz2vqey6tgkqa">fatcat:gzd6vohfavaypnz2vqey6tgkqa</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210131005641/https://arxiv.org/pdf/2101.11071v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/49/21/49214c3b99c9874a7e0b9cb210b1a653d77fa18c.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2101.11071v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

The MineRL 2019 Competition on Sample Efficient Reinforcement Learning using Human Priors [article]

William H. Guss, Cayden Codel, Katja Hofmann, Brandon Houghton, Noboru Kuno, Stephanie Milani, Sharada Mohanty, Diego Perez Liebana, Ruslan Salakhutdinov, Nicholay Topin, Manuela Veloso, Phillip Wang
<span title="2021-01-19">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Though deep reinforcement learning has led to breakthroughs in many difficult domains, these successes have required an ever-increasing number of samples.  ...  To facilitate research in this direction, we introduce the MineRL Competition on Sample Efficient Reinforcement Learning using Human Priors.  ...  Workshop on the competition with approximately 250 seats. We will reserve spots for guest speakers, organizers, Round 2 participants, and Round 1 participants attending NeurIPS.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1904.10079v3">arXiv:1904.10079v3</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/n3xfk2fyfnhe7l5oafdmccfbza">fatcat:n3xfk2fyfnhe7l5oafdmccfbza</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210123054846/https://arxiv.org/pdf/1904.10079v3.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/ba/02/ba02f6008dca20881249c1604ade65fb7a066593.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1904.10079v3" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Battlesnake Challenge: A Multi-agent Reinforcement Learning Playground with Human-in-the-loop [article]

Jonathan Chung, Anna Luo, Xavier Raffin, Scott Perry
<span title="2020-07-20">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Our framework is agent-agnostic and heuristics-agnostic such that researchers can design their own algorithms, train their models, and demonstrate in the online Battlesnake competition.  ...  We present the Battlesnake Challenge, a framework for multi-agent reinforcement learning with Human-In-the-Loop Learning (HILL).  ...  Our proposed framework is agent-agnostic and heuristics-agnostic such that researchers can design their own algorithms, train their models, and demonstrate in the real Battlesnake competition.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2007.10504v1">arXiv:2007.10504v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/2clw7xibqzbgherabti6gykhka">fatcat:2clw7xibqzbgherabti6gykhka</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200908112410/https://arxiv.org/pdf/2007.10504v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/3b/db/3bdb464777dad270ad1c80426614af16c08dd361.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2007.10504v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Task-Agnostic Dynamics Priors for Deep Reinforcement Learning [article]

Yilun Du, Karthik Narasimhan
<span title="2019-07-11">2019</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
While model-based deep reinforcement learning (RL) holds great promise for sample efficiency and generalization, learning an accurate dynamics model is often challenging and requires substantial interaction  ...  In this work, we propose an approach to learn task-agnostic dynamics priors from videos and incorporate them into an RL agent.  ...  Acknowledgements We would like to thank Alexander Botev, John Schulman, Tejas Kulkarni, Bowen Baker and the OpenAI team for providing helpful comments and sug-  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1905.04819v4">arXiv:1905.04819v4</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/rupfjffiozg67ijipshvv5abr4">fatcat:rupfjffiozg67ijipshvv5abr4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200915093726/https://arxiv.org/pdf/1905.04819v2.pdf" title="fulltext PDF download [not primary version]" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <span style="color: #f43e3e;">&#10033;</span> <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/e6/5f/e65f025faf8cf173640b216a0464a0ba0dec2c5d.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1905.04819v4" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

ARC: Adversarially Robust Control Policies for Autonomous Vehicles [article]

Sampo Kuutti, Saber Fallah, Richard Bowden
<span title="2021-07-09">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We introduce Adversarially Robust Control (ARC), which trains the protagonist policy and the adversarial policy end-to-end on the same loss.  ...  Therefore, there is a need to develop techniques to learn control policies that are robust against adversaries.  ...  Combining concepts of competing networks from GANs and adversarial training, Robust Adversarial Reinforcement Learning (RARL) [21] - [23] uses two DNNs trained through Reinforcement Learning (RL), where  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2107.04487v1">arXiv:2107.04487v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/y3vhs26kvnglljkm4wldpwvcna">fatcat:y3vhs26kvnglljkm4wldpwvcna</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210714032502/https://arxiv.org/pdf/2107.04487v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/32/c0/32c0cd3ce2de2cc6101e4a750e05f668bd85e4e1.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2107.04487v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

RRL: Resnet as representation for Reinforcement Learning [article]

Rutav Shah, Vikash Kumar
<span title="2021-11-11">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
The appeal of RRL lies in its simplicity in bringing together progress from the fields of Representation Learning, Imitation Learning, and Reinforcement Learning.  ...  Its effectiveness in learning behaviors directly from visual inputs with performance and sample efficiency matching learning directly from the state, even in complex high dimensional domains, is far from  ...  We demonstrate that features learned by image classification models are general towards different task (Figure 2 ), robust to visual distractors, and when used in conjunction with standard IL and RL pipelines  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2107.03380v3">arXiv:2107.03380v3</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/bqv2dd3mk5csxcet7ptzz6q7xq">fatcat:bqv2dd3mk5csxcet7ptzz6q7xq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210713170807/https://arxiv.org/pdf/2107.03380v2.pdf" title="fulltext PDF download [not primary version]" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <span style="color: #f43e3e;">&#10033;</span> <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/8a/9d/8a9dacb50933290ee75330400b97cde6d1401196.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2107.03380v3" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

CATCH: Context-based Meta Reinforcement Learning for Transferrable Architecture Search [article]

Xin Chen, Yawen Duan, Zewei Chen, Hang Xu, Zihao Chen, Xiaodan Liang, Tong Zhang, Zhenguo Li
<span title="2020-07-22">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
It is also capable of handling cross-domain architecture search as competitive networks on ImageNet, COCO, and Cityscapes are identified.  ...  The combination of meta-learning and RL allows CATCH to efficiently adapt to new tasks while being agnostic to search spaces.  ...  As a task-agnostic transferrable NAS framework, CATCH possesses great potentials in scaling NAS to large datasets and various domains efficiently.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2007.09380v3">arXiv:2007.09380v3</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/3mqj2rjlgvaprpxr6knxlx5x44">fatcat:3mqj2rjlgvaprpxr6knxlx5x44</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200724011038/https://arxiv.org/pdf/2007.09380v3.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/2c/77/2c77172171ae31f5c4cd1f19e79e91b939d3d966.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2007.09380v3" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Open Compound Domain Adaptation [article]

Ziwei Liu, Zhongqi Miao, Xingang Pan, Xiaohang Zhan, Dahua Lin, Stella X. Yu, Boqing Gong
<span title="2020-03-29">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Our experiments on digit classification, facial expression recognition, semantic segmentation, and reinforcement learning demonstrate the effectiveness of our approach.  ...  ) a memory module to increase the model's agility towards novel domains.  ...  Domain generalization [52, 23, 22] and domain agnostic learning [39, 5] aim to learn universal representations that can be applied in a domain-invariant manner.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1909.03403v2">arXiv:1909.03403v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ax627aerszhxnk2fwziugjwdjy">fatcat:ax627aerszhxnk2fwziugjwdjy</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200401000440/https://arxiv.org/pdf/1909.03403v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/01/cc/01cc664615c8b48c3341cab4452a60191fb1451d.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1909.03403v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

The Arcade Learning Environment: An Evaluation Platform for General Agents

M. G. Bellemare, Y. Naddaf, J. Veness, M. Bowling
<span title="2013-06-14">2013</span> <i title="AI Access Foundation"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/4ax4efcwajcgvidb6hcg6mwx4a" style="color: black;">The Journal of Artificial Intelligence Research</a> </i> &nbsp;
We illustrate the promise of ALE by developing and benchmarking domain-independent agents designed using well-established AI techniques for both reinforcement learning and planning.  ...  ALE presents significant research challenges for reinforcement learning, model learning, model-based planning, imitation learning, transfer learning, and intrinsic motivation.  ...  Acknowledgments We would like to thank Marc Lanctot, Erik Talvitie, and Matthew Hausknecht for providing suggestions on helping debug and improving the Arcade Learning Environment source code.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1613/jair.3912">doi:10.1613/jair.3912</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/yudan5ti4rdxtghbdtbswapnzm">fatcat:yudan5ti4rdxtghbdtbswapnzm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20190429024550/https://jair.org/index.php/jair/article/download/10819/25823" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/0e/cd/0ecd4fdce541317b38124967b5c2a259d8f43c91.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1613/jair.3912"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> Publisher / doi.org </button> </a>

Bootstrapping $Q$ -Learning for Robotics From Neuro-Evolution Results

Matthieu Zimmer, Stephane Doncieux
<span title="">2018</span> <i title="Institute of Electrical and Electronics Engineers (IEEE)"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/ebf5qcn4yrdcnjbbdfccmlscme" style="color: black;">IEEE Transactions on Cognitive and Developmental Systems</a> </i> &nbsp;
Once this is done, the robot can apply reinforcement learning (1) to be more robust to new domains and, if required, (2) to learn faster than a direct policy search.  ...  , and then learning with an adapted representation to be faster and more robust.  ...  Once this is done, the robot can apply reinforcement learning (1) to be more robust to new domains and, if required, (2) to learn faster than a direct policy search.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/tcds.2016.2628817">doi:10.1109/tcds.2016.2628817</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/afwp2m22wvagngylysgivbumci">fatcat:afwp2m22wvagngylysgivbumci</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20190503211448/https://hal.archives-ouvertes.fr/hal-01494744/document" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/28/07/280753aebe1fcc6d6fc3c8cd7c847e5d10e4f751.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/tcds.2016.2628817"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

Data Valuation using Reinforcement Learning [article]

Jinsung Yoon, Sercan O. Arik, Tomas Pfister
<span title="2019-09-25">2019</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Data valuation has multiple important use cases: (1) building insights about the learning task, (2) domain adaptation, (3) corrupted sample discovery, and (4) robust learning.  ...  The corrupted sample discovery performance of DVRL is close to optimal in many regimes (i.e. as if the noisy samples were known apriori), and for domain adaptation and robust learning DVRL significantly  ...  and robust learning.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1909.11671v1">arXiv:1909.11671v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/tltgkffp5vf77ig7njn4bnjojq">fatcat:tltgkffp5vf77ig7njn4bnjojq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200828032715/https://arxiv.org/pdf/1909.11671v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/17/b6/17b6829678802a20e51558ec28c5369414defe42.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1909.11671v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>
&laquo; Previous Showing results 1 &mdash; 15 out of 2,742 results