Filters








940 Hits in 2.3 sec

Autonomous Driving Path Planning based on Sarsa-Dyna Algorithm

Aboul Ella Hassanieni, Jennefer Mononteliza
<span title="2020-07-31">2020</span> <i title="Future Convergence Technology Research Society"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/r5lcr47lcnaodklgyozcp3425i" style="color: black;">Asia-pacific Journal of Convergent Research Interchange</a> </i> &nbsp;
The Dyna framework in reinforcement learning can solve the problem of planning efficiency.  ...  The analysis of convergence speed and collision times has been done between the proposed Sarsa-Dyna, Q-learning, Sarsa and Dyna-Q algorithm.  ...  is better than the algorithm without the framework, which reflects the efficiency of planning.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.47116/apjcri.2020.07.06">doi:10.47116/apjcri.2020.07.06</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/rdo7ipgr7nac5pkctwsmnoctcy">fatcat:rdo7ipgr7nac5pkctwsmnoctcy</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20201106004114/http://fucos.or.kr/journal/APJCRI/Articles/v6n7/6.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/c6/86/c686761c99b124d35dca0ef2d5c965d1c289f286.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.47116/apjcri.2020.07.06"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

An Architectural Framework for Integrated Multiagent Planning, Reacting, and Learning [chapter]

Gerhard Weiß
<span title="">2001</span> <i title="Springer Berlin Heidelberg"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2w3awgokqne6te4nvlofavy5a4" style="color: black;">Lecture Notes in Computer Science</a> </i> &nbsp;
Dyna is a single-agent architectural framework that integrates learning, planning, and reacting. Well known instantiations of Dyna are Dyna-AC and Dyna-Q.  ...  This extension, called M-Dyna-Q, constitutes a novel coordination framework that bridges the gap between plan-based and reactive coordination in multiagent systems.  ...  The research reported in this paper has been supported by Deutsche Forschungsgemeinschaft DFG (German National Science Foundation) under contract We1718/6-3.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/3-540-44631-1_22">doi:10.1007/3-540-44631-1_22</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/366lhu3gmbak3bxabctcc4nhc4">fatcat:366lhu3gmbak3bxabctcc4nhc4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170922010306/https://www7.in.tum.de/~weissg/Docs/weissg-atal00.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/4b/3c/4b3c692e399403f98e45b2359800e76ea7031057.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/3-540-44631-1_22"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>

Dyna-T: Dyna-Q and Upper Confidence Bounds Applied to Trees [article]

Tarek Faycal, Claudio Zito
<span title="2022-01-19">2022</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
In this work we present a preliminary investigation of a novel algorithm called Dyna-T. In reinforcement learning (RL) a planning agent has its own representation of the environment as a model.  ...  Experience can be used for learning a better model or improve directly the value function and policy.  ...  Active RL and the role of policy in exploration There are two approaches to learning within the MDP framework: active and passive.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2201.04502v2">arXiv:2201.04502v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/x6u3xryf4ba5hiz7vhe7u2c7ke">fatcat:x6u3xryf4ba5hiz7vhe7u2c7ke</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220123145437/https://arxiv.org/pdf/2201.04502v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/22/cd/22cd4c5597390ef577441b80cfaf875da3169c79.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2201.04502v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

RTMBA: A Real-Time Model-Based Reinforcement Learning Architecture for robot control

Todd Hester, Michael Quinlan, Peter Stone
<span title="">2012</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/nytnrt3gtzbixld5r6a4talbky" style="color: black;">2012 IEEE International Conference on Robotics and Automation</a> </i> &nbsp;
learning, and planning processes in a novel way such that the acting process is sufficiently fast for typical robot control cycles.  ...  In this paper, we present a novel parallel architecture for model-based RL that runs in real-time by 1) taking advantage of sample-based approximate planning methods and 2) parallelizing the acting, model  ...  Our ongoing research agenda includes testing RTMBA on other robotic platforms, as well as testing other model learning and MCTS planning algorithms within the framework.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/icra.2012.6225072">doi:10.1109/icra.2012.6225072</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/icra/HesterQS12.html">dblp:conf/icra/HesterQS12</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ifxetlh4qnaz5h6n5degashuxy">fatcat:ifxetlh4qnaz5h6n5degashuxy</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20131019150023/http://www.cs.utexas.edu/~pstone/Papers/bib2html-links/ICRA12-hester.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/b4/ac/b4ac2229de928dd48f45fd95abeb1bcbdd85e0be.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/icra.2012.6225072"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

Learning the structure of Factored Markov Decision Processes in reinforcement learning problems

Thomas Degris, Olivier Sigaud, Pierre-Henri Wuillemin
<span title="">2006</span> <i title="ACM Press"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/v54jrmjbpzgodi4buoyaj7vrzm" style="color: black;">Proceedings of the 23rd international conference on Machine learning - ICML &#39;06</a> </i> &nbsp;
In this paper, we propose sdyna, a general framework for addressing large reinforcement learning problems by trial-and-error and with no initial knowledge of their structure. sdyna integrates incremental  ...  planning algorithms based on fmdps with supervised learning techniques building structured representations of the problem.  ...  Acknowledgement Thanks to the anonymous referees for their suggestions. We also wish to thank Christophe Marsala and Vincent Corruble for useful discussions.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/1143844.1143877">doi:10.1145/1143844.1143877</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/icml/DegrisSW06.html">dblp:conf/icml/DegrisSW06</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/mfvy3wbddjfmnkj2qh64u3qzcm">fatcat:mfvy3wbddjfmnkj2qh64u3qzcm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170810141140/http://machinelearning.wustl.edu/mlpapers/paper_files/icml2006_DegrisSW06.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/25/e8/25e830175a9f050843e1395f2af8a5afb4db7054.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/1143844.1143877"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> acm.org </button> </a>

Organizing Experience: a Deeper Look at Replay Mechanisms for Sample-Based Planning in Continuous State Domains

Yangchen Pan, Muhammad Zaheer, Adam White, Andrew Patterson, Martha White
<span title="">2018</span> <i title="International Joint Conferences on Artificial Intelligence Organization"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/vfwwmrihanevtjbbkti2kc3nke" style="color: black;">Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence</a> </i> &nbsp;
Dyna is a planning paradigm that naturally interleaves learning and planning, by simulating one-step experience to update the action-value function.  ...  This elegant planning strategy has been mostly explored in the tabular setting. The aim of this paper is to revisit sample-based planning, in stochastic and continuous domains with learned models.  ...  We highlight criteria for learned models used within Dyna, and propose Reweighted Experience Models (REMs) that are data-efficient, efficient to sample and can be learned incrementally.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.24963/ijcai.2018/666">doi:10.24963/ijcai.2018/666</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/ijcai/PanZWPW18.html">dblp:conf/ijcai/PanZWPW18</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/evdqx3572fbfdnbckh4s555wjy">fatcat:evdqx3572fbfdnbckh4s555wjy</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20190429122940/https://www.ijcai.org/proceedings/2018/0666.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/ba/7a/ba7a309fcc8dd361bddd27662fdfd68294e58b80.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.24963/ijcai.2018/666"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

Switch-based Active Deep Dyna-Q: Efficient Adaptive Planning for Task-Completion Dialogue Policy Learning [article]

Yuexin Wu and Xiujun Li and Jingjing Liu and Jianfeng Gao and Yiming Yang
<span title="2018-11-19">2018</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Our results show that by combining switcher and active learning, the new framework named as Switch-based Active Deep Dyna-Q (Switch-DDQ), leads to significant improvement over DDQ and Q-learning baselines  ...  The Dyna-Q algorithm extends Q-learning by integrating a world model, and thus can effectively boost training efficiency using simulated experiences generated by the world model.  ...  Acknowledgement We thank the reviewers for their helpful comments, and we would like to acknowledge the volunteers for helping us with the human experiments.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1811.07550v1">arXiv:1811.07550v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/kocv3bf2trenhonffnqllsxuoi">fatcat:kocv3bf2trenhonffnqllsxuoi</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200824224744/https://arxiv.org/pdf/1811.07550v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/7d/84/7d84b2ad58ea169926a8391266454709ce2ef72b.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1811.07550v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Organizing Experience: A Deeper Look at Replay Mechanisms for Sample-based Planning in Continuous State Domains [article]

Yangchen Pan, Muhammad Zaheer, Adam White, Andrew Patterson, Martha White
<span title="2018-06-12">2018</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Dyna is a planning paradigm that naturally interleaves learning and planning, by simulating one-step experience to update the action-value function.  ...  This elegant planning strategy has been mostly explored in the tabular setting. The aim of this paper is to revisit sample-based planning, in stochastic and continuous domains with learned models.  ...  Effi- cient Learning and Planning Within the Dyna Framework. Adap- tive behavior, 1993. [Pires and Szepesvári, 2016] Bernardo Avila Pires and Csaba Szepesvári.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1806.04624v1">arXiv:1806.04624v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/mvpvb3l6wjhs7dw4uc2bcie5zi">fatcat:mvpvb3l6wjhs7dw4uc2bcie5zi</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20191022100725/https://arxiv.org/pdf/1806.04624v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/3c/00/3c0071f938f94d9b4f3f20744e2fb3405991b801.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1806.04624v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Page 2917 of Psychological Abstracts Vol. 82, Issue 6 [page]

<span title="">1995</span> <i title="American Psychological Association"> <a target="_blank" rel="noopener" href="https://archive.org/details/pub_psychological-abstracts" style="color: black;">Psychological Abstracts </a> </i> &nbsp;
(Northeastern U, Coll of Computer Science, Boston, MA) Efficient learning and planning within the Dyna framework. Adaptive Behavior, 1993(Spr), Vol 1(4), 437-454.  ...  —Examines the incremental Dyna reinforcement learning algorithm for machine learning (R. S.  ... 
<span class="external-identifiers"> </span>
<a target="_blank" rel="noopener" href="https://archive.org/details/sim_psychological-abstracts_1995-06_82_6/page/2917" title="read fulltext microfilm" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Archive [Microfilm] <div class="menu fulltext-thumbnail"> <img src="https://archive.org/serve/sim_psychological-abstracts_1995-06_82_6/__ia_thumb.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a>

Evaluating techniques for learning a feedback controller for low-cost manipulators

Oliver M. Cliff, T. Sildomar, Monteiro
<span title="">2013</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/dmucnarmarh2fj6syg5jyqs7ny" style="color: black;">2013 IEEE/RSJ International Conference on Intelligent Robots and Systems</a> </i> &nbsp;
Learning algorithms to control robotic arms have introduced elegant solutions to the complexities faced in such systems.  ...  The agents were tested in a simulated domain for learning closed-loop policies of a simple task with no prior information.  ...  We would also like to extend our gratitude to Marc Deisenroth for supplying the PILCO framework source code and ongoing assisting with its implementation; and Todd Hester for his RL package in ROS and  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/iros.2013.6696428">doi:10.1109/iros.2013.6696428</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/iros/CliffSM13.html">dblp:conf/iros/CliffSM13</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/i3krcmnf5zfqnomketugis2rsm">fatcat:i3krcmnf5zfqnomketugis2rsm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170812071816/http://vigir.missouri.edu/~gdesouza/Research/Conference_CDs/IEEE_IROS_2013/media/files/2718.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/3c/05/3c05e1f5a9e8c52c799519a928547aa983bc2abb.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/iros.2013.6696428"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

Guiding Robot Exploration in Reinforcement Learning via Automated Planning [article]

Yohei Hayamizu, Saeid Amiri, Kishan Chandan, Keiki Takadama, Shiqi Zhang
<span title="2021-03-16">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Focusing on improving RL agents' learning efficiency, we develop Guided Dyna-Q (GDQ) to enable RL agents to reason with action knowledge to avoid exploring less-relevant states.  ...  Reinforcement learning (RL) enables an agent to learn from trial-and-error experiences toward achieving long-term goals; automated planning aims to compute plans for accomplishing tasks using action knowledge  ...  AIR research is supported in part by grants from the National Science Foundation (NRI-1925044), Ford Motor Company (URP Awards 2019 and 2020), OPPO (Faculty Research Award 2020), and SUNY Research Foundation  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2004.11456v2">arXiv:2004.11456v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/jehwm6urbvdcxiedrsnpgehdzm">fatcat:jehwm6urbvdcxiedrsnpgehdzm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210319192959/https://arxiv.org/pdf/2004.11456v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/de/fe/defe91c5df8a2e944edaf8636fd8494f84293f48.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2004.11456v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Deep Dyna-Q: Integrating Planning for Task-Completion Dialogue Policy Learning [article]

Baolin Peng and Xiujun Li and Jianfeng Gao and Jingjing Liu and Kam-Fai Wong and Shang-Yu Su
<span title="2018-05-23">2018</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
To address these issues, we present Deep Dyna-Q, which to our knowledge is the first deep RL framework that integrates planning for task-completion dialogue policy learning.  ...  During dialogue policy learning, the world model is constantly updated with real user experience to approach real user behavior, and in turn, the dialogue agent is optimized using both real experience  ...  We would like to acknowledge the volunteers from Microsoft Research for helping us with the human-in-the-loop experiments. This work was done when Baolin Peng and Shang-Yu Su were visiting Microsoft.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1801.06176v3">arXiv:1801.06176v3</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/o2xfwqirmzeazizjxlii66uwkm">fatcat:o2xfwqirmzeazizjxlii66uwkm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200930071755/https://arxiv.org/pdf/1801.06176v3.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/e7/14/e714b4cae47ab963d6e3e0bf26ae003c2bd378a7.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1801.06176v3" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

A Real-Time Model-Based Reinforcement Learning Architecture for Robot Control [article]

Todd Hester, Michael Quinlan, Peter Stone
<span title="2011-05-21">2011</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
learning, and planning processes such that the acting process is sufficiently fast for typical robot control cycles.  ...  In this paper, we present a novel parallel architecture for model-based RL that runs in real-time by 1) taking advantage of sample-based approximate planning methods and 2) parallelizing the acting, model  ...  ACKNOWLEDGMENTS This work has taken place in the Learning Agents Research Group (LARG) at the Artificial Intelligence Laboratory, The University of Texas at Austin.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1105.1749v2">arXiv:1105.1749v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/7j7dnqntt5ac5p7oej5vk5obxa">fatcat:7j7dnqntt5ac5p7oej5vk5obxa</a> </span>
<a target="_blank" rel="noopener" href="https://archive.org/download/arxiv-1105.1749/1105.1749.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> File Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/26/73/2673b198eb3328bc3834bd2b48b11da494bba790.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1105.1749v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Switch-Based Active Deep Dyna-Q: Efficient Adaptive Planning for Task-Completion Dialogue Policy Learning

Yuexin Wu, Xiujun Li, Jingjing Liu, Jianfeng Gao, Yiming Yang
<span title="2019-07-17">2019</span> <i title="Association for the Advancement of Artificial Intelligence (AAAI)"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/wtjcymhabjantmdtuptkk62mlq" style="color: black;">PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE</a> </i> &nbsp;
Our results show that by combining switcher and active learning, the new framework named as Switch-based Active Deep Dyna-Q (Switch-DDQ), leads to significant improvement over DDQ and Q-learning baselines  ...  The Dyna-Q algorithm extends Q-learning by integrating a world model, and thus can effectively boost training efficiency using simulated experiences generated by the world model.  ...  Acknowledgement We thank the reviewers for their helpful comments, and we would like to acknowledge the volunteers for helping us with the human experiments.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1609/aaai.v33i01.33017289">doi:10.1609/aaai.v33i01.33017289</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/im24jsofcbg4jentrbcx2awsdu">fatcat:im24jsofcbg4jentrbcx2awsdu</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220304103601/https://ojs.aaai.org/index.php/AAAI/article/download/4715/4593" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/35/1f/351f9d0af423f1464f0f1fc7453a70c71a926355.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1609/aaai.v33i01.33017289"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

TeXDYNA: Hierarchical Reinforcement Learning in Factored MDPs [chapter]

Olga Kozlova, Olivier Sigaud, Christophe Meyer
<span title="">2010</span> <i title="Springer Berlin Heidelberg"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2w3awgokqne6te4nvlofavy5a4" style="color: black;">Lecture Notes in Computer Science</a> </i> &nbsp;
Reinforcement learning is one of the main adaptive mechanisms that is both well documented in animal behaviour and giving rise to computational studies in animats and robots.  ...  Learning and factorization techniques of Factored Reinforcement Learning.  ...  This framework is built on three main ideas: -The use of the transition function structure represented as decision trees to discover options results in efficient learning and planning capabilities that  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-642-15193-4_46">doi:10.1007/978-3-642-15193-4_46</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/5moziwlpdnantdq4dp5qim4k34">fatcat:5moziwlpdnantdq4dp5qim4k34</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170922223947/http://www.isir.upmc.fr/files/sabpaper.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/ef/6c/ef6c983ceb71aa335b6d788040ae280942785c97.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-642-15193-4_46"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>
&laquo; Previous Showing results 1 &mdash; 15 out of 940 results