Filters








65 Hits in 6.0 sec

Memory Bounded Open-Loop Planning in Large POMDPs Using Thompson Sampling

Thomy Phan, Lenz Belzner, Marie Kiermeier, Markus Friedrich, Kyrill Schmid, Claudia Linnhoff-Popien
<span title="2019-07-17">2019</span> <i title="Association for the Advancement of Artificial Intelligence (AAAI)"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/wtjcymhabjantmdtuptkk62mlq" style="color: black;">PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE</a> </i> &nbsp;
In this paper, we propose Partially Observable Stacked Thompson Sampling (POSTS), a memory bounded approach to openloop planning in large POMDPs, which optimizes a fixed size stack of Thompson Sampling  ...  We show that POSTS achieves competitive performance compared to tree-based open-loop planning and offers a performancememory tradeoff, making it suitable for partially observable planning with highly restricted  ...  Conclusion and Future Work In this paper, we proposed Partially Observable Stacked Thompson Sampling (POSTS), a memory bounded approach to open-loop planning in large POMDPs, which optimizes a fixed size  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1609/aaai.v33i01.33017941">doi:10.1609/aaai.v33i01.33017941</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ovikgxwjzbamjafunmbl4l2mwm">fatcat:ovikgxwjzbamjafunmbl4l2mwm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20201028145926/https://aaai.org/ojs/index.php/AAAI/article/download/4794/4672" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/77/eb/77eb4c9217e8c7b525487501b03184537d0266da.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1609/aaai.v33i01.33017941"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

Memory Bounded Open-Loop Planning in Large POMDPs using Thompson Sampling [article]

Thomy Phan, Lenz Belzner, Marie Kiermeier, Markus Friedrich, Kyrill Schmid, Claudia Linnhoff-Popien
<span title="2019-05-10">2019</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
In this paper, we propose Partially Observable Stacked Thompson Sampling (POSTS), a memory bounded approach to open-loop planning in large POMDPs, which optimizes a fixed size stack of Thompson Sampling  ...  We show that POSTS achieves competitive performance compared to tree-based open-loop planning and offers a performance-memory tradeoff, making it suitable for partially observable planning with highly  ...  Conclusion and Future Work In this paper, we proposed Partially Observable Stacked Thompson Sampling (POSTS), a memory bounded approach to open-loop planning in large POMDPs, which optimizes a fixed size  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1905.04020v1">arXiv:1905.04020v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/fjxh6p3d45hptgmj7ysr5qmpum">fatcat:fjxh6p3d45hptgmj7ysr5qmpum</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200905010639/https://arxiv.org/pdf/1905.04020v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/58/3c/583c22f37539426addd733b8e9f6ff9f81015306.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1905.04020v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Universal Reinforcement Learning Algorithms: Survey and Experiments [article]

John Aslanides, Jan Leike, Marcus Hutter
<span title="2017-05-30">2017</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
The universal Bayesian agent AIXI and a family of related URL algorithms have been developed in this setting.  ...  We also present an open-source reference implementation of the algorithms which we hope will facilitate further understanding of, and experimentation with, these ideas.  ...  Acknowledgements We wish to thank Sean Lamont for his assistance in developing the gridworld visualizations used in Figures 1 and 4 .  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1705.10557v1">arXiv:1705.10557v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/aptsmnq6ajdpvobxerzqlisr3m">fatcat:aptsmnq6ajdpvobxerzqlisr3m</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200929035821/https://arxiv.org/pdf/1705.10557v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/8c/ad/8cad912d75352f56c0dec5118fb2defda423e613.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1705.10557v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

A Partially Observable MDP Approach for Sequential Testing for Infectious Diseases such as COVID-19 [article]

Rahul Singh, Fang Liu, Ness B. Shroff
<span title="2020-07-25">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We investigate fundamental performance bounds, and ensure that our solution is robust to errors in the input graph as well as in the tests themselves.  ...  Countries that have been more successful in corralling the virus typically use a "test, treat, trace, test" strategy that begins with testing individuals with symptoms, traces contacts of positively tested  ...  Open-Loop Policy π 0 : At time t = 0 the user picks T nodes out of N nodes, arranges them in some order and decides to sample them according to this order.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2007.13023v1">arXiv:2007.13023v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/h2zawjqvlzhu7o2pgarjwqofbe">fatcat:h2zawjqvlzhu7o2pgarjwqofbe</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200924030312/https://arxiv.org/pdf/2007.13023v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/57/d2/57d2f0345fb215ee80b39aedabaa55c45a5a1380.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2007.13023v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Proactive Action Preparation: Seeing Action Preparation as a Continuous and Proactive Process

Giovanni Pezzulo, Dimitri Ognibene
<span title="">2012</span> <i title="Human Kinetics"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2dx5kr7fa5hgzk4qv7wiuquirq" style="color: black;">Motor Control</a> </i> &nbsp;
Specifically, we discuss how prior knowledge and prospective abilities can be used to maximize utility even before deciding what to do.  ...  In this paper, we aim to elucidate the processes that occur during action preparation from both a conceptual and a computational point of view.  ...  Alternative to the idea of fast feedback loops is the proposal that motor execution is delegated to open-loop motor primitives (Flash & Hochner, 2005) .  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1123/mcj.16.3.386">doi:10.1123/mcj.16.3.386</a> <a target="_blank" rel="external noopener" href="https://www.ncbi.nlm.nih.gov/pubmed/22643383">pmid:22643383</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/5vbiss5fpnhqhdgncrmz3utulq">fatcat:5vbiss5fpnhqhdgncrmz3utulq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170812114335/http://www.humankinetics.com/acucustom/sitename/Documents/DocumentItem/06_Pezzulo_MC_2010_0039withAppendix.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/80/f5/80f5e5cf901fcdf47566b61bc8dec7f21f802580.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1123/mcj.16.3.386"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

Convex Optimization: Algorithms and Complexity

Mohammed Ghavamzadeh, Shie Mannor, Joelle Pineau, Aviv Tamar
<span title="">2015</span> <i title="Now Publishers"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/ka2h7lkphrfvjlabybgqbnn2jq" style="color: black;">Foundations and Trends® in Machine Learning</a> </i> &nbsp;
In this survey, we provide an in-depth review of the role of Bayesian methods for the reinforcement learning (RL) paradigm.  ...  The major incentives for incorporating Bayesian reasoning in RL are: 1) it provides an elegant approach to action-selection (exploration/exploitation) as a function of the uncertainty in learning; and  ...  In the bandit case (single-step planning horizon), this method is in fact equivalent to Thompson sampling.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1561/2200000049">doi:10.1561/2200000049</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/xrgut7tqjbf5le7h5otjwcwkry">fatcat:xrgut7tqjbf5le7h5otjwcwkry</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20160804022133/http://tx.technion.ac.il:80/~avivt/BRLS_journal.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/40/d0/40d0867d5569cb7d6ad830032cfea82b87402f87.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1561/2200000049"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

Sampling-based robotic information gathering algorithms

Geoffrey A. Hollinger, Gaurav S. Sukhatme
<span title="2014-06-27">2014</span> <i title="SAGE Publications"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/uhsvnr5ecvb4die3422lvgaz6q" style="color: black;">The international journal of robotics research</a> </i> &nbsp;
Our proposed rapidly-exploring information gathering (RIG) algorithms combine ideas from sampling-based motion planning with branch and bound techniques to achieve efficient information gathering in continuous  ...  We propose three sampling-based motion planning algorithms for generating informative mobile robot trajectories.  ...  Sampling-based approaches have been applied to POMDPs in the past (Thrun, 1999) .  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1177/0278364914533443">doi:10.1177/0278364914533443</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/nkonq4d6bffytcqqo5ofmy2mua">fatcat:nkonq4d6bffytcqqo5ofmy2mua</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170816131641/http://ir.library.oregonstate.edu/xmlui/bitstream/handle/1957/55134/HollingerGeoffreyMechIndMfgEngnSamplingBasedRoboticInformation.pdf?sequence=1" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/0b/2d/0b2d197f90f3bce607a4465ea7312995f024a1a0.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1177/0278364914533443"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> sagepub.com </button> </a>

AIXIjs: A Software Demo for General Reinforcement Learning [article]

John Aslanides
<span title="2017-05-22">2017</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Many of the obstacles and open questions are conceptual: What does it mean to be intelligent? How does one explore and learn optimally in general, unknown environments?  ...  sampling (Leike et al., 2016), and optimism (Sunehag and Hutter, 2015).  ...  Thompson sampling is asymptotically optimal in mean in general environments. 1: t ← 1 2: loop 3: Sample ρ ∼ w (·|ae <t ) 4: d ← H t ( t ) 5: for i = 1 → d do 6: act π ρ 7: end for 8: end loop So much  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1705.07615v1">arXiv:1705.07615v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/hu5axpkgzrcdrijqetf6pmkjua">fatcat:hu5axpkgzrcdrijqetf6pmkjua</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200911050803/https://arxiv.org/pdf/1705.07615v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/12/34/1234cd20688261084f6223909dc910c935235f7a.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1705.07615v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Leveraging Statistical Multi-Agent Online Planning with Emergent Value Function Approximation [article]

Thomy Phan, Lenz Belzner, Thomas Gabor, Kyrill Schmid
<span title="2018-04-17">2018</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Many online planning algorithms rely on statistical sampling to avoid searching the whole state space, while still being able to make acceptable decisions.  ...  In this paper, we propose Emergent Value function Approximation for Distributed Environments (EVADE), an approach to integrate global experience into multi-agent online planning in stochastic domains to  ...  An approach to open-loop planning in MAS is proposed in [Belzner and Gabor, 2017a ].  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1804.06311v1">arXiv:1804.06311v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/alsd3274mrfgndl3qhbtqnl6ne">fatcat:alsd3274mrfgndl3qhbtqnl6ne</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20191024051905/https://arxiv.org/pdf/1804.06311v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/75/2c/752c19c831a775b9d7ceb1f3d592955778ff0466.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1804.06311v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Model-based Reinforcement Learning: A Survey [article]

Thomas M. Moerland, Joost Broekens, Aske Plaat, Catholijn M. Jonker
<span title="2022-03-31">2022</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
, and how to integrate planning in the learning and acting loop.  ...  Sequential decision making, commonly formalized as Markov Decision Process (MDP) optimization, is a important challenge in artificial intelligence.  ...  Much theoretical work tries to quantify the rate at which algorithms converge, which we can largely split up in sample complexity bounds (PAC bounds) and regret bounds.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2006.16712v4">arXiv:2006.16712v4</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/qyb4auoqovdeji4ov65sv6f3fq">fatcat:qyb4auoqovdeji4ov65sv6f3fq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220606023942/https://arxiv.org/pdf/2006.16712v4.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/1c/64/1c6435cb353271f3cb87b27ccc6df5b727d55f26.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2006.16712v4" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Reinforcement Learning in Practice: Opportunities and Challenges [article]

Yuxi Li
<span title="2022-04-22">2022</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Then we discuss challenges, in particular, 1) foundation, 2) representation, 3) reward, 4) exploration, 5) model, simulation, planning, and benchmarks, 6) off-policy/offline learning, 7) learning to learn  ...  We conclude with a discussion, attempting to answer: "Why has RL not been widely adopted in practice yet?" and "When is RL helpful?".  ...  agent's memory of the observed space are used in the action selection).  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2202.11296v2">arXiv:2202.11296v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/xdtsmme22rfpfn6rgfotcspnhy">fatcat:xdtsmme22rfpfn6rgfotcspnhy</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220426140729/https://arxiv.org/pdf/2202.11296v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/6d/0a/6d0adac188152fbaa45a88ba4da788926ed8144a.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2202.11296v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Deep Reinforcement Learning [article]

Yuxi Li
<span title="2018-10-15">2018</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We discuss deep reinforcement learning in an overview style. We draw a big picture, filled with details.  ...  Then we discuss important mechanisms for RL, including attention and memory, unsupervised learning, hierarchical RL, multi-agent RL, relational RL, and learning to learn.  ...  distributions using empirical game-theoretic analysis.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1810.06339v1">arXiv:1810.06339v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/kp7atz5pdbeqta352e6b3nmuhy">fatcat:kp7atz5pdbeqta352e6b3nmuhy</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200823034914/https://arxiv.org/pdf/1810.06339v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/f2/ac/f2ac2a3fd7b341f2b1be752b4dd46ed9abcf0751.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1810.06339v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Universal Artificial Intelligence [chapter]

Tom Everitt, Marcus Hutter
<span title="">2018</span> <i title="Springer International Publishing"> Foundations of Trusted Autonomy </i> &nbsp;
Artificial intelligence (AI) bears the promise of making us all healthier, wealthier, and happier by reducing the need for human labour and by vastly increasing our scientific and technological progress  ...  Since the inception of the AI research field in the mid-twentieth century, a range of practical and theoretical approaches have been investigated.  ...  Partially observable MDPs (POMDPs) [35] is another popular approach. However, the learning of POMDPs is still an open question.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-319-64816-3_2">doi:10.1007/978-3-319-64816-3_2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/pvbspss75bcftktbrhbyjozyom">fatcat:pvbspss75bcftktbrhbyjozyom</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20181030073523/https://link.springer.com/content/pdf/10.1007%2F978-3-319-64816-3_2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/18/ba/18bab8ba2792e5016fa3ac37bdaf3fa444a77ef1.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-319-64816-3_2"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>

A Gentle Introduction to Reinforcement Learning and its Application in Different Fields

Muddasar Naeem, S. Tahir H. Rizvi, Antonio Coronato
<span title="">2020</span> <i title="Institute of Electrical and Electronics Engineers (IEEE)"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/q7qi7j4ckfac7ehf3mjbso4hne" style="color: black;">IEEE Access</a> </i> &nbsp;
Myopic value of information [17] , policy gradient, POMDP discretization, upper confidence bound, Bayesian sparse sam-pling, BEETLE and Thompson sampling [22] are some of the famous methods that are  ...  not fully covered in used samples.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/access.2020.3038605">doi:10.1109/access.2020.3038605</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/febm7kz525adpcvkfmnim2yha4">fatcat:febm7kz525adpcvkfmnim2yha4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20201121071522/https://ieeexplore.ieee.org/ielx7/6287639/6514899/09261348.pdf?tp=&amp;arnumber=9261348&amp;isnumber=6514899&amp;ref=" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/99/d2/99d25e2857231c7516650cefa09b903129b8e868.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/access.2020.3038605"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> ieee.com </button> </a>

Nonparametric General Reinforcement Learning [article]

Jan Leike
<span title="2016-11-28">2016</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Hence Thompson sampling achieves sublinear regret in these environments.  ...  We construct a large but limit computable class containing a grain of truth and show that agents based on Thompson sampling over this class converge to play Nash equilibria in arbitrary unknown computable  ...  Thompson Sampling In this section we prove that the Thompson sampling policy defined in Section 4.3.4 is asymptotically optimal.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1611.08944v1">arXiv:1611.08944v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/qaagmvpfbfecdessxa65b7d7n4">fatcat:qaagmvpfbfecdessxa65b7d7n4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20191023092219/https://arxiv.org/pdf/1611.08944v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/f3/06/f30620ee233ded85ca99783f2f9470ddcae18822.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1611.08944v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>
&laquo; Previous Showing results 1 &mdash; 15 out of 65 results