Filters








1,568 Hits in 5.6 sec

Machine Discovery of Comprehensible Strategies for Simple Games Using Meta-interpretive Learning

Stephen H. Muggleton, Celine Hocquette
<span title="2019-04-25">2019</span> <i title="Springer Nature"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/tg7qlmhqujdc7af72ot22mh7y4" style="color: black;">New generation computing</a> </i> &nbsp;
We use these games to compare Cumulative Minimax Regret for variants of both standard and deep reinforcement learning against two variants of a new Meta-interpretive Learning system called MIGO.  ...  One advantage of considering simple games is that there is a tractable approach to calculating minimax regret.  ...  iveco mmons .org/licen ses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/s00354-019-00054-2">doi:10.1007/s00354-019-00054-2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/5ne62g4hljc47nlorlmf4h6zgy">fatcat:5ne62g4hljc47nlorlmf4h6zgy</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200310145427/https://link.springer.com/content/pdf/10.1007%2Fs00354-019-00054-2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/aa/e2/aae274ab86bf74aa7db7499407f061e16857a4c1.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/s00354-019-00054-2"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>

Can Meta-Interpretive Learning outperform Deep Reinforcement Learning of Evaluable Game strategies? [article]

Céline Hocquette, Stephen H. Muggleton
<span title="2019-02-26">2019</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We use these games to compare Cumulative Minimax Regret for variants of both standard and deep reinforcement learning against two variants of a new Meta-Interpretive Learning system called MIGO.  ...  However, owing to tractability considerations minimax regret of a learning system cannot be evaluated in such games.  ...  For instance, MIGO would first learn a simple defi-nition of win 1/1 for winning in one move.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1902.09835v1">arXiv:1902.09835v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/xe24mjababbkbibpsrssjhmnte">fatcat:xe24mjababbkbibpsrssjhmnte</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200930093132/https://arxiv.org/pdf/1902.09835v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/a0/b4/a0b486978a613c2c99f7c9d5b247d212701c7322.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1902.09835v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

CI in General Game Playing - To Date Achievements and Perspectives [chapter]

Karol Walȩdzik, Jacek Mańdziuk
<span title="">2010</span> <i title="Springer Berlin Heidelberg"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2w3awgokqne6te4nvlofavy5a4" style="color: black;">Lecture Notes in Computer Science</a> </i> &nbsp;
In this paper, we concentrate on the General Game Playing Competition which defines a universal game description language and acts as a framework for comparison of various approaches to the problem.  ...  Multigame playing agents are programs capable of autonomously learning to play new, previously unknown games.  ...  In the simplest approach, game agent could attempt to store metaparameters that proved most successful in evaluation function learning for each game and attempt to select one of those sets whenever a new  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-642-13232-2_82">doi:10.1007/978-3-642-13232-2_82</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/rlkz4ysvfzc4vncolu2rzjhfia">fatcat:rlkz4ysvfzc4vncolu2rzjhfia</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20180820125444/http://www.mini.pw.edu.pl/~mandziuk/PRACE/ICAISC10.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/c4/75/c475e351f8b90e4cf268816a263d8106132e46e1.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-642-13232-2_82"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>

A Survey of Game Theoretic Approaches for Adversarial Machine Learning in Cybersecurity Tasks

Prithviraj Dasgupta, Joseph Collins
<span title="2019-06-24">2019</span> <i title="Association for the Advancement of Artificial Intelligence (AAAI)"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/27wksbinzzhjfow2wuy6m2iefm" style="color: black;">The AI Magazine</a> </i> &nbsp;
We also discuss open problems and challenges and possible directions for further research that would make deep machine learningbased systems more robust and reliable for cybersecurity tasks.  ...  This article provides a detailed survey of the stateof-the-art techniques that are used to make a machine learning algorithm robust against adversarial attacks by using the computational framework of game  ...  Acknowledgments The authors would like to acknowledge support from the US Office of Naval Research Summer Faculty Research program for supporting the work of Prithviraj Dasgupta at the US Naval Research  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1609/aimag.v40i2.2847">doi:10.1609/aimag.v40i2.2847</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/aptetzccqfcwpcszm6s4kj7vtu">fatcat:aptetzccqfcwpcszm6s4kj7vtu</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210427102103/https://ojs.aaai.org/index.php/aimagazine/article/download/2847/3418" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/ad/96/ad96b5efc8a61b39da77c7770ad984b196b0284b.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1609/aimag.v40i2.2847"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

Accelerating Multiagent Reinforcement Learning by Equilibrium Transfer

Yujing Hu, Yang Gao, Bo An
<span title="">2015</span> <i title="Institute of Electrical and Electronics Engineers (IEEE)"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/snhjqrgxbff5teva5lfasxmfr4" style="color: black;">IEEE Transactions on Cybernetics</a> </i> &nbsp;
By introducing transfer loss and transfer condition, a novel framework called equilibrium transfer-based MARL is proposed.  ...  For the first time, this paper finds that during the learning process of equilibrium-based MARL, the one-shot games corresponding to each state's successive visits often have the same or similar equilibria  ...  Minimax-Q [10] is widely considered as the first equilibrium-based MARL algorithm, which uses a minimax rule for action selection and updating value functions.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/tcyb.2014.2349152">doi:10.1109/tcyb.2014.2349152</a> <a target="_blank" rel="external noopener" href="https://www.ncbi.nlm.nih.gov/pubmed/25181517">pmid:25181517</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/umqzftx37naytcg4ynbemredli">fatcat:umqzftx37naytcg4ynbemredli</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170829151407/http://www.ntu.edu.sg/home/boan/papers/TC14-transfer.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/54/b4/54b4da5c5b2a5fd91a9a4d657e83618ada1c2c88.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/tcyb.2014.2349152"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

GTDM-CSAT: an LTE-U self Coexistence Solution based on Game Theory and Reinforcement Learning

Pedro Santana, José Neto, Fuad Abinader Jr., Vicente Sousa Jr.
<span title="">2019</span> <i title="Sociedad Brasileira de Telecomunicacoes"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/a2clstunbbdoxk4u5ffodjz2yy" style="color: black;">Journal of Communication and Information Systems</a> </i> &nbsp;
The solution for the best ON-OFF time ratio is defined by applying a modified Minimax Q-learning algorithm for finding the game equilibrium.  ...  For this, we formulate the problem as a Markovian game, where the LTE-U operators coexist on a two-zero-sum game.  ...  Littman proposes a new algorithm that widens Q-Learning technique for solving stochastic games, specifically two-player zero-sum stochastic games, called Minimax Q-Learning.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.14209/jcis.2019.17">doi:10.14209/jcis.2019.17</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/iroq76jvcnc65gfxjywb4wssra">fatcat:iroq76jvcnc65gfxjywb4wssra</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200307080217/https://jcis.sbrt.org.br/jcis/article/download/655/463" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/ae/8c/ae8cca3bc3838f72b5ffafd9a8c66320b56f69db.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.14209/jcis.2019.17"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

Robust Reinforcement Learning using Adversarial Populations [article]

Eugene Vinitsky and Yuqing Du and Kanaad Parvate and Kathy Jang and Pieter Abbeel and Alexandre Bayen
<span title="2020-09-22">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
The Robust RL formulation tackles this by adding worst-case adversarial noise to the dynamics and constructing the noise distribution as the solution to a zero-sum minimax game.  ...  Reinforcement Learning (RL) is an effective tool for controller design but can struggle with issues of robustness, failing catastrophically when the underlying system dynamics are perturbed.  ...  Acknowledgments The authors would like to thank Lerrel Pinto for help understanding and reproducing "Robust Adversarial Reinforcement Learning" as well as insightful discussions of our problem.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2008.01825v2">arXiv:2008.01825v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/xa6n2sf7cffwvlwsnaivz42quq">fatcat:xa6n2sf7cffwvlwsnaivz42quq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200925001926/https://arxiv.org/pdf/2008.01825v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2008.01825v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Replay-Guided Adversarial Environment Design [article]

Minqi Jiang, Michael Dennis, Jack Parker-Holder, Jakob Foerster, Edward Grefenstette, Tim Rocktäschel
<span title="2022-01-13">2022</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
This connection allows us to develop novel theory for PLR, providing a version with a robustness guarantee at Nash equilibria.  ...  Indeed, our experiments confirm that our new method, PLR^⊥, obtains better results on a suite of out-of-distribution, zero-shot transfer tasks, in addition to demonstrating that PLR^⊥ improves the performance  ...  Further, we are grateful to our anonymous reviewers for their valuable feedback. MJ is supported by the FAIR PhD program. This work was funded by Facebook.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2110.02439v2">arXiv:2110.02439v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/mdqyigkfwrd7dk5mhtzesxc3ba">fatcat:mdqyigkfwrd7dk5mhtzesxc3ba</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20211011130038/https://arxiv.org/pdf/2110.02439v1.pdf" title="fulltext PDF download [not primary version]" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <span style="color: #f43e3e;">&#10033;</span> <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/d6/3f/d63fb9c8cf9339f70a1ea4e2074ec0a0264cdbcd.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2110.02439v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Strategy Selection for Moving Target Defense in Incomplete Information Game

Huan Zhang, Kangfeng Zheng, Xiujuan Wang, Shoushan Luo, Bin Wu
<span title="">2019</span> <i title="Computers, Materials and Continua (Tech Science Press)"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/amujz7fcqna6do727z6ev3ueo4" style="color: black;">Computers Materials &amp; Continua</a> </i> &nbsp;
Moreover, the performances of the Minimax-Q learning algorithm and Naive-Q learning algorithm were compared and analyzed in the MTD environment.  ...  Thus, the selection of an optimal defense strategy based on MTD has become the focus of research.  ...  Acknowledgement: Thanks for the valuable review comments of every expert and editor.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.32604/cmc.2020.06553">doi:10.32604/cmc.2020.06553</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/kwnsktbnqnewzedhvdlfw7bmve">fatcat:kwnsktbnqnewzedhvdlfw7bmve</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200709193515/https://www.techscience.com/cmc/v62n2/38275/pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/e3/22/e3229f16777a42436ca5dbf246648e0be27e315c.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.32604/cmc.2020.06553"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design [article]

Michael Dennis, Natasha Jaques, Eugene Vinitsky, Alexandre Bayen, Stuart Russell, Andrew Critch, Sergey Levine
<span title="2021-02-04">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
A wide range of reinforcement learning (RL) problems - including robustness, transfer learning, unsupervised RL, and emergent complexity - require specifying a distribution of tasks or environments in  ...  To generate structured, solvable environments for our protagonist agent, we introduce a second, antagonist agent that is allied with the environment-generating adversary.  ...  We are grateful for funding of this work as a gift from the Berkeley Existential Risk Intuitive. We are also grateful to Google Research for funding computation expenses associated with this work.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2012.02096v2">arXiv:2012.02096v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/dz7hfau6dfdszllldix6ldc3ga">fatcat:dz7hfau6dfdszllldix6ldc3ga</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20201206042424/https://arxiv.org/pdf/2012.02096v1.pdf" title="fulltext PDF download [not primary version]" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <span style="color: #f43e3e;">&#10033;</span> <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/93/b2/93b2788fb1f2aed0e545d9f9d7dca1c05a63208a.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2012.02096v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Frontmatter

Simon M. Lucas, Michael Mateas, Mike Preuss, Pieter Spronck, Julian Togelius, Michael Wagner
<span title="2013-11-12">2013</span> <i > <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/mc2rq5incjf43fsp54ib3idijy" style="color: black;">Dagstuhl Publications</a> </i> &nbsp;
have been highly successful in a wide range of application areas, to address a broad range of problems arising in video games.  ...  For simplicity, we will consider primarily the objective of maximising playing strength, and consider games where this is a challenging task, which results in interesting gameplay.  ...  range of leading researchers in AI for search and abstraction in games.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.4230/dfu.vol6.12191.i">doi:10.4230/dfu.vol6.12191.i</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/dagstuhl/X13b.html">dblp:conf/dagstuhl/X13b</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/i4isdb5w4fastcbnczbtsdulkm">fatcat:i4isdb5w4fastcbnczbtsdulkm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20160624050159/https://skatgame.net/mburo/ps/DFU6-chapter1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/22/79/22792d6a1f35769b49632db846294e7bdf129486.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.4230/dfu.vol6.12191.i"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

The Effects of Cultural Learning in Populations of Neural Networks

Dara Curran, Colm O'Riordan
<span title="">2007</span> <i title="MIT Press - Journals"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/trtzv54dg5e5lpy55iml6bgv6u" style="color: black;">Artificial Life</a> </i> &nbsp;
Population learning can be described as the iterative Darwinian process of fitness-based selection and genetic transfer of information leading to populations of higher fitness and is often simulated using  ...  Our model explores the effect of a cultural learning on a population and employs three benchmark sequential decision tasks as the evolutionary task for the population: connect-four, tic-tac-toe and blackjack  ...  Acknowledgements We wish to thank the reviewers for their many constructive comments and suggestions.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1162/artl.2007.13.1.45">doi:10.1162/artl.2007.13.1.45</a> <a target="_blank" rel="external noopener" href="https://www.ncbi.nlm.nih.gov/pubmed/17204012">pmid:17204012</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/7hmkkgi6wfgr3fkgmr6n7fsykm">fatcat:7hmkkgi6wfgr3fkgmr6n7fsykm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20121115060306/http://www.cs.ucc.ie:80/~dc17/pubs/CurranALIFE2006.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/8d/16/8d1602af47a3b0f5c5eb118bb3803710db684453.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1162/artl.2007.13.1.45"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> mitpressjournals.org </button> </a>

Cooperative reinforcement learning based on zero-sum games

Kao-Shing Hwang, Jeng-Yih Chiou, Tse-Yu Chen
<span title="">2008</span> <i title="IEEE"> 2008 SICE Annual Conference </i> &nbsp;
The Q(λ)-learning is a modified Q-learning methods with the eligibility skill, and the minimax-Q further combines Q-learning with a simple game theory.  ...  These robots carrying out tasks of a helper or defender improve action policy at play based on Q-learning inspired by the game theory.  ...  Cooperative Reinforcement Learning Based on Zero-Sum Games, Mobile Robots -Control Architectures, Bio-Interfacing, Navigation, Multi Robot Motion Planning and Operator Training, Dr.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/sice.2008.4655172">doi:10.1109/sice.2008.4655172</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/vgnliviuvnacdg3rtdrrii23eu">fatcat:vgnliviuvnacdg3rtdrrii23eu</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20190501100134/https://cdn.intechopen.com/pdfs/24663.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/35/0f/350f921023154f7beb0835fb01acddc69b41674b.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/sice.2008.4655172"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

Cooperative Reinforcement Learning Based on Zero-Sum Games [chapter]

Kao-Shing Hwang, Wei-Cheng Jiang, Hung-Hsiu Yu, Shin-Yi Li
<span title="2011-12-02">2011</span> <i title="InTech"> Mobile Robots - Control Architectures, Bio-Interfacing, Navigation, Multi Robot Motion Planning and Operator Training </i> &nbsp;
The Q(λ)-learning is a modified Q-learning methods with the eligibility skill, and the minimax-Q further combines Q-learning with a simple game theory.  ...  These robots carrying out tasks of a helper or defender improve action policy at play based on Q-learning inspired by the game theory.  ...  Cooperative Reinforcement Learning Based on Zero-Sum Games, Mobile Robots -Control Architectures, Bio-Interfacing, Navigation, Multi Robot Motion Planning and Operator Training, Dr.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.5772/26620">doi:10.5772/26620</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/u53pcolpmrdhnmxayihlgiqrxm">fatcat:u53pcolpmrdhnmxayihlgiqrxm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170705062756/http://cdn.intechopen.com/pdfs/24663/InTech-Cooperative_reinforcement_learning_based_on_zero_sum_games.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/53/12/531226c7e7248e0b92179866b32d966bbfe39b5e.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.5772/26620"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

Minimax and Neyman-Pearson Meta-Learning for Outlier Languages [article]

Edoardo Maria Ponti, Rahul Aralikatte, Disha Shrivastava, Siva Reddy, Anders Søgaard
<span title="2021-06-02">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Both criteria constitute fully differentiable two-player games. In light of this, we propose a new adaptive optimiser solving for a local approximation to their Nash equilibrium.  ...  Model-agnostic meta-learning (MAML) has been recently put forth as a strategy to learn resource-poor languages in a sample-efficient fashion.  ...  Acknowledgements We thank the reviewers for their valuable feedback. Rahul Aralikatte and Anders Søgaard are funded by a Google Focused Research Award.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2106.01051v1">arXiv:2106.01051v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/7mtaaat3e5auvowumypc3vzr5u">fatcat:7mtaaat3e5auvowumypc3vzr5u</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210605092329/https://arxiv.org/pdf/2106.01051v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/30/2a/302a691914b1e000ba260f88e6859d1b0ae35557.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2106.01051v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>
&laquo; Previous Showing results 1 &mdash; 15 out of 1,568 results