581 Hits in 1.1 sec

Reinforcement Learning for Blackjack [chapter]

Saqib A. Kakvi
2009 Lecture Notes in Computer Science  
The current Artificial intelligence in the SKCards Blackjack is highly flawed. Reinforcement Learning was chosen as the method to be employed.  ...  This will initially be developed for Blackjack, with possible extensions to other games.  ...  Reinforcement Learning Reinforcement Learning learns from rewards for taking a sequence of actions in an evirnoment, based on its knowledge. This will lead to a change and eventually a reward.  ... 
doi:10.1007/978-3-642-04052-8_43 fatcat:5bflopenmbcbzj7q5uarqynrym


Mazda Ahmadi, Matthew E. Taylor, Peter Stone
2007 Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems - AAMAS '07  
Reinforcement learning is a popular and successful framework for many agent-related problems because only limited environmental feedback is necessary for learning.  ...  One effective method for speeding-up reinforcement learning algorithms is to leverage expert knowledge.  ...  IFSA outperforms Sarsa for the first 1,000,000 episodes in the Blackjack domain. Comparing different orders of adding features for IFSA in the Blackjack domain.  ... 
doi:10.1145/1329125.1329351 dblp:conf/atal/AhmadiTS07 fatcat:wqqoxz6mebgzddu5dftrhnucpu

Reinforcement Learning with Quantum Variational Circuits [article]

Owen Lockwood, Mei Si
2020 arXiv   pre-print
This work explores the potential for quantum computing to facilitate reinforcement learning problems.  ...  reinforcement learning.  ...  In this work, we use a quantum simulator to explore the potential for using quantum computing to solve reinforcement learning tasks.  ... 
arXiv:2008.07524v3 fatcat:wjcgukys7bdbpdvlgphyougdrq

What Did You Think Would Happen? Explaining Agent Behaviour Through Intended Outcomes [article]

Herman Yau, Chris Russell, Simon Hadfield,
2020 arXiv   pre-print
We provide a simple proof that general methods for post-hoc explanations of this nature are impossible in traditional reinforcement learning.  ...  We present a novel form of explanation for Reinforcement Learning, based around the notion of intended outcome. These explanations describe the outcome an agent is trying to achieve by its actions.  ...  Background We set out the minimal background of reinforcement learning necessary to define our approach to explainable Reinforcement Learning.  ... 
arXiv:2011.05064v1 fatcat:7dhpg24fifdavhfp3qed7yowam

Exploratory Gradient Boosting for Reinforcement Learning in Complex Domains [article]

David Abel, Alekh Agarwal, Fernando Diaz, Akshay Krishnamurthy, Robert E. Schapire
2016 arXiv   pre-print
High-dimensional observations and complex real-world dynamics present major challenges in reinforcement learning for both function approximation and exploration.  ...  We address both of these challenges with two complementary techniques: First, we develop a gradient-boosting style, non-parametric function approximator for learning on Q-function residuals.  ...  This section details our evaluation on the two standard reinforcement learning benchmarks of Blackjack and n-Chain.  ... 
arXiv:1603.04119v1 fatcat:dn2jtdr4pncadc6sddkkcrytcm

Page 1650 of Journal of Cognitive Neuroscience Vol. 21, Issue 8 [page]

2009 Journal of Cognitive Neuroscience  
ACC activ- ity then reflects the demand to use the reinforcement learning information stored from previous occasions.  ...  In line with reinforcement learning theory, it may be suggested that a biasing signal is accumulated in the ACC during experiences of punishment (negative TD er- rors) and reinforcement (positive TD errors  ... 

Page 1077 of Psychological Abstracts Vol. 57, Issue 5 [page]

1977 Psychological Abstracts  
In the 3 experimental conditions the child received either money, an award, or positive verbal reinforcement for his/her performance on the target activity.  ...  This effect is discussed in terms of attribution and learning theory. —Journal abstract. 9555. Bond, Nicholas A. (California State U, Sacra- mento) Basic strategy and expectation in casino Blackjack.  ... 

RLCard: A Toolkit for Reinforcement Learning in Card Games [article]

Daochen Zha, Kwei-Herng Lai, Yuanpu Cao, Songyi Huang, Ruzhe Wei, Junyu Guo, Xia Hu
2020 arXiv   pre-print
RLCard is an open-source toolkit for reinforcement learning research in card games.  ...  The goal of RLCard is to bridge reinforcement learning and imperfect information games, and push forward the research of reinforcement learning in domains with multiple agents, large state and action space  ...  ., LTD for the generous support.  ... 
arXiv:1910.04376v2 fatcat:en7mvko3ebdhrk7jl3qhr663zy

Spike-based Decision Learning of Nash Equilibria in Two-Player Games

Johannes Friedrich, Walter Senn, Olaf Sporns
2012 PLoS Computational Biology  
The suggested population reinforcement learning reproduces data from human behavioral experiments for the blackjack and the inspector game.  ...  In contrast, temporal-difference(TD)-learning, covariance-learning, and basic reinforcement learning fail to perform optimally for the stochastic strategy.  ...  Dorris for providing the code of the computer algorithm used in [11] , Johannes Hewig for the data of humans playing blackjack [10] , and Yonatan Loewenstein for helpful feedback on the covariance rules  ... 
doi:10.1371/journal.pcbi.1002691 pmid:23028289 pmcid:PMC3459907 fatcat:ku47p5i6fjcyrhcfeun5h4lw7y

A New Multi-Agent Reinforcement Learning Method based on Evolving Dynamic Correlation Matrix

Xingli Gan, Hongliang Guo, Zhan Li
2019 IEEE Access  
Multi-agent reinforcement learning approaches can be roughly classified into two categories.  ...  INDEX TERMS Multi-agent reinforcement learning, dynamic correlation matrix, convergence, metaparameter evolution.  ...  TABLE 1 . 1 The meta-parameter combination for each generation in the blackjack game. TABLE 2 . 2 The meta-parameter combination for each generation in UTC.  ... 
doi:10.1109/access.2019.2946848 fatcat:m45ylhmavzhndh7jtsba53mcdm

Integrating theory development with design evaluation

1992 Behavior and Information Technology  
In this paper, we recruit the construct of psychological design rationale as a framework for integrating theory development with design evaluation in HCI.  ...  Acknowledgments We are grateful to Susan Chipman, John Karat, Wendy Kellogg, Bob Mack, and Linda Tetzlaff for comments on this work.  ...  We also verified that the blackjack application did provide a good learning-by-doing model for programmers: though they often did "just" play several rounds of blackjack, this experience demonstrably helped  ... 
doi:10.1080/01449299208924345 fatcat:qsts3baflzhz5fg2r7qnsl4eoe

The Effects of Cultural Learning in Populations of Neural Networks

Dara Curran, Colm O'Riordan
2007 Artificial Life  
Our model explores the effect of a cultural learning on a population and employs three benchmark sequential decision tasks as the evolutionary task for the population: connect-four, tic-tac-toe and blackjack  ...  Experiments are conducted with populations employing population learning alone and populations combining population and cultural learning.  ...  Acknowledgements We wish to thank the reviewers for their many constructive comments and suggestions.  ... 
doi:10.1162/artl.2007.13.1.45 pmid:17204012 fatcat:7hmkkgi6wfgr3fkgmr6n7fsykm

Continuous Blackjack: Equilibrium, Deviation and Adaptive Strategy [article]

Mu Zhao
2020 arXiv   pre-print
Finally, we apply reinforcement learning techniques to the game and address several associated engineering challenges.  ...  We introduce a variant of the classic poker game blackjack -- the continuous blackjack. We study the Nash Equilibrium as well as the case where players deviate from it.  ...  Finally, in chapter 5, we model the game on contextual bandits and apply reinforcement learning technique to it .  ... 
arXiv:2011.10315v4 fatcat:fdfwzztmerdglm7aq5wsnjjo7a

Decision-making under Risk: An fMRI Study

Johannes Hewig, Thomas Straube, Ralf H. Trippe, Nora Kretschmer, Holger Hecht, Michael G. H. Coles, Wolfgang H. R. Miltner
2009 Journal of Cognitive Neuroscience  
In a recent study, we used a realistic Blackjack gam- bling task to further examine the function of the RERN and FERN in reinforcement learning (Hewig et al., 2007).  ...  ACC activ- ity then reflects the demand to use the reinforcement learning information stored from previous occasions.  ... 
doi:10.1162/jocn.2009.21112 pmid:18823238 fatcat:v2qjazuj35chbmqvb6dmlhwt7a

On the Convergence of the Monte Carlo Exploring Starts Algorithm for Reinforcement Learning [article]

Che Wang, Keith Ross
2020 arXiv   pre-print
The convergence results presented here make progress for this long-standing open problem in reinforcement learning.  ...  A simple and natural algorithm for reinforcement learning is Monte Carlo Exploring States (MCES), where the Q-function is estimated by averaging the Monte Carlo returns, and the policy is improved by choosing  ...  ) reinforcement learning problems.  ... 
arXiv:2002.03585v1 fatcat:k2smibawmzdfhorsxr7jfbyvmq
« Previous Showing results 1 — 15 out of 581 results