Filters








582 Hits in 1.8 sec

Bayesian Optimization in AlphaGo [article]

Yutian Chen, Aja Huang, Ziyu Wang, Ioannis Antonoglou, Julian Schrittwieser, David Silver, Nando de Freitas
2018 arXiv   pre-print
During the development of AlphaGo, its many hyper-parameters were tuned with Bayesian optimization multiple times.  ...  It is our hope that this brief case study will be of interest to Go fans, and also provide Bayesian optimization practitioners with some insights and inspiration.  ...  In particular, Bayesian optimization was a significant factor in the strength of AlphaGo in the highly publicized match against Lee Sedol.  ... 
arXiv:1812.06855v1 fatcat:oj7syr5uvrftvdxihwkmzblyyy

A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play

David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan (+1 others)
2018 Science  
By contrast, the AlphaGo Zero program recently achieved superhuman performance in the game of Go by reinforcement learning from self-play.  ...  In this paper, we generalize this approach into a single AlphaZero algorithm that can achieve superhuman performance in many challenging games.  ...  Silver The hyperparameters of AlphaGo Zero were tuned by Bayesian optimization.  ... 
doi:10.1126/science.aar6404 pmid:30523106 fatcat:3ojohsnggndppnfm5akucp7pve

Mastering the game of Go with deep neural networks and tree search

David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe (+8 others)
2016 Nature  
Asymptotically, this policy converges to optimal play, and the evaluations converge to the optimal value function 12 .  ...  These games may be solved by recursively computing the optimal value function in a search tree containing approximately b d possible sequences of moves, where b is the game's breadth (number 1 of legal  ...  Acknowledgements We thank Fan Hui for agreeing to play against AlphaGo; Toby Manning for refereeing the match; Ostrovski for reviewing the paper; and the rest of the DeepMind team for their support, ideas  ... 
doi:10.1038/nature16961 pmid:26819042 fatcat:hhxixsirtjairjuwqivmi3gcga

Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm [article]

David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis
2017 arXiv   pre-print
In contrast, the AlphaGo Zero program recently achieved superhuman performance in the game of Go, by tabula rasa reinforcement learning from games of self-play.  ...  In this paper, we generalise this approach into a single AlphaZero algorithm that can achieve, tabula rasa, superhuman performance in many challenging domains.  ...  AlphaGo Zero tuned the hyper-parameter of its search by Bayesian optimisation. In Alp-haZero we reuse the same hyper-parameters for all games without game-specific tuning.  ... 
arXiv:1712.01815v1 fatcat:flj56adezzf6xepdezevbo24xq

Hyper-Parameter Sweep on AlphaZero General [article]

Hui Wang, Michael Emmerich, Mike Preuss, Aske Plaat
2019 arXiv   pre-print
Since AlphaGo and AlphaGo Zero have achieved breakground successes in the game of Go, the programs have been generalized to solve other tasks.  ...  However, AlphaZero contains many parameters, and for neither AlphaGo, AlphaGo Zero nor AlphaZero, there is sufficient discussion about how to set parameter values in these algorithms.  ...  Therefore, the parameter optimization seems to be necessary. In our work, we choose the most general framework algorithm in aforementioned AlphaGo series algorithms-AlphaZero, to study.  ... 
arXiv:1903.08129v1 fatcat:ostlc34i3nan3eajxj36nveqwa

Frequentist and Bayesian Learning Approaches to Artificial Intelligence

Sunghae Jun
2016 International Journal of Fuzzy Logic and Intelligent Systems  
In general, the AI system has to lead the optimal decision under uncertainty. But it is difficult for the AI system can derive the best conclusion.  ...  In addition, we have a trouble to represent the intelligent capacity of AI in numeric values. Statistics has the ability to quantify the uncertainty by two approaches of frequentist and Bayesian.  ...  From the small and big data, we get the intelligence for the optimal decision in AI using statistics as a learning tool. In addition, the Bayesian statistics has its strong learning power.  ... 
doi:10.5391/ijfis.2016.16.2.111 fatcat:saigaezjzjc4dmnoqzsafodur4

Microstructure Characterization of Al-Si Cast Alloys Using Machine Learning with Image Recognition

Sang-Jun Jeong, In-Kyu Hwang, Hee-Soo Kim
2018 Journal of the Reports of the Japan Foundry Engineering Society Meeting  
The algorithms forthe machine learning include Decision Tree, Bayesian Network, Support Vector Machine, and Artificial Neural Network The famous machine learning system adapted to the AlphaGo, is called  ...  Smart factory and optimization of the industrial process can be the good examples. Image recognition with machine learning is also popular.  ... 
doi:10.11279/jfeskouen.172_148 fatcat:rntyxqthvjbwdmp6y7geeoxodq

Evaluation Function Approximation for Scrabble [article]

Rishabh Agarwal
2019 arXiv   pre-print
In this work, we experimented with evolutionary algorithms and Bayesian Optimization to learn the weights for an approximate feature-based evaluation function.  ...  However, these optimization methods were not quite effective, which lead us to explore the given problem from an Imitation Learning point of view.  ...  Bayesian Optimization Bayesian optimization [4] is a method of optimization of expensive cost functions without calculating derivatives.  ... 
arXiv:1901.08728v1 fatcat:ongfxpuwjzcwri5ynq5bhxiogq

ACS Central Science Virtual Issue on Machine Learning

Andrew L. Ferguson
2018 ACS Central Science  
the mold of AlphaGo Zero?  ...  A prominent example is provided by Google's Go-playing computer program AlphaGo Zero.  ... 
doi:10.1021/acscentsci.8b00528 pmid:30159387 pmcid:PMC6107860 fatcat:qtism3iabbbsvg5osjofishkj4

Practical Massively Parallel Monte-Carlo Tree Search Applied to Molecular Design [article]

Xiufeng Yang and Tanuj Kr Aasawat and Kazuki Yoshizoe
2021 arXiv   pre-print
However, while massively parallel computing is often used for training models, it is rarely used for searching solutions for combinatorial optimization problems.  ...  Existing work on large-scale parallel MCTS show efficient scalability in terms of the number of rollouts up to 100 workers, but suffer from the degradation in the quality of the solutions.  ...  The extensive experiments have shown that an efficient distributed MCTS significantly outperforms other approaches that use more complex DNN models combined with optimizations such as Bayesian Optimization  ... 
arXiv:2006.10504v3 fatcat:xbo2c4rcsnghlaciwg3m5jf6my

Micro-Data Learning: The Other End of the Spectrum [article]

Jean-Baptiste Mouret
2016 arXiv   pre-print
Shahriari, et al.: "Taking the human out of the loop: A review of bayesian optimization", Proc. of the IEEE, 2016. [2] M. P. Deisenroth, D. Fox, C. E.  ...  Bayesian optimisation [1] is such a data-efficient algorithm that has recently attracted a lot of interest in the machine learning community.  ... 
arXiv:1610.00946v1 fatcat:dgwohn4lujbenlm6jwp5hdej3u

Deep Reinforcement Learning: An Overview [article]

Yuxi Li
2018 arXiv   pre-print
Then we discuss various applications of RL, including games, in particular, AlphaGo, robotics, natural language processing, including dialogue systems, machine translation, and text generation, computer  ...  Next we discuss core RL elements, including value function, in particular, Deep Q-Network (DQN), policy, reward, model, planning, and exploration.  ...  , in context", after AlphaGo defeated Ke Jie in May 2017.  ... 
arXiv:1701.07274v6 fatcat:x2es3yf3crhqblbbskhxelxf2q

Research Progress in Bayesian Program Learning

Zong-jian ZHU, Ming-qiang PAN, Cheng SUN, Lei JIU, Li-ning SUN
2018 DEStech Transactions on Computer Science and Engineering  
In recent years, the small sample learning model with BPL as the core has made breakthroughs in methodology and performance, attracting great attention from both industry and academia.  ...  Bayesian Program Learning (BPL) is an important area of machine learning.  ...  For example, AlphaGo trained millions of human chess master chess [49] [50] . In contrast, Bayesian program learning requires very little data.  ... 
doi:10.12783/dtcse/cnai2018/24187 fatcat:tpaeoq5b2rbrfawf5sjf5sezom

Machines Imitating Human Thinking Using Bayesian Learning and Bootstrap

Sunghae Jun
2021 Symmetry  
Therefore, because of deep learning, we can expect the faster growth of AI technology such as AlphaGo in optimal decision-making.  ...  However, humans sometimes think and act not optimally but emotionally. In this paper, we propose a method for building thinking machines imitating humans using Bayesian decision theory and learning.  ...  In 2016, AlphaGo defeated Lee Sedol, who is the worldwide Go champion [10] .  ... 
doi:10.3390/sym13030389 fatcat:fyyffwdiibbidjys676o62klde

A brief introduction to the Grey Machine Learning [article]

Xin Ma
2019 arXiv   pre-print
A short discussion on the priority of this new framework to the existing grey models and LSSVM have also been discussed in this paper.  ...  In the works mentioned above, the Bayesian Framework has been used by Brenden Lake for the hand writing tasks, and the AlphaGo is actually built on the neural networks and Monte Carto tree search.  ...  But, we can not always find out an optimal formulation of the function f (·) in the real world applications.  ... 
arXiv:1805.01745v2 fatcat:syf2ixw2rngtjfk4jqfdf3nftu
« Previous Showing results 1 — 15 out of 582 results