Filters








4 Hits in 2.5 sec

DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning [article]

Daochen Zha, Jingru Xie, Wenye Ma, Sheng Zhang, Xiangru Lian, Xia Hu, Ji Liu
2021 arXiv   pre-print
Starting from scratch in a single server with four GPUs, DouZero outperformed all the existing DouDizhu AI programs in days of training and was ranked the first in the Botzone leaderboard among 344 AI  ...  In this work, we propose a conceptually simple yet effective DouDizhu AI system, namely DouZero, which enhances traditional Monte-Carlo methods with deep neural networks, action encoding, and parallel  ...  Acknowledgements We thank our colleagues in Kuai Inc. for building the DouDizhu environment and the helpful discussions, Qiqi Jiang from DeltaDou team for helping us set up DeltaDou models, and Songyi  ... 
arXiv:2106.06135v1 fatcat:juy7mpjt7zcntjrgw77uh5flk4

DouZero+: Improving DouDizhu AI by Opponent Modeling and Coach-guided Learning [article]

Youpeng Zhao, Jian Zhao, Xunhan Hu, Wengang Zhou, Houqiang Li
2022 arXiv   pre-print
Trained using traditional Monte Carlo method with deep neural networks and self-play procedure without the abstraction of human prior knowledge, DouZero has outperformed all the existing DouDizhu AI programs  ...  Recently, a DouDizhu AI system called DouZero has been proposed.  ...  DeltaDou [19] is the first bot that reaches top human-level performance compared to human experts.  ... 
arXiv:2204.02558v1 fatcat:ulm3gmorebgctfmoh6qzpf5gbm

PerfectDou: Dominating DouDizhu with Perfect Information Distillation [article]

Guan Yang, Minghuan Liu, Weijun Hong, Weinan Zhang, Fei Fang, Guangjun Zeng, Yue Lin
2022 arXiv   pre-print
In this paper, we propose PerfectDou, a state-of-the-art DouDizhu AI system that dominates the game, in an actor-critic framework with a proposed technique named perfect information distillation.  ...  In experiments we show how and why PerfectDou beats all existing AI programs, and achieves state-of-the-art performance.  ...  Minghuan Liu is also supported by Wu Wen Jun Honorary Doctoral Scholarship, AI Institute, SJTU.  ... 
arXiv:2203.16406v4 fatcat:sdvndgowubcn7pxfe5tz7b2npy

Combining Tree Search and Action Prediction for State-of-the-Art Performance in DouDiZhu

Yunsheng Zhang, Dong Yan, Bei Shi, Haobo Fu, Qiang Fu, Hang Su, Jun Zhu, Ning Chen
2021 Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence   unpublished
When comparing with state-of-the-art DouDiZhu AIs, the Elo rating of AP-MCTS is 50 to 200 higher than them. The ablation study shows that accurate action prediction is the key to AP-MCTS winning.  ...  When playing against experienced human players, AP-MCTS achieved a 65.65\% winning rate, which is almost twice the human's winning rate.  ...  We have tried to train DeltaDou through self-play reinforcement learning from random neural network parameters, but the performance of the agent does not improve.  ... 
doi:10.24963/ijcai.2021/470 fatcat:jtspuvcoh5hhteq52u3m56rsdu