Learning Tetris Using the Noisy Cross-Entropy Method

István Szita, András Lörincz
2006 Neural Computation  
The cross-entropy method is an efficient and general optimization algorithm. However, its applicability in reinforcement learning (RL) seems to be limited because it often converges to suboptimal policies. We apply noise for preventing early convergence of the cross-entropy method, using Tetris, a computer game, for demonstration. The resulting policy outperforms previous RL algorithms by almost two orders of magnitude.
doi:10.1162/neco.2006.18.12.2936 pmid:17052153 fatcat:ivlucsytgjcyfk4kf7d3p5pufq