Policy Learning for Malaria Control [article]

Van Bach Nguyen, Belaid Mohamed Karim, Bao Long Vu, Jörg Schlötterer, Michael Granitzer
2019 arXiv   pre-print
Sequential decision making is a typical problem in reinforcement learning with plenty of algorithms to solve it. However, only a few of them can work effectively with a very small number of observations. In this report, we introduce the progress to learn the policy for Malaria Control as a Reinforcement Learning problem in the KDD Cup Challenge 2019 and propose diverse solutions to deal with the limited observations problem. We apply the Genetic Algorithm, Bayesian Optimization, Q-learning with
more » ... sequence breaking to find the optimal policy for five years in a row with only 20 episodes/100 evaluations. We evaluate those algorithms and compare their performance with Random Search as a baseline. Among these algorithms, Q-Learning with sequence breaking has been submitted to the challenge and got ranked 7th in KDD Cup.
arXiv:1910.08926v1 fatcat:43svnlhhyfaj5opuqm4siqqiwy