Distributed Deep Reinforcement Learning: Learn How to Play Atari Games in 21 minutes [chapter]

Igor Adamski, Robert Adamski, Tomasz Grel, Adam Jędrych, Kamil Kaczmarek, Henryk Michalewski
2018 Lecture Notes in Computer Science  
We present a study in Distributed Deep Reinforcement Learning (DDRL) focused on scalability of a state-of-the-art Deep Reinforcement Learning algorithm known as Batch Asynchronous Advantage Actor-Critic (BA3C). We show that using the Adam optimization algorithm with a batch size of up to 2048 is a viable choice for carrying out large scale machine learning computations. This, combined with careful reexamination of the optimizer's hyperparameters, using synchronous training on the node level
more » ... le keeping the local, single node part of the algorithm asynchronous) and minimizing the model's memory footprint, allowed us to achieve linear scaling for up to 64 CPU nodes. This corresponds to a training time of 21 minutes on 768 CPU cores, as opposed to the 10 hours required when using a single node with 24 cores achieved by a baseline single-node implementation. 5 The source code along with game-play videos can be found at: https://github.com/deepsense-ai/Distributed-BA3C.
doi:10.1007/978-3-319-92040-5_19 fatcat:armgndw6u5afvcpg64rl2kyqk4