Terrain-adaptive locomotion skills using deep reinforcement learning

Xue Bin Peng, Glen Berseth, Michiel van de Panne
2016 ACM Transactions on Graphics  
Figure 1 : Terrain traversal using a learned actor-critic ensemble. The color-coding of the center-of-mass trajectory indicates the choice of actor used for each leap. Abstract Reinforcement learning offers a promising methodology for developing skills for simulated characters, but typically requires working with sparse hand-crafted features. Building on recent progress in deep reinforcement learning (DeepRL), we introduce a mixture of actor-critic experts (MACE) approach that learns
more » ... tive dynamic locomotion skills using high-dimensional state and terrain descriptions as input, and parameterized leaps or steps as output actions. MACE learns more quickly than a single actorcritic approach and results in actor-critic experts that exhibit specialization. Additional elements of our solution that contribute towards efficient learning include Boltzmann exploration and the use of initial actor biases to encourage specialization. Results are demonstrated for multiple planar characters and terrain classes.
doi:10.1145/2897824.2925881 fatcat:b2n5ytpbqzczll2lj5adz7tjjm