Empirical Evaluation of Contextual Policy Search with a Comparison-based Surrogate Model and Active Covariance Matrix Adaptation [article]

Alexander Fabisch
<span title="2018-10-26">2018</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Contextual policy search (CPS) is a class of multi-task reinforcement learning algorithms that is particularly useful for robotic applications. A recent state-of-the-art method is Contextual Covariance Matrix Adaptation Evolution Strategies (C-CMA-ES). It is based on the standard black-box optimization algorithm CMA-ES. There are two useful extensions of CMA-ES that we will transfer to C-CMA-ES and evaluate empirically: ACM-ES, which uses a comparison-based surrogate model, and aCMA-ES, which
es an active update of the covariance matrix. We will show that improvements with these methods can be impressive in terms of sample-efficiency, although this is not relevant any more for the robotic domain.
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1810.11491v1">arXiv:1810.11491v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ipz7yc6i4bedpazsvjdxkkyobq">fatcat:ipz7yc6i4bedpazsvjdxkkyobq</a> </span>
