A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2010; you can also visit <a rel="external noopener" href="http://www6.in.tum.de:80/pub/Main/Publications/Sehnke2010a.pdf">the original URL</a>. The file type is <code>application/pdf</code>.
Multimodal Parameter-exploring Policy Gradients
<span title="">2010</span>
<i title="IEEE">
<a target="_blank" rel="noopener" href="https://fatcat.wiki/container/ijxfcr5nm5bezm7iddvoa2pajm" style="color: black;">2010 Ninth International Conference on Machine Learning and Applications</a>
</i>
Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estimates encountered in normal policy gradient methods. It has been shown to drastically speed up convergence for several large-scale reinforcement learning tasks. However the independent normal distributions used by PGPE to search through parameter space are inadequate for some problems with multimodal reward surfaces. This paper
<span class="external-identifiers">
<a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/icmla.2010.24">doi:10.1109/icmla.2010.24</a>
<a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/icmla/SehnkeGOS10.html">dblp:conf/icmla/SehnkeGOS10</a>
<a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/6cbpsaxav5fsbasrxo6nlywafa">fatcat:6cbpsaxav5fsbasrxo6nlywafa</a>
</span>
more »
... ends the basic PGPE algorithm to use multimodal mixture distributions for each parameter, while remaining efficient. Experimental results on the Rastrigin function and the inverted pendulum benchmark demonstrate the advantages of this modification, with faster convergence to better optima.
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20101128210438/http://www6.in.tum.de:80/pub/Main/Publications/Sehnke2010a.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext">
<button class="ui simple right pointing dropdown compact black labeled icon button serp-button">
<i class="icon ia-icon"></i>
Web Archive
[PDF]
<div class="menu fulltext-thumbnail">
<img src="https://blobs.fatcat.wiki/thumbnail/pdf/2c/01/2c0142f679e6255a66b6d6896bf5c47c75f7b59a.180px.jpg" alt="fulltext thumbnail" loading="lazy">
</div>
</button>
</a>
<a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/icmla.2010.24">
<button class="ui left aligned compact blue labeled icon button serp-button">
<i class="external alternate icon"></i>
ieee.com
</button>
</a>