A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit <a rel="external noopener" href="https://link.springer.com/content/pdf/10.1007%2Fs10994-019-05788-0.pdf">the original URL</a>. The file type is <code>application/pdf</code>.
TD-regularized actor-critic methods
<span title="2019-02-21">2019</span>
<i title="Springer Nature">
<a target="_blank" rel="noopener" href="https://fatcat.wiki/container/h4nnd7sxwzcwhetu5qkjbcdh6u" style="color: black;">Machine Learning</a>
</i>
Actor-critic methods can achieve incredible performance on difficult reinforcement learning problems, but they are also prone to instability. This is partly due to the interaction between the actor and critic during learning, e.g., an inaccurate step taken by one of them might adversely affect the other and destabilize the learning. To avoid such issues, we propose to regularize the learning objective of the actor by penalizing the temporal difference (TD) error of the critic. This improves
<span class="external-identifiers">
<a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/s10994-019-05788-0">doi:10.1007/s10994-019-05788-0</a>
<a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/osifv5utpnft5kjlmh2xfnxktu">fatcat:osifv5utpnft5kjlmh2xfnxktu</a>
</span>
more »
... ility by avoiding large steps in the actor update whenever the critic is highly inaccurate. The resulting method, which we call the TD-regularized actor-critic method, is a simple plug-and-play approach to improve stability and overall performance of the actor-critic methods. Evaluations on standard benchmarks confirm this.
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20190505004431/https://link.springer.com/content/pdf/10.1007%2Fs10994-019-05788-0.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext">
<button class="ui simple right pointing dropdown compact black labeled icon button serp-button">
<i class="icon ia-icon"></i>
Web Archive
[PDF]
<div class="menu fulltext-thumbnail">
<img src="https://blobs.fatcat.wiki/thumbnail/pdf/eb/10/eb10c2da8271da82084779b55f3afa31ccd710ad.180px.jpg" alt="fulltext thumbnail" loading="lazy">
</div>
</button>
</a>
<a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/s10994-019-05788-0">
<button class="ui left aligned compact blue labeled icon button serp-button">
<i class="external alternate icon"></i>
springer.com
</button>
</a>