A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is
In certain cases where the environment dynamics change dramatically, due to moving obstacles or partial agent damage, a single policy may not be sufficient. ... Therefore, maintaining a diversity of policies is necessary to provide alternatives for the system to function normally. ... Specifically, these representations are motivated by a geometric perspective of the policy as a curved surface of the policy distribution, and learned through contrastive learning and action prediction ...doi:10.25560/96985 fatcat:wtx7usqrybavnkbvpbfsknq6xm