A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2011; you can also visit the original URL.
The file type is
Learning algorithms often obtain relatively low average payoffs in repeated general-sum games between other learning agents due to a focus on myopic best-response and one-shot Nash equilibrium (NE) strategies. A less myopic approach places focus on NEs of the repeated game, which suggests that (at the least) a learning agent should possess two properties. First, an agent should never learn to play a strategy that produces average payoffs less than the minimax value of the game. Second, an agentdoi:10.1145/1102351.1102372 dblp:conf/icml/CrandallG05 fatcat:twtgdip535budmnwcj2kz2hsce