A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2010; you can also visit the original URL.
The file type is
1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227)
In this paper we describe how a n actor-critic reinforcement learning agent in a non-Markovian domain nds an optimal sequence of actions in a totally modelfree fashion that is, the agent neither learns transitional probabilities and associated rewards, nor by h o w m uch the state space should be augmented so that the Markov p r o perty holds. In particular, we employ an Elman-type recurrent neural network to solve non-Markovian problems since an Elman-type network is able to implicitly anddoi:10.1109/ijcnn.1998.687169 fatcat:r44xdijcynf7jftn6kqpp362nq