A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is
An agent that must learn to act in the world by trial and error faces the reinforcement learning problem, which is quite different from standard concept learning. Although good algorithms exist for this problem in the general case, they are often quite inefficient and do not exhibit generalization. One strategy is to find restricted classes of action policies that can be learned more efficiently. This paper pursues that strategy by developing an algorithm that performans an on-line searchdoi:10.1007/bf00993348 fatcat:t7hrpohocngzndkl3sdreg5uca