Bounded Incremental Real-Time Dynamic Programming

Changjie Fan, Xiaoping Chen
2007 2007 Frontiers in the Convergence of Bioscience and Information Technologies  
A real-time multi-step planning problem is characterized by alternating decision-making and execution processes, whole online decision-making time divided in slices between each execution, and the pressing need for policy that only relates to current step. We propose a new criterion to judge the optimality of a policy based on the upper and lower bound theory. This criterion guarantees that the agent can act earlier in a real-time decision process while an optimal policy with sufficient
more » ... n still remains. We prove that, under certain conditions, one can obtain an optimal policy with arbitrary precision using such an incremental method. We present a Bounded Incremental Real-Time Dynamic Programming algorithm (BIRTDP). In the experiments of two typical real-time simulation systems, BIRTDP outperforms the other state-of-the-art RTDP algorithms tested.
doi:10.1109/fbit.2007.14 dblp:conf/fbit/FanC07 fatcat:5okhrvtu6rbzvc4r3iouvupzai