A fast learning variable lambda TD model: Used to realize home aware robot navigation

Abdulrahman Altahhan
2014 2014 International Joint Conference on Neural Networks (IJCNN)  
This work describes a fast learning robot goalaware navigation model that employs both gradient and conjugate gradient Temporal Difference (TD, TD-conj) methods. It builds on the fact that TD-conj was proven to be equivalent to a gradient TD method with a variable lambda under certain conditions. Based on straightforward features extraction process combined with goal-aware capabilities provided by whole image measure, the model solves what we call u-turn-homing benchmark problem without using
more » ... ndmarks. Only one goal snapshot was used with agent facing the goal directly. Therefore a novel threshold stopping formula was used to recognize the goal which is less sensitive to the agent-goal orientation problem. Unlike other models, this model refrains from artificially manipulating or assuming a priori knowledge about the environment, two constraints that widely restrict the applicability of existing models in realistic scenarios. An on-line control method was used to train a set of neural networks. With the aid of variable and fixed eligibility traces, these networks approximate the agent's action-value function allowing it to take close to optimal actions to reach its home. The effectiveness of the model was experimentally verified on an agent.
doi:10.1109/ijcnn.2014.6889845 dblp:conf/ijcnn/Altahhan14 fatcat:nxikdv3wajagzgow7ch5rhgzzi