A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Under-Approximating Expected Total Rewards in POMDPs
2022
Tools and Algorithms for the Construction and Analysis of Systems : 28th International Conference
We consider the problem: is the optimal expected total reward to reach a goal state in a partially observable Markov decision process (POMDP) below a given threshold? We tackle this-generally undecidable-problem by computing under-approximations on these total expected rewards. This is done by abstracting finite unfoldings of the infinite belief MDP of the POMDP. The key issue is to find a suitable under-approximation of the value function. We provide two techniques: a simple (cut-off)
doi:10.18154/rwth-2022-03987
fatcat:g3rl7ksiu5btflinhdn6dgfcau