A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit <a rel="external noopener" href="http://cfins.au.tsinghua.edu.cn/personalhg/caoxiren/papers/journal_partially%20observable%20markov%20decision%20processes%20with%20reward%20info.pdf">the original URL</a>. The file type is <code>application/pdf</code>.
Partially Observable Markov Decision Processes With Reward Information: Basic Ideas and Models
<span title="">2007</span>
<i title="Institute of Electrical and Electronics Engineers (IEEE)">
<a target="_blank" rel="noopener" href="https://fatcat.wiki/container/tiaci7xy45hczhz755zlpu5h7q" style="color: black;">IEEE Transactions on Automatic Control</a>
</i>
In a partially observable Markov decision process (POMDP), if the reward can be observed at each step, then the observed reward history contains information on the unknown state. This information, in addition to the information contained in the observation history, can be used to update the state probability distribution. The policy thus obtained is called a reward-information policy (RI-policy); an optimal RI-policy performs no worse than any normal optimal policy depending only on the
<span class="external-identifiers">
<a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/tac.2007.894520">doi:10.1109/tac.2007.894520</a>
<a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/eqscs76qpvhndcdsirro66qcxi">fatcat:eqscs76qpvhndcdsirro66qcxi</a>
</span>
more »
... ion history. The above observation leads to four different problem-formulations for POMDPs depending on whether the reward function is known and whether the reward at each step is observable. This exploratory work may attract attention to these interesting problems. Index Terms-Partially observable Markov decision process (POMDP), reward-information policy.
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170811223424/http://cfins.au.tsinghua.edu.cn/personalhg/caoxiren/papers/journal_partially%20observable%20markov%20decision%20processes%20with%20reward%20info.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext">
<button class="ui simple right pointing dropdown compact black labeled icon button serp-button">
<i class="icon ia-icon"></i>
Web Archive
[PDF]
<div class="menu fulltext-thumbnail">
<img src="https://blobs.fatcat.wiki/thumbnail/pdf/4d/52/4d5298d706dfe3e969d9cbcf87ca324cac5d6592.180px.jpg" alt="fulltext thumbnail" loading="lazy">
</div>
</button>
</a>
<a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/tac.2007.894520">
<button class="ui left aligned compact blue labeled icon button serp-button">
<i class="external alternate icon"></i>
ieee.com
</button>
</a>