A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is
Decentralization has been widely applied in Q-learning  , which is one of the most well-known model-free-techniques for learning in unknown environment      , especially when the systems are naturally distributed [14, 15] . In Q-learning, the learning agent maintains the optimal values for all state-action entries in its Q-table. In each state, the learning agent chooses the action by the highest Q-table entry for the state. After each visit, the learning agent updates thedoi:10.1109/smc.2017.8122624 dblp:conf/smc/NguyenM17 fatcat:6tkaunvhrbh7zlw7taigpft5o4