5 Hits in 3.9 sec

RMIX: Learning Risk-Sensitive Policies for Cooperative Reinforcement Learning Agents [article]

Wei Qiu, Xinrun Wang, Runsheng Yu, Xu He, Rundong Wang, Bo An, Svetlana Obraztsova, Zinovi Rabinovich
2021 arXiv   pre-print
To address these issues, we propose RMIX, a novel cooperative MARL method with the Conditional Value at Risk (CVaR) measure over the learned distributions of individuals' Q values.  ...  Current value-based multi-agent reinforcement learning methods optimize individual Q values to guide individuals' behaviours via centralized training with decentralized execution (CTDE).  ...  Conclusion and Future Work In this paper, we propose RMIX, a novel and practical MARL method with CVaR over the learned distributions of individuals' Q values as risk-sensitive policies for cooperative  ... 
arXiv:2102.08159v3 fatcat:ccp4foqc6zfdrae2pfltaykd3y

Learning Generalizable Risk-Sensitive Policies to Coordinate in Decentralized Multi-Agent General-Sum Games [article]

Ziyi Liu, Xian Guo, Yongchun Fang
2022 arXiv   pre-print
While various multi-agent reinforcement learning methods have been proposed in cooperative settings, few works investigate how self-interested learning agents achieve mutual coordination in decentralized  ...  general-sum games and generalize pre-trained policies to non-cooperative opponents during execution.  ...  [43] propose a decentralized risk-sensitive policy LH-IQN for all agents to seek higher team rewards.  ... 
arXiv:2205.15859v1 fatcat:ev6fyiei3vh6pd2yve22idwepa

Disentangling Sources of Risk for Distributional Multi-Agent Reinforcement Learning

Kyunghwan Son, Junsu Kim, Sungsoo Ahn, Roben Delos Reyes, Yung Yi, Jinwoo Shin
2022 International Conference on Machine Learning  
To this end, we propose Disentangled RIsk-sensitive Multi-Agent reinforcement learning (DRIMA) to separately access the risk sources.  ...  In cooperative multi-agent reinforcement learning, the outcomes of agent-wise policies are highly stochastic due to the two sources of risk: (a) random actions taken by teammates and (b) random transition  ...  For such environments in single-agent settings, risk-sensitive reinforcement learning (RL) (Chow & Ghavamzadeh, 2014) has shown remarkable results by using policies that consider risk rather than simple  ... 
dblp:conf/icml/SonKARYS22 fatcat:6h3din6thfbbzdzy3gstrmzk5u

Off-Beat Multi-Agent Reinforcement Learning [article]

Wei Qiu, Weixun Wang, Rundong Wang, Bo An, Yujing Hu, Svetlana Obraztsova, Zinovi Rabinovich, Jianye Hao, Yingfeng Chen, Changjie Fan
2022 arXiv   pre-print
We investigate model-free multi-agent reinforcement learning (MARL) in environments where off-beat actions are prevalent, i.e., all actions have pre-set execution durations.  ...  It boosts multi-agent learning by addressing the challenging temporal credit assignment problem raised by the off-beat actions via our novel reward redistribution scheme, alleviating the issue of non-Markovian  ...  It replaces the Q value policy with CVaR [43] for risk-sensitive policy learning.  ... 
arXiv:2205.13718v2 fatcat:3jpom4oiazeldj7ogbp4oi5zji

The Russian Academy of Sciences, 2006 Update [article]

Iuri S. Osipov, Austin, The University Of Texas At
The scientific, educational and organizing activities of the great scientist and a person of encyclopaedic learning. Mikhail Lomonosov made as much as a whole era in the Academy's history.  ...  The press was assigned to publish all kinds (except for ecclesiastical) of the literature in the country.  ...  Along with the models for evaluating the radiation risk, IBRAE, in cooperation with the IRFChP of the National Academy of Sciences of Belarus, develops computer codes for risk analysis related with chemically  ... 
doi:10.26153/tsw/10405 fatcat:tqi3ovf3uvefja5jxzi3e7x7ea