Filters








51 Hits in 2.8 sec

A multiagent variant of Dyna-Q

G. Weiss
Proceedings Fourth International Conference on MultiAgent Systems  
This paper describes a multiagent variant of Dyna-Q called M-Dyna-Q. Dyna-Q is an integrated single-agent framework for planning, reacting, and learning.  ...  Like Dyna-Q, M-Dyna-Q employs two key ideas: learning results can serve as a valuable input for both planning and reacting, and results of planning and reacting can serve as a valuable input to learning  ...  M-Dyna-Q in its current form is a rather straightforward multiagent realization of Dyna-Q that still shows several limitations.  ... 
doi:10.1109/icmas.2000.858525 dblp:conf/icmas/Weiss00 fatcat:gi7ese7vojepzh6qtan24burny

Deep Residual Reinforcement Learning [article]

Shangtong Zhang, Wendelin Boehmer, Shimon Whiteson
2020 arXiv   pre-print
We propose the bidirectional target network technique to stabilize residual algorithms, yielding a residual version of DDPG that significantly outperforms vanilla DDPG in the DeepMind Control Suite benchmark  ...  Compared with the existing TD(k) method, our residual-based method makes weaker assumptions about the model and yields a greater performance boost.  ...  We define a shorthand д t ηγ ∇ θ Q(s t +1 , µ(s t +1 )) − ∇ θ Q(s t , a t ) and the update rule for θ is θ ← θ −α 1 (r t +1 +∆)д t , where ∆ is different for different variants.  ... 
arXiv:1905.01072v3 fatcat:x46i7xwbgbdnxdembxejerii3u

Model-Based Reinforcement Learning in Multiagent Systems with Sequential Action Selection

Ali AKRAMIZADEH, Ahmad AFSHAR, Mohammad Bagher MENHAJ, Samira JAFARI
2011 IEICE transactions on information and systems  
This is especially interesting in multiagent systems, since a large number of experiences are necessary to achieve a good performance.  ...  The algorithm is proved to be convergent and discussed based on the new results on the convergence of the traditional prioritized sweeping. key words: multiagent systems, Markov games, model-based reinforcement  ...  We refer to the set of preferences of all agents as Extended Q-values, Q i = [Q 1 , . . . , Q i , . . . , Q N ] Model-Based Multiagent Reinforcement Learning Learning in EMGs Learning in multiagent  ... 
doi:10.1587/transinf.e94.d.255 fatcat:cpjopvqr6fesjmoxfzaaz5p2t4

Reinforcement Learning for Hybrid and Plug-In Hybrid Electric Vehicle Energy Management: Recent Advances and Prospects

Xiasong Hu, Teng Liu, Xuewei Qi, Matthew Barth
2019 IEEE Industrial Electronics Magazine  
In this article, we describe the energy-management issues of HEVs/PHEVs and summarize a variety of potential DRL applications for onboard energy management.  ...  E nergy management is a critical technology in plug-in hybrid-electric vehicles (PHEVs) for maximizing efficiency, fuel economy, and range, as well as reducing pollutant emissions.  ...  Dyna and Q-learning are respectively applied in the energy management of a hybrid tracked vehicle, and their performance is compared in simulation results.  ... 
doi:10.1109/mie.2019.2913015 fatcat:m5hln6n74zc3fcccmuw5ah3omy

A Survey of Learning in Multiagent Environments: Dealing with Non-Stationarity [article]

Pablo Hernandez-Leal, Michael Kaisers, Tim Baarslag, Enrique Munoz de Cote
2019 arXiv   pre-print
The key challenge in multiagent learning is learning a best response to the behaviour of other agents, which may be non-stationary: if the other agents adapt their strategy as well, the learning target  ...  A wide range of state-of-the-art algorithms is classified into a taxonomy, using these categories and key characteristics of the environment (e.g., observability) and adaptation behaviour of the opponents  ...  This work is part of the Veni research programme with project number 639.021.751, which is financed by the Netherlands Organisation for Scientific Research (NWO).  ... 
arXiv:1707.09183v2 fatcat:mnducjpn7zawpnw3u6wnhhc6k4

2021 Index IEEE Transactions on Neural Networks and Learning Systems Vol. 32

2021 IEEE Transactions on Neural Networks and Learning Systems  
The primary entry includes the coauthors' names, the title of the paper or other item, and its location, specified by the publication abbreviation, year, month, and inclusive pagination.  ...  Adaptive Neural Control for a Class of Nonlinear Multiagent Systems.  ...  ,Cao, J., Li, Q., and Liu, Q., Neural-Network-Based Fully Distributed Zhang, H., see Liang, C., TNNLS Sept. 2021 3831-3845 Zhang, H., Adaptive Consensus for a Class of Uncertain Multiagent Systems; TNNLS  ... 
doi:10.1109/tnnls.2021.3134132 fatcat:2e7comcq2fhrziselptjubwjme

Une approche multi-agent pour la gestion de la communication dans les réseaux de capteurs sans fil

Jean-Paul Jamont, Michel Occello
2006 Techniques et sciences informatiques  
It proposes an original approach of open multiagent systems in the context of wireless networks of intelligent sensors. Such networks are composed by sensors with independant energy sources.  ...  This paper deals with multiagent self-organization giving adaptive features to a distributed system embodied in an agressive environment.  ...  automatiques et dyna-miques.  ... 
doi:10.3166/tsi.25.661-690 fatcat:2kj5exkh35b57c4lusko4wbele

A Survey of Planning and Learning in Games

Fernando Fradique Duarte, Nuno Lau, Artur Pereira, Luis Paulo Reis
2020 Applied Sciences  
This paper presents a survey of the multiple methodologies that have been proposed to integrate planning and learning in the context of games.  ...  In general, games pose interesting and complex problems for the implementation of intelligent agents and are a popular domain in the study of artificial intelligence.  ...  Dyna-PI, based on the policy iteration method, and Dyna-Q, based on the Q-Learning algorithm, are two examples of this [264] .  ... 
doi:10.3390/app10134529 fatcat:wc27eo2wmvd6lclar7yteyj6cm

Reinforcement learning meets minority game: Toward optimal resource allocation

Si-Ping Zhang, Jia-Qi Dong, Li Liu, Zi-Gang Huang, Liang Huang, Ying-Cheng Lai
2019 Physical review. E  
A ubiquitous dynamical phenomenon is the emergence of herding, where a vast majority of the users concentrate on a small number of resources, leading to a low efficiency in resource allocation.  ...  Previous works focused on control strategies that rely on external interventions, such as pinning control where a fraction of users are forced to choose a certain action.  ...  -Q. Zhang  ... 
doi:10.1103/physreve.99.032302 fatcat:pelohehctjbmvmqrjtnzv2ykia

Reinforcement Learning-Empowered Mobile Edge Computing for 6G Edge Intelligence [article]

Peng Wei, Kun Guo, Ye Li, Jue Wang, Wei Feng, Shi Jin, Ning Ge, Ying-Chang Liang
2022 arXiv   pre-print
This paper provides a comprehensive research review on RL-enabled MEC and offers insight for development in this area.  ...  Mobile edge computing (MEC) is considered a novel paradigm for computation-intensive and delay-sensitive tasks in fifth generation (5G) networks and beyond.  ...  To accelerate the search speed of the Q-learning algorithm, based on multiagent RL, a cooperative Q-learning algorithm was proposed in [142] , where new agents can obtain efficient training and learning  ... 
arXiv:2201.11410v4 fatcat:24igkq4kbrb2pjzwf3mf3n7qtq

Reinforcement Learning for IoT Security: A Comprehensive Survey

Aashma Uprety, Danda B. Rawat
2020 IEEE Internet of Things Journal  
Securing billions of B connected devices in IoT is a must task to realize the full potential of IoT applications. Recently, researchers have proposed many security solutions for IoT.  ...  With this paper, readers can have a more thorough understanding of IoT security attacks and countermeasures using Reinforcement Learning, as well as research trends in this area.  ...  But here the authors have compared the performance of the RL agent on the following two algorithms. They have compared the performance of Q-learning and Dyna-Q algorithm.  ... 
doi:10.1109/jiot.2020.3040957 fatcat:qtm2emhqmjhlncyczuij6nw3oa

Reinforcement Learning-Empowered Mobile Edge Computing for 6G Edge Intelligence

Peng Wei, Kun Guo, Ye Li, Jue Wang, Wei Feng, Shi Jin, Ning Ge, Ying-Chang Liang
2022 IEEE Access  
This paper provides a comprehensive research review on RL-enabled MEC and offers insight for development in this area.  ...  Mobile edge computing (MEC) is considered a novel paradigm for computation-intensive and delay-sensitive tasks in fifth generation (5G) networks and beyond.  ...  To accelerate the search speed of the Q-learning algorithm, based on multiagent RL, a cooperative Q-learning algorithm was proposed in [144] , where new agents can obtain efficient training and learning  ... 
doi:10.1109/access.2022.3183647 fatcat:pd5z6q4innd5jl25g4r7b4nq3i

Centralized Model and Exploration Policy for Multi-Agent RL [article]

Qizhen Zhang, Chris Lu, Animesh Garg, Jakob Foerster
2022 arXiv   pre-print
robots or a team of quadcopters.  ...  Our key insight is that using just a polynomial number of samples, one can learn a centralized model that generalizes across different policies.  ...  Acknowledgements Authors thank Wendelin Böhmer, Amir-massoud Farahmand, Keiran Paster, Claas Voelcker, and Stephen Zhao for insightful discussions and/or feedbacks on the drafts of the paper.  ... 
arXiv:2107.06434v2 fatcat:lyemu3umn5defm2hwxae2mgara

Dynamical systems as a level of cognitive analysis of multi-agent learning

Wolfram Barfuss
2021 Neural computing & applications (Print)  
I find that its deterministic dynamical systems description follows a minimum free-energy principle and unifies a boundedly rational account of game theory with decision-making under uncertainty.  ...  I find that this algorithm serves as a micro-foundation of the deterministic learning equations by showing that its learning trajectories approach the ones of the deterministic learning equations under  ...  It is based on a previously published extended abstract [5] . I thank Richard P. Mann for helpful comments on the manuscript.  ... 
doi:10.1007/s00521-021-06117-0 pmid:35221541 pmcid:PMC8827307 fatcat:y3oxx2kglvfpjd5onhriky5g44

Deep Model-Based Reinforcement Learning for High-Dimensional Problems, a Survey [article]

Aske Plaat, Walter Kosters, Mike Preuss
2020 arXiv   pre-print
We use these approaches to organize a comprehensive overview of important recent developments such as latent models.  ...  We propose a taxonomy based on three approaches: using explicit planning on given transitions, using explicit planning on learned transitions, and end-to-end learning of both planning and transitions.  ...  ACKNOWLEDGMENTS We thank the members of the Leiden Reinforcement Learning Group, and especially Thomas Moerland and Mike Huisman, for many discussions and insights.  ... 
arXiv:2008.05598v2 fatcat:5xmwmemv5bfinkw57avf5ghhxq
« Previous Showing results 1 — 15 out of 51 results