Filters








1,679 Hits in 4.0 sec

Market-Based Reinforcement Learning in Partially Observable Worlds [article]

Ivo Kwee, Marcus Hutter, Juergen Schmidhuber
2001 arXiv   pre-print
Here we reimplement a recent approach to market-based RL and for the first time evaluate it in a toy POMDP setting.  ...  Unlike traditional reinforcement learning (RL), market-based RL is in principle applicable to worlds described by partially observable Markov Decision Processes (POMDPs), where an agent needs to learn  ...  Conclusion We have started to evaluate market-based RL in POMDP settings, focusing on the Hayek machine as a vehicle for learning to memorize relevant events in short-term memory.  ... 
arXiv:cs/0105025v1 fatcat:w22o4vqeerhn3ciz35f22gpdhu

Market-Based Reinforcement Learning in Partially Observable Worlds [chapter]

Ivo Kwee, Marcus Hutter, Jürgen Schmidhuber
2001 Lecture Notes in Computer Science  
Here we reimplement a recent approach to market-based RL and for the first time evaluate it in a toy POMDP setting.  ...  Unlike traditional reinforcement learning (RL), market-based RL is in principle applicable to worlds described by partially observable Markov Decision Processes (POMDPs), where an agent needs to learn  ...  We first give an overview of this approach and its history, then evaluate it in a POMDP setting, and discuss its potential and limitations. 2 Market-based RL: History & State of the Art Classifier systems  ... 
doi:10.1007/3-540-44668-0_120 fatcat:vdhc7xlfhbfrtazqveyxuabuue

Viewing Classifier Systems as Model Free Learning in POMDPs

Akira Hayashi, Nobuo Suematsu
1998 Neural Information Processing Systems  
In order to solve the problems, we have developed a hybrid classifier system: GLS (Generalization Learning System).  ...  In designing GLS, we view CSs as model free learning in POMDPs and take a hybrid approach to finding the best generalization, given the total number of rules.  ...  LEARNING IN POMDPS Given a policy 7r, the value of a state s, V7!' (s), is defined for POMDPs just as for MDPs. Then, the value of a message m under policy 7r, V7!'  ... 
dblp:conf/nips/HayashiS98 fatcat:aff5thqmrff4npke7jlrw52nwe

Aliased States Discerning in POMDPs and Improved Anticipatory Classifier System

Tomohiro Hayashida, Ichiro Nishizaki, Ryosuke Sakato
2014 Procedia Computer Science  
ACSM achieves greater experimental result than the existing classifier systems for the maze problems.  ...  This paper improves a classifier system, ACS (Anticipatory Classifier System).  ...  As mentioned above, the fitness values of the classifiers corresponding to the aliased states in a POMDP converge after the sufficient number of learning.  ... 
doi:10.1016/j.procs.2014.08.082 fatcat:kp3z6swgefbtfiyz7qe2ciwnai

Active cooperative perception in network robot systems using POMDPs

M T J Spaan, T S Veiga, P U Lima
2010 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems  
Partially observable Markov decisions processes (POMDPs) form an attractive framework to address planning in the uncertain environments that typify NRS.  ...  Network robot systems (NRS) provide many scientific and technological challenges, given that robots interact with each other as well as with sensors present in the environment to accomplish certain tasks  ...  Positive reward for classification is awarded only when C f still has value not yet classified.  ... 
doi:10.1109/iros.2010.5648856 dblp:conf/iros/SpaanVL10 fatcat:xztckckpozduff3h2laddqzcby

Machine Learning for Adaptive Power Management

George Theocharous
2006 Intel Technology Journal  
To improve performance of the direct approach we partition data based on the context and train the learning algorithms separately for each context.  ...  We propose a system that learns when to turn off components based on different user patterns. We describe the challenges of building such a system and progressively explore a range of solutions.  ...  ACKNOWLEDGMENTS We thank Leslie Pack Kaelbling from MIT for helpful discussions and our reviewers for valuable suggestions that significantly improved this paper.  ... 
doi:10.1535/itj.1004.05 fatcat:wyo772ywircv7n4ndhax2b36ru

Sensor Planning for Mobile Robot Localization---A Hierarchical Approach Using a Bayesian Network and a Particle Filter

Hongjun Zhou, Shigeyuki Sakane
2008 IEEE Transactions on robotics  
In this paper we propose a hierarchical approach to solving sensor planning for the global localization of a mobile robot. Our system consists of two subsystems: a lower layer and a higher layer.  ...  The higher layer uses a Bayesian network for probabilistic inference. The sensor planning takes into account both localization belief and sensing cost.  ...  [19] also proposed integrating a Kalman filter-based metric map and a POMDP-based topological map for robot localization and navigation.  ... 
doi:10.1109/tro.2007.912091 fatcat:zgq5g6vkybbyta4f6i6nudxnpm

Sensor planning for mobile robot localization - a hierarchical approach using Bayesian network and particle filter

Hongjun Zhou, S. Sakane
2005 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems  
In this paper we propose a hierarchical approach to solving sensor planning for the global localization of a mobile robot. Our system consists of two subsystems: a lower layer and a higher layer.  ...  The higher layer uses a Bayesian network for probabilistic inference. The sensor planning takes into account both localization belief and sensing cost.  ...  [19] also proposed integrating a Kalman filter-based metric map and a POMDP-based topological map for robot localization and navigation.  ... 
doi:10.1109/iros.2005.1545154 dblp:conf/iros/ZhouS05 fatcat:iioeqf2nkfbwvmovoubmjzagpm

Multi agent Markov games

K. Vimala, S. Bharathi
2022 International Journal of Health Sciences  
In the sense of decision processes, it would be preferable to handle their entities more as agents than stand – alone systems.  ...  Modern computing systems are totally different from one that worked in the last decade. They are distributed, large and heterogeneous in structure.  ...  Numeric Methods Numeric techniques for multi-agent reinforcement learning in MGs based on value functions and policy gradients.  ... 
doi:10.53730/ijhs.v6ns2.5531 fatcat:rec4k2imojar5atwqunv2mxzaa

Online Service Migration in Edge Computing with Incomplete Information: A Deep Recurrent Actor-Critic Method [article]

Jin Wang, Jia Hu, Geyong Min, Qiang Ni, Tarek El-Ghazawi
2022 arXiv   pre-print
Many existing studies make centralized migration decisions based on complete system-level information, which is time-consuming and also lacks desirable scalability.  ...  To address these challenges, we propose a novel learning-driven method, which is user-centric and can make effective online migration decisions by utilizing incomplete system-level information.  ...  They modeled the MEC environment as a POMDP and proposed a multi-agent DRL method based on independent Q-learning to learn the policy.  ... 
arXiv:2012.08679v4 fatcat:qtngy5kzwzbbdbfdpchhvog7qy

Probabilistic co-adaptive brain–computer interfacing

Matthew J Bryan, Stefan A Martin, Willy Cheung, Rajesh P N Rao
2013 Journal of Neural Engineering  
based on the output of classifiers or regression techniques.  ...  ') over brain and environment state, and (2) actions are selected based on entire belief distributions in order to maximize total expected reward; by employing methods from reinforcement learning, the  ...  W911NF-11-1-0307, the Office of Naval Research (ONR) grant N000140910097, NSF award no. 0930908, NSF Center for Sensorimotor Neural Engineering (EEC-1028725), and a Mary Gates Research Scholarship awarded  ... 
doi:10.1088/1741-2560/10/6/066008 pmid:24140680 fatcat:4xrhgpwhg5aq3nwamfikekkzi4

Decision Theoretic Modeling of Human Facial Displays [chapter]

Jesse Hoey, James J. Little
2004 Lecture Notes in Computer Science  
This avoids the need for human intervention in training data collection, and allows the models to be used without modification for facial display learning in any context without prior knowledge of the  ...  The learned model correctly predicts human actions during a simple cooperative card game based, in part, on their facial displays.  ...  Supported by the Institute for Robotics and Intelligent Systems (IRIS), and a Precarn scholarship. We thank our anonymous reviewers, Pascal Poupart, Nicole Arksey and Don Murray.  ... 
doi:10.1007/978-3-540-24672-5_3 fatcat:xoc4u67kabbczcofaccjqrog44

A Human–Robot Cooperative Learning System for Easy Installation of Assistant Robots in New Working Environments

María Elena López, Rafael Barea, Luis Miguel Bergasa, María Soledad Escudero
2004 Journal of Intelligent and Robotic Systems  
The proposed learning method, based on a modification of the EM algorithm, is able to robustly explore new environments with a low number of corridor traversals, as shown in some experiments carried out  ...  To cope with robustness and reliability requirements, the navigation system uses probabilistic reasoning (POMDPs) to globally localize the robot and to direct its goal-oriented actions.  ...  Acknowledgements The authors wish to acknowledge the contribution of the Ministerio de Ciencia y Tecnología (MCyT) for SIRAPEM project financing (DPI2002-02193).  ... 
doi:10.1023/b:jint.0000038952.66083.d1 fatcat:k2hwkkoe2jethi7mal2x3pn65q

Developing Communication Strategy for Multi-Agent Systems with Incremental Fuzzy Model

Sam Hamzeloo, Mansoor Zolghadri
2018 International Journal of Advanced Computer Science and Applications  
An incremental method is also presented to create and tune our fuzzy model that reduces the high computational complexity of the multi-agent systems.  ...  In this paper, we introduce an algorithm to develop a communication strategy for cooperative multi-agent systems in which the communication is limited.  ...  Our incremental method has reduced the high computational complexity of the multi-agent systems by constructing a compact fuzzy rule-base.  ... 
doi:10.14569/ijacsa.2018.090822 fatcat:rwld75dp7jhd5bnv5nlj7tdrae

Machine Learning for Social Multiparty Human--Robot Interaction

Simon Keizer, Mary Ellen Foster, Zhuoran Wang, Oliver Lemon
2014 ACM transactions on interactive intelligent systems (TiiS)  
., action selection for generating socially appropriate robot behaviour-which is based on reinforcement learning, using a data-driven simulation of multiple users to train execution policies for social  ...  Finally, we present an alternative unsupervised learning framework that combines social state recognition and social skills execution, based on hierarchical Dirichlet processes and an infinite POMDP interaction  ...  In addition, on the second data set, a supervised learning based POMDP model is also trained as a baseline system.  ... 
doi:10.1145/2600021 fatcat:fd7y377tpjdhbpe6jqtb4mwuly
« Previous Showing results 1 — 15 out of 1,679 results