Filters








2,432 Hits in 25.2 sec

Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning Approach [article]

Xuezhou Zhang, Yuda Song, Masatoshi Uehara, Mengdi Wang, Alekh Agarwal, Wen Sun
2022 arXiv   pre-print
We present BRIEE (Block-structured Representation learning with Interleaved Explore Exploit), an algorithm for efficient reinforcement learning in Markov Decision Processes with block-structured dynamics  ...  (i.e., Block MDPs), where rich observations are generated from a set of unknown latent states.  ...  Function approximation Our representation learning approach to learn Block MDPs requires a feature class {Φ h } H−1 h=0 .  ... 
arXiv:2202.00063v2 fatcat:a4tvd3cq7zb37evpyr242vz2le

Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning Approach [article]

Xuezhou Zhang, Yuda Song, Masatoshi Uehara, Mengdi Wang, Wen Sun, Alekh Agarwal
2022
We present BRIEE (Block-structured Representation learning with Interleaved Explore Exploit), an algorithm for efficient reinforcement learning in Markov Decision Processes with block-structured dynamics  ...  (i.e., Block MDPs), where rich observations are generated from a set of unknown latent states.  ...  Function approximation Our representation learning approach to learn Block MDPs requires a feature class {Φ h } H−1 h=0 .  ... 
doi:10.48550/arxiv.2202.00063 fatcat:535mmwret5blxgr355xvn2xpdq

Invariant Causal Prediction for Block MDPs [article]

Amy Zhang, Clare Lyle, Shagun Sodhani, Angelos Filos, Marta Kwiatkowska, Joelle Pineau, Yarin Gal, Doina Precup
2020 arXiv   pre-print
In this paper, we consider the problem of learning abstractions that generalize in block MDPs, families of environments with a shared latent state space and dynamics structure over that latent space, but  ...  We leverage tools from causal inference to propose a method of invariant prediction to learn model-irrelevance state abstractions (MISA) that generalize to novel observations in the multi-environment setting  ...  The authors would also like to thank Marlos Machado for helpful feedback in the writing process.  ... 
arXiv:2003.06016v2 fatcat:hnqf7cfkergp3fsaoi6lbsxa6u

Denoised MDPs: Learning World Models Better Than the World Itself [article]

Tongzhou Wang, Simon S. Du, Antonio Torralba, Phillip Isola, Amy Zhang, Yuandong Tian
2022 arXiv   pre-print
This framework clarifies the kinds information removed by various prior work on representation learning in reinforcement learning (RL), and leads to our proposed approach of learning a Denoised MDP that  ...  With this ability, humans can efficiently perform real world tasks without considering all possible nuisance factors.How can artificial agents do the same?  ...  We are very thankful to Alex Lamb for suggestions and catching our typo in the conditioning of Equation (1).  ... 
arXiv:2206.15477v4 fatcat:he66y45mgfcjvp6l6d253hzkn4

Representation Learning for Online and Offline RL in Low-rank MDPs [article]

Masatoshi Uehara, Xuezhou Zhang, Wen Sun
2022 arXiv   pre-print
and exploitation, in a sample efficient manner.  ...  This work studies the question of Representation Learning in RL: how can we learn a compact low-dimensional representation such that on top of the representation we can perform RL procedures such as exploration  ...  Comparing to prior representation learning works on low-rank MDPs and block MDPs that rely on a forward step-by-step reward-free exploration framework, our algorithm interleaves representation learning  ... 
arXiv:2110.04652v3 fatcat:uqexoogxkbgjboja4ik5cj3mgy

FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs [article]

Alekh Agarwal, Sham Kakade, Akshay Krishnamurthy, Wen Sun
2020 arXiv   pre-print
Algorithmically, we develop FLAMBE, which engages in exploration and representation learning for provably efficient RL in low rank transition models.  ...  Structurally, we make precise connections between these low rank MDPs and latent variable models, showing how they significantly generalize prior formulations for representation learning in RL.  ...  Our results raise a number of promising directions for future work. On the theoretical side, can we develop provably efficient model-free algorithms for representation learning in the low rank MDP?  ... 
arXiv:2006.10814v2 fatcat:stvbyny3prbrbddy7ye74gdnza

Feature Reinforcement Learning: Part I. Unstructured MDPs

Marcus Hutter
2009 Journal of Artificial General Intelligence  
On the other hand, reinforcement learning is well-developed for small finite state Markov decision processes (MDPs).  ...  Before we can think of mechanizing this search for suitable MDPs, we need a formal objective criterion. The main contribution of this article is to develop such a criterion.  ...  Finally, I presented a complete feature reinforcement learning algorithm ΦMDP-Agent().  ... 
doi:10.2478/v10229-011-0002-8 fatcat:gqe3pv22xjclrjij5xapwwc2zi

Feature Reinforcement Learning: Part I: Unstructured MDPs [article]

Marcus Hutter
2009 arXiv   pre-print
On the other hand, reinforcement learning is well-developed for small finite state Markov decision processes (MDPs).  ...  Before we can think of mechanizing this search for suitable MDPs, we need a formal objective criterion. The main contribution of this article is to develop such a criterion.  ...  Finally, I presented a complete feature reinforcement learning algorithm ΦMDP-Agent(). The building blocks and computational flow are depicted in the following diagram: Relation to previous work.  ... 
arXiv:0906.1713v1 fatcat:guxecjuhznhc7e2blxqcuesuze

Making Linear MDPs Practical via Contrastive Representation Learning

Tianjun Zhang, Tongzheng Ren, Mengjiao Yang, Joseph Gonzalez, Dale Schuurmans, Bo Dai
2022 International Conference on Machine Learning  
However, most approaches require a given representation under unrealistic assumptions about the normalization of the decomposition or introduce unresolved computational challenges in practice.  ...  Instead, we consider an alternative definition of linear MDPs that automatically ensures normalization while allowing efficient representation learning via contrastive estimation.  ...  In this paper, we address the lingering computational efficiency issues in representation learning for linear MDPs.  ... 
dblp:conf/icml/ZhangRYGSD22 fatcat:6qzyykjwazgjjatspss2cqjnlq

Making Linear MDPs Practical via Contrastive Representation Learning [article]

Tianjun Zhang, Tongzheng Ren, Mengjiao Yang, Joseph E. Gonzalez, Dale Schuurmans, Bo Dai
2022 arXiv   pre-print
However, most approaches require a given representation under unrealistic assumptions about the normalization of the decomposition or introduce unresolved computational challenges in practice.  ...  Instead, we consider an alternative definition of linear MDPs that automatically ensures normalization while allowing efficient representation learning via contrastive estimation.  ...  In this paper, we address the lingering computational efficiency issues in representation learning for linear MDPs.  ... 
arXiv:2207.07150v1 fatcat:zhvegm6lerbinddvla2uph7sga

A Physics-Based Model Prior for Object-Oriented MDPs

Jonathan Scholz, Martin Levihn, Charles Lee Isbell Jr., David Wingate
2014 International Conference on Machine Learning  
One of the key challenges in using reinforcement learning in robotics is the need for models that capture natural world structure.  ...  We present a physics-based approach that exploits modern simulation tools to efficiently parameterize physical dynamics.  ...  We hope that this approach helps to close the representational gap between the sorts of models used in Reinforcement Learning and the models that robotics engineers use in practice.  ... 
dblp:conf/icml/ScholzLIW14 fatcat:mpr6zylcgfailitehfpgs73rti

Lifted Model Checking for Relational MDPs [article]

Wen-Chi Yang, Jean-François Raskin, Luc De Raedt
2022 arXiv   pre-print
It extends REBEL, a relational model-based reinforcement learning technique, toward relational pCTL model checking.  ...  On the other hand, it is commonly required to make relational abstractions in planning and reinforcement learning.  ...  There has been a significant interest in such relational representations in reinforcement learning and planning.  ... 
arXiv:2106.11735v2 fatcat:rhz4q7otqjao3o7uwiitwkyq4m

Feature Reinforcement Learning: Part II. Structured MDPs

Marcus Hutter
2021 Journal of Artificial General Intelligence  
I discuss all building blocks required for a complete general learning algorithm, and compare the novel ΦDBN model to the prevalent POMDP approach.  ...  The Feature Markov Decision Processes ( MDPs) model developed in Part I (Hutter, 2009b) is well-suited for learning agents in general environments.  ...  Relation between POMDP and ΦMDP/DBN. In the following I compare the prevalent POMDP approach to our novel ΦMDP approach.  ... 
doi:10.2478/jagi-2021-0003 fatcat:42r47alhf5d5vj5xugjrycqwb4

DeepAveragers: Offline Reinforcement Learning by Solving Derived Non-Parametric MDPs [article]

Aayam Shrestha, Stefan Lee, Prasad Tadepalli, Alan Fern
2020 arXiv   pre-print
We study an approach to offline reinforcement learning (RL) based on optimally solving finitely-represented MDPs derived from a static dataset of experience.  ...  DAC-MDPs are a non-parametric model that can leverage deep representations and account for limited data by introducing costs for exploiting under-represented parts of the model.  ...  On the other hand, model-based reinforcement learning (MBRL) aims to learn grounded models to improve RL's data efficiency.  ... 
arXiv:2010.08891v1 fatcat:5rwfx7z2sncs7gfudsc3lnc5t4

Model-free Representation Learning and Exploration in Low-rank MDPs [article]

Aditya Modi, Jinglin Chen, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal
2022 arXiv   pre-print
The low rank MDP has emerged as an important model for studying representation learning and exploration in reinforcement learning.  ...  In this work, we present the first model-free representation learning algorithms for low rank MDPs.  ...  Acknowledgements Part of this work was done while AM was at University of Michigan and was supported in part by a grant from the Open Philanthropy Project to the Center for Human-Compatible AI, and in  ... 
arXiv:2102.07035v2 fatcat:dizamv2qazarxggnttuwhr6pwu
« Previous Showing results 1 — 15 out of 2,432 results