612 Hits in 3.6 sec

Efficient Empowerment Estimation for Unsupervised Stabilization [article]

Ruihan Zhao, Kevin Lu, Pieter Abbeel, Stas Tiomkin
2021 arXiv   pre-print
We demonstrate our solution for sample-based unsupervised stabilization on different dynamical control systems and show the advantages of our method by comparing it to the existing VLB approaches.  ...  In this work, we propose an alternative solution based on a trainable representation of a dynamical system as a Gaussian channel, which allows us to efficiently calculate an unbiased estimator of empowerment  ...  Then, we demonstrate the advantages of our method in terms of sample efficiency and stability.  ... 
arXiv:2007.07356v2 fatcat:xnplmrqqq5ftraedme3dakolcm

Efficient Empowerment [article]

Maximilian Karl, Justin Bayer, Patrick van der Smagt
2015 arXiv   pre-print
In this work, we propose to use an efficient approximation for marginalising out the actions in the case of continuous environments.  ...  This is a natural candidate for an intrinsic reward signal in the context of reinforcement learning: the agent will place itself in a situation where its action have maximum stability and maximum influence  ...  This value can be used in reinforcement learning as the reward function and serves as an unsupervised type of control which moves the robot towards states with high stability and maximal influence.  ... 
arXiv:1509.08455v1 fatcat:fey74hyn7jb4hpwu36jjzmqagq

Learning Efficient Representation for Intrinsic Motivation [article]

Ruihan Zhao, Stas Tiomkin, Pieter Abbeel
2020 arXiv   pre-print
In this work, we develop a novel approach for the estimation of empowerment in unknown dynamics from visual observation only, without the need to sample for MIAS.  ...  This allows us to efficiently compute empowerment with the "Water-Filling" algorithm from information theory.  ...  In this work, we introduce a new method for efficient estimation of a certain type of information-theoretic intrinsic motivation, known as empowerment.  ... 
arXiv:1912.02624v3 fatcat:otkkjcmvpzgebhtm6f3g237ef4


Sammar Z. Allam
2021 Zenodo  
Demonstrating artificial Intelligence (AI) algorithmto predict energy consumption can reflect on the amount of energy required for storage to cover energy needs.  ...  Methodology:- This research methodology revolves around three main aspects where energy efficient district stabilizes Including Distributed stational technology, renewable power-plants, Internet of Things  ...  Type Classifier Accuracy (%) Execution (s) Supervised SVM 97 0.010 Supervised ANN 96 13.000 Unsupervised Mean Shift 94 0.013 Unsupervised Silhouette 98 0.012 Unsupervised K-Means  ... 
doi:10.5281/zenodo.4662691 fatcat:xoelwglowfachbovpwhvgjc744

Variational Empowerment as Representation Learning for Goal-Conditioned Reinforcement Learning

Jongwook Choi, Archit Sharma, Honglak Lee, Sergey Levine, Shixiang Shane Gu
2021 International Conference on Machine Learning  
state representations for goal reaching.  ...  comparing empowerment algorithms with different choices of latent dimensionality and discriminator parameterization.  ...  JC was partly supported by Korea Foundation for Advanced Studies.  ... 
dblp:conf/icml/ChoiSLLG21 fatcat:clrxiv5zmvgxxoka3thtb6xzqm

Reset-Free Lifelong Learning with Skill-Space Planning [article]

Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch
2021 arXiv   pre-print
We learn the skills in an unsupervised manner using intrinsic rewards and plan over the learned skills using a learned dynamics model.  ...  Moreover, our framework permits skill discovery even from offline data, thereby reducing the need for excessive real-world interactions.  ...  ACKNOWLEDGEMENTS We would like to thank Archit Sharma for advice on implementing DADS. REFERENCES Joshua Achiam, Harrison Edwards, Dario Amodei, and Pieter Abbeel. Variational option discovery  ... 
arXiv:2012.03548v3 fatcat:ri3ayz6kyvfonhc4vqrripg7si

Active Hierarchical Exploration with Stable Subgoal Representation Learning [article]

Siyuan Li, Jin Zhang, Jianhao Wang, Yang Yu, Chongjie Zhang
2022 arXiv   pre-print
In this paper, we propose a novel regularization that contributes to both stable and efficient subgoal representation learning.  ...  Building upon the stable representation, we design measures of novelty and potential for subgoals, and develop an active hierarchical exploration strategy that seeks out new promising subgoals and states  ...  The stability regularization enables us to use prioritized sampling (Hinton, 2007) to improve representation learning efficiency without hurting its stability.  ... 
arXiv:2105.14750v3 fatcat:6u5ijsizizaipoq6ay2a567o4e

Braxlines: Fast and Interactive Toolkit for RL-driven Behavior Engineering beyond Reward Maximization [article]

Shixiang Shane Gu, Manfred Diaz, Daniel C. Freeman, Hiroki Furuta, Seyed Kamyar Seyed Ghasemipour, Anton Raichuk, Byron David, Erik Frey, Erwin Coumans, Olivier Bachem
2021 arXiv   pre-print
In reinforcement learning (RL)-driven approaches, this is often accomplished through careful task reward engineering for efficient exploration and running an off-the-shelf RL algorithm.  ...  control environments, and set of stable and well-tested baselines for two families of algorithms -- mutual information maximization (MiMax) and divergence minimization (DMin) -- supporting unsupervised  ...  [45] to estimate empowerment of a trained policy. It uses straight-forward discretization for tractable non-parametric estimation in 1D and 2D.  ... 
arXiv:2110.04686v1 fatcat:iq3hr4d56jbbllvgdlcutesoci

Diversity is All You Need: Learning Skills without a Reward Function [article]

Benjamin Eysenbach, Abhishek Gupta, Julian Ibarz, Sergey Levine
2018 arXiv   pre-print
Our results suggest that unsupervised discovery of skills can serve as an effective pretraining mechanism for overcoming challenges of exploration and data efficiency in reinforcement learning.  ...  On a variety of simulated robotic tasks, we show that this simple objective results in the unsupervised emergence of diverse skills, such as walking and jumping.  ...  Akin to the use of pre-trained models in computer vision, we propose that DIAYN can serve as unsupervised pre-training for more sample-efficient finetuning of task-specific policies. Question 5.  ... 
arXiv:1802.06070v6 fatcat:giahsx3wjbhkteblz75rsidnei

Changing the Environment Based on Empowerment as Intrinsic Motivation

Christoph Salge, Cornelius Glackin, Daniel Polani
2014 Entropy  
For this purpose, we introduce an approximation of the established empowerment formalism based on sparse sampling, which is simpler and significantly faster to compute for deterministic dynamics.  ...  In this paper we investigate how the information-theoretic measure of agent empowerment can provide a task-independent, intrinsic motivation to restructure the world.  ...  Acknowledgments This research was supported by the European Commission as part of the CORBYS (Cognitive Control Framework for Robotic Systems) project under contract FP7 ICT-270219 (  ... 
doi:10.3390/e16052789 fatcat:74agankv5fd4ndd2uczurvgn44

A Mobile App to Stabilize Daily Functional Activity of Breast Cancer Patients in Collaboration With the Physician: A Randomized Controlled Clinical Trial

Marco Egbring, Elmira Far, Malgorzata Roos, Michael Dietrich, Mathis Brauchbar, Gerd A Kullak-Ublick, Andreas Trojan
2016 Journal of Medical Internet Research  
Conclusions: The mobile app was associated with stabilized daily functional activity when used under collaborative review.  ...  Patient status was self-measured using Eastern Cooperative Oncology Group scoring and Common Terminology Criteria for Adverse Events.  ...  A mean of 80% (SD 13%) was estimated by reviewing medical records of randomly selected patients at the Breast-Center Zürich.  ... 
doi:10.2196/jmir.6414 pmid:27601354 pmcid:PMC5030453 fatcat:njyizpr5jratzelonr6lljfvc4

Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning [article]

Silviu Pitis, Harris Chan, Stephen Zhao, Bradly Stadie, Jimmy Ba
2020 arXiv   pre-print
We show that our strategy achieves an order of magnitude better sample efficiency than the prior state of the art on long-horizon multi-goal tasks including maze navigation and block stacking.  ...  Zhang and the anonymous reviewers for their helpful comments.  ...  In FetchStack2 we see that OMEGA's eventual focus on the desired goal distribution is necessary for long run stability.  ... 
arXiv:2007.02832v1 fatcat:i75hyfiownchvbfp5bdjzgavky

Hierarchical intrinsically motivated agent planning behavior with dreaming in grid environments

Evgenii Dzhivelikian, Artem Latyshev, Petr Kuderov, Aleksandr I. Panov
2022 Brain Informatics  
AbstractBiologically plausible models of learning may provide a crucial insight for building autonomous intelligent agents capable of performing a wide range of tasks.  ...  We found that our proposed algorithm for the empowerment estimate cannot handle the case when from a single state different actions lead to itself, which is typical for corner positions.  ...  Empowerment is a utility function that estimates the agent's capability to influence the environment from a specified state.  ... 
doi:10.1186/s40708-022-00156-6 pmid:35366128 pmcid:PMC8976870 fatcat:2dwmi5cnszbapbcjypmrvnlb4a

Deep Intrinsically Motivated Continuous Actor-Critic for Efficient Robotic Visuomotor Skill Learning

Muhammad Burhan Hafez, Cornelius Weber, Matthias Kerzel, Stefan Wermter
2019 Paladyn: Journal of Behavioral Robotics  
Our approach is more data-efficient and inherently more stable than the existing actor-critic methods for continuous control from pixel data.  ...  Both networks receive the hidden representation of a deep convolutional autoencoder which is trained to reconstruct the visual input, while the centre-most hidden representation is also optimized to estimate  ...  Other studies demonstrate that unsupervised learning from self-generated reward leads to efficient exploration.  ... 
doi:10.1515/pjbr-2019-0005 fatcat:vngwa4ig7bamnpbchzp43sv7cm

Unsupervised Curricula for Visual Meta-Reinforcement Learning [article]

Allan Jabri, Kyle Hsu, Ben Eysenbach, Abhishek Gupta, Sergey Levine, Chelsea Finn
2019 arXiv   pre-print
functions and serves as pre-training for more efficient supervised meta-learning of test task distributions.  ...  We develop an unsupervised algorithm for inducing an adaptive meta-training task distribution, i.e. an automatic curriculum, by modeling unsupervised interaction in a visual environment.  ...  unsupervised meta-learning that transfers to downstream tasks specified by hand-crafted reward functions and serves as pre-training for more efficient  ... 
arXiv:1912.04226v1 fatcat:cer4sn5jm5ezrfmcvsvfs3t6ii
« Previous Showing results 1 — 15 out of 612 results