17,040 Hits in 3.8 sec

Active Advice Seeking for Inverse Reinforcement Learning

Phillip Odom, Sriraam Natarajan
We consider the problem of actively soliciting human advice in an inverse reinforcement learning setting where the utilities are learned from demonstrations.  ...  While the active learning formalism allows for these systems to incrementally acquire demonstrations from the human expert, most learning systems require all the advice about the domain in advance.  ...  Active Advice Seeking The main difficulty for active advice seeking is deciding the parts of the state/action space where advice would be most beneficial.  ... 
doi:10.1609/aaai.v29i1.9722 fatcat:6noqv7c4yrgjxnjedjrqywt36m

Object Affordance Driven Inverse Reinforcement Learning Through Conceptual Abstraction and Advice

Rupam Bhattacharyya, Shyamanta M. Hazarika
2018 Paladyn: Journal of Behavioral Robotics  
An architecture for recognizing human intent is presented which consists of an extended Maximum Likelihood Inverse Reinforcement Learning agent.  ...  Within human Intent Recognition (IR), a popular approach to learning from demonstration is Inverse Reinforcement Learning (IRL).  ...  Acknowledgement: The authors would also like to thank Zubin Bhuyan, University of Massachusetts Lowell, for the discussion regarding IRL.  ... 
doi:10.1515/pjbr-2018-0021 fatcat:m6a2fm5ja5elnp25prgy6mnjp4

Explore, Exploit or Listen: Combining Human Feedback and Policy Model to Speed up Deep Reinforcement Learning in 3D Worlds [article]

Zhiyu Lin, Brent Harrison, Aaron Keech, Mark O. Riedl
2021 arXiv   pre-print
We describe a method to use discrete human feedback to enhance the performance of deep learning agents in virtual three-dimensional environments by extending deep-reinforcement learning to model the confidence  ...  This enables deep reinforcement learning algorithms to determine the most appropriate time to listen to the human feedback, exploit the current policy model, or explore the agent's environment.  ...  Inverse reinforcement learning (Abbeel and Ng 2004; El Asri et al. 2016), on the other hand, seeks to directly engineer a reward function based on examples of optimal behavior provided by human trainers  ... 
arXiv:1709.03969v2 fatcat:inras67hrzdbrbdzbdrmrcpety

A Structural Equation Model of Achievement Emotions, Coping Strategies and Engagement-Burnout in Undergraduate Students: A Possible Underlying Mechanism in Facets of Perfectionism

Jesús de la Fuente, Francisca La Hortiga-Ramos, Carmen Laspra-Solís, Cristina Maestro-Martín, Irene Alustiza, Enrique Aubá, Raquel Martín-Lanas
2020 International Journal of Environmental Research and Public Health  
Achievement emotions that the university student experiences in the learning process can be significant in facilitating or interfering with learning.  ...  The present research looked for linear and predictive relations between university students' achievement emotions, coping strategies, and engagement-burnout, in three different learning situations (classroom  ...  Seeking alternative reinforcements 0.80).  ... 
doi:10.3390/ijerph17062106 pmid:32235741 pmcid:PMC7143652 fatcat:zc4l5rxrozgdnlabv324vsuto4

Preference Learning in Assistive Robotics: Observational Repeated Inverse Reinforcement Learning

Bryce Woodworth, Francesco Ferrari, Teofilo E. Zosa, Laurel D. Riek
2018 Machine Learning in Health Care  
As robots become more affordable and more common in everyday life, there will be an ever-increasing demand for adaptive behavior that is personalized to the individual needs of users.  ...  To accomplish this, robots will need to learn about their users' unique preferences through interaction.  ...  Narboneta for designing the study robot's interface layout.  ... 
dblp:conf/mlhc/WoodworthFZR18 fatcat:f3jraxywsze55b24uetgl4r6lq

Generative Adversarial Imitation Learning for Empathy-based AI [article]

Pratyush Muthukumar, Karishma Muthukumar, Deepan Muthirayan, Pramod Khargonekar
2021 arXiv   pre-print
Our novel GAIL model utilizes a sentiment analysis history-based reinforcement learning approach to empathetically respond to human interactions in a personalized manner.  ...  In this paper, we utilize the GAIL model for text generation to develop empathy-based context-aware conversational AI.  ...  Note that ψ-regularized inverse reinforcement learning implicitly seeks a policy where its occupancy measure ρ is close to the expert.  ... 
arXiv:2105.13328v1 fatcat:agbxrbl4rbhrhbz4zsgkll3w2q

Guiding Autonomous Agents to Better Behaviors through Human Advice

Gautam Kunapuli, Phillip Odom, Jude W. Shavlik, Sriraam Natarajan
2013 2013 IEEE 13th International Conference on Data Mining  
Inverse Reinforcement Learning (IRL) is an approach for domain-reward discovery from demonstration, where an agent mines the reward function of a Markov decision process by observing an expert acting in  ...  In the standard setting, it is assumed that the expert acts (nearly) optimally, and a large number of trajectories, i.e., training examples are available for reward discovery (and consequently, learning  ...  CONCLUSIONS AND FUTURE WORK We propose a novel methodology for incorporating expert advice into the inverse reinforcement learning framework.  ... 
doi:10.1109/icdm.2013.79 dblp:conf/icdm/KunapuliOSN13 fatcat:gbfi5qti55d63csdwybb2rpmuu

Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion

J. Zico Kolter, Pieter Abbeel, Andrew Y. Ng
2007 Neural Information Processing Systems  
In this paper we propose a method for hierarchical apprenticeship learning, which allows the algorithm to accept isolated advice at different hierarchical levels of the control task.  ...  This type of advice is often feasible for experts to give, even if the expert is unable to demonstrate complete trajectories.  ...  Acknowledgements We gratefully acknowledge the anonymous reviewers for helpful suggestions. This work was supported by the DARPA Learning Locomotion program under contract number FA8650-05-C-7261.  ... 
dblp:conf/nips/KolterAN07 fatcat:iejybdrsnnazdmyoz424ciaqw4

Attenuation of cocaine and heroin seeking by μ-opioid receptor antagonism

Chiara Giuliano, Trevor W. Robbins, David R. Wille, Edward T. Bullmore, Barry J. Everitt
2013 Psychopharmacology  
GSK1521498, but not NTX, dose-dependently reduced heroin seeking both before and after infusion of the drug although both increased heroin self-administration under continuous reinforcement.  ...  Objectives The aim of the present experiments was to investigate the effects of a novel selective μ-opioid receptor antagonist GSK1521498 on cocaine and heroin seeking and the primary reinforcement of  ...  The authors thank Daina Economidou for advice and assistance on the experiments; Kristin Patterson and Ramprakash Govindarajan for providing the GSK1521498 solutions; and members of the GSK1521498 project  ... 
doi:10.1007/s00213-012-2949-9 pmid:23299095 pmcid:PMC3622002 fatcat:lhrkutnnarevhe625nyvhsaqya

Reinforcement learning across development: What insights can we draw from a decade of research?

Kate Nussenbaum, Catherine A. Hartley
2019 Developmental Cognitive Neuroscience  
The past decade has seen the emergence of the use of reinforcement learning models to study developmental change in value-based learning.  ...  We focus specifically on learning rates and inverse temperature parameter estimates, and find evidence that from childhood to adulthood, individuals become better at optimally weighting recent outcomes  ...  Across reinforcement learning studies, developmental change in the inverse temperature parameter follows a somewhat consistent pattern.  ... 
doi:10.1016/j.dcn.2019.100733 pmid:31770715 pmcid:PMC6974916 fatcat:wj3w3xzypvh3thj23heurzlfzq

Hierarchical prediction errors in midbrain and septum during social learning

Andreea O. Diaconescu, Christoph Mathys, Lilian A. E. Weber, Lars Kasper, Jan Mauer, Klaas E. Stephan
2017 Social Cognitive and Affective Neuroscience  
Low-level prediction errors (PEs) about advice accuracy not only activated regions known to support 'theory of mind', but also the dopaminergic midbrain.  ...  These findings, replicated in both samples, have important implications: They suggest that social learning rests on hierarchically related PEs encoded by midbrain and septum activity, respectively, in  ...  Acknowledgements We are grateful for support by the UZH Forschungskredit (AOD), the Rene ´and Susanne Braginsky Foundation (KES), the University of Zurich (KES) and the UZH Clinical Research Priority Program  ... 
doi:10.1093/scan/nsw171 pmid:28119508 pmcid:PMC5390746 fatcat:4qoodxpkdfeo7mvndusefq3k4m

Page 718 of Linguistics and Language Behavior Abstracts: LLBA Vol. 23, Issue 2 [page]

1989 Linguistics and Language Behavior Abstracts: LLBA  
revisions; qualitative case study; upperclass college students; 8902622 written revision practices, word processing influence; formal observa- tions; 6th graders; 8902618 Computer-Generated Language Analysis advice-seeking  ...  children; 8902204 computer-assisted story production, formal systems theory application; 8903446 content area reading instruction, computer-assisted activities de- scribed; 8902414 English as a second  ... 

Model-Free Risk-Sensitive Reinforcement Learning [article]

Grégoire Delétang, Jordi Grau-Moya, Markus Kunesch, Tim Genewein, Rob Brekelmans, Shane Legg, Pedro A. Ortega
2021 arXiv   pre-print
We extend temporal-difference (TD) learning in order to obtain risk-sensitive, model-free reinforcement learning algorithms.  ...  As a result, one obtains a stochastic approximation rule for estimating the free energy from i.i.d. samples generated by a Gaussian distribution with unknown mean and variance.  ...  Risk-sensitive control has a long history in control theory [Coraluppi, 1997] and is an active area of research within reinforcement learning (RL).  ... 
arXiv:2111.02907v1 fatcat:sw3d2mske5hhtkwbln5wz25xj4

Activating the Informational Capabilities of Information Technology for Organizational Change

Paul M. Leonardi
2007 Organization science (Providence, R.I.)  
Armed with such information, technicians began to seek advice differently than they had before, which led to an overall transformation in the organization's social structure.  ...  I characterize appropriations of a technology's features as a set of practices that activate the informational capabilities of a new technology through advice networks.  ...  Three anonymous reviewers and Senior Editor Ann Majchrzak also offered invaluable advice for improving this manuscript.  ... 
doi:10.1287/orsc.1070.0284 fatcat:onffzwulw5eipoi773dohp4724

How Active Inference Could Help Revolutionise Robotics

Lancelot Da Costa, Pablo Lanillos, Noor Sajid, Karl Friston, Shujhat Khan
2022 Entropy  
In short, active inference leverages the processes thought to underwrite human behaviour to build effective autonomous systems.  ...  In this paper, we explain how active inference—a well-known description of sentient behaviour from neuroscience—can be exploited in robotics.  ...  Acknowledgments: The authors thank Jeroen Infographics for designing the Figure . The authors thank Areeb Mian and Sima Al-Asad for helpful input on a previous version of the manuscript.  ... 
doi:10.3390/e24030361 pmid:35327872 pmcid:PMC8946999 fatcat:qiyc4hul3ffjjmq5buq524mgiu
« Previous Showing results 1 — 15 out of 17,040 results