Filters








83,688 Hits in 8.5 sec

Ask Your Humans: Using Human Instructions to Improve Generalization in Reinforcement Learning [article]

Valerie Chen, Abhinav Gupta, Kenneth Marino
2021 arXiv   pre-print
Complex, multi-task problems have proven to be difficult to solve efficiently in a sparse-reward reinforcement learning setting.  ...  To facilitate the automatic decomposition of hierarchical tasks, we propose the use of step-by-step human demonstrations in the form of natural language instructions and action trajectories.  ...  Approaches that aim to learn new tasks from humans must be able to use human-generated instructions.  ... 
arXiv:2011.00517v3 fatcat:uhgnxm3czfcazptknw5iw4lnk4

Ask me in your own words: paraphrasing for multitask question answering

G. Thomas Hudson, Noura Al Moubayed
2021 PeerJ Computer Science  
Multitask learning has led to significant advances in Natural Language Processing, including the decaNLP benchmark where question answering is used to frame 10 natural language understanding tasks in a  ...  We explore the addition of paraphrase detection and paraphrase generation tasks, and find that while both models are able to learn these new tasks, knowledge about paraphrasing does not transfer to other  ...  Dong et al. (2017) perform this via back-translation, while Buck et al. (2018) explore an approach based on an agent which has been trained using reinforcement learning to reformulate the input to maximise  ... 
doi:10.7717/peerj-cs.759 pmid:34805510 pmcid:PMC8576550 fatcat:grwv562m7napjji22ti6cuts2q

Successful young adults are asked – 'In your experience, what builds confidence?'

Karen Fagan, Helen Simmons, Mary Nash
2016 Aotearoa New Zealand Social Work  
This article summarises the findings from Karen Fagan's research (Successful young adults are askedIn your experience, what builds confidence?  ...  who you are (inside and out), and being able to portray who you are to others'.  ...  The research design was approved by the Massey University Ethics Committee (09/48) in July 2009. A full pdf of this research can be located at : http://muir.massey.ac.nz/handle/10179/568/browse?  ... 
doi:10.11157/anzswj-vol24iss2id130 fatcat:enzqiubyefatxbhrjnzvldtplm

Human Resource Development

1982 American Journal of Agricultural Economics  
This study investigates differences in instructional and learner factors between two groups of learners exposed to online only and blended delivery formats, respectively, in an effort to compare learning  ...  Findings indicated that no significant differences existed in learning outcomes; however, significant differences existed in several instructional and learner factors between the two delivery format groups  ...  The open-ended part of the questionnaire also asked the learners' satisfaction with instructional factors such as instructor, learning activities, group work, learning support, and suggestions to improve  ... 
doi:10.1093/ajae/64.2.410 fatcat:ymctufu5qrbgrnpaqy4tv72jje

Decision making by humans in a behavioral task: Do humans, like pigeons, show suboptimal choice?

Mikael Molet, Holly C. Miller, Jennifer R. Laude, Chelsea Kirk, Brandon Manning, Thomas R. Zentall
2012 Learning & Behavior  
In the present research, we asked whether humans would show suboptimal choice on a task involving choices with probabilities similar to those for pigeons.  ...  In Experiment 2, we found that when the inhibiting abilities of typical humans were impaired by a self-regulatory depletion manipulation, they were more likely to choose the gambling-like alternative.  ...  It is your job to destroy the enemy spaceships by using your laser-gun (the mouse). To shoot, click the left button of the mouse while the pointer is on the target.  ... 
doi:10.3758/s13420-012-0065-7 pmid:22328280 fatcat:diodkjcn4fc75ifshlbgt5o6nm

Human Resources Management [chapter]

2010 Managing Complex Projects  
Assessment -General Assessment is a part of the learning process and is designed to reinforce the course material and take you beyond.  ...  practice. • To promote excellence in practice through improved management knowledge, complementary to technical knowledge. and relate to: • Increasing your awareness of the significance people have in  ...  The groups and topics will be posted in Blackboard at the start of session. Some ground rules: 1. Contact, as soon as possible, other members in your group by using their z email addresses.  ... 
doi:10.1002/9780470927977.ch6 fatcat:gcfwofemevdpxazcuziofs5vye

Dialogue Learning With Human-In-The-Loop [article]

Jiwei Li, Alexander H. Miller, Sumit Chopra, Marc'Aurelio Ranzato, Jason Weston
2017 arXiv   pre-print
In this paper we explore this direction in a reinforcement learning setting where the bot improves its question-answering ability from feedback a teacher gives following its generated responses.  ...  An important aspect of developing conversational agents is to give a bot the ability to improve through communicating with humans and to learn from the mistakes that it makes.  ...  REINFORCEMENT LEARNING In this section, we present the algorithms we used to train MemN2N in an online fashion. Our learning setup can be cast as a particular form of Reinforcement Learning.  ... 
arXiv:1611.09823v3 fatcat:jzmspwwlfbdchn6kwxmlch7hy4

Deep reinforcement learning from human preferences [article]

Paul Christiano, Jan Leike, Tom B. Brown, Miljan Martic, Shane Legg, Dario Amodei
2017 arXiv   pre-print
For sophisticated reinforcement learning (RL) systems to interact usefully with real-world environments, we need to communicate complex goals to these systems.  ...  In this work, we explore goals defined in terms of (non-expert) human preferences between pairs of trajectory segments.  ...  We thank Tyler Adkisson, Mandy Beri, Jessica Richards, Heather Tran, and other contractors for providing the data used to train our agents.  ... 
arXiv:1706.03741v3 fatcat:b2phuyaq7fay7chweuqdkbo4ae

Human Instruction-Following with Deep Reinforcement Learning via Transfer-Learning from Text [article]

Felix Hill, Sona Mokra, Nathaniel Wong, Tim Harley
2020 arXiv   pre-print
Recent work has described neural-network-based agents that are trained with reinforcement learning (RL) to execute language-like commands in simulated worlds, as a step towards an intelligent agent or  ...  Here, we propose a conceptually simple method for training instruction-following agents with deep RL that are robust to natural human instructions.  ...  If that's the case, just make your best guess. Full human instructions This is a task called Ask to put. You will find yourself in a room.  ... 
arXiv:2005.09382v1 fatcat:ftss3t5a3nbvlcmtiemgbqxa6u

Human Patient Simulation in a Pharmacotherapy Course

Amy L. Seybert, Lawrence R. Kobulinsky, Teresa P. McKaveney
2008 American Journal of Pharmaceutical Education  
Human patient simulation provided a unique opportunity for students to apply what they learned and allowed them to practice problem-solving skills.  ...  Students showed improvement in knowledge and ability to resolve patient treatment problems, as well as in self-confidence.  ...  a Pharmacotherapy Courses Using Human Patient Simulation (N 5 89) Question Student Response a Pre- simulation Post- simulation P How confident are you in your ability to interpret a basic electrocardiogram  ... 
doi:10.5688/aj720237 pmid:18483603 pmcid:PMC2384212 fatcat:52dssztl6falvg26whobdmnxvu

Abolishing the effect of reinforcement delay on human causal learning

Marc J. Buehner, Jon May
2004 Quarterly Journal of Experimental Psychology Section B-comparative And Physiological Psychology  
Associative learning theory postulates two main determinants for human causal learning: contingency and contiguity.  ...  Temporal contiguity is thus not essential for human causal learning.  ...  GENERAL DISCUSSION Associative learning, stimulus saliency, and prior knowledge In this paper we pitched two theoretical approaches to human causal learning against each other: associationism and causal  ... 
doi:10.1080/02724990344000123 pmid:15204115 fatcat:hu7x6batbzcy7fvzkc4r7cubhq

The MineRL BASALT Competition on Learning from Human Feedback [article]

Rohin Shah, Cody Wild, Steven H. Wang, Neel Alex, Brandon Houghton, William Guss, Sharada Mohanty, Anssi Kanervisto, Stephanie Milani, Nicholay Topin, Pieter Abbeel, Stuart Russell (+1 others)
2021 arXiv   pre-print
While multiple solutions have been proposed, in this competition we focus on one in particular: learning from human feedback.  ...  To help participants get started, we provide a dataset of human demonstrations on each of the four tasks, as well as an imitation learning baseline that leverages these demonstrations.  ...  He is interested in learning from human feedback, and hopes that this competition improves the efficacy of such methods.  ... 
arXiv:2107.01969v1 fatcat:jy7epmcm2zeibgucqbwg2ze5qq

Learning to summarize from human feedback [article]

Nisan Stiennon, Long Ouyang, Jeff Wu, Daniel M. Ziegler, Ryan Lowe, Chelsea Voss, Alec Radford, Dario Amodei, Paul Christiano
2022 arXiv   pre-print
policy using reinforcement learning.  ...  In this work, we show that it is possible to significantly improve summary quality by training a model to optimize for human preferences.  ...  Acknowledgements We'd like to thank Beth Barnes for help with labeler hiring and general encouragement; Geoffrey Irving for guidance on earlier iterations of the project and inspiring conversations; Ben  ... 
arXiv:2009.01325v3 fatcat:bppzwov6gzamff3h7pyeeprlfe

Reinforcement Learning with Human Teachers: Understanding How People Want to Teach Robots

Andrea L. Thomaz, Guy Hoffman, Cynthia Breazeal
2006 ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication  
While Reinforcement Learning (RL) is not traditionally designed for interactive supervisory input from a human teacher, several works in both robot and software agents have adapted it for human input by  ...  We report three main observations on how people administer feedback when teaching a robot a task through Reinforcement Learning: (a) they use the reward channel not only for feedback, but also for future-directed  ...  If you click anywhere else, Sophie assumes your feedback pertains to everything in general.  ... 
doi:10.1109/roman.2006.314459 dblp:conf/ro-man/ThomazHB06 fatcat:hmssozyf7rdk7ghuzzi6vaiu34

Students' Perception of Cell Phones in the Classroom

2017 International Journal of Humanities Social Sciences and Education  
Katz & Lambert (2016) studied the use of positive reinforcement, in the form of extra credit, to promote the use of not using a phone during class.  ...  There are multitudes of apps and programs that allow curriculum materials to be delivered digitally and across platforms that can differentiate instruction and learning.  ... 
doi:10.20431/2349-0381.0411016 fatcat:hbvy4gxsenhizbrbsfwiollwem
« Previous Showing results 1 — 15 out of 83,688 results