A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
Ask Your Humans: Using Human Instructions to Improve Generalization in Reinforcement Learning
[article]
2021
arXiv
pre-print
Complex, multi-task problems have proven to be difficult to solve efficiently in a sparse-reward reinforcement learning setting. ...
To facilitate the automatic decomposition of hierarchical tasks, we propose the use of step-by-step human demonstrations in the form of natural language instructions and action trajectories. ...
Approaches that aim to learn new tasks from humans must be able to use human-generated instructions. ...
arXiv:2011.00517v3
fatcat:uhgnxm3czfcazptknw5iw4lnk4
Ask me in your own words: paraphrasing for multitask question answering
2021
PeerJ Computer Science
Multitask learning has led to significant advances in Natural Language Processing, including the decaNLP benchmark where question answering is used to frame 10 natural language understanding tasks in a ...
We explore the addition of paraphrase detection and paraphrase generation tasks, and find that while both models are able to learn these new tasks, knowledge about paraphrasing does not transfer to other ...
Dong et al. (2017) perform this via back-translation, while Buck et al. (2018) explore an approach based on an agent which has been trained using reinforcement learning to reformulate the input to maximise ...
doi:10.7717/peerj-cs.759
pmid:34805510
pmcid:PMC8576550
fatcat:grwv562m7napjji22ti6cuts2q
Successful young adults are asked – 'In your experience, what builds confidence?'
2016
Aotearoa New Zealand Social Work
This article summarises the findings from Karen Fagan's research (Successful young adults are asked – In your experience, what builds confidence? ...
who you are (inside and out), and being able to portray who you are to others'. ...
The research design was approved by the Massey University Ethics Committee (09/48) in July 2009. A full pdf of this research can be located at : http://muir.massey.ac.nz/handle/10179/568/browse? ...
doi:10.11157/anzswj-vol24iss2id130
fatcat:enzqiubyefatxbhrjnzvldtplm
Human Resource Development
1982
American Journal of Agricultural Economics
This study investigates differences in instructional and learner factors between two groups of learners exposed to online only and blended delivery formats, respectively, in an effort to compare learning ...
Findings indicated that no significant differences existed in learning outcomes; however, significant differences existed in several instructional and learner factors between the two delivery format groups ...
The open-ended part of the questionnaire also asked the learners' satisfaction with instructional factors such as instructor, learning activities, group work, learning support, and suggestions to improve ...
doi:10.1093/ajae/64.2.410
fatcat:ymctufu5qrbgrnpaqy4tv72jje
Decision making by humans in a behavioral task: Do humans, like pigeons, show suboptimal choice?
2012
Learning & Behavior
In the present research, we asked whether humans would show suboptimal choice on a task involving choices with probabilities similar to those for pigeons. ...
In Experiment 2, we found that when the inhibiting abilities of typical humans were impaired by a self-regulatory depletion manipulation, they were more likely to choose the gambling-like alternative. ...
It is your job to destroy the enemy spaceships by using your laser-gun (the mouse). To shoot, click the left button of the mouse while the pointer is on the target. ...
doi:10.3758/s13420-012-0065-7
pmid:22328280
fatcat:diodkjcn4fc75ifshlbgt5o6nm
Human Resources Management
[chapter]
2010
Managing Complex Projects
Assessment -General Assessment is a part of the learning process and is designed to reinforce the course material and take you beyond. ...
practice. • To promote excellence in practice through improved management knowledge, complementary to technical knowledge. and relate to: • Increasing your awareness of the significance people have in ...
The groups and topics will be posted in Blackboard at the start of session. Some ground rules: 1. Contact, as soon as possible, other members in your group by using their z email addresses. ...
doi:10.1002/9780470927977.ch6
fatcat:gcfwofemevdpxazcuziofs5vye
Dialogue Learning With Human-In-The-Loop
[article]
2017
arXiv
pre-print
In this paper we explore this direction in a reinforcement learning setting where the bot improves its question-answering ability from feedback a teacher gives following its generated responses. ...
An important aspect of developing conversational agents is to give a bot the ability to improve through communicating with humans and to learn from the mistakes that it makes. ...
REINFORCEMENT LEARNING In this section, we present the algorithms we used to train MemN2N in an online fashion. Our learning setup can be cast as a particular form of Reinforcement Learning. ...
arXiv:1611.09823v3
fatcat:jzmspwwlfbdchn6kwxmlch7hy4
Deep reinforcement learning from human preferences
[article]
2017
arXiv
pre-print
For sophisticated reinforcement learning (RL) systems to interact usefully with real-world environments, we need to communicate complex goals to these systems. ...
In this work, we explore goals defined in terms of (non-expert) human preferences between pairs of trajectory segments. ...
We thank Tyler Adkisson, Mandy Beri, Jessica Richards, Heather Tran, and other contractors for providing the data used to train our agents. ...
arXiv:1706.03741v3
fatcat:b2phuyaq7fay7chweuqdkbo4ae
Human Instruction-Following with Deep Reinforcement Learning via Transfer-Learning from Text
[article]
2020
arXiv
pre-print
Recent work has described neural-network-based agents that are trained with reinforcement learning (RL) to execute language-like commands in simulated worlds, as a step towards an intelligent agent or ...
Here, we propose a conceptually simple method for training instruction-following agents with deep RL that are robust to natural human instructions. ...
If that's the case, just make your best guess.
Full human instructions This is a task called Ask to put. You will find yourself in a room. ...
arXiv:2005.09382v1
fatcat:ftss3t5a3nbvlcmtiemgbqxa6u
Human Patient Simulation in a Pharmacotherapy Course
2008
American Journal of Pharmaceutical Education
Human patient simulation provided a unique opportunity for students to apply what they learned and allowed them to practice problem-solving skills. ...
Students showed improvement in knowledge and ability to resolve patient treatment problems, as well as in self-confidence. ...
a Pharmacotherapy Courses Using Human Patient Simulation (N 5 89) Question
Student Response a
Pre-
simulation
Post-
simulation
P
How confident are you in your ability to interpret a basic electrocardiogram ...
doi:10.5688/aj720237
pmid:18483603
pmcid:PMC2384212
fatcat:52dssztl6falvg26whobdmnxvu
Abolishing the effect of reinforcement delay on human causal learning
2004
Quarterly Journal of Experimental Psychology Section B-comparative And Physiological Psychology
Associative learning theory postulates two main determinants for human causal learning: contingency and contiguity. ...
Temporal contiguity is thus not essential for human causal learning. ...
GENERAL DISCUSSION Associative learning, stimulus saliency, and prior knowledge In this paper we pitched two theoretical approaches to human causal learning against each other: associationism and causal ...
doi:10.1080/02724990344000123
pmid:15204115
fatcat:hu7x6batbzcy7fvzkc4r7cubhq
The MineRL BASALT Competition on Learning from Human Feedback
[article]
2021
arXiv
pre-print
While multiple solutions have been proposed, in this competition we focus on one in particular: learning from human feedback. ...
To help participants get started, we provide a dataset of human demonstrations on each of the four tasks, as well as an imitation learning baseline that leverages these demonstrations. ...
He is interested in learning from human feedback, and
hopes that this competition improves the efficacy of such methods. ...
arXiv:2107.01969v1
fatcat:jy7epmcm2zeibgucqbwg2ze5qq
Learning to summarize from human feedback
[article]
2022
arXiv
pre-print
policy using reinforcement learning. ...
In this work, we show that it is possible to significantly improve summary quality by training a model to optimize for human preferences. ...
Acknowledgements We'd like to thank Beth Barnes for help with labeler hiring and general encouragement; Geoffrey Irving for guidance on earlier iterations of the project and inspiring conversations; Ben ...
arXiv:2009.01325v3
fatcat:bppzwov6gzamff3h7pyeeprlfe
Reinforcement Learning with Human Teachers: Understanding How People Want to Teach Robots
2006
ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication
While Reinforcement Learning (RL) is not traditionally designed for interactive supervisory input from a human teacher, several works in both robot and software agents have adapted it for human input by ...
We report three main observations on how people administer feedback when teaching a robot a task through Reinforcement Learning: (a) they use the reward channel not only for feedback, but also for future-directed ...
If you click anywhere else, Sophie assumes your feedback pertains to everything in general. ...
doi:10.1109/roman.2006.314459
dblp:conf/ro-man/ThomazHB06
fatcat:hmssozyf7rdk7ghuzzi6vaiu34
Students' Perception of Cell Phones in the Classroom
2017
International Journal of Humanities Social Sciences and Education
Katz & Lambert (2016) studied the use of positive reinforcement, in the form of extra credit, to promote the use of not using a phone during class. ...
There are multitudes of apps and programs that allow curriculum materials to be delivered digitally and across platforms that can differentiate instruction and learning. ...
doi:10.20431/2349-0381.0411016
fatcat:hbvy4gxsenhizbrbsfwiollwem
« Previous
Showing results 1 — 15 out of 83,688 results