Filters








225,363 Hits in 3.1 sec

Learning to summarize from human feedback [article]

Nisan Stiennon, Long Ouyang, Jeff Wu, Daniel M. Ziegler, Ryan Lowe, Chelsea Voss, Alec Radford, Dario Amodei, Paul Christiano
2022 arXiv   pre-print
We hope the evidence from our paper motivates machine learning researchers to pay closer attention to how their training loss affects the model behavior they actually want.  ...  We collect a large, high-quality dataset of human comparisons between summaries, train a model to predict the human-preferred summary, and use that model as a reward function to fine-tune a summarization  ...  present the work, experiment design, and what datasets to use; Shan Carter for help designing the main diagram; Douwe Kiela, Zach Lipton, and Alex Irpan for providing feedback on the paper; and Gretchen  ... 
arXiv:2009.01325v3 fatcat:bppzwov6gzamff3h7pyeeprlfe

Make The Most of Prior Data: A Solution for Interactive Text Summarization with Preference Feedback [article]

Duy-Hung Nguyen and Nguyen Viet Dung Nghiem and Bao-Sinh Nguyen and Dung Tien Le and Shahab Sabahi and Minh-Tien Nguyen and Hung Le
2022 arXiv   pre-print
For summarization, human preference is critical to tame outputs of the summarizer in favor of human interests, as ground-truth summaries are scarce and ambiguous.  ...  In this paper, we introduce a new framework to train summarization models with preference feedback interactively.  ...  Acknowledgement We would like to thank anonymous ACL ARR reviewers and Senior Area Chairs who gave constructive comments for our paper.  ... 
arXiv:2204.05512v2 fatcat:4qg5icc2cjcvzlaqoq26j6cftm

Training Language Models with Language Feedback [article]

Jérémy Scheurer, Jon Ander Campos, Jun Shern Chan, Angelica Chen, Kyunghyun Cho, Ethan Perez
2022 arXiv   pre-print
Using only 100 samples of human-written feedback, our learning algorithm finetunes a GPT-3 model to roughly human-level summarization.  ...  Here, we propose to learn from natural language feedback, which conveys more information per human evaluation. We learn from language feedback on model outputs using a three-step learning algorithm.  ...  We follow prior work on learning from human preferences (Stiennon et al., 2020) and learn to summarize Reddit posts from Völske et al. (2017) .  ... 
arXiv:2204.14146v3 fatcat:27w63k7wofchdna7cazqbepxey

Putting Humans in the Natural Language Processing Loop: A Survey [article]

Zijie J. Wang, Dongjin Choi, Shenyu Xu, Diyi Yang
2021 arXiv   pre-print
How can we design Natural Language Processing (NLP) systems that learn from human feedback?  ...  HITL NLP research is nascent but multifarious -- solving various NLP problems, collecting diverse feedback from different people, and applying different methods to learn from collected feedback.  ...  We summarize recent literature on HITL NLP from both NLP and HCI communities, and position each work with respect to its task, goal, human interaction, and feedback learning method.  ... 
arXiv:2103.04044v1 fatcat:bnwj25lwofcwrnjtvlta64niq4

NetReAct: Interactive Learning for Network Summarization [article]

Sorour E. Amiri, Bijaya Adhikari, John Wenskovitch, Alexander Rodriguez, Michelle Dowling, Chris North, B. Aditya Prakash
2020 arXiv   pre-print
NetReAct incorporates human feedback with reinforcement learning to summarize and visualize document networks.  ...  How can we use this feedback to improve the network summary quality?  ...  We proposed a novel and effective network summarization algorithm, NetReAct, which leverages a feedback-based reinforcement learning approach to incorporate human input.  ... 
arXiv:2012.11821v1 fatcat:m2phbutgsba4npxnnpd7fcjxaa

Evaluation of Unsupervised Learning based Extractive Text Summarization Technique for Large Scale Review and Feedback Data

Jai Prakash Verma, Atul Patel
2017 Indian Journal of Science and Technology  
Background/Objectives: Supervised techniques uses human generated summary to select features and parameter for summarization.  ...  Due to diversity of large scale datasets, supervised techniques based summarization also fails to meet the requirements.  ...  Unsupervised Learning Based Text Summarization Supervised techniques use human generated summary to select features and parameters for summarization.  ... 
doi:10.17485/ijst/2017/v10i17/106493 fatcat:qwbaxugabzfclanq4m6ajxc7vy

Improving Factual Consistency of Abstractive Summarization on Customer Feedback [article]

Yang Liu, Yifei Sun, Vincent Gao
2021 arXiv   pre-print
E-commerce stores collect customer feedback to let sellers learn about customer concerns and enhance customer order experience.  ...  In this work, we introduce a set of methods to enhance the factual consistency of abstractive summarization on customer feedback.  ...  learned from question answering models.  ... 
arXiv:2106.16188v1 fatcat:4s7xtd7t7jgw5kzv6h2vxaqomq

Joint Optimization of User-desired Content in Multi-document Summaries by Learning from User Feedback

Avinesh PVS, Christian M. Meyer
2017 Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)  
In this paper, we propose an extractive multi-document summarization (MDS) system using joint optimization and active learning for content selection grounded in user feedback.  ...  Our method interactively obtains user feedback to gradually improve the results of a state-of-the-art integer linear programming (ILP) framework for MDS.  ...  Acknowledgments This work has been supported by the German Research Foundation as part of the Research Training Group Adaptive Preparation of Information from Heterogeneous Sources (AIPHES) under grant  ... 
doi:10.18653/v1/p17-1124 dblp:conf/acl/AvineshM17 fatcat:xgri2evu7vhe5n3sj3imrwi2xa

Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks [article]

Julia Kreutzer, Stefan Riezler, Carolin Lawrence
2021 arXiv   pre-print
Using such interaction logs in an offline reinforcement learning (RL) setting is a promising approach.  ...  Large volumes of interaction logs can be collected from NLP systems that are deployed in the real world. How can this wealth of information be leveraged?  ...  For instance, learning from human pairwise preferences from humans has been advertised for summarization [9, 36] , but the reliability of the signal has not been evaluated.  ... 
arXiv:2011.02511v3 fatcat:n5quenu4qfbbtjuyoa7eutn6wa

Hone as You Read: A Practical Type of Interactive Summarization [article]

Tanner Bohn, Charles X. Ling
2021 arXiv   pre-print
Our approaches range from simple heuristics to preference-learning and their analysis provides insight into this important task. Human evaluation additionally supports the practicality of HARE.  ...  This task is related to interactive summarization, where personalized summaries are produced following a long feedback stage where users may read the same sentences many times.  ...  Böhm et al. (2019) consider learning a reward function from existing human ratings.  ... 
arXiv:2105.02923v1 fatcat:ipvcxzs6jjgdjd2o75b2qpz7ey

Automatic Summarization of Student Course Feedback [article]

Wencan Luo, Fei Liu, Zitao Liu, Diane Litman
2018 arXiv   pre-print
In this work, we propose a new approach to summarizing student course feedback based on the integer linear programming (ILP) framework.  ...  Experimental results on a student feedback corpus show that our approach outperforms a range of baselines in terms of both ROUGE scores and human evaluation.  ...  Acknowledgments This research is supported by an internal grant from the Learning Research and Development Center at the University of Pittsburgh. We thank Muhsin Menekse for providing the data set.  ... 
arXiv:1805.10395v1 fatcat:v4dcbpwcnrfndehtjszqmsxaxy

Developing Summarization Skills through the Use of LSA-Based Feedback

Eileen Kintsch, Dave Steinhart, Gerry Stahl, LSA Research Group LSA Research Group, Cindy Matthews, Ronald Lamb
2000 Interactive Learning Environments  
The feedback allows students to engage in extensive, independent practice in writing and revising without placing excessive demands on teachers for feedback.  ...  This paper describes a series of classroom trials during which we developed Summary Street, an educational software system that uses Latent Semantic Analysis to support writing and revision activities.  ...  Thus, based on evidence from a single trial, the summarization software did not appear either to benefit or to harm students' learning or writing. 3. Comparison of LSA and human graders: Energy unit.  ... 
doi:10.1076/1049-4820(200008)8:2;1-b;ft087 fatcat:4tckegjnzvhxnf4ztyhbqtcnoy

Automatic Summarization of Student Course Feedback

Wencan Luo, Fei Liu, Zitao Liu, Diane Litman
2016 Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies  
In this work, we propose a new approach to summarizing student course feedback based on the integer linear programming (ILP) framework.  ...  Experimental results on a student feedback corpus show that our approach outperforms a range of baselines in terms of both ROUGE scores and human evaluation.  ...  Acknowledgments This research is supported by an internal grant from the Learning Research and Development Center at the University of Pittsburgh. We thank Muhsin Menekse for providing the data set.  ... 
doi:10.18653/v1/n16-1010 dblp:conf/naacl/LuoLLL16 fatcat:y7krdw7h55a2rjwrh5eutodpxa

How Can Psychology Inform the Design of Learning Experiences?

Milos Kravcik, Ralf Klamma, Zinayda Petrushyna
2011 2011 IEEE 11th International Conference on Advanced Learning Technologies  
We have reviewed literature on human decision making processes, organized a survey and a workshop with PhD students to collect various opinions on these issues, and here we summarize the outcomes.  ...  Our aim is to analyze results of behavioral and cognitive psychology to help designers of learning experiences with specification of requirements.  ...  The learner must receive a clear feedback on the learning progress to be able to distinguish success from failure. D. Collaboration Recommendations can come either from experts or from the crowd.  ... 
doi:10.1109/icalt.2011.92 dblp:conf/icalt/KravcikKP11 fatcat:cos37vrrc5amdek5oesfq5a3gu

Learning Improvised Chatbots from Adversarial Modifications of Natural Language Feedback [article]

Makesh Narsimhan Sreedhar, Kun Ni, Siva Reddy
2020 arXiv   pre-print
The generator's goal is to convert the feedback into a response that answers the user's previous utterance and to fool the discriminator which distinguishes feedback from natural responses.  ...  We show that augmenting original training data with these modified feedback responses improves the original chatbot performance from 69.94% to 75.96% in ranking correct responses on the Personachat dataset  ...  Annotating data to convert feedback text to natural response is also expensive and defeats the purpose of learning from feedback text. you could say hey, i'm 30. how old are you?  ... 
arXiv:2010.07261v2 fatcat:lc6rhnbqynd77iugpjzu3f6t4u
« Previous Showing results 1 — 15 out of 225,363 results