2,642,793 Hits in 4.5 sec

Including summaries in system evaluation

Andrew Turpin, Falk Scholer, Kalvero Jarvelin, Mingfang Wu, J. Shane Culpepper
2009 Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval - SIGIR '09  
In batch evaluation of retrieval systems, performance is calculated based on predetermined relevance judgements applied to a list of documents returned by the system for a query.  ...  Given that system orderings alter when summaries are taken into account, the small amount of effort required to judge summaries in addition to documents (19 seconds vs 88 seconds on average in our data  ...  In this paper we explicitly examine the effect of including the summary examination stage of the retrieval process in batch evaluations.  ... 
doi:10.1145/1571941.1572029 dblp:conf/sigir/TurpinSJWC09 fatcat:hf4mbs22prgazcvq2auzx3jb6q

Automatic text summarization in TIPSTER

Thérèse Firmin, Inderjeet Mani
1996 Proceedings of a workshop on held at Baltimore, Maryland October 13-15, 1998 -  
Each of the systems is described in detail elsewhere in the proceedings. In coordination with the various research efforts, DARPA sponsored an evaluation of text summarization systems.  ...  the various systems in either evaluation, however the results do show some encouraging trends.  ... 
doi:10.3115/1119089.1119119 dblp:conf/tipster/FirminM98 fatcat:rbg3vd3dojb23ky37o2lbjcdpi

A Survey of Text Summarization Techniques [chapter]

Ani Nenkova, Kathleen McKeown
2012 Mining Text Data  
It also presents the evaluation measures of a text summarizer.  ...  In today's fast growing information world, text summarization has become an important matter for interpreting text information.  ...  Precision (P) is calculated as the number of sentences occurring in both the system summary and human generated summary divided by the number of sentences in the system summary.  ... 
doi:10.1007/978-1-4614-3223-4_3 fatcat:dzt5m34qivgf3id763q2kwk2e4

DUC in context

Paul Over, Hoa Dang, Donna Harman
2007 Information Processing & Management  
Recent years have seen increased interest in text summarization with emphasis on evaluation of prototype systems.  ...  The themes are extrinsic and intrinsic evaluation, evaluation procedures and methods, generic versus focused summaries, single-and multi-document summaries, length and compression issues, extracts versus  ...  This task was extended in (NTCIR3, 2002) ) to include testing of multi-document summaries, using a similar evaluation.  ... 
doi:10.1016/j.ipm.2007.01.019 fatcat:ycjd3eebvzd5lemkt277fpsjma

Development of Algorithm and System for Automatic Generation of Nursing Summaries from Nursing Care Plans

Misao Miyagawa, Yuko Yasuhara, Tetsuya Tanioka, Hirokazu Ito, Motoyuki Suzuki, Rozzano Locsin
2014 Intelligent Information Management  
Advantages of this system are that it enables nursing summaries to be generated automatically in real time, simplifies the process, and permits the standardization of useful nursing summaries that reflect  ...  the course of the nursing care provided and its evaluation.  ...  Consequently, the information in the nursing care plan includes information needed for the nursing summary.  ... 
doi:10.4236/iim.2014.63011 fatcat:2hhawoebmndf7nqfhqzv5klkdm

Keyphrase based Evaluation of Automatic Text Summarization

Fatma Elghannam, Tarek El-Shishtawy
2015 International Journal of Computer Applications  
The development of methods to deal with the informative contents of the text units in the matching process is a major challenge in automatic summary evaluation systems that use fixed n-gram matching.  ...  The system was applied to evaluate different summaries of Arabic multi-document data set presented at TAC2011.  ...  The data includes the peer summaries, human summaries, and results of Rouge-1, Rouge-2, Rouge-SU4, and AutoSummENG-MeMoG evaluation scores. The data set available in 7 languages including Arabic.  ... 
doi:10.5120/20564-2953 fatcat:kgrx3ju36rcvxp3wt3jpx35tvu

Understanding Factual Errors in Summarization: Errors, Summarizers, Datasets, Error Detectors [article]

Liyan Tang, Tanya Goyal, Alexander R. Fabbri, Philippe Laban, Jiacheng Xu, Semih Yahvuz, Wojciech Kryściński, Justin F. Rousseau, Greg Durrett
2022 arXiv   pre-print
The propensity of abstractive summarization systems to make factual errors has been the subject of significant study, including work on models to detect factual errors and annotation of errors in current  ...  In this work, we collect labeled factuality errors from across nine datasets of annotated summary outputs and stratify them in a new way, focusing on what kind of base summarization model was used.  ...  Introduction Although abstractive summarization systems (Liu and Lapata, 2019; have improved dramatically over the past few years, these systems still often include factual errors in generated summaries  ... 
arXiv:2205.12854v1 fatcat:kjo2pzkdtfem7lxzzdbyz3tbgy

A Lemma Based Evaluator for Semitic Language Text Summarization Systems [article]

Tarek El-Shishtawy, Fatma El-Ghannam
2014 arXiv   pre-print
The system is an extension of ROUGE test in which texts are matched on token's lemma level.  ...  Matching texts in highly inflected languages such as Arabic by simple stemming strategy is unlikely to perform well.  ...  DataaSet1 included the source texts, system summaries, and human summaries. The data set is available in 7 languages including Arabic. It was derived from publicly available WikiNews English texts.  ... 
arXiv:1403.5596v1 fatcat:amniuj4srfhxdj5rqhfy32gkh4

The trecvid 2008 BBC rushes summarization evaluation

Paul Over, Alan F. Smeaton, George Awad
2008 Proceeding of the 2nd ACM workshop on Video summarization - TVS '08  
This paper describes an evaluation of automatic video summarization systems run on rushes from several BBC dramatic series.  ...  Additional objective measures included: how long it took the system to create the summary, how long it took the assessor to judge it against the ground truth, and what the summary's duration was.  ...  summary judging, to Carnegie Mellon Univer-  ... 
doi:10.1145/1463563.1463564 dblp:conf/mm/OverSA08 fatcat:o7raw7wluzhnhillawgg3zo2vu

University of California, San Francisco School of Medicine

Catherine R. Lucey, Karen Hauer, Patricia O'Sullivan, Ann Poncelet, Kevin H. Souza, John Davis
2020 Academic Medicine  
Collaborate to coordinate patient care within and across healthcare systems, including patient hand- offs •  ...  Participate in a systematic approach to promote patient safety • F1: CMC systems improvement template, QIKAT SBP 3 (graduation).  ... 
doi:10.1097/acm.0000000000003469 pmid:33626649 fatcat:4sdmxqvz3zfa7h2a2yfzv57dle

A Proposed Methodology for Subjective Evaluation of Video and Text Summarization [chapter]

Begona Garcia-Zapirain, Cristian Castillo, Aritz Badiola, Sofia Zahia, Amaia Mendez, David Langlois, Denis Jouvet, Juan-Manuel Torres, Mikołaj Leszczuk, Kamel Smaili
2018 Lecture Notes in Computer Science  
and Protocol for the Whole Integrated System Our proposed experiment includes two different lines: subjective evaluation (questionnaires) and objective evaluation.  ...  So, in the present article it is presented a complete way in order to evaluate this type of systems efficiently.  ... 
doi:10.1007/978-3-319-98678-4_40 fatcat:272ljee6hbhhhe6d2npipffswi

Older versions of the ROUGEeval summarization evaluation system were easier to fool

Jonas Sjöbergh
2007 Information Processing & Management  
By a simple greedy word selection strategy, summaries with high ROUGE-scores are generated. These summaries would however not be considered good by human readers.  ...  We show some limitations of the ROUGE evaluation method for automatic summarization. We present a method for automatic summarization based on a Markov model of the source text.  ...  An example of a system generated summary is shown in Figure 1 .  ... 
doi:10.1016/j.ipm.2007.01.014 fatcat:nayyj6nnwbhbfmt3khqhir6njy

Evaluation of Summarization Systems across Gender, Age, and Race [article]

Anna Jørgensen, Anders Søgaard
2021 arXiv   pre-print
For two different evaluation scenarios -- evaluation against gold summaries and system output ratings -- we show that summary evaluation is sensitive to protected attributes.  ...  Summarization systems are ultimately evaluated by human annotators and raters.  ...  We did so in two different evaluation scenarios: automatic evaluation against gold summaries and system output ratings by human evaluators.  ... 
arXiv:2110.04384v1 fatcat:eygyp34enfewbcxqzunmg226jq

The trecvid 2007 BBC rushes summarization evaluation pilot

Paul Over, Alan F. Smeaton, Philip Kelly
2007 Proceedings of the international workshop on TRECVID video summarization - TVS '07  
Additional objective measures included: how long it took the system to create the summary, how long it took the assessor to judge it against the ground truth, and what the summary's duration was.  ...  This paper provides an overview of a pilot evaluation of video summaries using rushes from several BBC dramatic series. It was carried out under the auspices of TRECVID.  ...  We are grateful to the BBC archives and to Richard Wright for providing the data, to NIST and DTO for sponsoring the evaluation campaign, to the assessors at NIST who judged the summaries, to Science Foundation  ... 
doi:10.1145/1290031.1290032 dblp:conf/mm/OverSK07 fatcat:yscn3od5rzhrjl3horbrzxpctq

Question-Driven Summarization of Answers to Consumer Health Questions [article]

Max Savery, Asma Ben Abacha, Soumya Gayen, Dina Demner-Fushman
2020 arXiv   pre-print
In order to benchmark the dataset, we include results of baseline and state-of-the-art deep learning summarization models, demonstrating that this dataset can be used to effectively evaluate question-driven  ...  However, to evaluate the quality of automatically generated summaries of health information, gold-standard, human generated summaries are required.  ...  S.G managed the interface for generating the summaries, and provided data processing support.  ... 
arXiv:2005.09067v2 fatcat:ouwgy6mxrfdqjplqcsbi3x62wq
« Previous Showing results 1 — 15 out of 2,642,793 results