A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2011; you can also visit the original URL.
The file type is application/pdf
.
Filters
Including summaries in system evaluation
2009
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval - SIGIR '09
In batch evaluation of retrieval systems, performance is calculated based on predetermined relevance judgements applied to a list of documents returned by the system for a query. ...
Given that system orderings alter when summaries are taken into account, the small amount of effort required to judge summaries in addition to documents (19 seconds vs 88 seconds on average in our data ...
In this paper we explicitly examine the effect of including the summary examination stage of the retrieval process in batch evaluations. ...
doi:10.1145/1571941.1572029
dblp:conf/sigir/TurpinSJWC09
fatcat:hf4mbs22prgazcvq2auzx3jb6q
Automatic text summarization in TIPSTER
1996
Proceedings of a workshop on held at Baltimore, Maryland October 13-15, 1998 -
Each of the systems is described in detail elsewhere in the proceedings. In coordination with the various research efforts, DARPA sponsored an evaluation of text summarization systems. ...
the various systems in either evaluation, however the results do show some encouraging trends. ...
doi:10.3115/1119089.1119119
dblp:conf/tipster/FirminM98
fatcat:rbg3vd3dojb23ky37o2lbjcdpi
A Survey of Text Summarization Techniques
[chapter]
2012
Mining Text Data
It also presents the evaluation measures of a text summarizer. ...
In today's fast growing information world, text summarization has become an important matter for interpreting text information. ...
Precision (P) is calculated as the number of sentences occurring in both the system summary and human generated summary divided by the number of sentences in the system summary. ...
doi:10.1007/978-1-4614-3223-4_3
fatcat:dzt5m34qivgf3id763q2kwk2e4
DUC in context
2007
Information Processing & Management
Recent years have seen increased interest in text summarization with emphasis on evaluation of prototype systems. ...
The themes are extrinsic and intrinsic evaluation, evaluation procedures and methods, generic versus focused summaries, single-and multi-document summaries, length and compression issues, extracts versus ...
This task was extended in (NTCIR3, 2002) ) to include testing of multi-document summaries, using a similar evaluation. ...
doi:10.1016/j.ipm.2007.01.019
fatcat:ycjd3eebvzd5lemkt277fpsjma
Development of Algorithm and System for Automatic Generation of Nursing Summaries from Nursing Care Plans
2014
Intelligent Information Management
Advantages of this system are that it enables nursing summaries to be generated automatically in real time, simplifies the process, and permits the standardization of useful nursing summaries that reflect ...
the course of the nursing care provided and its evaluation. ...
Consequently, the information in the nursing care plan includes information needed for the nursing summary. ...
doi:10.4236/iim.2014.63011
fatcat:2hhawoebmndf7nqfhqzv5klkdm
Keyphrase based Evaluation of Automatic Text Summarization
2015
International Journal of Computer Applications
The development of methods to deal with the informative contents of the text units in the matching process is a major challenge in automatic summary evaluation systems that use fixed n-gram matching. ...
The system was applied to evaluate different summaries of Arabic multi-document data set presented at TAC2011. ...
The data includes the peer summaries, human summaries, and results of Rouge-1, Rouge-2, Rouge-SU4, and AutoSummENG-MeMoG evaluation scores. The data set available in 7 languages including Arabic. ...
doi:10.5120/20564-2953
fatcat:kgrx3ju36rcvxp3wt3jpx35tvu
Understanding Factual Errors in Summarization: Errors, Summarizers, Datasets, Error Detectors
[article]
2022
arXiv
pre-print
The propensity of abstractive summarization systems to make factual errors has been the subject of significant study, including work on models to detect factual errors and annotation of errors in current ...
In this work, we collect labeled factuality errors from across nine datasets of annotated summary outputs and stratify them in a new way, focusing on what kind of base summarization model was used. ...
Introduction Although abstractive summarization systems (Liu and Lapata, 2019; have improved dramatically over the past few years, these systems still often include factual errors in generated summaries ...
arXiv:2205.12854v1
fatcat:kjo2pzkdtfem7lxzzdbyz3tbgy
A Lemma Based Evaluator for Semitic Language Text Summarization Systems
[article]
2014
arXiv
pre-print
The system is an extension of ROUGE test in which texts are matched on token's lemma level. ...
Matching texts in highly inflected languages such as Arabic by simple stemming strategy is unlikely to perform well. ...
DataaSet1 included the source texts, system summaries, and human summaries. The data set is available in 7 languages including Arabic. It was derived from publicly available WikiNews English texts. ...
arXiv:1403.5596v1
fatcat:amniuj4srfhxdj5rqhfy32gkh4
The trecvid 2008 BBC rushes summarization evaluation
2008
Proceeding of the 2nd ACM workshop on Video summarization - TVS '08
This paper describes an evaluation of automatic video summarization systems run on rushes from several BBC dramatic series. ...
Additional objective measures included: how long it took the system to create the summary, how long it took the assessor to judge it against the ground truth, and what the summary's duration was. ...
summary judging, to Carnegie Mellon Univer- ...
doi:10.1145/1463563.1463564
dblp:conf/mm/OverSA08
fatcat:o7raw7wluzhnhillawgg3zo2vu
University of California, San Francisco School of Medicine
2020
Academic Medicine
Collaborate to coordinate patient care within and
across healthcare systems, including patient hand-
offs
• ...
Participate in a systematic approach to promote patient safety • F1: CMC systems improvement template, QIKAT SBP 3 (graduation). ...
doi:10.1097/acm.0000000000003469
pmid:33626649
fatcat:4sdmxqvz3zfa7h2a2yfzv57dle
A Proposed Methodology for Subjective Evaluation of Video and Text Summarization
[chapter]
2018
Lecture Notes in Computer Science
and Protocol for the Whole Integrated System Our proposed experiment includes two different lines: subjective evaluation (questionnaires) and objective evaluation. ...
So, in the present article it is presented a complete way in order to evaluate this type of systems efficiently. ...
doi:10.1007/978-3-319-98678-4_40
fatcat:272ljee6hbhhhe6d2npipffswi
Older versions of the ROUGEeval summarization evaluation system were easier to fool
2007
Information Processing & Management
By a simple greedy word selection strategy, summaries with high ROUGE-scores are generated. These summaries would however not be considered good by human readers. ...
We show some limitations of the ROUGE evaluation method for automatic summarization. We present a method for automatic summarization based on a Markov model of the source text. ...
An example of a system generated summary is shown in Figure 1 . ...
doi:10.1016/j.ipm.2007.01.014
fatcat:nayyj6nnwbhbfmt3khqhir6njy
Evaluation of Summarization Systems across Gender, Age, and Race
[article]
2021
arXiv
pre-print
For two different evaluation scenarios -- evaluation against gold summaries and system output ratings -- we show that summary evaluation is sensitive to protected attributes. ...
Summarization systems are ultimately evaluated by human annotators and raters. ...
We did so in two different evaluation scenarios: automatic evaluation against gold summaries and system output ratings by human evaluators. ...
arXiv:2110.04384v1
fatcat:eygyp34enfewbcxqzunmg226jq
The trecvid 2007 BBC rushes summarization evaluation pilot
2007
Proceedings of the international workshop on TRECVID video summarization - TVS '07
Additional objective measures included: how long it took the system to create the summary, how long it took the assessor to judge it against the ground truth, and what the summary's duration was. ...
This paper provides an overview of a pilot evaluation of video summaries using rushes from several BBC dramatic series. It was carried out under the auspices of TRECVID. ...
We are grateful to the BBC archives and to Richard Wright for providing the data, to NIST and DTO for sponsoring the evaluation campaign, to the assessors at NIST who judged the summaries, to Science Foundation ...
doi:10.1145/1290031.1290032
dblp:conf/mm/OverSK07
fatcat:yscn3od5rzhrjl3horbrzxpctq
Question-Driven Summarization of Answers to Consumer Health Questions
[article]
2020
arXiv
pre-print
In order to benchmark the dataset, we include results of baseline and state-of-the-art deep learning summarization models, demonstrating that this dataset can be used to effectively evaluate question-driven ...
However, to evaluate the quality of automatically generated summaries of health information, gold-standard, human generated summaries are required. ...
S.G managed the interface for generating the summaries, and provided data processing support. ...
arXiv:2005.09067v2
fatcat:ouwgy6mxrfdqjplqcsbi3x62wq
« Previous
Showing results 1 — 15 out of 2,642,793 results