Filters








593 Hits in 5.2 sec

Biology based alignments of paraphrases for sentence compression

João Cordeiro, Gäel Dias, Guillaume Cleuziou
2007 Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing - RTE '07   unpublished
In this paper, we present a study for extracting and aligning paraphrases in the context of Sentence Compression.  ...  Finally, we will provide some results of different biology based methodologies for pairwise paraphrase alignment.  ...  Finally, we will provide some results of different biology based methodologies for pairwise paraphrase alignment.  ... 
doi:10.3115/1654536.1654573 fatcat:do6yqfup6jburkray3h4ihrhoq

Rule Induction for Sentence Reduction [chapter]

João Cordeiro, Gaël Dias, Pavel Brazdil
2013 Lecture Notes in Computer Science  
Paraphrases are first discovered within a collection of automatically crawled Web News Stories and then textually aligned in order to extract interchangeable text fragment candidates, in particular reduction  ...  As only positive examples exist, Inductive Logic Programming (ILP) provides an interesting learning paradigm for the extraction of sentence reduction rules.  ...  para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology) within project "FCOMP -01-0124-FEDER-022701".  ... 
doi:10.1007/978-3-642-40669-0_45 fatcat:b2j54v4umfhibmaametgbeofga

Sentence Fusion for Multidocument News Summarization

Regina Barzilay, Kathleen R. McKeown
2005 Computational Linguistics  
Sentence fusion moves the summarization field from the use of purely extractive methods to the generation of abstracts that contain sentences not found in any of the input documents and can synthesize  ...  Sentence fusion involves bottom-up local multisequence alignment to identify phrases conveying similar information and statistical generation to combine common phrases into a sentence.  ...  This article is based upon work supported in part by the National Science Foundation under grant IIS-0448168, DARPA grant N66001-00-1-8919 and a Louis Morin scholarship.  ... 
doi:10.1162/089120105774321091 fatcat:xpjmqhqkj5g7fjef34oxghrhnu

Bootstrapping lexical choice via multiple-sequence alignment

Regina Barzilay, Lillian Lee
2002 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - EMNLP '02  
Typically, labor-intensive knowledge-based methods are used to construct the dictionary.  ...  For example, in evaluations involving a dozen human judges, our system produced output whose readability and faithfulness to the semantic input rivaled that of a traditional generation system. 1 Throughout  ...  We are grateful to Amanda Holland-Minkley for help running the comparison experiments. Portions of this work were done while the first author was visiting Cornell University.  ... 
doi:10.3115/1118693.1118715 dblp:conf/emnlp/BarzilayL02 fatcat:74cxpqmtljgx5kcjakegovlixm

Bootstrapping Lexical Choice via Multiple-Sequence Alignment [article]

Regina Barzilay, Lillian Lee
2002 arXiv   pre-print
For example, in evaluations involving a dozen human judges, our system produced output whose readability and faithfulness to the semantic input rivaled that of a traditional generation system.  ...  Typically, labor-intensive knowledge-based methods are used to construct the dictionary.  ...  We are grateful to Amanda Holland-Minkley for help running the comparison experiments. Portions of this work were done while the first author was visiting Cornell University.  ... 
arXiv:cs/0205065v1 fatcat:vabvzal5pzcc7iweb3cjhfuxhe

Automatic Medical Text Simplification: Challenges of Data Quality and Curation

Chandrayee Basu, Rosni Vasu, Michihiro Yasunaga, Sohyeong Kim, Qian Yang
2021 AAAI Fall Symposia  
The topmost reason for low health literacy is the vocabulary gap between providers and patients.  ...  It is, however, extremely challenging to curate quality corpus for this natural language processing (NLP) task.  ...  We measured semantic diversity of the MSD validation data and the entire SIMPWIKI corpus using Sentence-BERT based corpus alignment.  ... 
dblp:conf/aaaifs/BasuVYKY21 fatcat:urhfj3i4wrc5ff26rqldgz4yuy

Paraphrasing for condensation in journal abstracting

Richard Kittredge
2002 Journal of Biomedical Informatics  
Some paraphrase operations may use both lexical functions and rhetorical relations between sentences to reformulate larger chunks of text in a concise abstract sentence.  ...  representations of selected source sentences.  ...  Acknowledgments I am grateful to the editor of this special issue for many helpful suggestions, and to Elena Matitashvili for explaining terms used in Genome Biology.  ... 
doi:10.1016/s1532-0464(03)00016-9 pmid:12755521 fatcat:c5icguhl5neyjn5fzllpooyzoy

Contextualized Rewriting for Text Summarization [article]

Guangsheng Bao, Yue Zhang
2021 arXiv   pre-print
We formalize contextualized rewriting as a seq2seq problem with group alignments, introducing group tag as a solution to model the alignments, identifying extracted summaries through content-based addressing  ...  Existing work shows that abstractive rewriting for extractive summaries can improve the conciseness and readability.  ...  Acknowledgments We would like to thank the anonymous reviewers for their valuable feedback and Wenyu Du for the inspiring discussion.  ... 
arXiv:2102.00385v2 fatcat:vjdaogwn55htzpm3xobawyde2u

A Multilingual Study of Multi-Sentence Compression using Word Vertex-Labeled Graphs and Integer Linear Programming

Elvys Linhares Pontes, Stephane Huet, Juan-Manuel Torres-Moreno, Thiago G. da Silva, Andrea Carneiro Linhares
2020 Zenodo  
Multi-Sentence Compression (MSC) aims to generate a short sentence with the key information from a cluster of similar sentences.  ...  We led both automatic and manual evaluations to determine the informativeness and the grammaticality of compressions for each dataset.  ...  Filippova and Strube [12] also used dependency trees to align each cluster of related sentences and generated a new tree, this time with ILP, to compress the information.  ... 
doi:10.5281/zenodo.3759285 fatcat:xkv6klon5bb37h62uwzgop5p6m

A Multilingual Study of Multi-Sentence Compression using Word Vertex-Labeled Graphs and Integer Linear Programming [article]

Elvys Linhares Pontes, Stéphane Huet, Juan-Manuel Torres-Moreno, Thiago G. da Silva, Andréa Carneiro Linhares
2020 arXiv   pre-print
Multi-Sentence Compression (MSC) aims to generate a short sentence with the key information from a cluster of similar sentences.  ...  We led both automatic and manual evaluations to determine the informativeness and the grammaticality of compressions for each dataset.  ...  Filippova and Strube [12] also used dependency trees to align each cluster of related sentences and generated a new tree, this time with ILP, to compress the information.  ... 
arXiv:2004.04468v1 fatcat:avk6yn6eq5csbhxa5wzn7ucbke

A Multilingual Study of Multi-Sentence Compression using Word Vertex-Labeled Graphs and Integer Linear Programming

Elvys Linhares Pontes, Stéphane Huet, Juan Manuel Torres Moreno, Thiago Gouveia da Silva, Andréa Carneiro Linhares
2020 Zenodo  
Multi-Sentence Compression (MSC) aims to generate a short sentence with the key information from a cluster of similar sentences.  ...  We led both automatic and manual evaluations to determine the informativeness and the grammaticality of compressions for each dataset.  ...  Filippova and Strube [12] also used dependency trees to align each cluster of related sentences and generated a new tree, this time with ILP, to compress the information.  ... 
doi:10.5281/zenodo.3930868 fatcat:vclxvriiqfgyfeyz75lhp6dqia

Translating Questions into Answers using DBPedia n-triples [article]

Mihael Arcan
2018 arXiv   pre-print
Although the automatic evaluation shows a low overlap of the generated answers compared to the gold standard set, a manual inspection of the showed promising outcomes from the experiment for further work  ...  Acknowledgement This publication has emanated from research conducted with the financial support of Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289 (Insight).  ...  Methodology The training approach to the sequence-tosequence neural model requires an aligned dataset of questions and answers, which are aligned on a sentence level.  ... 
arXiv:1803.02914v1 fatcat:ed2e3a37tbbnpkwx6rixvrmruu

Plagiarism Detection for Indonesian Texts

Lucia D. Krisnawati, Klaus U. Schulz
2013 Proceedings of International Conference on Information Integration and Web-based Applications & Services - IIWAS '13  
We plan to incorporate sentence alignment which collects contextual evidence and exploits word similarity introduced in [172] to increase system's recognition on heavily paraphrased passages.  ...  We plan to improve seed alignment by regarding the offsets of sentences in which the seeds occur. This is to address the drawback of the passage boundary detection. 3 .  ...  , frequency of special characters and compression rate. • Word-based lexical features (wblf ): such as word length average, sentence length average, average number of syllable or words and term frequency  ... 
doi:10.1145/2539150.2539213 dblp:conf/iiwas/KrisnawatiS13 fatcat:r6p2h4oiq5fi3mhlazokatknrq

Induction of Word and Phrase Alignments for Automatic Document Summarization [article]

Hal Daumé III, Daniel Marcu
2009 arXiv   pre-print
Our model for the alignment task is based on an extension of the standard hidden Markov model, and learns to create alignments in a completely unsupervised fashion.  ...  This paper describes experiments we have carried out to analyze the ability of humans to perform such alignments, and based on these analyses, we describe experiments for creating them automatically.  ...  Acknowledgments We wish to thank David Blei for helpful theoretical discussions related to this project and Franz Josef Och for sharing his technical expertise on issues that made the computations discussed  ... 
arXiv:0907.0804v1 fatcat:seslhdxrhzcgzhkitij27mjv5a

Induction of Word and Phrase Alignments for Automatic Document Summarization

Hal Daumé, Daniel Marcu
2005 Computational Linguistics  
Our model for the alignment task is based on an extension of the standard hidden Markov model and learns to create alignments in a completely unsupervised fashion.  ...  This paper describes experiments we have carried out to analyze the ability of humans to perform such alignments, and based on these analyses, we describe experiments for creating them automatically.  ...  Acknowledgments We wish to thank David Blei for helpful theoretical discussions related to this project and Franz Josef Och for sharing his technical expertise on issues that made the computations discussed  ... 
doi:10.1162/089120105775299140 fatcat:6zh2ule65vgpzgpwxhjmwod56a
« Previous Showing results 1 — 15 out of 593 results