Filters








831 Hits in 7.5 sec

Style-transfer and Paraphrase: Looking for a Sensible Semantic Similarity Metric [article]

Ivan P. Yamshchikov, Viacheslav Shibaev, Nikolay Khlebnikov, Alexey Tikhonov
2020 arXiv   pre-print
The rapid development of such natural language processing tasks as style transfer, paraphrase, and machine translation often calls for the use of semantic similarity metrics.  ...  In recent years a lot of methods to measure the semantic similarity of two short texts were developed. This paper provides a comprehensive analysis for more than a dozen of such methods.  ...  Conclusion In this paper, we examine more than a dozen metrics for semantic similarity in the context of NLP tasks of style transfer and paraphrase.  ... 
arXiv:2004.05001v3 fatcat:33m5e5lwbnccpkrx4pelcpx24a

Rethinking Crowd Sourcing for Semantic Similarity [article]

Shaul Solomon and Adam Cohn and Hernan Rosenblum and Chezi Hershkovitz and Ivan P. Yamshchikov
2021 arXiv   pre-print
Estimation of semantic similarity is crucial for a variety of natural language processing (NLP) tasks.  ...  In the absence of a general theory of semantic information, many papers rely on human annotators as the source of ground truth for semantic similarity estimation.  ...  Ivan P Yamshchikov, Viacheslav Shibaev, Nikolay Khlebnikov, and Alexey Tikhonov. 2021. Style- transfer and paraphrase: Looking for a sensible se- mantic similarity metric.  ... 
arXiv:2109.11969v1 fatcat:4ojmomlfofhazaqnhngfvemwte

Deep Learning for Text Style Transfer: A Survey [article]

Di Jin, Zhijing Jin, Zhiting Hu, Olga Vechtomova, Rada Mihalcea
2021 arXiv   pre-print
In this paper, we present a systematic survey of the research on neural text style transfer, spanning over 100 representative articles since the first neural text style transfer work in 2017.  ...  Text style transfer is an important task in natural language generation, which aims to control certain attributes in the generated text, such as politeness, emotion, humor, and many others.  ...  A Looking for a sensible semantic similarity dataset for low-resource stylized metric. In Thirty-Fifth AAAI Conference on sequence-to-sequence generation.  ... 
arXiv:2011.00416v5 fatcat:wfw3jfh2mjfupbzrmnztsqy4ny

Methods for Detoxification of Texts for the Russian Language

Daryna Dementieva, Daniil Moskovskiy, Varvara Logacheva, David Dale, Olga Kozlova, Nikita Semenov, Alexander Panchenko
2021 Multimodal Technologies and Interaction  
This kind of textual style transfer can be used for processing toxic content on social media or for eliminating toxicity in automatically generated texts.  ...  In addition, we provide the training datasets and describe the evaluation setup and metrics for automatic and manual evaluation.  ...  The style change and the content preservation are crucial for style transfer.  ... 
doi:10.3390/mti5090054 fatcat:xo4snfbjbbexhictmp3syb3cnq

Deep Learning for Text Style Transfer: A Survey

Di Jin, Zhijing Jin, Zhiting Hu, Olga Vechtomova, Rada Mihalcea
2021 Computational Linguistics  
In this paper, we present a systematic survey of the research on neural text style transfer, spanning over 100 representative articles since the first neural text style transfer work in 2017.  ...  Text style transfer is an important task in natural language generation, which aims to control certain attributes in the generated text, such as politeness, emotion, humor, and many others.  ...  Style-transfer and paraphrase: Wu, Yu, Yunli Wang, and Shujie Liu. 2020. A Looking for a sensible semantic similarity dataset for low-resource stylized metric.  ... 
doi:10.1162/coli_a_00426 fatcat:v7vmb62ckfcu5k5mpu2pydnrxy

Text Analysis in Adversarial Settings: Does Deception Leave a Stylistic Trace? [article]

Tommi Gröndahl, N. Asokan
2019 arXiv   pre-print
Textual deception constitutes a major problem for online security.  ...  We suggest that deceptiveness as such leaves no content-invariant stylistic trace, and textual similarity measures provide superior means of classifying texts as potentially deceptive.  ...  For assessing semantic similarity, they use the MT evaluation metric Meteor [38] , which measures n-gram overlap using additional paraphrase tables. ey receive scores of 0.69, 0.79, and 0.29 in the gender  ... 
arXiv:1902.08939v2 fatcat:qjbxcq5fpjaubj5z5xii3v44mu

Text Analysis in Adversarial Settings

Tommi Gröndahl, N. Asokan
2019 ACM Computing Surveys  
Textual deception constitutes a major problem for online security.  ...  We suggest that deceptiveness as such leaves no content-invariant stylistic trace, and textual similarity measures provide superior means of classifying texts as potentially deceptive.  ...  For assessing semantic similarity, they use the MT evaluation metric Meteor [38] , which measures n-gram overlap using additional paraphrase tables.  ... 
doi:10.1145/3310331 fatcat:563vjvd63fcdnnswmvmsxthu7e

Simple is Better! Lightweight Data Augmentation for Low Resource Slot Filling and Intent Classification [article]

Samuel Louvan, Bernardo Magnini
2020 arXiv   pre-print
We show that lightweight augmentation, a set of augmentation methods involving word span and sentence level operations, alleviates data scarcity problems.  ...  Our experiments on limited data settings show that lightweight augmentation yields significant performance improvement on slot filling on the ATIS and SNIPS datasets, and achieves competitive performance  ...  However, we do not use D to look for substitute candidates, instead we use a large pre-trained language model to generate the slot value candidates, using the fill-in-the-blank style (Donahue et al.,  ... 
arXiv:2009.03695v1 fatcat:5mc3lhxnyzh2lam3dkzp6mfj7u

Hurdles to Progress in Long-form Question Answering [article]

Kalpesh Krishna, Aurko Roy, Mohit Iyyer
2021 arXiv   pre-print
quality and can be easily gamed; and (4) human evaluations used for other text generation tasks are unreliable for LFQA.  ...  ELI5 contains significant train / validation overlap, as at least 81% of ELI5 validation questions occur in paraphrased form in the training set; (3) ROUGE-L is not an informative metric of generated answer  ...  We are very grateful to Vidhisha Balachandran, Niki Parmar, and Ashish Vaswani for weekly meetings discussing progress and the REALM team (Kenton Lee, Kelvin Guu, Ming-Wei  ... 
arXiv:2103.06332v2 fatcat:gdogrseicrbefbshlfm7nnv5zy

Question Answering Infused Pre-training of General-Purpose Contextualized Representations [article]

Robin Jia, Mike Lewis, Luke Zettlemoyer
2022 arXiv   pre-print
We propose a pre-training objective based on question answering (QA) for learning general-purpose contextual representations, motivated by the intuition that the representation of a phrase in a passage  ...  We show large improvements over both RoBERTa-large and previous state-of-the-art results on zero-shot and few-shot paraphrase detection on four datasets, few-shot named entity recognition on two datasets  ...  , Sebastian Riedel, Sewon Min, Patrick Lewis, Scott Yih, and our anonymous reviewers for their feedback.  ... 
arXiv:2106.08190v2 fatcat:bh63p5or5zcixh63hpedad7s7i

Analyzing Non-Textual Content Elements to Detect Academic Plagiarism

Norman Meuschke, Bela Gipp, Harald Reiterer, Michael L. Nelson
2021 Zenodo  
Detection approaches proposed so far analyze lexical, syntactical, and semantic text similarity. These approaches find copied, moderately reworded, and literally translated text.  ...  However, reliably detecting disguised plagiarism, such as strong paraphrases, sense-for-sense translations, and the reuse of non-textual content and ideas, is an open research problem.  ...  It combines the analysis of syntactic and semantic sentence similarity using a linear combination of two similarity metrics: i) the cosine similarity of semantic vectors and ii) the similarity of syntactic  ... 
doi:10.5281/zenodo.4913344 fatcat:xmpaahvwuva53l5l5i2gaidvi4

A Survey of the Usages of Deep Learning in Natural Language Processing [article]

Daniel W. Otter, Julian R. Medina, Jugal K. Kalita
2019 arXiv   pre-print
This survey provides a brief introduction to the field and a quick overview of deep learning architectures and methods.  ...  A discussion of the current state of the art is then provided along with recommendations for future research in the field.  ...  In this section, neural semantic processing research is separated into two distinct areas: Work on comparing the semantic similarity of two portions of text, and work on capturing and transferring meaning  ... 
arXiv:1807.10854v3 fatcat:ajyv5o743naixeo5c5y6p6tg3e

From hyperlinks to Semantic Web properties using Open Knowledge Extraction

Valentina Presutti, Andrea Giovanni Nuzzolese, Sergio Consoli, Aldo Gangemi, Diego Reforgiato Recupero, Stefan Schlobach, Krzysztof Janowicz, Stefan Schlobach, Krzysztof Janowicz
2016 Semantic Web Journal  
This work proposes a novel paradigm, named Open Knowledge Extraction, and its implementation (Legalo) that performs unsupervised, open domain, and abstractive knowledge extraction from text for producing  ...  such semantic relations, their subjects and objects, can be revealed by processing their linguistic traces (i.e. the sentences that embed the hyperlinks) and formalised as Semantic Web triples and ontology  ...  Two different similarity measures were computed: a string similarity score based on Jaccard distance measure 33 , and a semantic similarity measure based on the SimLibrary framework [37] 34 .  ... 
doi:10.3233/sw-160221 fatcat:q5elzu73zbep3ncy4ygm7p5t6e

From Hyperlinks To Semantic Web Properties Using Open Knowledge Extraction

Valentina Presutti, Andrea Giovanni Nuzzolese, Sergio Consoli, Aldo Gangemi, Diego Reforgiato Recupero
2016 Zenodo  
This work proposes a novel paradigm, named Open Knowledge Extraction, and its implementation (Legalo) that performs unsupervised, open domain, and abstractive knowledge extraction from text for producing  ...  such semantic relations, their subjects and objects, can be revealed by processing their linguistic traces (i.e. the sentences that embed the hyperlinks) and formalised as Semantic Web triples and ontology  ...  Two different similarity measures were computed: a string similarity score based on Jaccard distance measure 33 , and a semantic similarity measure based on the SimLibrary framework [36] 34 .  ... 
doi:10.5281/zenodo.1204398 fatcat:fncpq3ictzgxzkxdwfttlypofu

Report from the NSF Future Directions Workshop on Automatic Evaluation of Dialog: Research Directions and Challenges [article]

Shikib Mehri, Jinho Choi, Luis Fernando D'Haro, Jan Deriu, Maxine Eskenazi, Milica Gasic, Kallirroi Georgila, Dilek Hakkani-Tur, Zekang Li, Verena Rieser, Samira Shaikh, David Traum (+4 others)
2022 arXiv   pre-print
The workshop explored the current state of the art along with its limitations and suggested promising directions for future work in this important and very rapidly changing area of research.  ...  This is a report on the NSF Future Directions Workshop on Automatic Evaluation of Dialog.  ...  ., 2021c) measures both semantic similarity and response fluency and the PARADISE-style model of (Walker et al., 2021) uses both predicted user ratings and dialog length.  ... 
arXiv:2203.10012v1 fatcat:c6ckt5of35andgw22q4tcvxm7u
« Previous Showing results 1 — 15 out of 831 results