Filters








1,299 Hits in 1.8 sec

Unsupervised Paraphrasing with Pretrained Language Models [article]

Tong Niu, Semih Yavuz, Yingbo Zhou, Nitish Shirish Keskar, Huan Wang, Caiming Xiong
2021 arXiv   pre-print
To address this drawback, we adopt a transfer learning approach and propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.  ...  We also demonstrate that our model transfers to paraphrasing in other languages without any additional finetuning.  ...  ., 2019) in identifying text similarity hints that pretrained language models are equipped with extensive knowledge in paraphrasing.  ... 
arXiv:2010.12885v2 fatcat:wcronrkhx5cidasbpv7uvmmbdu

Unsupervised Paraphrasing with Pretrained Language Models

Tong Niu, Semih Yavuz, Yingbo Zhou, Nitish Shirish Keskar, Huan Wang, Caiming Xiong
2021 Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing   unpublished
To address this drawback, we adopt a transfer learning approach and propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.  ...  We also demonstrate that our model transfers to paraphrasing in other languages without any additional finetuning.  ...  The effectiveness of BERT-score in identifying text similarity hints that pretrained language models are equipped with extensive knowledge in paraphrasing.  ... 
doi:10.18653/v1/2021.emnlp-main.417 fatcat:3vnlpy4vwna7xnu35cmghxf63q

MUSS: Multilingual Unsupervised Sentence Simplification by Mining Paraphrases [article]

Louis Martin, Angela Fan, Éric de la Clergerie, Antoine Bordes, Benoît Sagot
2021 arXiv   pre-print
These models leverage unsupervised pretraining and controllable generation mechanisms to flexibly adjust attributes such as length and lexical complexity at inference time.  ...  MUSS uses a novel approach to sentence simplification that trains strong models using sentence-level paraphrase data instead of proper simplification data.  ...  Subsequently, we finetune pretrained models augmented with controllable mechanisms on the paraphrase corpora to achieve sentence simplification models in any language. associated simplified texts exist  ... 
arXiv:2005.00352v2 fatcat:m2dyquni35d7fi37q3rygoyzza

Unsupervised Text Generation by Learning from Search [article]

Jingjing Li, Zichao Li, Lili Mou, Xin Jiang, Michael R. Lyu, Irwin King
2020 arXiv   pre-print
Our model significantly outperforms unsupervised baseline methods in both tasks. Especially, it achieves comparable performance with the state-of-the-art supervised methods in paraphrase generation.  ...  We demonstrate the effectiveness of TGLS on two real-world natural language generation tasks, paraphrase generation and text formalization.  ...  All variants use pretrained language models.  ... 
arXiv:2007.08557v1 fatcat:l4obb6esbvdfbhwdxwt2haicgy

ConRPG: Paraphrase Generation using Contexts as Regularizer [article]

Yuxian Meng, Xiang Ao, Qing He, Xiaofei Sun, Qinghong Han, Fei Wu, Chun fan, Jiwei Li
2021 arXiv   pre-print
Inspired by this fundamental idea, we propose a pipelined system which consists of paraphrase candidate generation based on contextual language models, candidate filtering using scoring functions, and  ...  In this paper, we propose an unsupervised paradigm for paraphrase generation based on the assumption that the probabilities of generating two sentences with the same meaning given the same context should  ...  More recently, large-scale language model pretraining has also been proven to benefit paraphrase generation in both supervised learning (Witteveen and Andrews, 2019) and unsupervised learning (Hegde  ... 
arXiv:2109.00363v1 fatcat:d6whrwejwbbeznwjmyrq3x5w4q

Several Experiments on Investigating Pretraining and Knowledge-Enhanced Models for Natural Language Inference [article]

Tianda Li, Xiaodan Zhu, Quan Liu, Qian Chen, Zhigang Chen, Si Wei
2019 arXiv   pre-print
Recent work on unsupervised pretraining that leverages unsupervised signals such as language-model and sentence prediction objectives has shown to be very effective on a wide range of NLP problems.  ...  In addition, external knowledge that does not exist in the limited amount of NLI training data may be added to NLI models in two typical ways, e.g., from human-created resources or an unsupervised pretraining  ...  As the recent advance in learning representation for natural language, unsupervised pretraining that leverages large unannotated data using language-model or sentence prediction objectives have shown to  ... 
arXiv:1904.12104v1 fatcat:qszwftfahnacli7zejfqsufkze

Simple Unsupervised Summarization by Contextual Matching

Jiawei Zhou, Alexander Rush
2019 Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics  
We propose an unsupervised method for sentence summarization using only language modeling.  ...  The approach employs two language models, one that is generic (i.e. pretrained), and the other that is specific to the target domain.  ...  The key aspect of this technique is the use of a pretrained language model for unsupervised contextual matching, i.e. unsupervised paraphrasing.  ... 
doi:10.18653/v1/p19-1503 dblp:conf/acl/ZhouR19 fatcat:wg5luxk52rgytpw3lqdcuuqlse

Simple Unsupervised Summarization by Contextual Matching [article]

Jiawei Zhou, Alexander M. Rush
2019 arXiv   pre-print
We propose an unsupervised method for sentence summarization using only language modeling.  ...  The approach employs two language models, one that is generic (i.e. pretrained), and the other that is specific to the target domain.  ...  The key aspect of this technique is the use of a pretrained language model for unsupervised contextual matching, i.e. unsupervised paraphrasing.  ... 
arXiv:1907.13337v1 fatcat:rc4lic4mdbao3nmfjjrw6yqapu

Reflective Decoding: Beyond Unidirectional Generation with Off-the-Shelf Language Models [article]

Peter West, Ximing Lu, Ari Holtzman, Chandra Bhagavatula, Jena Hwang, Yejin Choi
2021 arXiv   pre-print
Publicly available, large pretrained LanguageModels (LMs) generate text with remarkable quality, but only sequentially from left to right.  ...  Comprehensive empirical results demonstrate that Reflective Decoding outperforms strong unsupervised baselines on both paraphrasing and abductive text infilling, significantly narrowing the gap between  ...  Further, in abductive natural language generation it outperforms unsupervised baselines by a significant margin and halves the gap with supervised models.  ... 
arXiv:2010.08566v4 fatcat:oq76p62zgfblpnw45b34l4tlqe

From Paraphrasing to Semantic Parsing: Unsupervised Semantic Parsing via Synchronous Semantic Decoding [article]

Shan Wu, Bo Chen, Chunlei Xin, Xianpei Han, Le Sun, Weipeng Zhang, Jiansong Chen, Fan Yang, Xunliang Cai
2021 arXiv   pre-print
Specifically, we reformulate semantic parsing as a constrained paraphrasing problem: given an utterance, our model synchronously generates its canonical utterance and meaning representation.  ...  In this paper, we propose an unsupervised semantic parsing method - Synchronous Semantic Decoding (SSD), which can simultaneously resolve the semantic gap and the structure gap by jointly leveraging paraphrasing  ...  Effect of Pretrained Language Models To analyze the effect of PLMs, we show the results with different PLM settings: instead of T5 we use GPT-2 or randomly initialized transformers to construct paraphrasing  ... 
arXiv:2106.06228v1 fatcat:qjz7uhi4ivctpnzcdazge5jsm4

Data Augmentation Approaches in Natural Language Processing: A Survey [article]

Bohan Li, Yutai Hou, Wanxiang Che
2021 arXiv   pre-print
It is widely applied in computer vision then introduced to natural language processing and achieves improvements in many tasks.  ...  In this survey, we frame DA methods into three categories based on the diversity of augmented data, including paraphrasing, noising, and sampling.  ...  More Exploration on Pretrained Language Models.  ... 
arXiv:2110.01852v2 fatcat:io56z2tfoffa7bm3ay764a2nwe

MOVER: Mask, Over-generate and Rank for Hyperbole Generation [article]

Yunxiang Zhang, Xiaojun Wan
2022 arXiv   pre-print
Automatic and human evaluation results show that our model is effective at generating hyperbolic paraphrase sentences and outperforms several baseline systems.  ...  Despite being a common figure of speech, hyperbole is under-researched in Figurative Language Processing.  ...  Unsupervised Paraphrase Generation Unsupervised paraphrase generation models (Wieting et al., 2017; Zhang et al., 2019a; Roy and Grangier, 2019; Huang and Chang, 2021) do not require paraphrase pairs  ... 
arXiv:2109.07726v2 fatcat:debr3n6yezhv5hxzyn3xc45tdu

An Unsupervised Method for Building Sentence Simplification Corpora in Multiple Languages [article]

Xinyu Lu and Jipeng Qiang and Yun Li and Yunhao Yuan and Yi Zhu
2021 arXiv   pre-print
The building SS corpora with an unsupervised approach can satisfy the expectations that the aligned sentences preserve the same meanings and have difference in text complexity levels.  ...  the source and target language of a translation corpus.  ...  For BART-based model used for French and Spanish languages, we adopt a multilingual pretrained BART (mBART) with the weights 7 pretrained on 25 languages.  ... 
arXiv:2109.00165v1 fatcat:swgwctaobbek5dhktltcnrgryy

DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations [article]

John Giorgi, Osvald Nitski, Bo Wang, Gary Bader
2021 arXiv   pre-print
When used to extend the pretraining of transformer-based language models, our approach closes the performance gap between unsupervised and supervised pretraining for universal sentence encoders.  ...  Our code and pretrained models are publicly available and can be easily adapted to new domains or used to embed unseen text.  ...  Bold: best scores. ∆: difference to DeCLUTR-base average score. ↑ and ↓ denote increased or decreased performance with respect to the underlying pretrained model. *: Unsupervised evaluations.  ... 
arXiv:2006.03659v4 fatcat:poh7gosywrh2ressnn2yfjtdry

On Systematic Style Differences between Unsupervised and Supervised MT and an Application for High-Resource Machine Translation [article]

Kelly Marchisio, Markus Freitag, David Grangier
2022 arXiv   pre-print
We compare translations from supervised and unsupervised MT systems of similar quality, finding that unsupervised output is more fluent and more structurally different in comparison to human translation  ...  Modern unsupervised machine translation (MT) systems reach reasonable translation quality under clean and controlled data conditions.  ...  MASS is an encoder-decoder trained jointly with a masked language modeling objective on monolingual data. Iterative back-translation (BT) follows pretraining.  ... 
arXiv:2106.15818v2 fatcat:2r76w6wwq5e4rdguubtmenzone
« Previous Showing results 1 — 15 out of 1,299 results