Filters








14 Hits in 17.1 sec

Deep Learning for Text Style Transfer: A Survey [article]

Di Jin, Zhijing Jin, Zhiting Hu, Olga Vechtomova, Rada Mihalcea
2021 arXiv   pre-print
It has a long history in the field of natural language processing, and recently has re-gained significant attention thanks to the promising performance brought by deep neural models.  ...  We also provide discussions on a variety of important topics regarding the future development of this task. Our curated paper list is at https://github.com/zhijing-jin/Text_Style_Transfer_Survey  ...  In Conference on Natural Language Processing Proceedings of the 56th Annual Meeting of the (Volume 1: Long Papers), pages 3786–3800, Association for Computational Linguistics, Association  ... 
arXiv:2011.00416v5 fatcat:wfw3jfh2mjfupbzrmnztsqy4ny

Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little [article]

Koustuv Sinha, Robin Jia, Dieuwke Hupkes, Joelle Pineau, Adina Williams, Douwe Kiela
2021 arXiv   pre-print
A possible explanation for the impressive performance of masked language model (MLM) pre-training is that such models have learned to represent the syntactic structures prevalent in classical NLP pipelines  ...  Overall, our results show that purely distributional information largely explains the success of pre-training, and underscore the importance of curating challenging evaluation datasets that require deeper  ...  We also thank the anonymous reviewers for their constructive feedback during the reviewing phase, which helped polish the paper to its current state.  ... 
arXiv:2104.06644v2 fatcat:6arq2lp37zbctln637ommaiqvu

Deep Learning for Text Style Transfer: A Survey

Di Jin, Zhijing Jin, Zhiting Hu, Olga Vechtomova, Rada Mihalcea
2021 Computational Linguistics  
It has a long history in the field of natural language processing, and recently has re-gained significant attention thanks to the promising performance brought by deep neural models.  ...  In this paper, we present a systematic survey of the research on neural text style transfer, spanning over 100 representative articles since the first neural text style transfer work in 2017.  ...  In Conference on Natural Language Processing Proceedings of the 56th Annual Meeting of the (Volume 1: Long Papers), pages 3786–3800, Association for Computational Linguistics,  ... 
doi:10.1162/coli_a_00426 fatcat:v7vmb62ckfcu5k5mpu2pydnrxy

DoCoGen: Domain Counterfactual Generation for Low Resource Domain Adaptation [article]

Nitay Calderon and Eyal Ben-David and Amir Feder and Roi Reichart
2022 arXiv   pre-print
Natural language processing (NLP) algorithms have become very successful, but they still struggle when applied to out-of-distribution examples.  ...  Our model outperforms strong baselines and improves the accuracy of a state-of-the-art unsupervised DA algorithm.  ...  Acknowledgements We would like to thank the action editor and the reviewers, as well as the members of the IE@Technion NLP group for their valuable feedback and advice.  ... 
arXiv:2202.12350v2 fatcat:7uomvkwjuvcctpghiy5kxbcnpe

Paradigm Shift in Natural Language Processing [article]

Tianxiang Sun, Xiangyang Liu, Xipeng Qiu, Xuanjing Huang
2021 arXiv   pre-print
With the rapid progress of pre-trained language models, recent years have observed a rising trend of Paradigm Shift, which is solving one NLP task by reformulating it as another one.  ...  For example, we usually adopt the sequence labeling paradigm to solve a bundle of tasks such as POS-tagging, NER, Chunking, and adopt the classification paradigm to solve tasks like sentiment analysis.  ...  In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers  ... 
arXiv:2109.12575v1 fatcat:vckeva3u3va3vjr6okhuztox4y

A Primer on Pretrained Multilingual Language Models [article]

Sumanth Doddapaneni, Gowtham Ramesh, Mitesh M. Khapra, Anoop Kunchukuttan, Pratyush Kumar
2021 arXiv   pre-print
variety of tasks and languages for evaluating (iii) analysing the performance of on monolingual, zero-shot cross-lingual and bilingual tasks (iv) understanding the universal language patterns (if any)  ...  learnt by and (v) augmenting the (often) limited capacity of to improve their performance on seen or even unseen languages.  ...  CoRR, ACL/IJCNLP 2021, (Volume 1: Long Papers), Vir- abs/2004.14327. tual Event, August 1-6, 2021, pages 3118–3135. As- Aaron van den Oord, Yazhe Li, and Oriol Vinyals.  ... 
arXiv:2107.00676v2 fatcat:jvvt6wwitvg2lc7bmttvv3aw6m

UNKs Everywhere: Adapting Multilingual Language Models to New Scripts [article]

Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych, Sebastian Ruder
2021 arXiv   pre-print
Massively multilingual language models such as multilingual BERT offer state-of-the-art cross-lingual transfer performance on a range of NLP tasks.  ...  Relying on matrix factorization, our methods capitalize on the existing latent knowledge about multiple languages already available in the pretrained model's embedding matrix.  ...  We thank Laura Rimell, Nils Reimers, Michael Bugert and the anonymous reviewers for insightful feedback and suggestions on a draft of this paper.  ... 
arXiv:2012.15562v3 fatcat:4zrgue5xyfbu5ft4fun5jaspx4

Contrastive Learning of Sociopragmatic Meaning in Social Media [article]

Chiyu Zhang, Muhammad Abdul-Mageed, Ganesh Jawahar
2022 arXiv   pre-print
Our framework outperforms other contrastive learning frameworks for both in-domain and out-of-domain data, across both the general and few-shot settings.  ...  For example, compared to two popular pre-trained language models, our method obtains an improvement of 11.66 average F_1 on 16 datasets when fine-tuned on only 20 training samples per dataset.  ...  Acknowledgements We gratefully acknowledge support from the Natural Sciences and Engineering Research Council of Canada (NSERC; RGPIN-2018-04267), the Social Sciences and Humanities Research Council of  ... 
arXiv:2203.07648v2 fatcat:6zmhiogvirdlznoaqonyuesc54

Intrinisic Gradient Compression for Federated Learning [article]

Luke Melas-Kyriazi, Franklyn Wang
2021 arXiv   pre-print
One of the largest barriers to wider adoption of federated learning is the communication cost of sending model updates from and to the clients, which is accentuated by the fact that many of these devices  ...  Specifically, we present three algorithms in this family with different levels of upload and download bandwidth for use in various federated settings, along with theoretical guarantees on their performance  ...  on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1-6, 2021, pages 7319–7328.  ... 
arXiv:2112.02656v1 fatcat:bmkxosl22rgnln5ikdbaayzofi

Evaluating Explanations: How much do explanations from the teacher aid students? [article]

Danish Pruthi, Rachit Bansal, Bhuwan Dhingra, Livio Baldini Soares, Michael Collins, Zachary C. Lipton, Graham Neubig, William W. Cohen
2021 arXiv   pre-print
In this work, we introduce a framework to quantify the value of explanations via the accuracy gains that they confer on a student model trained to simulate a teacher model.  ...  Using our framework, we compare numerous attribution methods for text classification and question answering, and observe quantitative differences that are consistent (to a moderate to high degree) across  ...  In Advances 11th International Joint Conference on Natural in Neural Information Processing Systems 32: Language Processing (ACL-IJCNLP), Virtual.  ... 
arXiv:2012.00893v2 fatcat:czmvmj4525fcdffimhaxqxtgdu

ElitePLM: An Empirical Study on General Language Ability Evaluation of Pretrained Language Models [article]

Junyi Li, Tianyi Tang, Zheng Gong, Lixin Yang, Zhuohao Yu, Zhipeng Chen, Jingyuan Wang, Wayne Xin Zhao, Ji-Rong Wen
2022 arXiv   pre-print
Moreover, the prediction results of PLMs in our experiments are released as an open resource for more deep and detailed analysis on the language abilities of PLMs.  ...  In this paper, we present a large-scale empirical study on general language ability evaluation of PLMs (ElitePLM).  ...  Association for Computational Linguistics. sented in Table 10. 2021, Online Event, August 1-6, 2021, volume ACL/IJCNLP 2021 of Findings of ACL, pages 1558- Reasoning Tests.  ... 
arXiv:2205.01523v1 fatcat:d2qusgoj75aefa32btqgtkybdi

DoCoGen: Domain Counterfactual Generation for Low Resource Domain Adaptation

Nitay Calderon, Eyal Ben-David, Amir Feder, Roi Reichart
2022 Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)   unpublished
Natural language processing (NLP) algorithms have become very successful, but they still struggle when applied to out-of-distribution examples.  ...  Our model outperforms strong baselines and improves the accuracy of a state-of-the-art unsupervised DA algorithm. 1  ...  Acknowledgements We would like to thank the action editor and the reviewers, as well as the members of the IE@Technion NLP group for their valuable feedback and advice.  ... 
doi:10.18653/v1/2022.acl-long.533 fatcat:lkukck3wn5d4hdcbn6xidk7smm

UNKs Everywhere: Adapting Multilingual Language Models to New Scripts

Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych, Sebastian Ruder
2021 Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing   unpublished
cessing, ACL/IJCNLP 2021, (Volume 1: Long Pa- pers), Virtual Event, August 1-6, 2021, pages 3118– Daniel Zeman, Joakim Nivre, Mitchell Abrams, Elia 3135.  ...  ings of the 59th Annual Meeting of the Association 10196 for Computational Linguistics and the 11th Interna- NLP, pages 120–130, Online.  ... 
doi:10.18653/v1/2021.emnlp-main.800 fatcat:yfk6loxis5agbj2or3rrjjlyiq

Softmax Bottleneck Makes Language Models Unable to Represent Multi-mode Word Distributions

Haw-Shiuan Chang, Andrew McCallum
2022 Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)   unpublished
The softmax layer produces the distribution based on the dot products of a single hidden state and the embeddings of words in the vocabulary.  ...  Neural language models (LMs) such as GPT-2 estimate the probability distribution over the next word by a softmax over the vocabulary.  ...  Technology Collaborative, in part by the National Science Foundation (NSF) grant numbers DMR-1534431, IIS-1763618, and IIS-1955567, and in part by the Office of Naval  ... 
doi:10.18653/v1/2022.acl-long.554 fatcat:oz46bx624vhnpd2jh6gboma5ju