Filters








5,992 Hits in 6.6 sec

Towards Efficient NLP: A Standard Evaluation and A Strong Baseline [article]

Xiangyang Liu, Tianxiang Sun, Junliang He, Jiawen Wu, Lingling Wu, Xinyu Zhang, Hao Jiang, Zhao Cao, Xuanjing Huang, Xipeng Qiu
2022 arXiv   pre-print
To that end, this work presents ELUE (Efficient Language Understanding Evaluation), a standard evaluation, and a public leaderboard for efficient NLP models.  ...  With ElasticBERT, the proposed ELUE has a strong Pareto Frontier and makes a better evaluation for efficient NLP models.  ...  ELUE: A Standard Benchmark for Efficient NLP Models ELUE aims to offer a standard evaluation for various efficient NLP models, such that they can be fairly and comprehensively compared.  ... 
arXiv:2110.07038v2 fatcat:etpicb7quzcmrjfbhsizixy5um

Towards Efficient NLP: A Standard Evaluation and A Strong Baseline

Xiangyang Liu, Tianxiang Sun, Junliang He, Jiawen Wu, Lingling Wu, Xinyu Zhang, Hao Jiang, Zhao Cao, Xuanjing Huang, Xipeng Qiu
2022 Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies   unpublished
To that end, this work presents ELUE (Efficient Language Understanding Evaluation), a standard evaluation, and a public leaderboard for efficient NLP models.  ...  With ElasticBERT, the proposed ELUE has a strong Pareto Frontier and makes a better evaluation for efficient NLP models.  ...  ELUE: A Standard Benchmark for Efficient NLP Models ELUE aims to offer a standard evaluation for various efficient NLP models, such that they can be fairly and comprehensively compared.  ... 
doi:10.18653/v1/2022.naacl-main.240 fatcat:c4xjrckk7fh2bi7ktsvoffr5xi

MixKD: Towards Efficient Distillation of Large-scale Language Models [article]

Kevin J Liang, Weituo Hao, Dinghan Shen, Yufan Zhou, Weizhu Chen, Changyou Chen, Lawrence Carin
2021 arXiv   pre-print
To verify its effectiveness, we conduct experiments on the GLUE benchmark, where MixKD consistently leads to significant gains over the standard KD training, and outperforms several competitive baselines  ...  To address these issues, we propose MixKD, a data-agnostic distillation framework that leverages mixup, a simple yet efficient data augmentation approach, to endow the resulting model with stronger generalization  ...  We conclude that the mixup ratio does not have a strong effect on overall performance.  ... 
arXiv:2011.00593v2 fatcat:z2egcdo2wva2ljuekat65dv5wu

FLEX: Unifying Evaluation for Few-Shot NLP [article]

Jonathan Bragg, Arman Cohan, Kyle Lo, Iz Beltagy
2021 arXiv   pre-print
In response, we formulate the FLEX Principles, a set of requirements and best practices for unified, rigorous, valid, and cost-sensitive few-shot NLP evaluation.  ...  Following the principles, we release the FLEX benchmark, which includes four few-shot transfer settings, zero-shot evaluation, and a public leaderboard that covers diverse NLP tasks.  ...  and feedback.  ... 
arXiv:2107.07170v2 fatcat:yioumxf2tfebdmzbesz5kriqti

Improving NLP through Marginalization of Hidden Syntactic Structure

Jason Naradowsky, Sebastian Riedel, David A. Smith
2012 Conference on Empirical Methods in Natural Language Processing  
Results show that this approach provides significant gains over a syntactically uninformed baseline, outperforming models that observe syntax on an English relation extraction task, and performing comparably  ...  We propose a novel method which avoids the need for any syntactically annotated data when predicting a related NLP task.  ...  This work was supported in part by the Center for Intelligent Information Retrieval and in part by Army prime contract number W911NF-07-1-0216 and University of Pennsylvania subaward number 103-548106.  ... 
dblp:conf/emnlp/NaradowskyRS12 fatcat:axhvgqmxnbeureo6jbtmwjwyni

Robustness Gym: Unifying the NLP Evaluation Landscape [article]

Karan Goel, Nazneen Rajani, Jesse Vig, Samson Tan, Jason Wu, Stephan Zheng, Caiming Xiong, Mohit Bansal, Christopher Ré
2021 arXiv   pre-print
In this work, we identify challenges with evaluating NLP systems and propose a solution in the form of Robustness Gym (RG), a simple and extensible evaluation toolkit that unifies 4 standard evaluation  ...  By providing a common platform for evaluation, Robustness Gym enables practitioners to compare results from all 4 evaluation paradigms with just a few clicks, and to easily develop and share novel evaluation  ...  Acknowledgements This work was part of a collaboration between Stanford, UNC, and Salesforce Research and was supported by Salesforce AI Research grants to MB and CR.  ... 
arXiv:2101.04840v1 fatcat:gky3ggyvavgr3atajqjnhgs52i

Advancing NLP with Cognitive Language Processing Signals [article]

Nora Hollenstein, Maria Barrett, Marius Troendle, Francesco Bigiolli, Nicolas Langer, Ce Zhang
2019 arXiv   pre-print
These methods significantly outperform the baselines and show the potential and current limitations of employing human language processing data for NLP.  ...  We present an extensive investigation of the benefits and limitations of using cognitive processing data for NLP.  ...  Learning word frequency as an auxiliary task is a strong baseline.  ... 
arXiv:1904.02682v1 fatcat:vtsqievrxvhrvl6sjsnetjkq54

Self-Explaining Structures Improve NLP Models [article]

Zijun Sun, Chun Fan, Qinghong Han, Xiaofei Sun, Yuxian Meng, Fei Wu, Jiwei Li
2020 arXiv   pre-print
To deal with these two issues, in this paper, we propose a simple yet general and effective self-explaining framework for deep learning models in NLP.  ...  nature, achieving a new SOTA performance of 59.1 on SST-5 and a new SOTA performance of 92.3 on SNLI.  ...  Towards these three purposes, in this paper, we propose a self-explainable framework for deep neural models in the context of NLP.  ... 
arXiv:2012.01786v2 fatcat:pn4bay6dvzhuvilod6seatcjfq

Semantic Representation and Inference for NLP [article]

Dongsheng Wang
2021 arXiv   pre-print
Semantic representation and inference is essential for Natural Language Processing (NLP).  ...  Motivated by this, we operationalize the compositionality of a phrase contextually by enriching the phrase representation with external word embeddings and knowledge graphs.  ...  Acknowledgments This research is partially supported by QUARTZ (721321, EU H2020 MSCA-ITN) and DABAI (5153-00004A, Innovation Fund Denmark).  ... 
arXiv:2106.08117v1 fatcat:qi3546wlhfd2xhqj3f776wa6km

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks [article]

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, Douwe Kiela
2021 arXiv   pre-print
We fine-tune and evaluate our models on a wide range of knowledge-intensive NLP tasks and set the state-of-the-art on three open domain QA tasks, outperforming parametric seq2seq models and task-specific  ...  For language generation tasks, we find that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.  ...  The authors would also like to thank Kyunghyun Cho and Sewon Min for productive discussions and advice. EP thanks supports from the NSF Graduate Research Fellowship.  ... 
arXiv:2005.11401v4 fatcat:g3lsxujnybbbdp7lcv5tmusuri

Efficient NLP Inference at the Edge via Elastic Pipelining [article]

Liwei Guo, Wonkyo Choe, Felix Xiaozhu Lin
2022 arXiv   pre-print
Atop two commodity SoCs, we build WRX and evaluate it against a wide range of NLP tasks, under a practical range of target latencies, and on both CPU and GPU.  ...  Yet, the unprecedented size of an NLP model stresses both latency and memory, the two key resources of a mobile device.  ...  Our approach towards high efficiency is through adjusting IO workloads of model shards to the computation.  ... 
arXiv:2207.05022v2 fatcat:xfdp45nwufck3m75qdkklwjyfi

Recent advances in conversational NLP : Towards the standardization of Chatbot building [article]

Maali Mnasri
2019 arXiv   pre-print
Finally, we present an opinion piece suggesting to orientate the research towards the standardization of dialogue systems building.  ...  Their use is getting more and more fluid and easy throughout the time. This boils down to the improvements made in NLP and AI fields.  ...  Such standards should be interoperable and should allow NLP researchers to plug their NLP in it.  ... 
arXiv:1903.09025v1 fatcat:cocknrgvdvguvjykorzmeu5zse

Post-hoc Interpretability for Neural NLP: A Survey [article]

Andreas Madsen, Siva Reddy, Sarath Chandar
2022 arXiv   pre-print
Neural networks for NLP are becoming increasingly complex and widespread, and there is a growing concern if these models are responsible to use.  ...  Additionally, post-hoc methods provide explanations after a model is learned and are generally model-agnostic.  ...  Although, human-grounded evaluation is much more efficient than application-grounded evaluation, the human aspect still takes time.  ... 
arXiv:2108.04840v4 fatcat:twveq6lt7vgahi5fbibc4sue5e

Towards Scalable and Reliable Capsule Networks for Challenging NLP Applications

Wei Zhao, Haiyun Peng, Steffen Eger, Erik Cambria, Min Yang
2019 Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics  
We validate our approach on two NLP tasks, namely: multi-label text classification and question answering.  ...  Obstacles hindering the development of capsule networks for challenging NLP applications include poor scalability to large output spaces and less reliable routing processes.  ...  outperforms strong baselines on multi-label text classification and question answering.  ... 
doi:10.18653/v1/p19-1150 dblp:conf/acl/ZhaoPECY19 fatcat:4cqppc7j55gqhbiopxywpyvfiq

Learning Structured Predictors from Bandit Feedback for Interactive NLP

Artem Sokolov, Julia Kreutzer, Christopher Lo, Stefan Riezler
2016 Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)  
Structured prediction from bandit feedback describes a learning scenario where instead of having access to a gold standard structure, a learner only receives partial feedback in form of the loss value  ...  of a predicted structure.  ...  Acknowledgments This research was supported in part by the German research foundation (DFG), and in part by a research cooperation grant with the Amazon Development Center Germany.  ... 
doi:10.18653/v1/p16-1152 dblp:conf/acl/SokolovKLR16 fatcat:fv25oaz545gmbkzis6acdtgbs4
« Previous Showing results 1 — 15 out of 5,992 results