192 Hits in 5.1 sec

Large-Scale News Classification using BERT Language Model: Spark NLP Approach [article]

Kuncahyo Setyo Nugroho, Anantha Yullian Sukmadewa, Novanto Yudistira
2021 arXiv   pre-print
This research aims to study the effect of big data processing on NLP tasks based on a deep learning approach. We classify a big text of news topics with fine-tuning BERT used pre-trained models.  ...  The accuracy average and training time of all models using BERT is 0.9187 and 35 minutes while using BERT with Spark NLP pipeline is 0.8444 and 9 minutes.  ...  CONCLUSION The BERT model is a good model to do large scale NLP tasks such as news classification. The larger the model gives higher accuracy, but it will take more time to complete the task.  ... 
arXiv:2107.06785v2 fatcat:wtfhguxkobaprgu3xiljqvwogu

Testing pre-trained Transformer models for Lithuanian news clustering [article]

Lukas Stankevičius, Mantas Lukoševičius
2020 arXiv   pre-print
However, non-English languages could not leverage such new opportunities with the English text pre-trained models.  ...  We compare pre-trained multilingual BERT, XLM-R, and older learned text representation methods as encodings for the task of Lithuanian news clustering.  ...  Lithuanian language does not yet have BERT-scale monolingual NLP model. It is relatively very little spoken in a world.  ... 
arXiv:2004.03461v1 fatcat:eqgj7x7zuncnbcexndj77vg2im

From English To Foreign Languages: Transferring Pre-trained Language Models [article]

Ke Tran
2020 arXiv   pre-print
With a single GPU, our approach can obtain a foreign BERT base model within a day and a foreign BERT large within two days.  ...  Pre-trained models have demonstrated their effectiveness in many downstream natural language processing (NLP) tasks.  ...  A common practice to train a large-scale multilingual model is to do so from scratch. But do multilingual models always need to be trained from scratch?  ... 
arXiv:2002.07306v2 fatcat:s444t3yg75cpppuix65lip7lii

A Roadmap for Big Model [article]

Sha Yuan, Hanyu Zhao, Shuai Zhao, Jiahong Leng, Yangxiao Liang, Xiaozhi Wang, Jifan Yu, Xin Lv, Zhou Shao, Jiaao He, Yankai Lin, Xu Han (+88 others)
2022 arXiv   pre-print
We introduce 16 specific BM-related topics in those four parts, they are Data, Knowledge, Computing System, Parallel Training System, Language Model, Vision Model, Multi-modal Model, Theory&Interpretability  ...  With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm.  ...  Recently, Big Language Models (BLMs) [26, 18] become a new paradigm to learn universal language representation from large-scale unlabeled data.  ... 
arXiv:2203.14101v4 fatcat:rdikzudoezak5b36cf6hhne5u4

RobBERTje: a Distilled Dutch BERT Model [article]

Pieter Delobelle, Thomas Winters, Bettina Berendt
2022 arXiv   pre-print
Pre-trained large-scale language models such as BERT have gained a lot of attention thanks to their outstanding performance on a wide range of natural language tasks.  ...  We found that the performance of the models using the shuffled versus non-shuffled datasets is similar for most tasks and that randomly merging subsequent sentences in a corpus creates models that train  ...  As these models are scaling exponentially, distilling such large language models has received a lot of attention.  ... 
arXiv:2204.13511v1 fatcat:r7tvql6btjf6dhhil4e4ogegxe

Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey [article]

Bonan Min, Hayley Ross, Elior Sulem, Amir Pouran Ben Veyseh, Thien Huu Nguyen, Oscar Sainz, Eneko Agirre, Ilana Heinz, Dan Roth
2021 arXiv   pre-print
Large, pre-trained transformer-based language models such as BERT have drastically changed the Natural Language Processing (NLP) field.  ...  We present a survey of recent work that uses these large language models to solve NLP tasks via pre-training then fine-tuning, prompting, or text generation approaches.  ...  In this section, we first provide a primer on pre-trained large language models (PLMs), then describe approaches that use frozen or fine-tuned PLMs for NLP tasks.  ... 
arXiv:2111.01243v1 fatcat:4xfjkkby2bfnhdrhmrdlliy76m

Deeper Clinical Document Understanding Using Relation Extraction [article]

Hasham Ul Haq, Veysel Kocaman, David Talby
2021 arXiv   pre-print
The system is built using the Spark NLP library which provides a production-grade, natively scalable, hardware-optimized, trainable & tunable NLP framework.  ...  First, we introduce two new RE model architectures -- an accuracy-optimized one based on BioBERT and a speed-optimized one utilizing crafted features over a Fully Connected Neural Network (FCNN).  ...  Spark NLP: Natural Lan- WHO. 2019. ICD10. guage Understanding at Scale. Software Impacts, 8: 100058. classifications/classification-of-diseases.  ... 
arXiv:2112.13259v1 fatcat:5larnhre6bd5zkuga2p5vs24ra

CodeBERT: A Pre-Trained Model for Programming and Natural Languages [article]

Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, Ming Zhou
2020 arXiv   pre-print
We present CodeBERT, a bimodal pre-trained model for programming language (PL) and nat-ural language (NL).  ...  This enables us to utilize both bimodal data of NL-PL pairs and unimodal data, where the former provides input tokens for model training while the latter helps to learn better generators.  ...  Successful approaches train deep neural networks on large-scale plain texts with self-supervised learning objectives.  ... 
arXiv:2002.08155v4 fatcat:zgz4uic7fvazdm5fbnkjfrefqa

The great transformer: Examining the role of large language models in the political economy of AI

Dieuwertje Luitse, Wiebke Denkena
2021 Big Data & Society  
In natural language processing (NLP), this tendency is reflected in the emergence of large language models (LLMs) like GPT-3.  ...  that is increasingly divided by access to large-scale computing power.  ...  language models to automatically generate fake news or violence-inciting posts on social media at scale (McGuffie and Newhouse, 2020) .  ... 
doi:10.1177/20539517211047734 fatcat:57lewolflnadhjgynrbz3fmn24

Can You Tell Me How to Get Past Sesame Street? Sentence-Level Pretraining Beyond Language Modeling

Alex Wang, Jan Hula, Patrick Xia, Raghavendra Pappagari, R. Thomas McCoy, Roma Patel, Najoung Kim, Ian Tenney, Yinghui Huang, Katherin Yu, Shuning Jin, Berlin Chen (+4 others)
2019 Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics  
We conduct the first large-scale systematic study of candidate pretraining tasks, comparing 19 different tasks both as alternatives and complements to language modeling.  ...  language modeling.  ...  For these experiments, we largely follow the procedure and architecture used by ELMo rather than BERT, but we expect similar trends with BERT-style models.  ... 
doi:10.18653/v1/p19-1439 dblp:conf/acl/WangHXPMPKTHYJC19 fatcat:rqhxn6jtgjcepclspzad3odir4

Commonsense Knowledge Reasoning and Generation with Pre-trained Language Models: A Survey [article]

Prajjwal Bhargava, Vincent Ng
2022 arXiv   pre-print
commonsense knowledge acquisition and reasoning has traditionally been a core research topic in the knowledge representation and reasoning community, recent years have seen a surge of interest in the natural language  ...  processing community in developing pre-trained models and testing their ability to address a variety of newly designed commonsense knowledge reasoning and generation tasks.  ...  The advent of the neural natural language processing (NLP) era has revolutionized virtually all areas of NLP research.  ... 
arXiv:2201.12438v1 fatcat:tpqwhagvnvdzlhzmbm6ajhokcm

Pretrained Language Models for Text Generation: A Survey [article]

Junyi Li, Tianyi Tang, Wayne Xin Zhao, Jian-Yun Nie, Ji-Rong Wen
2022 arXiv   pre-print
The resurgence of deep learning has greatly advanced this field, in particular, with the help of neural generation models based on pre-trained language models (PLMs).  ...  We also include a summary of various useful resources and typical text generation applications based on PLMs.  ...  This observation sparked the development of large-scale PLMs in text generation [14, 207] .  ... 
arXiv:2201.05273v4 fatcat:pnffabspsnbhvo44gbaorhxc3a

A Word Cloud Model based on Hate Speech in an Online Social Media Environment

Valentina Ibrahim, Juhaid Abu Bakar, Nor Hazlyna Harun, Alaa Fareed Abdulateef
2021 Baghdad Science Journal  
This research aims to develop a word cloud model based on hateful words on online social media environment such as Google News.  ...  Social media is known as detectors platform that are used to measure the activities of the users in the real world.  ...  The proposed approach achieved a promising outcome with a special spark function for big data.  ... 
doi:10.21123/bsj.2021.18.2(suppl.).0937 fatcat:vqkihyqofnfa7m3gb5qhmcdfu4

Measuring and Improving Consistency in Pretrained Language Models

Yanai Elazar, Nora Kassner, Shauli Ravfogel, Abhilasha Ravichander, Eduard Hovy, Hinrich Schütze, Yoav Goldberg
2021 Transactions of the Association for Computational Linguistics  
In this paper we study the question: Are Pretrained Language Models (PLMs) consistent with respect to factual knowledge?  ...  Consistency of a model—that is, the invariance of its behavior under meaning-preserving alternations in its input—is a highly desirable property in natural language processing.  ...  conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the NSF, DARPA, or the US  ... 
doi:10.1162/tacl_a_00410 fatcat:xirjzobvhngebmqq4tuow575gq

On the validity of pre-trained transformers for natural language processing in the software engineering domain [article]

Julian von der Mosel, Alexander Trautsch, Steffen Herbold
2022 arXiv   pre-print
Such models are pre-trained on large amounts of data, usually from the general domain.  ...  Transformers are the current state-of-the-art of natural language processing in many domains and are using traction within software engineering research as well.  ...  In conclusion, we recommend to ensure that a large amount of SE data was used for pre-training large NLP models, when these models are used for SE tasks.  ... 
arXiv:2109.04738v2 fatcat:kjgg3abyvvf4jjjsgdmi47ynm4
« Previous Showing results 1 — 15 out of 192 results