Filters








852 Hits in 7.7 sec

Cross-lingual COVID-19 Fake News Detection [article]

Jiangshu Du, Yingtong Dou, Congying Xia, Limeng Cui, Jing Ma, Philip S. Yu
2021 arXiv   pre-print
In this paper, we make the first attempt to detect COVID-19 misinformation in a low-resource language (Chinese) only using the fact-checked news in a high-resource language (English).  ...  We start by curating a Chinese real&fake news dataset according to existing fact-checking information.  ...  This work is supported in part by NSF under grants III-1763325, III-1909323, III-2106758, and SaTC-1930941. Jing Ma was supported by HKBU direct grant (Ref. AIS 21-22/02).  ... 
arXiv:2110.06495v2 fatcat:ry3vjtglkjhzhawwpphu4a6ggy

No Rumours Please! A Multi-Indic-Lingual Approach for COVID Fake-Tweet Detection [article]

Debanjana Kar, Mohit Bhardwaj, Suranjana Samanta, Amar Prakash Azad
2020 arXiv   pre-print
We also propose a zero-shot learning approach to alleviate the data scarcity issue for such low resource languages.  ...  Towards this, we propose an approach to detect fake news about COVID-19 early on from social media, such as tweets, for multiple Indic-Languages besides English.  ...  CONCLUDING REMARKS In this work, we propose a multilingual approach to detect fake news about COVID-19 from Twitter posts for multiple Indic-Languages.  ... 
arXiv:2010.06906v1 fatcat:fyfih46qcrh3teazrf6dahtlhq

Overcoming Language Disparity in Online Content Classification with Multimodal Learning [article]

Gaurav Verma, Rohit Mujumdar, Zijie J. Wang, Munmun De Choudhury, Srijan Kumar
2022 arXiv   pre-print
Our comparative analyses on three detection tasks focusing on crisis information, fake news, and emotion recognition, as well as five high-resource non-English languages, demonstrate that: (a) detection  ...  frameworks based on pre-trained large language models like BERT and multilingual-BERT systematically perform better on the English language compared against non-English languages, and (b) including images  ...  We acknowledge Shivangi Singhal (IIIT-Delhi, India) for providing us with the pre-processed multimodal fake news dataset.  ... 
arXiv:2205.09744v1 fatcat:gpjomtyxljcgfgzrkvrnrqsite

Coarse and Fine-Grained Hostility Detection in Hindi Posts using Fine Tuned Multilingual Embeddings [article]

Arkadipta De, Venkatesh E, Kaushal Kumar Maurya, Maunendra Sankar Desarkar
2021 arXiv   pre-print
We view this hostility detection as a multi-label multi-class classification problem. We propose an effective neural network-based technique for hostility detection in Hindi posts.  ...  The hostility detection task has been well explored for resource-rich languages like English, but is unexplored for resource-constrained languages like Hindidue to the unavailability of large suitable  ...  A lexicon-based approach is proposed by [7] to hate speech detection in web discourses viz. web forums, blogs, etc.  ... 
arXiv:2101.04998v1 fatcat:z4bdmqg7mzdlbggjh7am472o2q

Semi-automatic Generation of Multilingual Datasets for Stance Detection in Twitter

Elena Zotova, Rodrigo Agerri, German Rigau
2021 Expert systems with applications  
Although some efforts have recently been made to develop annotated data in other languages, there is a telling lack of resources to facilitate multilingual and crosslingual research on stance detection  ...  This paper presents a method to obtain multilingual datasets for stance detection in Twitter.  ...  Rodrigo Agerri is also funded by the RYC-2017-23647 fellowship and acknowledges the donation of a Titan V GPU by the NVIDIA Corporation.  ... 
doi:10.1016/j.eswa.2020.114547 fatcat:ej4fykkhnjbhdcwocwah3meedq

Evaluation of Deep Learning Models for Hostility Detection in Hindi Text [article]

Ramchandra Joshi, Rushabh Karnavat, Kaustubh Jirapure, Raviraj Joshi
2021 arXiv   pre-print
The problem is more pronounced languages like Hindi which are low in resources. In this work, we present approaches for hostile text detection in the Hindi language.  ...  Two variations of pre-trained multilingual transformer language models mBERT and IndicBERT are used. We show that the performance of BERT based models is best.  ...  We would like to express our gratitude towards our mentors at L3Cube for their continuous support and encouragement.  ... 
arXiv:2101.04144v4 fatcat:2334yi6zlzflpdv7noy7n7qigq

Arabic fake news detection based on deep contextualized embedding models

Ali Bou Nassif, Ashraf Elnagar, Omar Elgendy, Yaman Afadar
2022 Neural computing & applications (Print)  
Many studies have been conducted to help detect fake news in English, but research conducted on fake news detection in the Arabic language is scarce.  ...  Second, we have developed and evaluated transformer-based classifiers to identify fake news while utilizing eight state-of-the-art Arabic contextualized embedding models.  ...  Arabic Language Processing  ... 
doi:10.1007/s00521-022-07206-4 pmid:35529091 pmcid:PMC9063258 fatcat:7fbddkahbnftpeqm77urtiboiq

Current Approaches and Applications in Natural Language Processing

Arturo Montejo-Ráez, Salud María Jiménez-Zafra
2022 Applied Sciences  
Artificial Intelligence has gained a lot of popularity in recent years thanks to the advent of, mainly, Deep Learning techniques [...]  ...  This is used in a multimodal classification system applied in fake news detection.  ...  It is remarkable how multilinguality is fostering research to cover what are considered "low-resourced" languages (i.e., those different from English or Chinese).  ... 
doi:10.3390/app12104859 fatcat:yhoyyoqcazflrbx7veksnkrrdq

Detecting Deceptive Utterances Using Deep Pre-Trained Neural Networks

Aleksander Wawer, Justyna Sarzyńska-Wawer
2022 Applied Sciences  
Our study aims to investigate the performance of automated lie detection methods, namely the most recent breed of pre-trained transformer neural networks capable of processing the Polish language.  ...  We also explored model interpretability based on integrated gradient to shed light on classifier decisions.  ...  , or in the decision to publish the results.  ... 
doi:10.3390/app12125878 fatcat:76whuoaz5vhevimomo4ggzrkly

Text Analytics in Bulgarian: An Overview and Future Directions

Gloria Hristova
2021 Cybernetics and Information Technologies  
A review of key research articles in two main directions is provided – development of language resources for Bulgarian and experimenting with Bulgarian text data in practical applications.  ...  By summarizing the results of a large literature review, we draw conclusions about the degree of development of the field, the availability of language resources for the Bulgarian language and the extent  ...  [32] are the first to address the problem of fake news detection as a task to distinguish between serious news and news that are designed to make the reader believe they are real (as opposed to humorous  ... 
doi:10.2478/cait-2021-0027 fatcat:qtpliu7q7fhchhi5wv3ca4di6q

Matching Tweets With Applicable Fact-Checks Across Languages [article]

Ashkan Kazemi, Zehua Li, Verónica Pérez-Rosas, Scott A. Hale, Rada Mihalcea
2022 arXiv   pre-print
An important challenge for news fact-checking is the effective dissemination of existing fact-checks. This in turn brings the need for reliable methods to detect previously fact-checked claims.  ...  We conduct both classification and retrieval experiments, in monolingual (English only), multilingual (Spanish, Portuguese), and cross-lingual (Hindi-English) settings using multilingual transformer models  ...  More recently, Kazemi et al. (2021a) focused on matching claims that can be served with one fact-check in five low and high-resource languages.  ... 
arXiv:2202.07094v1 fatcat:2k7e6e3gk5es7hvyznodb3ezbi

Detecting Social Media Manipulation in Low-Resource Languages [article]

Samar Haider, Luca Luceri, Ashok Deb, Adam Badawy, Nanyun Peng, Emilio Ferrara
2020 arXiv   pre-print
Here, we investigate whether and to what extent malicious actors can be detected in low-resource language settings.  ...  We first learn an embedding model for each language, namely a high-resource language (English) and a low-resource one (Tagalog), independently.  ...  In this paper, we posed the problem of detecting social media abuse in low-resource languages.  ... 
arXiv:2011.05367v1 fatcat:vxqo2b5bijgvllzygv7vpwuxsq

The Impact of Translating Resource-Rich Datasets to Low-Resource Languages Through Multi-Lingual Text Processing

Abdul Ghafoor, Ali Shariq Imran, Sher Muhammad Daudpota, Zenun Kastrati, Abdullah, Rakhi Batra, Mudasir Ahmad Wani
2021 IEEE Access  
Our study shows 2-3 percentage points performance degradation in sentiment classification due to polarity shift as a result of translation from resource-rich languages to low-resource languages.  ...  This study evaluates the effect of translation on the sentiment classification task from a resource-rich language to a low-resource language.  ...  This approach uses machine translation technique to translate a dataset from resource-rich language to low-resource language.  ... 
doi:10.1109/access.2021.3110285 fatcat:iqckm343qfd6pmwrgaboaevtqe

Deep Learning Models for Multilingual Hate Speech Detection [article]

Sai Saketh Aluru, Binny Mathew, Punyajoy Saha, Animesh Mukherjee
2020 arXiv   pre-print
In this paper, we conduct a large scale analysis of multilingual hate speech in 9 languages from 16 different sources.  ...  Hate speech detection is a challenging problem with most of the datasets available in only one language: English.  ...  In general, LASER + LR performs well in low resource setting and BERT based models are better in high resource settings Language Low resource High resource Arabic Monolingual, LASER + LR Multilingual  ... 
arXiv:2004.06465v3 fatcat:t2ds5n3tqjc4te3aj2tvxubl7a

BanFakeNews: A Dataset for Detecting Fake News in Bangla [article]

Md Zobaer Hossain, Md Ashraful Rahman, Md Saiful Islam, Sudipta Kar
2020 arXiv   pre-print
In this work, we propose an annotated dataset of ~50K news that can be used for building automated fake news detection systems for a low resource language like Bangla.  ...  We expect this dataset will be a valuable resource for building technologies to prevent the spreading of fake news and contribute in research with low resource languages.  ...  To evaluate the scope of such a language model in our work, we use the multilingual BERT model to classify news documents.  ... 
arXiv:2004.08789v1 fatcat:loagpv34nfgsxmng7npowtshha
« Previous Showing results 1 — 15 out of 852 results