Filters








1,262 Hits in 5.0 sec

Mitigation of Unintended Biases against Non-Native English Texts in Sentiment Analysis

Alina Zhiltsova, Simon Caton, Catherine Mulway
2019 Irish Conference on Artificial Intelligence and Cognitive Science  
This research investigates previously undefined non-native speaker bias in sentiment analysis, i.e. unintended discrimination against English texts written by non-native speakers of English.  ...  The tools gave significantly different scores to English texts with features of non-native speakers.  ...  and, where necessary, mitigating bias in lexicon-based sentiment analysis systems against English texts written by non-native speakers (speakers of French, Italian and Spanish)?'  ... 
dblp:conf/aics/ZhiltsovaCM19 fatcat:4eacai3t75e2lawk3kwvbouox4

BiasFinder: Metamorphic Test Generation to Uncover Bias for Sentiment Analysis Systems [article]

Muhammad Hilmi Asyrofi, Zhou Yang, Imam Nur Bani Yusuf, Hong Jin Kang, Ferdian Thung, David Lo
2021 arXiv   pre-print
Such biases manifest in an SA system when it predicts a different sentiment for similar texts that differ only in the characteristic of individuals described.  ...  Artificial Intelligence (AI) software systems, such as Sentiment Analysis (SA) systems, typically learn from large amounts of data that may reflect human biases.  ...  In general, sentiment analysis can be viewed as a specialized text classification with class labels corresponding to the sentiment polarities of the text.  ... 
arXiv:2102.01859v2 fatcat:3hsuzvu57nc5lj2kxljs3sedzi

Language (Technology) is Power: A Critical Survey of "Bias" in NLP [article]

Su Lin Blodgett and Solon Barocas and Hal Daumé III and Hanna Wallach
2020 arXiv   pre-print
We further find that these papers' proposed quantitative techniques for measuring or mitigating "bias" are poorly matched to their motivations and do not engage with the relevant literature outside of  ...  "---i.e., what kinds of system behaviors are harmful, in what ways, to whom, and why, as well as the normative reasoning underlying these statements---and to center work around the lived experiences of  ...  Any opinion, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.  ... 
arXiv:2005.14050v2 fatcat:nyzr4fj5gne55jzhzm2fy5xeqa

How to keep text private? A systematic review of deep learning methods for privacy-preserving natural language processing [article]

Samuel Sousa, Roman Kern
2022 arXiv   pre-print
Further, we discuss open challenges in privacy-preserving NLP regarding data traceability, computation overhead, dataset size, the prevalence of human biases in embeddings, and the privacy-utility tradeoff  ...  Deep learning (DL) models for natural language processing (NLP) tasks often handle private data, demanding protection against breaches and disclosures.  ...  Statements and declarations All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter  ... 
arXiv:2205.10095v1 fatcat:rksy7oxxlbde5bol3ay44yycru

Race, Language, and the Passive Voice: Hardship narratives in U.S. Social Studies Textbooks from 1860 to the present

Jeremy Jimenez
2020 Journal of Social Studies Education Research  
Adapting Mark Phillips' (2013) concept of historical distance coupled with a form of linguistic analysis known as stylistics, I examine 50 U.S. history textbooks from 1860 to 2016 in order to analyze which  ...  While United States historians' inclination to write in affect-inducing ways has waxed and waned throughout the past 150 years, racial biases concerning such writing have persisted through today.  ...  Researchers had also documented considerable racial biases in media accounts, noting journalists were more likely to write passively when describing crimes against people of color (Smitherman-Donaldson  ... 
doaj:4f469a9b0bd34221ace69c42b12896bb fatcat:5jlagokbafdc7hvrcttcwa2b5e

Towards generalisable hate speech detection: a review on obstacles and solutions [article]

Wenjie Yin, Arkaitz Zubiaga
2021 arXiv   pre-print
future research to improve generalisation in hate speech detection.  ...  Hate speech is one type of harmful online content which directly attacks or promotes hate towards a group or an individual member based on their actual or perceived aspects of identity, such as ethnicity  ...  These attempts to address biases against minority social groups differ by how they measure biases and their approaches to mitigate them.  ... 
arXiv:2102.08886v1 fatcat:7uemuqrehnfvnbenp657hv3fpy

From Theories on Styles to their Transfer in Text: Bridging the Gap with a Hierarchical Survey [article]

Enrica Troiano and Aswathy Velutharambath and Roman Klinger
2022 arXiv   pre-print
We organize them in a hierarchy, highlighting the challenges for the definition of each of them, and pointing out gaps in the current research landscape. The hierarchy comprises two main groups.  ...  They can, for instance, re-phrase a formal letter in an informal way, convey a literal message with the use of figures of speech or edit a novel by mimicking the style of some well-known authors.  ...  Systems capable of style transfer could also improve the readability of texts by paraphrasing them in simpler terms (Cao et al. 2020) , and help in this way non-native speakers (Wang et al. 2019b) .  ... 
arXiv:2110.15871v4 fatcat:zqzpmd6ennhqzhwedwdlqizf7y

TrollHunter [Evader]: Automated Detection [Evasion] of TwitterTrolls During the COVID-19 Pandemic [article]

Peter Jachim and Filipo Sharevski and Paige Treebridge
2020 arXiv   pre-print
To counter the COVID-19 infodemic, the TrollHunter leverages a unique linguistic analysis of a multi-dimensional set of Twitter content features to detect whether or not a tweet was meant to troll.  ...  Without a final resolution of the pandemic in sight, it is unlikely that the trolls will go away, although they might be forced to evade automated hunting.  ...  is done to help non-native English speakers make sense of English slang on social media [39] .  ... 
arXiv:2012.02586v1 fatcat:wirawysmzneora3ynlqpwev3pi

HateCheck: Functional Tests for Hate Speech Detection Models [article]

Paul Röttger, Bertram Vidgen, Dong Nguyen, Zeerak Waseem, Helen Margetts, Janet B. Pierrehumbert
2021 arXiv   pre-print
It also risks overestimating generalisable model performance due to increasingly well-evidenced systematic gaps and biases in hate speech datasets.  ...  We specify 29 model functionalities motivated by a review of previous research and a series of interviews with civil society stakeholders.  ...  PhD). 70% were native English speakers and 30% were non-native but fluent. Annotators had a range of nationalities: 60% were British and 10% each were Polish, Spanish, Argentinian and Irish.  ... 
arXiv:2012.15606v2 fatcat:uq4e5gl6djga7iuszekimn5s64

Analyzing Right-wing YouTube Channels

Raphael Ottoni, Evandro Cunha, Gabriel Magno, Pedro Bernardina, Wagner Meira Jr., Virgílio Almeida
2018 Proceedings of the 10th ACM Conference on Web Science - WebSci '18  
analyze (a) lexicon, (b) topics and (c) implicit biases present in the texts.  ...  demonstrate more discriminatory bias against Muslims (in videos) and towards LGBT people (in comments).  ...  [16] to study unconscious, subtle and often unintended biases in individuals.  ... 
doi:10.1145/3201064.3201081 dblp:conf/websci/OttoniCMBMA18 fatcat:lrmq7tu7unay3h34iodt3g65xy

Analyzing the Limits of Self-Supervision in Handling Bias in Language [article]

Lisa Bauer, Karthik Gopalakrishnan, Spandana Gella, Yang Liu, Mohit Bansal, Dilek Hakkani-Tur
2022 arXiv   pre-print
This also helps gain insight into how well language models capture the semantics of a wide range of downstream tasks purely from self-supervised pre-training on massive corpora of unlabeled text.  ...  In this paper, we define and comprehensively evaluate how well such language models capture the semantics of four tasks for bias: diagnosis, identification, extraction and rephrasing.  ...  Rephrase Data Collection We use native English speaking crowd-workers from the US to collect rephrases of CAD rationales such that bias in each rationale is removed.  ... 
arXiv:2112.08637v2 fatcat:n4l7sdo5hrdjlmhvh5nfjo477y

Deep Learning for Text Style Transfer: A Survey [article]

Di Jin, Zhijing Jin, Zhiting Hu, Olga Vechtomova, Rada Mihalcea
2021 arXiv   pre-print
We discuss the task formulation, existing datasets and subtasks, evaluation, as well as the rich methodologies in the presence of parallel and non-parallel data.  ...  In this paper, we present a systematic survey of the research on neural text style transfer, spanning over 100 representative articles since the first neural text style transfer work in 2017.  ...  by Sharma et al. (2021)); • Non-native-to-native transfer (i.e., reformulating grammatical error correction with TST); • Sentence disambiguation, to resolve nuance in text.  ... 
arXiv:2011.00416v5 fatcat:wfw3jfh2mjfupbzrmnztsqy4ny

Training language models to follow instructions with human feedback [article]

Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton (+8 others)
2022 arXiv   pre-print
In this paper, we show an avenue for aligning language models with user intent on a wide range of tasks by fine-tuning with human feedback.  ...  In other words, these models are not aligned with their users.  ...  We'd also like to thank Owain Evans and Stephanie Lin for pointing out the fact that the automatic TruthfulQA metrics were overstating the gains of our PPO models.  ... 
arXiv:2203.02155v1 fatcat:nsjth3nazzeithrsgpggfbchci

Teach For America's long arc: A critical race theory textual analysis of Wendy Kopp's works

Michael Barnes, Emily Germain, Angela Valenzuela
2016 Education Policy Analysis Archives  
Specifically, we answer the following questions: What evidence of institutional and epistemological racism is exposed by a CRT textual analysis of TFA's founding document and later works by Wendy Kopp?  ...  And, to what extent does TFA's contribution to a "culture of achievement" (Kopp & Farr, 2011) constitute an actual "poverty of culture" (Ladson-Billings, 2006a) that enacts real harms on communities of  ...  Applying our sentiment analysis for Kopp's 2001 text, we find that the 20.6% divide between positive language in TFA Members as compared to the Students, Community is reduced to a divide of just 1.6%.  ... 
doi:10.14507/epaa.24.2046 fatcat:w36eqxfdcnd3xm3t7mthfmxpam

Large Language Models Can Be Strong Differentially Private Learners [article]

Xuechen Li, Florian Tramèr, Percy Liang, Tatsunori Hashimoto
2022 arXiv   pre-print
Differentially Private (DP) learning has seen limited success for building large deep learning models of text, and attempts at straightforwardly applying Differentially Private Stochastic Gradient Descent  ...  We show that this performance drop can be mitigated with (1) the use of large pretrained models; (2) hyperparameters that suit DP optimization; and (3) fine-tuning objectives aligned with the pretraining  ...  We thank members of the Stanford statistical machine learning group for comments on early versions of the abstract. We thank Guodong Zhang and Mayee Chen for comments on an early draft.  ... 
arXiv:2110.05679v4 fatcat:xjwodam7rbeopjaz7k5e52v67e
« Previous Showing results 1 — 15 out of 1,262 results