Filters








22 Hits in 9.3 sec

Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings [article]

Thomas Manzini, Yao Chong Lim, Yulia Tsvetkov, Alan W Black
2019 arXiv   pre-print
Word embeddings, trained on these texts, perpetuate and amplify these stereotypes, and propagate biases to machine learning models that use word embeddings as features.  ...  In this work, we propose a method to debias word embeddings in multiclass settings such as race and religion, extending the work of (Bolukbasi et al., 2016) from the binary setting, such as binary gender  ...  Finally, we are greatly appreciative of the anonymous reviewers for their time and constructive comments.  ... 
arXiv:1904.04047v3 fatcat:xhhisxxfxjfzxoyza3bes3krlm

Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings

Thomas Manzini, Lim Yao Chong, Alan W Black, Yulia Tsvetkov
2019 Proceedings of the 2019 Conference of the North  
Word embeddings, trained on these texts, perpetuate and amplify these stereotypes, and propagate biases to machine learning models that use word embeddings as features.  ...  In this work, we propose a method to debias word embeddings in multiclass settings such as race and religion, extending the work of (Bolukbasi et al., 2016) from the binary setting, such as binary gender  ...  Finally, we are greatly appreciative of the anonymous reviewers for their time and constructive comments.  ... 
doi:10.18653/v1/n19-1062 dblp:conf/naacl/ManziniLBT19 fatcat:2fv5rpdwovggxjebk4r2gsv5um

Word Embeddings via Causal Inference: Gender Bias Reducing and Semantic Information Preserving [article]

Lei Ding, Dengdeng Yu, Jinhan Xie, Wenxing Guo, Shenggang Hu, Meichen Liu, Linglong Kong, Hongsheng Dai, Yanchun Bao, Bei Jiang
2021 arXiv   pre-print
To address these issues, we propose a novel methodology that leverages a causal inference framework to effectively remove gender bias.  ...  Previous studies have shown that word embeddings trained on human-generated corpora have strong gender biases that can produce discriminative results in downstream tasks.  ...  C.; Tsvetkov, Y.; and Black, A. W. tain human-like biases. Science, 356(6334): 183–186. 2019. Black is to criminal as caucasian is to police: De- Dev, S.; and Phillips, J. 2019.  ... 
arXiv:2112.05194v1 fatcat:r6oik4br3rdxplwjjmsjzoab6q

A Survey of Race, Racism, and Anti-Racism in NLP [article]

Anjalie Field, Su Lin Blodgett, Zeerak Waseem, Yulia Tsvetkov
2021 arXiv   pre-print
However, persistent gaps in research on race and NLP remain: race has been siloed as a niche topic and remains ignored in many NLP tasks; most work operationalizes race as a fixed single-dimensional variable  ...  By identifying where and how NLP literature has and has not considered race, especially in comparison to related fields, our work calls for inclusion and racial justice in NLP research practices.  ...  Z.W. has been supported in part by the Canada 150 Research  ... 
arXiv:2106.11410v2 fatcat:z5lme4aayradfmvtmosi3ip2re

Language (Technology) is Power: A Critical Survey of "Bias" in NLP [article]

Su Lin Blodgett and Solon Barocas and Hal Daumé III and Hanna Wallach
2020 arXiv   pre-print
"---i.e., what kinds of system behaviors are harmful, in what ways, to whom, and why, as well as the normative reasoning underlying these statements---and to center work around the lived experiences of  ...  We survey 146 papers analyzing "bias" in NLP systems, finding that their motivations are often vague, inconsistent, and lacking in normative reasoning, despite the fact that analyzing "bias" is an inherently  ...  In an ideal world, shared task papers would engage with "bias" more critically, but given the nature of shared tasks it is understandable that they do not.  ... 
arXiv:2005.14050v2 fatcat:nyzr4fj5gne55jzhzm2fy5xeqa

Latent Hatred: A Benchmark for Understanding Implicit Hate Speech [article]

Mai ElSherief, Caleb Ziems, David Muchlinski, Vaishnavi Anupindi, Jordyn Seybolt, Munmun De Choudhury, Diyi Yang
2021 arXiv   pre-print
We present systematic analyses of our dataset using contemporary baselines to detect and explain implicit hate speech, and we discuss key features that challenge existing models.  ...  This dataset will continue to serve as a useful benchmark for understanding this multifaceted issue.  ...  The work is supported in part by Russell Sage Foundation.  ... 
arXiv:2109.05322v1 fatcat:kh7nwzssjbc4zcpx5tiulfi7ji

Nurse is Closer to Woman than Surgeon? Mitigating Gender-Biased Proximities in Word Embeddings [article]

Vaibhav Kumar, Tenzin Singhay Bhotia, Vaibhav Kumar, Tanmoy Chakraborty
2020 arXiv   pre-print
Existing post-processing methods for debiasing word embeddings are unable to mitigate gender bias hidden in the spatial arrangement of word vectors.  ...  Unfortunately, these models have been shown to exhibit undesirable word associations resulting from gender, racial, and religious biases.  ...  Black is to criminal as caucasian is to police: Detecting and removing multiclass bias in word embeddings.  ... 
arXiv:2006.01938v1 fatcat:dsfrpxexinah3ahkhyrqyr3bfm

Assessing the Reliability of Word Embedding Gender Bias Measures [article]

Yupei Du, Qixiang Fang, Dong Nguyen
2021 arXiv   pre-print
Various measures have been proposed to quantify human-like social biases in word embeddings. However, bias scores based on these measures can suffer from measurement error.  ...  In this paper, we assess three types of reliability of word embedding gender bias measures, namely test-retest reliability, inter-rater consistency and internal consistency.  ...  Acknowledgements We thank all anonymous reviewers for their constructive and helpful feedback. We also thank Anna Wegmann for the proofreading and productive discussions.  ... 
arXiv:2109.04732v1 fatcat:kle7qhoq3ndqzfwalof36rofni

"You are grounded!": Latent Name Artifacts in Pre-trained Language Models [article]

Vered Shwartz, Rachel Rudinger, Oyvind Tafjord
2020 arXiv   pre-print
Pre-trained language models (LMs) may perpetuate biases originating in their training corpus to downstream models.  ...  As a silver lining, our experiments suggest that additional pre-training on different corpora may mitigate this bias.  ...  Thomas Manzini, Lim Yao Chong, Alan W Black, and Yulia Tsvetkov. 2019. Black is to criminal as caucasian is to police: Detecting and removing multiclass bias in word embeddings.  ... 
arXiv:2004.03012v2 fatcat:fw4gsqbffvbbxenarrdy6xtkyy

How Do You Speak about Immigrants? Taxonomy and StereoImmigrants Dataset for Identifying Stereotypes about Immigrants

Javier Sánchez-Junquera, Berta Chulvi, Paolo Rosso, Simone Paolo Ponzetto
2021 Applied Sciences  
We propose a new approach to detect stereotypes about immigrants in texts focusing not on the personal attributes assigned to the minority but in the frames, that is, the narrative scenarios, in which  ...  We carried out two preliminary experiments: first, to evaluate the automatic detection of stereotypes; and second, to distinguish between the two supracategories of immigrants' stereotypes.  ...  Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings.  ... 
doi:10.3390/app11083610 fatcat:bxnmljgcqvhw3c7rj33rbdvxay

Super-Fine Attributes with Crowd Prototyping

Daniel Martinho-Corbishley, Mark Nixon, John N. Carter
2018 IEEE Transactions on Pattern Analysis and Machine Intelligence  
, reporting up to a 11.2% and 14.8% mAP improvement for gender and age, further surpassed by ethnicity.  ...  We aim to discover more relevant and precise subject descriptions, improving image retrieval and closing the semantic gap.  ...  Gender, age and ethnicity are the most commonly reported characteristics in policing [4] , criminal record keeping [5] and identity science [2] , [6] , [7] , [8] and are proven to be critical in  ... 
doi:10.1109/tpami.2018.2836900 pmid:29994759 fatcat:z7cf52y4jrdmnl7fke5yvekshe

Exploring Text Specific and Blackbox Fairness Algorithms in Multimodal Clinical NLP

John Chen, Ian Berlot-Attwell, Xindi Wang, Safwan Hossain, Frank Rudzicz
2020 Proceedings of the 3rd Clinical Natural Language Processing Workshop   unpublished
Clinical machine learning is increasingly multimodal, collected in both structured tabular formats and unstructured forms such as free text.  ...  To this end, we investigate a modality-agnostic fairness algorithmequalized odds post processing -and compare it to a text-specific fairness algorithm: debiased clinical word embeddings.  ...  Black is to criminal as caucasian is to police: Detecting and removing multiclass bias in word embeddings.  ... 
doi:10.18653/v1/2020.clinicalnlp-1.33 fatcat:xijpztpyyrcoxhphvzei5zajda

Bad Seeds: Evaluating Lexical Methods for Bias Measurement

Maria Antoniak, David Mimno
2021 Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)   unpublished
A common factor in bias measurement methods is the use of hand-curated seed lexicons, but there remains little guidance for their selection.  ...  We gather seeds used in prior work, documenting their common sources and rationales, and in case studies of three English-language corpora, we enumerate the different types of social biases and linguistic  ...  Acknowledgements Thank you to our anonymous reviewers whose comments substantially influenced and improved this paper.  ... 
doi:10.18653/v1/2021.acl-long.148 fatcat:w27musun5fe75bx6zq4l6qyfma

Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer

Jieyu Zhao, Subhabrata Mukherjee, saghar Hosseini, Kai-Wei Chang, Ahmed Hassan Awadallah
2020 Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics   unpublished
These embeddings have been widely used in various settings, such as cross-lingual transfer, where a natural language processing (NLP) model trained on one language is deployed to another language.  ...  In this paper, we study gender bias in multilingual embeddings and how it affects transfer learning for NLP applications.  ...  We would like to thank Maria De-Arteaga and Andi Peng for the helpful discussion, and thank all the reviewers for their feedback.  ... 
doi:10.18653/v1/2020.acl-main.260 fatcat:lsyvbmlsdrbmlj64luu5cw7x2e

A Survey of Race, Racism, and Anti-Racism in NLP

Anjalie Field, Su Lin Blodgett, Zeerak Waseem, Yulia Tsvetkov
2021 Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)   unpublished
However, persistent gaps in research on race and NLP remain: race has been siloed as a niche topic and remains ignored in many NLP tasks; most work operationalizes race as a fixed singledimensional variable  ...  By identifying where and how NLP literature has and has not considered race, especially in comparison to related fields, our work calls for inclusion and racial justice in NLP research practices.  ...  Acknowledgements We gratefully thank Hanna Kim, Kartik Goyal, Artidoro Pagnoni, Qinlan Shen, and Michael Miller Yoder for their feedback on this work. Z.W. has  ... 
doi:10.18653/v1/2021.acl-long.149 fatcat:k6o7zcif2na6rew3n5x7sqszu4
« Previous Showing results 1 — 15 out of 22 results