Filters








30 Hits in 11.2 sec

Bias Out-of-the-Box: An Empirical Analysis of Intersectional Occupational Biases in Popular Generative Language Models [article]

Hannah Kirk, Yennie Jun, Haider Iqbal, Elias Benussi, Filippo Volpin, Frederic A. Dreyer, Aleksandar Shtedritski, Yuki M. Asano
2021 arXiv   pre-print
Specifically, we conduct an in-depth analysis of GPT-2, which is the most downloaded text generation model on HuggingFace, with over half a million downloads per month.  ...  , especially for intersections; (ii) Intersectional interactions are highly relevant for occupational associations, which we quantify by fitting 262 logistic models; (iii) For most occupations, GPT-2 reflects  ...  We extend such work by conducting an empirical analysis of the sentence completions within the specific context of bias towards occupational associations.  ... 
arXiv:2102.04130v3 fatcat:ytv3674owzftfl4f7abk27xan4

Bias Out-of-the-Box: An Empirical Analysis of Intersectional Occupational Biases in Popular Generative Language Models [article]

Hannah Kirk, Yennie Jun, Haider Iqbal, Elias Benussi, Filippo Volpin, Frederic A. Dreyer, Aleksandar Shtedritski, Yuki M. Asano
2021
Specifically, we conduct an in-depth analysis of GPT-2, which is the most downloaded text generation model on HuggingFace, with over half a million downloads per month.  ...  , especially for intersections; (ii) Intersectional interactions are highly relevant for occupational associations, which we quantify by fitting 262 logistic models; (iii) For most occupations, GPT-2 reflects  ...  Building on these methodologies, we focus on intersectional biases of GPT-2 with regards to the domain of occupations. Intersectional biases.  ... 
doi:10.48550/arxiv.2102.04130 fatcat:wsutxvn3cbcp3lt2x2c7uooprq

Assessing Social and Intersectional Biases in Contextualized Word Representations [article]

Yi Chern Tan, L. Elisa Celis
2019 arXiv   pre-print
In this paper, we analyze the extent to which state-of-the-art models for contextual word representations, such as BERT and GPT-2, encode biases with respect to gender, race, and intersectional identities  ...  We demonstrate evidence of bias at the corpus level, find varying evidence of bias in embedding association tests, show in particular that racial bias is strongly encoded in contextual word models, and  ...  Acknowledgments We would like to thank Jessica Ambrosio and Annique Wong for providing feedback on an early version of this paper, and the Data Science Ethics class (S&DS 150) at Yale for insightful conversations  ... 
arXiv:1911.01485v1 fatcat:j7k3ubdnsnbytk5ezqp3yprqi4

Towards Understanding and Mitigating Social Biases in Language Models [article]

Paul Pu Liang, Chiyu Wu, Louis-Philippe Morency, Ruslan Salakhutdinov
2021 arXiv   pre-print
As machine learning methods are deployed in real-world settings such as healthcare, legal systems, and social science, it is crucial to recognize how they shape social biases and stereotypes in these sensitive  ...  As a step towards improving the fairness of LMs, we carefully define several sources of representational biases before proposing new benchmarks and metrics to measure them.  ...  Acknowledgements This material is based upon work partially supported by the National Science Foundation (Awards #1750439, #1734868, and #1722822) and the National Institutes of Health.  ... 
arXiv:2106.13219v1 fatcat:yjkjuktjyjbejjp2axyc3wprhy

Societal Biases in Language Generation: Progress and Challenges [article]

Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, Nanyun Peng
2021 arXiv   pre-print
To better understand these challenges, we present a survey on societal biases in language generation, focusing on how data and techniques contribute to biases and progress towards reducing biases.  ...  Language generation presents unique challenges for biases in terms of direct user interaction and the structure of decoding techniques.  ...  Acknowledgments We would like to thank Seraphina Goldfarb-Tarrant, Sunipa Dev, Jason Teoh, members of the Plus Lab, and our anonymous reviewers for the many helpful suggestions that went into this paper  ... 
arXiv:2105.04054v3 fatcat:dhwma4hvfbf7jke3lt227qb73i

A Survey on Bias in Deep NLP

Ismael Garrido-Muñoz, Arturo Montejo-Ráez, Fernando Martínez-Santiago, L. Alfonso Ureña-López
2021 Applied Sciences  
We introduce bias in a formal way and explore how it has been treated in several networks, in terms of detection and correction.  ...  These networks, somehow, learn a probability distribution of words and relations across the training collection used, inheriting the potential flaws, inconsistencies and biases contained in such a collection  ...  Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/app11073184 fatcat:cdaihoicdzeolovw77yltdmpeq

Image Representations Learned With Unsupervised Pre-Training Contain Human-like Biases [article]

Ryan Steed, Aylin Caliskan
2020 arXiv   pre-print
We find that state-of-the-art unsupervised models trained on ImageNet, a popular benchmark image dataset curated from internet images, automatically learn racial, gender, and intersectional biases.  ...  For the first time, we develop a novel method for quantifying biased associations between representations of social concepts and attributes in images.  ...  Just like the GPT-2 transformer architecture, iGPT is composed of L blocks n l = layer_norm(h l ) a l = h l + multihead_attention(n l ) h l+1 = a l + mlp(layer_norm(a l )) where h l is the input tensor  ... 
arXiv:2010.15052v1 fatcat:7z7r5ewwefgjlevryrw6c7ku6i

Scaling Language Models: Methods, Analysis Insights from Training Gopher [article]

Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan (+68 others)
2022 arXiv   pre-print
We provide a holistic analysis of the training dataset and model's behaviour, covering the intersection of model scale with bias and toxicity.  ...  In this paper, we present an analysis of Transformer-based language model performance across a wide range of model scales -- from models with tens of millions of parameters up to a 280 billion parameter  ...  Using an estimate of 283W drawn per chip, this leads to a total of 380 net tCO 2 e, compared to 552 net tCO 2 e for GPT-3 (Patterson et al., 2021) or roughly 300 tCO 2 e per passenger jet round trip  ... 
arXiv:2112.11446v2 fatcat:wtajhbesibbetikkpow2vwiwqq

"I'm sorry to hear that": finding bias in language models with a holistic descriptor dataset [article]

Eric Michael Smith, Melissa Hall Melanie Kambadur, Eleonora Presani, Adina Williams
2022 arXiv   pre-print
We demonstrate that our dataset is highly efficacious for measuring previously unmeasurable biases in token likelihoods and generations from language models, as well as in an offensiveness classifier.  ...  Many datasets for measuring bias currently exist, but they are restricted in their coverage of demographic axes, and are commonly used with preset bias tests that presuppose which types of biases the models  ...  Acknowledgments We thank the following people for their feedback on this work and on our list of HOLISTICBIAS descriptors: Andrew Rayner, Anya Drabkin, Bran-  ... 
arXiv:2205.09209v1 fatcat:paxhthlznfcptoa5xluefbligi

The Language Interpretability Tool: Extensible, Interactive Visualizations and Analysis for NLP Models [article]

Ian Tenney, James Wexler, Jasmijn Bastings, Tolga Bolukbasi, Andy Coenen, Sebastian Gehrmann, Ellen Jiang, Mahima Pushkarna, Carey Radebaugh, Emily Reif, Ann Yuan
2020 arXiv   pre-print
We present the Language Interpretability Tool (LIT), an open-source platform for visualization and understanding of NLP models.  ...  LIT supports a wide range of models--including classification, seq2seq, and structured prediction--and is highly extensible through a declarative, framework-agnostic API.  ...  et al., 2019) and GPT-2 (Radford et al., 2019) .  ... 
arXiv:2008.05122v1 fatcat:3psi3vbxafhjvcpwph3jqkgsmm

Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models [article]

Maribeth Rauh, John Mellor, Jonathan Uesato, Po-Sen Huang, Johannes Welbl, Laura Weidinger, Sumanth Dathathri, Amelia Glaese, Geoffrey Irving, Iason Gabriel, William Isaac, Lisa Anne Hendricks
2022 arXiv   pre-print
However, recent literature and, increasingly, real world observations, have demonstrated that these models can generate language that is toxic, biased, untruthful or otherwise harmful.  ...  Finally, we apply them in a case study of the Perspective API, a toxicity classifier that is widely used in harm benchmarks.  ...  Acknowledgments and Disclosure of Funding The authors received no specific funding for this work.  ... 
arXiv:2206.08325v1 fatcat:emuzdvwytnfplb7wbv3xuntrom

Underspecification Presents Challenges for Credibility in Modern Machine Learning [article]

Alexander D'Amour, Katherine Heller, Dan Moldovan, Ben Adlam, Babak Alipanahi, Alex Beutel, Christina Chen, Jonathan Deaton, Jacob Eisenstein, Matthew D. Hoffman, Farhad Hormozdiari, Neil Houlsby (+28 others)
2020 arXiv   pre-print
An ML pipeline is underspecified when it can return many predictors with equivalently strong held-out performance in the training domain.  ...  Underspecification is common in modern ML pipelines, such as those based on deep learning.  ...  We also appreciate the advice of our DeepMind collaborator Dr. Nenad Tomasev, Prof. Finale Doshi-Velez and the wider Google Health Research UK team led by Dr. Alan Karthikesalingam.  ... 
arXiv:2011.03395v2 fatcat:xr6wi6e7pjbkrjp5uh6qep5no4

Ethical and social risks of harm from Language Models [article]

Laura Weidinger, John Mellor, Maribeth Rauh, Conor Griffin, Jonathan Uesato, Po-Sen Huang, Myra Cheng, Mia Glaese, Borja Balle, Atoosa Kasirzadeh, Zac Kenton, Sasha Brown (+11 others)
2021 arXiv   pre-print
In order to foster advances in responsible innovation, an in-depth understanding of the potential risks posed by these models is needed.  ...  We discuss the points of origin of different risks and point to potential mitigation approaches.  ...  Acknowledgements The authors thank Phil Blunsom, Shane Legg, Jack Rae, Aliya Ahmad, Richard Ives, Shelly Bensal and Ben Zevenbergen for comments on earlier drafts of this report.  ... 
arXiv:2112.04359v1 fatcat:excmnsvm7fcm7aeze2pryaz7nq

Post-hoc Interpretability for Neural NLP: A Survey [article]

Andreas Madsen, Siva Reddy, Sarath Chandar
2022 arXiv   pre-print
This survey provides a categorization of how recent post-hoc interpretability methods communicate explanations to humans, it discusses each method in-depth, and how they are validated, as the latter is  ...  Explaining models helps to address the safety and ethical concerns and is essential for accountability.  ...  [123] apply Natural Indirect Effect to a small GPT-2 model, where the mediator is an attention head. By doing this, Vig et al.  ... 
arXiv:2108.04840v4 fatcat:twveq6lt7vgahi5fbibc4sue5e

Survey of Generative Methods for Social Media Analysis [article]

Stan Matwin, Aristides Milios, Paweł Prałat, Amilcar Soares, François Théberge
2021 arXiv   pre-print
This survey draws a broad-stroke, panoramic picture of the State of the Art (SoTA) of the research in generative methods for the analysis of social media data.  ...  Social dynamics are important for understanding the spreading of influence or diseases, formation of friendships, the productivity of teams, etc.  ...  The GROVER model is approximately the same size as GPT-2.  ... 
arXiv:2112.07041v1 fatcat:xgmduwctpbddfo67y6ack5s2um
« Previous Showing results 1 — 15 out of 30 results