A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models
[article]
2022
arXiv
pre-print
Pre-trained language models (LMs) are shown to easily generate toxic language. In this work, we systematically explore domain-adaptive training to reduce the toxicity of language models. ...
the large-scale models. ...
., poor coverage of language for different topics), which may hurt the LM's quality after domain-adaptive training. ...
arXiv:2202.04173v1
fatcat:vbxdns4glbffbh3korec5whooq
Reward Modeling for Mitigating Toxicity in Transformer-based Language Models
[article]
2022
arXiv
pre-print
The experiments demonstrate that the Reinforce-Detoxify method for language model detoxification outperforms existing detoxification approaches in automatic evaluation metrics, indicating the ability of ...
In this study, we propose Reinforce-Detoxify; A reinforcement learning-based method for mitigating toxicity in language models. ...
The BOLD is a large-scale dataset that consists of 23,679 English text generation prompts for bias benchmarking across five domains: profession, gender, race, religion, and political ideology. ...
arXiv:2202.09662v5
fatcat:bfvdyo6x2bedvhnd6ovgydncui
DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts
[article]
2021
arXiv
pre-print
We propose DExperts: Decoding-time Experts, a decoding-time method for controlled text generation that combines a pretrained language model with "expert" LMs and/or "anti-expert" LMs in a product of experts ...
Our work highlights the promise of tuning small LMs on text with (un)desirable attributes for efficient decoding-time steering. ...
We also thank the UW NLP group for helpful feedback on the work. ...
arXiv:2105.03023v2
fatcat:jacmrcsmlneexgsxpqkr64o7j4
Text Detoxification using Large Pre-trained Neural Models
[article]
2021
arXiv
pre-print
Finally, we present the first large-scale comparative study of style transfer models on the task of toxicity removal. We compare our models with a number of methods for style transfer. ...
We use a well-performing paraphraser guided by style-trained language models to keep the text content and remove toxicity. ...
Acknowledgements This research was conducted under the framework of the joint MTS-Skoltech laboratory. ...
arXiv:2109.08914v2
fatcat:umhz66ev3jep3e3uo73al5vqae
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models
[article]
2020
arXiv
pre-print
Pretrained neural language models (LMs) are prone to generating racist, sexist, or otherwise toxic language which hinders their safe deployment. ...
We investigate the extent to which pretrained LMs can be prompted to generate toxic language, and the effectiveness of controllable text generation algorithms at preventing such toxic degeneration. ...
Decoding-Based Detoxification Noting the additional cost of training language models further, we explore three detoxifying strategies that only rely on altering the decoding algorithm and are therefore ...
arXiv:2009.11462v2
fatcat:sdzqn6oumjgwvheetr2jrgggqq
Cognitive Impairments in Early-Detoxified Alcohol-Dependent Inpatients and Their Associations with Socio-Demographic, Clinical and Psychological Factors: An Exploratory Study
2020
Neuropsychiatric Disease and Treatment
The aim of this study was to describe qualitatively the cognitive deficits in early-detoxified AUD patients undergoing rehabilitation and to explore relevant associations with socio-demographic, clinical ...
Overall, 31.7% of AUD patients showed cognitive impairments according to the global score scale. ...
models. 78 Some limitations of this study should be mentioned. ...
doi:10.2147/ndt.s254369
pmid:32764946
pmcid:PMC7369414
fatcat:4ed6t55tlrf53ckbi55hg66nle
Quark: Controllable Text Generation with Reinforced Unlearning
[article]
2022
arXiv
pre-print
Large-scale language models often learn behaviors that are misaligned with user expectations. ...
We consider the task of unlearning these misalignments by fine-tuning the language model on signals of what not to do. ...
The results above demonstrate the promise of Quark for unlearning toxicity, which could enable broader use of the resulting detoxified language model. ...
arXiv:2205.13636v1
fatcat:petwegpqpzbghgkuvx4s7fmf5u
A neural classification method for supporting the creation of BioVerbNet
2019
Journal of Biomedical Semantics
VerbNet, an extensive computational verb lexicon for English, has proved useful for supporting a wide range of Natural Language Processing tasks requiring information about the behaviour and meaning of ...
Because VerbNet-style classification is extremely time consuming, we start from a small manual classification of biomedical verbs and apply a state-of-the-art neural representation model, specifically ...
Acknowledgements We would like to thank all participants who devoted their time to completing the study. We also wish to thank the reviewers for their valuable and detailed feedback.
Funding ...
doi:10.1186/s13326-018-0193-x
pmid:30658707
pmcid:PMC6339329
fatcat:d765jyeasjg3jit5rp7edqrdrq
Anticipating Safety Issues in E2E Conversational AI: Framework and Tooling
[article]
2021
arXiv
pre-print
However, these models are often trained on large datasets from the internet, and as a result, may learn undesirable behaviors from this data, such as toxic or otherwise harmful language. ...
Researchers must thus wrestle with the issue of how and when to release these models. ...
He is a member and the scientific director of the Data and Marketing Insights Unit of the Bocconi Institute for Data Science and Analysis. ...
arXiv:2107.03451v3
fatcat:ofl2i3btmzbmxlpozqy5iifd3e
Residual Energy-Based Models for Text
[article]
2020
arXiv
pre-print
Current large-scale auto-regressive language models display impressive fluency and can generate convincing text. ...
We find experimentally that the answer is affirmative when we have access to the training data for the model, and guardedly affirmative even if we do not. ...
In-domain Generalization In Table 4 we report the results of an in-domain generalization experiment using our large language model, TransfBig. ...
arXiv:2004.10188v2
fatcat:iw22p3w5fvfxvf5qrvtrmnxreq
GeDi: Generative Discriminator Guided Sequence Generation
[article]
2020
arXiv
pre-print
While large-scale language models (LMs) are able to imitate the distribution of natural language well enough to generate realistic text, it is difficult to control which regions of the distribution they ...
Lastly, we show that GeDi can make GPT-2 (1.5B parameters) significantly less toxic without sacrificing linguistic quality, making it by far the most practical existing method for detoxifying large language ...
ACKNOWLEDGMENTS The authors thank Semih Yavuz and Yu Bai for helpful discussions and feedback on this project. ...
arXiv:2009.06367v2
fatcat:wzbl2tdhznfrfnvp3s6by7qafy
Methods for Detoxification of Texts for the Russian Language
2021
Multimodal Technologies and Interaction
We introduce the first study of the automatic detoxification of Russian texts to combat offensive language. ...
While much work has been done for the English language in this field, there are no works on detoxification for the Russian language. ...
detoxGPT GPT-2 [28] is a powerful language model that can be adapted to a wide range of NLP tasks using a very small task-specific dataset. Until recently, there were no such models for Russian. ...
doi:10.3390/mti5090054
fatcat:xo4snfbjbbexhictmp3syb3cnq
Mental Imagery Skills in Alcohol-Dependent Subjects and Their Associations With Cognitive Performance: An Exploratory Study During Residential Rehabilitation
2021
Frontiers in Psychiatry
The global score at MIT did not show pathological scores. The 11.1% of AUD patients showed an impaired global score in the cognitive performance and the 5.7% with scoring at limits of norm. ...
This pilot study aims to observe the cognitive abilities useful for the inspection, maintenance, generation and manipulation of images in these patients during residential rehabilitation and investigate ...
Finally, a limitation has to be noted as to the multiple regression model. ...
doi:10.3389/fpsyt.2021.741900
pmid:34912249
pmcid:PMC8666508
fatcat:emtkwvcjjbfkda4vn3dd54qkzy
Large Pre-trained Language Models Contain Human-like Biases of What is Right and Wrong to Do
[article]
2022
arXiv
pre-print
Artificial writing is permeating our lives due to recent advances in large-scale, transformer-based language models (LMs) such as BERT, its variants, GPT-2/3, and others. ...
Using them as pre-trained models and fine-tuning them for specific tasks, researchers have extended state of the art for many NLP tasks and shown that they capture not only linguistic knowledge but also ...
Hessian Ministry of Higher Education, Research and the Arts (HMWK) cluster projects "The Adaptive Mind" and "The Third Wave of AI". ...
arXiv:2103.11790v3
fatcat:kq7sduihbzcklch3uqg4vn55le
Recipes for Safety in Open-domain Chatbots
[article]
2021
arXiv
pre-print
We then discuss the limitations of this work by analyzing failure cases of our models. ...
We investigate a variety of methods to mitigate these issues in the context of open-domain generative dialogue models. ...
(2019) train a large-scale controllable model that can modulate generations through control tokens, but also don't look at offensiveness. ...
arXiv:2010.07079v3
fatcat:qvbchivryrcdrj2evt6awl37fm
« Previous
Showing results 1 — 15 out of 540 results