A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Are Clinical BERT Models Privacy Preserving? The Difficulty of Extracting Patient-Condition Associations
2021
AAAI Fall Symposia
Language models may be trained on data that contain personal information, such as clinical data. Such sensitive data must not leak for privacy reasons. This article explores whether BERT models trained on clinical data are susceptible to training data extraction attacks. Multiple large sets of sentences generated from the model with top-k sampling and nucleus sampling are studied. The sentences are examined to determine the degree to which they contain information associating patients with
dblp:conf/aaaifs/VakiliD21
fatcat:zxdosrh6i5gkbn4ysrolorph24