A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Feature-Based Explanations Don't Help People Detect Misclassifications of Online Toxicity
2020
International Conference on Web and Social Media
We present an experimental assessment of the impact of feature attribution-style explanations on human performance in predicting the consensus toxicity of social media posts with advice from an unreliable machine learning model. By doing so we add to a small but growing body of literature inspecting the utility of interpretable machine learning in terms of human outcomes. We also evaluate interpretable machine learning for the first time in the important domain of online toxicity, where
dblp:conf/icwsm/CartonMR20
fatcat:22mhnmx6g5hizeman2j2sqduyi