A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Self-Attention Attribution: Interpreting Information Interactions Inside Transformer
[article]
2021
arXiv
pre-print
The great success of Transformer-based models benefits from the powerful multi-head self-attention mechanism, which learns token dependencies and encodes contextual information from the input. Prior work strives to attribute model decisions to individual input features with different saliency measures, but they fail to explain how these input features interact with each other to reach predictions. In this paper, we propose a self-attention attribution method to interpret the information
arXiv:2004.11207v2
fatcat:uztxgswjurg7jjcpmsbabcozt4