A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
A causal framework for explaining the predictions of black-box sequence-to-sequence models
2017
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
We interpret the predictions of any blackbox structured input-structured output model around a specific input-output pair. Our method returns an "explanation" consisting of groups of input-output tokens that are causally related. These dependencies are inferred by querying the black-box model with perturbed inputs, generating a graph over tokens from the responses, and solving a partitioning problem to select the most relevant components. We focus the general approach on sequence-tosequence
doi:10.18653/v1/d17-1042
dblp:conf/emnlp/Alvarez-MelisJ17
fatcat:j3yi5abponhpzn6shjnmdknf5a