A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods
[article]
2020
arXiv
pre-print
In this paper, we demonstrate that post hoc explanations techniques that rely on input perturbations, such as LIME and SHAP, are not reliable. ...
such as LIME and SHAP into generating innocuous explanations which do not reflect the underlying biases. ...
ACKNOWLEDGEMENTS We would like to thank the anonymous reviewers for their feedback, and Scott Lundberg for insightful discussions. ...
arXiv:1911.02508v2
fatcat:ybh7s6qyvjhuje6zzwftci2dpu
On the Tractability of SHAP Explanations
[article]
2021
arXiv
pre-print
Despite a lot of recent interest from both academia and industry, it is not known whether SHAP explanations of common machine learning models can be computed efficiently. ...
First, we consider fully-factorized data distributions, and show that the complexity of computing the SHAP explanation is the same as the complexity of computing the expected value of the model. ...
The authors would like to thank YooJung Choi for valuable discussions on the proof of Theorem 5. ...
arXiv:2009.08634v2
fatcat:qrqyevl2dzhhhgvwj4uobvvh4y
Feature Attributions and Counterfactual Explanations Can Be Manipulated
[article]
2021
arXiv
pre-print
We demonstrate how adversaries can design biased models that manipulate model agnostic feature attribution methods (e.g., LIME & SHAP) and counterfactual explanations that hill-climb during the counterfactual ...
We evaluate the manipulations on real world data sets, including COMPAS and Communities & Crime, and find explanations can be manipulated in practice. ...
fooling LIME or SHAP into generating innocuous explanations. ...
arXiv:2106.12563v2
fatcat:6eidicjv2vaxdb6f6vftjscp64
Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey
[article]
2020
arXiv
pre-print
We start by proposing a taxonomy and categorizing the XAI techniques based on their scope of explanations, methodology behind the algorithms, and explanation level or usage which helps build trustworthy ...
After explaining each category of algorithms and approaches in detail, we then evaluate the explanation maps generated by eight XAI algorithms on image data, discuss the limitations of this approach, and ...
Shapley sampling methods [64] are also post-hoc and model agnostic. ...
arXiv:2006.11371v2
fatcat:6eaz3rbaenflxchjdynmvwlc4i
Fairness via Explanation Quality: Evaluating Disparities in the Quality of Post hoc Explanations
[article]
2022
arXiv
pre-print
In addition, we also observe that certain post hoc explanation methods (e.g., Integrated Gradients, SHAP) are more likely to exhibit the aforementioned disparities. ...
As post hoc explanation methods are increasingly being leveraged to explain complex models in high-stakes settings, it becomes critical to ensure that the quality of the resulting explanations is consistently ...
[30] and Slack et al. [66] demonstrated that methods such as LIME and SHAP may result in explanations that are not only inconsistent and unstable, but also prone to adversarial attacks. ...
arXiv:2205.07277v1
fatcat:3logqufk2fdqxnj37jcaaviyv4
Fooling Partial Dependence via Data Poisoning
[article]
2021
arXiv
pre-print
Many methods have been developed to understand complex predictive models and high expectations are placed on post-hoc model explainability. ...
It turns out that such explanations are not robust nor trustworthy, and they can be fooled. ...
Acknowledgments and Disclosure of Funding We would like to thank the anonymous reviewers for many insightful comments and suggestions. ...
arXiv:2105.12837v2
fatcat:c2ndcqmed5fe5djg2vi5fo7bcq
Explainable Artificial Intelligence Approaches: A Survey
[article]
2021
arXiv
pre-print
While many popular Explainable Artificial Intelligence (XAI) methods or approaches are available to facilitate a human-friendly explanation of the decision, each has its own merits and demerits, with a ...
insight on quantifying explainability, and recommend paths towards responsible or human-centered AI using XAI as a medium. ...
ACKNOWLEDGMENTS
Our sincere thanks to Christoph Molnar for his open Ebook on Interpretable Machine Learning and contribution to the open-source R package "iml". ...
arXiv:2101.09429v1
fatcat:emnotqoj3zhs3lemwz7kbi45um
Explainable AI: A Review of Machine Learning Interpretability Methods
2020
Entropy
This study focuses on machine learning interpretability methods; more specifically, a literature review and taxonomy of these methods are presented, as well as links to their programming implementations ...
As a result, scientific interest in the field of Explainable Artificial Intelligence (XAI), a field that is concerned with the development of new methods that explain and interpret machine learning models ...
To conclude, they dealt with transparent models and post-hoc interpretation, as they believed that post-hoc interpretability could be used to elevate the predictive accuracy of a model and that transparent ...
doi:10.3390/e23010018
pmid:33375658
pmcid:PMC7824368
fatcat:gv42gcovm5cxzl2kmdsluiegdi
Towards Explainable Evaluation Metrics for Natural Language Generation
[article]
2022
arXiv
pre-print
We hope that our work can help catalyze and guide future research on explainable evaluation metrics and, mediately, also contribute to better and more transparent text generation systems. ...
Further, we conduct own novel experiments, which (among others) find that current adversarial NLP techniques are unsuitable for automatically identifying limitations of high-quality black-box evaluation ...
Methods for extracting explanations in this case are called post-hoc explanation methods. ...
arXiv:2203.11131v1
fatcat:lcfy3vs445btdd4am3sakroek4
The Disagreement Problem in Explainable Machine Learning: A Practitioner's Perspective
[article]
2022
arXiv
pre-print
As various post hoc explanation methods are increasingly being leveraged to explain complex models in high-stakes settings, it becomes critical to develop a deeper understanding of if and when the explanations ...
We then leverage this framework to carry out a rigorous empirical analysis with four real-world datasets, six state-of-the-art post hoc explanation methods, and eight different predictive models, to measure ...
[49] demonstrated that methods such as LIME and SHAP may result in explanations that are not only inconsistent and unstable, but also prone to adversarial attacks and fair washing [8] . ...
arXiv:2202.01602v3
fatcat:4xwkf6gxn5axtc5om4hvdli4na
Explainable Deep Learning in Healthcare: A Methodological Survey from an Attribution View
[article]
2021
arXiv
pre-print
DL models and choose the optimal one accordingly. ...
Besides the methods' details, we also include a discussion of advantages and disadvantages of these methods and which scenarios each of them is suitable for, so that interested readers can know how to ...
., and Leskovec, J. (2019). Gnn
explainer: A tool for post-hoc explanation of graph neural networks. Neural Information
Processing Systems (NeurIPS). ...
arXiv:2112.02625v1
fatcat:omcm44vj2ffthcpna27typyvau
Sentence-Based Model Agnostic NLP Interpretability
[article]
2020
arXiv
pre-print
Today, interpretability of Black-Box Natural Language Processing (NLP) models based on surrogates, like LIME or SHAP, uses word-based sampling to build the explanations. ...
By using sentences, the altered text remains in-distribution and the dimensionality of the problem is reduced for better fidelity to the black-box at comparable computational complexity. ...
Compared to other, word-based black-box post-hoc NLP interpretability methods like LIME (Ribeiro et al., 2016) and SHAP (Lundberg and Lee, 2017), we have a much smaller search space (Section 2.2). ...
arXiv:2012.13189v2
fatcat:p3auhaugare7blxmcqtbdblmbi
Explainable Deep Learning: A Field Guide for the Uninitiated
[article]
2021
arXiv
pre-print
) places explainability in the context of other related deep learning research areas, and iv) finally elaborates on user-oriented explanation designing and potential future directions on explainable deep ...
The development of methods and studies enabling the explanation of a DNN's decisions has thus blossomed into an active, broad area of research. ...
participants of the Schloss Dagstuhl − Leibniz Center for Informatics Seminar 17192 on Human-Like Neural-Symbolic Computing for providing the environment to develop the ideas in this paper. ...
arXiv:2004.14545v2
fatcat:4qvtfw6unbfgpkqmeosq737ghq
Explainable Deep Learning: A Field Guide for the Uninitiated
2022
The Journal of Artificial Intelligence Research
The development of methods and studies enabling the explanation of a DNN's decisions has thus blossomed into an active and broad area of research. ...
) places explainability in the context of other related deep learning research areas, and iv) discusses user-oriented explanation design and future directions. ...
Adversarial attack methods are about generating adversarial examples that can fool a DNN. ...
doi:10.1613/jair.1.13200
fatcat:qylru2n7tbepljxi72qah62bzy
Analysis of Explainers of Black Box Deep Neural Networks for Computer Vision: A Survey
[article]
2019
arXiv
pre-print
Deep Learning is a state-of-the-art technique to make inference on extensive or complex data. ...
Hence, scientists developed several so-called explanators or explainers which try to point out the connection between input and output to represent in a simplified way the inner structure of machine learning ...
CIE [71] any feature importance local, post-hoc DeepRed [69] any rule extraction global, ante-hoc LIME [7] any feature importance local, post-hoc, agn. ...
arXiv:1911.12116v1
fatcat:qgeg6rz6qzgrfikhsgah77yz2a
« Previous
Showing results 1 — 15 out of 47 results