Filters








47 Hits in 6.5 sec

Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods [article]

Dylan Slack, Sophie Hilgard, Emily Jia, Sameer Singh, Himabindu Lakkaraju
2020 arXiv   pre-print
In this paper, we demonstrate that post hoc explanations techniques that rely on input perturbations, such as LIME and SHAP, are not reliable.  ...  such as LIME and SHAP into generating innocuous explanations which do not reflect the underlying biases.  ...  ACKNOWLEDGEMENTS We would like to thank the anonymous reviewers for their feedback, and Scott Lundberg for insightful discussions.  ... 
arXiv:1911.02508v2 fatcat:ybh7s6qyvjhuje6zzwftci2dpu

On the Tractability of SHAP Explanations [article]

Guy Van den Broeck, Anton Lykov, Maximilian Schleich, Dan Suciu
2021 arXiv   pre-print
Despite a lot of recent interest from both academia and industry, it is not known whether SHAP explanations of common machine learning models can be computed efficiently.  ...  First, we consider fully-factorized data distributions, and show that the complexity of computing the SHAP explanation is the same as the complexity of computing the expected value of the model.  ...  The authors would like to thank YooJung Choi for valuable discussions on the proof of Theorem 5.  ... 
arXiv:2009.08634v2 fatcat:qrqyevl2dzhhhgvwj4uobvvh4y

Feature Attributions and Counterfactual Explanations Can Be Manipulated [article]

Dylan Slack, Sophie Hilgard, Sameer Singh, Himabindu Lakkaraju
2021 arXiv   pre-print
We demonstrate how adversaries can design biased models that manipulate model agnostic feature attribution methods (e.g., LIME & SHAP) and counterfactual explanations that hill-climb during the counterfactual  ...  We evaluate the manipulations on real world data sets, including COMPAS and Communities & Crime, and find explanations can be manipulated in practice.  ...  fooling LIME or SHAP into generating innocuous explanations.  ... 
arXiv:2106.12563v2 fatcat:6eidicjv2vaxdb6f6vftjscp64

Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey [article]

Arun Das, Paul Rad
2020 arXiv   pre-print
We start by proposing a taxonomy and categorizing the XAI techniques based on their scope of explanations, methodology behind the algorithms, and explanation level or usage which helps build trustworthy  ...  After explaining each category of algorithms and approaches in detail, we then evaluate the explanation maps generated by eight XAI algorithms on image data, discuss the limitations of this approach, and  ...  Shapley sampling methods [64] are also post-hoc and model agnostic.  ... 
arXiv:2006.11371v2 fatcat:6eaz3rbaenflxchjdynmvwlc4i

Fairness via Explanation Quality: Evaluating Disparities in the Quality of Post hoc Explanations [article]

Jessica Dai, Sohini Upadhyay, Ulrich Aivodji, Stephen H. Bach, Himabindu Lakkaraju
2022 arXiv   pre-print
In addition, we also observe that certain post hoc explanation methods (e.g., Integrated Gradients, SHAP) are more likely to exhibit the aforementioned disparities.  ...  As post hoc explanation methods are increasingly being leveraged to explain complex models in high-stakes settings, it becomes critical to ensure that the quality of the resulting explanations is consistently  ...  [30] and Slack et al. [66] demonstrated that methods such as LIME and SHAP may result in explanations that are not only inconsistent and unstable, but also prone to adversarial attacks.  ... 
arXiv:2205.07277v1 fatcat:3logqufk2fdqxnj37jcaaviyv4

Fooling Partial Dependence via Data Poisoning [article]

Hubert Baniecki, Wojciech Kretowicz, Przemyslaw Biecek
2021 arXiv   pre-print
Many methods have been developed to understand complex predictive models and high expectations are placed on post-hoc model explainability.  ...  It turns out that such explanations are not robust nor trustworthy, and they can be fooled.  ...  Acknowledgments and Disclosure of Funding We would like to thank the anonymous reviewers for many insightful comments and suggestions.  ... 
arXiv:2105.12837v2 fatcat:c2ndcqmed5fe5djg2vi5fo7bcq

Explainable Artificial Intelligence Approaches: A Survey [article]

Sheikh Rabiul Islam, William Eberle, Sheikh Khaled Ghafoor, Mohiuddin Ahmed
2021 arXiv   pre-print
While many popular Explainable Artificial Intelligence (XAI) methods or approaches are available to facilitate a human-friendly explanation of the decision, each has its own merits and demerits, with a  ...  insight on quantifying explainability, and recommend paths towards responsible or human-centered AI using XAI as a medium.  ...  ACKNOWLEDGMENTS Our sincere thanks to Christoph Molnar for his open Ebook on Interpretable Machine Learning and contribution to the open-source R package "iml".  ... 
arXiv:2101.09429v1 fatcat:emnotqoj3zhs3lemwz7kbi45um

Explainable AI: A Review of Machine Learning Interpretability Methods

Pantelis Linardatos, Vasilis Papastefanopoulos, Sotiris Kotsiantis
2020 Entropy  
This study focuses on machine learning interpretability methods; more specifically, a literature review and taxonomy of these methods are presented, as well as links to their programming implementations  ...  As a result, scientific interest in the field of Explainable Artificial Intelligence (XAI), a field that is concerned with the development of new methods that explain and interpret machine learning models  ...  To conclude, they dealt with transparent models and post-hoc interpretation, as they believed that post-hoc interpretability could be used to elevate the predictive accuracy of a model and that transparent  ... 
doi:10.3390/e23010018 pmid:33375658 pmcid:PMC7824368 fatcat:gv42gcovm5cxzl2kmdsluiegdi

Towards Explainable Evaluation Metrics for Natural Language Generation [article]

Christoph Leiter and Piyawat Lertvittayakumjorn and Marina Fomicheva and Wei Zhao and Yang Gao and Steffen Eger
2022 arXiv   pre-print
We hope that our work can help catalyze and guide future research on explainable evaluation metrics and, mediately, also contribute to better and more transparent text generation systems.  ...  Further, we conduct own novel experiments, which (among others) find that current adversarial NLP techniques are unsuitable for automatically identifying limitations of high-quality black-box evaluation  ...  Methods for extracting explanations in this case are called post-hoc explanation methods.  ... 
arXiv:2203.11131v1 fatcat:lcfy3vs445btdd4am3sakroek4

The Disagreement Problem in Explainable Machine Learning: A Practitioner's Perspective [article]

Satyapriya Krishna, Tessa Han, Alex Gu, Javin Pombra, Shahin Jabbari, Steven Wu, Himabindu Lakkaraju
2022 arXiv   pre-print
As various post hoc explanation methods are increasingly being leveraged to explain complex models in high-stakes settings, it becomes critical to develop a deeper understanding of if and when the explanations  ...  We then leverage this framework to carry out a rigorous empirical analysis with four real-world datasets, six state-of-the-art post hoc explanation methods, and eight different predictive models, to measure  ...  [49] demonstrated that methods such as LIME and SHAP may result in explanations that are not only inconsistent and unstable, but also prone to adversarial attacks and fair washing [8] .  ... 
arXiv:2202.01602v3 fatcat:4xwkf6gxn5axtc5om4hvdli4na

Explainable Deep Learning in Healthcare: A Methodological Survey from an Attribution View [article]

Di Jin and Elena Sergeeva and Wei-Hung Weng and Geeticka Chauhan and Peter Szolovits
2021 arXiv   pre-print
DL models and choose the optimal one accordingly.  ...  Besides the methods' details, we also include a discussion of advantages and disadvantages of these methods and which scenarios each of them is suitable for, so that interested readers can know how to  ...  ., and Leskovec, J. (2019). Gnn explainer: A tool for post-hoc explanation of graph neural networks. Neural Information Processing Systems (NeurIPS).  ... 
arXiv:2112.02625v1 fatcat:omcm44vj2ffthcpna27typyvau

Sentence-Based Model Agnostic NLP Interpretability [article]

Yves Rychener, Xavier Renard, Djamé Seddah, Pascal Frossard, Marcin Detyniecki
2020 arXiv   pre-print
Today, interpretability of Black-Box Natural Language Processing (NLP) models based on surrogates, like LIME or SHAP, uses word-based sampling to build the explanations.  ...  By using sentences, the altered text remains in-distribution and the dimensionality of the problem is reduced for better fidelity to the black-box at comparable computational complexity.  ...  Compared to other, word-based black-box post-hoc NLP interpretability methods like LIME (Ribeiro et al., 2016) and SHAP (Lundberg and Lee, 2017), we have a much smaller search space (Section 2.2).  ... 
arXiv:2012.13189v2 fatcat:p3auhaugare7blxmcqtbdblmbi

Explainable Deep Learning: A Field Guide for the Uninitiated [article]

Gabrielle Ras, Ning Xie, Marcel van Gerven, Derek Doran
2021 arXiv   pre-print
) places explainability in the context of other related deep learning research areas, and iv) finally elaborates on user-oriented explanation designing and potential future directions on explainable deep  ...  The development of methods and studies enabling the explanation of a DNN's decisions has thus blossomed into an active, broad area of research.  ...  participants of the Schloss Dagstuhl − Leibniz Center for Informatics Seminar 17192 on Human-Like Neural-Symbolic Computing for providing the environment to develop the ideas in this paper.  ... 
arXiv:2004.14545v2 fatcat:4qvtfw6unbfgpkqmeosq737ghq

Explainable Deep Learning: A Field Guide for the Uninitiated

Gabrielle Ras, Ning Xie, Marcel Van Gerven, Derek Doran
2022 The Journal of Artificial Intelligence Research  
The development of methods and studies enabling the explanation of a DNN's decisions has thus blossomed into an active and broad area of research.  ...  ) places explainability in the context of other related deep learning research areas, and iv) discusses user-oriented explanation design and future directions.  ...  Adversarial attack methods are about generating adversarial examples that can fool a DNN.  ... 
doi:10.1613/jair.1.13200 fatcat:qylru2n7tbepljxi72qah62bzy

Analysis of Explainers of Black Box Deep Neural Networks for Computer Vision: A Survey [article]

Vanessa Buhrmester, David Münch, Michael Arens
2019 arXiv   pre-print
Deep Learning is a state-of-the-art technique to make inference on extensive or complex data.  ...  Hence, scientists developed several so-called explanators or explainers which try to point out the connection between input and output to represent in a simplified way the inner structure of machine learning  ...  CIE [71] any feature importance local, post-hoc DeepRed [69] any rule extraction global, ante-hoc LIME [7] any feature importance local, post-hoc, agn.  ... 
arXiv:1911.12116v1 fatcat:qgeg6rz6qzgrfikhsgah77yz2a
« Previous Showing results 1 — 15 out of 47 results