Interpretation of Neural Networks Is Fragile

Ghorbani, Amirata; Abid, Abubakar; Zou, James

doi:10.1609/aaai.v33i01.33013681

articleProceedings of the AAAI Conference on Artificial IntelligenceJul 17, 2019DIAMOND OA

Interpretation of Neural Networks Is Fragile

AGAmirata Ghorbani AAAbubakar Abid JZJames Zou

Stanford University

Indexed incrossref

Abstract

In order for machine learning to be trusted in many applications, it is critical to be able to reliably explain why the machine learning algorithm makes certain predictions. For this reason, a variety of methods have been developed recently to interpret neural network predictions by providing, for example, feature importance maps. For both scientific robustness and security reasons, it is important to know to what extent can the interpretations be altered by small systematic perturbations to the input data, which might be generated by adversaries or by measurement biases. In this paper, we demonstrate how to generate adversarial perturbations that produce perceptively indistinguishable inputs that are assigned…

Citation impact

675

total citations

FWCI: 48.67
Percentile: 100%
References: 26

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Robustness (evolution)
Computer science
Artificial intelligence
Adversarial system
Hessian matrix
Machine learning
Artificial neural network
Interpretation (philosophy)

No related works found for this paper.