Interpretability Beyond Feature Attribution: Quantitative Testing with\n Concept Activation Vectors (TCAV)
Indexed inarxiv
Abstract
The interpretation of deep learning models is a challenge due to their size,\ncomplexity, and often opaque internal state. In addition, many systems, such as\nimage classifiers, operate on low-level features rather than high-level\nconcepts. To address these challenges, we introduce Concept Activation Vectors\n(CAVs), which provide an interpretation of a neural net's internal state in\nterms of human-friendly concepts. The key idea is to view the high-dimensional\ninternal state of a neural net as an aid, not an obstacle. We show how to use\nCAVs as part of a technique, Testing with CAVs (TCAV), that uses directional\nderivatives to quantify the degree to which a user-defined concept is important\nto a…
Citation impact
735
total citations
- FWCI
- —
- Percentile
- —
- References
- 0
Citations per year
Authors
7Topics & keywords
Topics
Keywords
- Interpretability
- Computer science
- Artificial intelligence
- Interpretation (philosophy)
- Image (mathematics)
- Domain (mathematical analysis)
- Machine learning
- Feature (linguistics)
No related works found for this paper.