STRIP
Commonwealth Scientific and Industrial Research Organisation · Swinburne University of Technology · +1 more institution
Abstract
A recent trojan attack on deep neural network (DNN) models is one insidious variant of data poisoning attacks. Trojan attacks exploit an effective backdoor created in a DNN model by leveraging the difficulty in interpretability of the learned model to misclassify any inputs signed with the attacker's chosen trojan trigger. Since the trojan trigger is a secret guarded and exploited by the attacker, detecting such trojan inputs is a challenge, especially at run-time when models are in active operation. This work builds STRong Intentional Perturbation (STRIP) based run-time trojan attack detection system and focuses on vision system. We intentionally perturb the incoming input, for instance by superimposing…
Citation impact
- FWCI
- 35.85
- Percentile
- 100%
- References
- 26
Authors
6Topics & keywords
- Trojan
- Backdoor
- Interpretability
- Computer science
- Exploit
- Robustness (evolution)
- Artificial intelligence
- Randomness