Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples

Papernot, Nicolas; McDaniel, Patrick; Goodfellow, Ian

doi:10.48550/arxiv.1605.07277

preprintarXiv (Cornell University)May 24, 2016GREEN OA

Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples

NPNicolas Papernot PMPatrick McDaniel IGIan Goodfellow

Indexed inarxivdatacite

Abstract

Many machine learning models are vulnerable to adversarial examples: inputs that are specially crafted to cause a machine learning model to produce an incorrect output. Adversarial examples that affect one model often affect another model, even if the two models have different architectures or were trained on different training sets, so long as both models were trained to perform the same task. An attacker may therefore train their own substitute model, craft adversarial examples against the substitute, and transfer them to a victim model, with very little information about the victim. Recent work has further developed a technique that uses the victim model as an oracle to label a synthetic training set for…

Citation impact

1,418

total citations

FWCI: —
Percentile: —
References: 16

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Transferability
Adversarial system
Black box
Computer science
Artificial intelligence
Adversarial machine learning
Machine learning

UN Sustainable Development Goals

Peace, Justice and strong institutions

No related works found for this paper.