Back
Disclaimer: These are my personal notes on this paper. I am in no way related to this paper. All credits go towards the authors.
Label Sanitization against Label Flipping Poisoning Attacks
Oct. 2, 2018 -
Paper Link -
Tags: Data-Poisoning, Defense, Label-Flipping
Summary
Notes
- Defined "optimal poisoning attacks" pretty well at the start of section 2.
- Attack Method:
- "worst-case scenario". Assumes full white-box access to everything. Model + dataset + separate validation dataset from the same distribution of the dataset. Admits that this is unrealistic.
- Used a binary linear classifier, but is applicable to a multi-classification problem as well.
- Attacker wishes to maximize the loss function
- Algorithm 1 gives the attack. Greedily selects examples from the training dataset that maximizes validation loss.
- Defense Method:
- Label sanitization based
- Considers flipped samples as outliers
- Algorithm 2 gives the defense. Uses k-NN. Given some threshold, if the number of samples in the majority in the k-NN cluster are greater than the threshold, the minority's labels are changed to match the majority.
- "Poisoning points that are far from the decision boundary are likely to be relabelled, mitigating their malicious effect on the performance of the classifier"
- Admits that genuine samples will be relabeled when two classes overlap
- Experimental Results:
- Used 3 datasets: BreastCancer (30 features), MNIST (1 vs 7) (784 features), SpamBase (54 features)
- When 20% of the data is poisoned, the classification error increased between 2.8 and 6 times.
- Defensive strategy does slightly degrade the performance of the classifier
- Figure 1 shows the results of the attack and defense (LF = attack, kNN = defense)
- Figure 2 shows the impact of k in k-NN and the threshold value.
- More samples per cluster leads to a better defense strategy
- A threshold of 50% lead to the best defense strategy. I.e., the minority is always relabeled to be the same as the majority.
Interesting References
- "manipulate samples at test time to evade detection or inject malicious data into the training set to poison the learning algorithm"
LINK
- Data poisoning attacks are very relevant as an emerging security threat for data-driven technologies.
LINK
- Label flipping attacks can degrade the performance of deep learning based networks significantly.
LINK
- Label flipping attack paper. Adversary selects a subset of the training data to maximize the error.
LINK
- Defense Strategies:
- "Defences against optimal poisoning attacks typically consist either in identifying malicious examples and discarding them from the training data or they require to solve some robust optimization problem"
- Outlier detection method for label flipping attacks that identities and removes suspicious samples. LINK
- Solve optimization problem. Assumes knowledge of the amount of poisoned data.
LINK
- Solve optimization problem. Requires an estimation of the amount of poisoned data. Iteratively removes poisoned data based on current error estimates.
LINK
- Evaluates the degree each training sample affects the learning algorithm's performance. Samples that negatively affect the loss function are discarded. Does not scale well. LINK
- "detect the most influential training points". Similar to ^, but scales better. Does not retrain the model. LINK
Citation: Paudice, Andrea, Luis Muñoz-González, and Emil C. Lupu. "Label sanitization against label flipping poisoning attacks." Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, Cham, 2018.