A Survey on Machine Learning Approaches to Detect Label Flipping Adversarial Poisoning Attacks

Rucha Gurav

A Survey on Machine Learning Approaches to Detect Label Flipping Adversarial Poisoning Attacks

Authors

Rucha Gurav

Keywords:

Dynamic label poisoning, adversarial attacks, machine learning robustness, data poisoning evaluation, stealth attack strategies

Abstract

Label-flipping adversarial poisoning attacks present a substantial threat to the integrity and security of machine learning (ML) models by deliberately altering the labels in the training dataset. Such manipulation can significantly distort model predictions, leading to compromised performance and unreliable decision-making. The detection of these adversarial attacks is paramount to maintaining the robustness and trustworthiness of ML systems, particularly in critical domains like cyber security, finance, and healthcare, where model reliability is of utmost importance. This paper offers an extensive review of contemporary methods and approaches for detecting label-flipping adversarial poisoning attacks, utilizing various machine learning algorithms. We conduct a comparative analysis of the strengths and limitations of existing detection strategies, focusing on both supervised and unsupervised learning paradigms. Furthermore, we examine the influence of critical factors such as feature engineering, model interpretability, and the challenges posed by class imbalance on the effectiveness of detection methods. Finally, this review highlights current challenges, identifies existing research gaps, and outlines future directions for advancing detection mechanisms, thereby contributing to the development of more resilient and secure machine learning models capable of withstanding adversarial manipulation.

Downloads

Download data is not yet available.

References

E. Rosenfeld, E. Winston, P. Ravikumar, and J. Z. Kolter, "Certified Robustness to Label-Flipping Attacks via Randomized Smoothing," Proceedings of the 37th International Conference on Machine Learning (ICML), Online, PMLR 119, 2020.

A. R. Shahid, A. Imteaj, P. Y. Wu, D. A. Igoche, and T. Alam, "Label Flipping Data Poisoning Attack Against Wearable Human Activity Recognition System," arXiv preprint arXiv:2208.08433, 2022.

K. Aryal, M. Gupta, and M. Abdelsalam, "Analysis of Label-Flip Poisoning Attack on Machine Learning Based Malware Detector," arXiv preprint arXiv:2301.01044, 2023.

X. Chang, G. Dobbie, and J. Wicker, "Fast Adversarial Label-Flipping Attack on Tabular Data," arXiv preprint arXiv:2310.10744, 2023.

O. Mengara, "A Backdoor Approach with Inverted Labels Using Dirty Label-Flipping Attacks," IEEE Access, vol. 11, pp. 1–12, 2023, DOI: 10.1109/ACCESS.2023.0322000.

H. K. Surendrababu and N. Nagaraj, "A Novel Backdoor Detection Approach Using Entropy-Based Measures," IEEE Access, vol. 12, pp. 114057–114066, 2024, DOI: 10.1109/ACCESS.2024.3444273.

M. Umer and R. Polikar, "Adversary Aware Continual Learning," IEEE Access, vol. 12, pp. 126108–126119, 2024, DOI: 10.1109/ACCESS.2024.3455090.

I.-H. Liu, J.-S. Li, Y.-C. Peng, and C.-G. Liu, "A Robust Countermeasure for Poisoning Attacks on Deep Neural Networks of Computer Interaction Systems," Applied Sciences, vol. 12, no. 15, p. 7753, 2022, DOI: 10.3390/app12157753.

M. Altoub, F. AlQurashi, T. Yigitcanlar, J. M. Corchado, and R. Mehmood, "An Ontological Knowledge Base of Poisoning Attacks on Deep Neural Networks," Applied Sciences, vol. 12, no. 21, p. 11053, 2022, DOI: 10.3390/app122111053.

Y. Sun, H. Ochiai, and J. Sakuma, "Attacking-Distance-Aware Attack: Semi-targeted Model Poisoning on Federated Learning," IEEE Transactions on Artificial Intelligence, vol. 5, no. 2, pp. 925–935, 2024, DOI: 10.1109/TAI.2023.3280155.

C. Cui, H. Du, Z. Jia, X. Zhang, Y. He, and Y. Yang, "Data Poisoning Attacks With Hybrid Particle Swarm Optimization Algorithms Against Federated Learning in Connected and Autonomous Vehicles," IEEE Access, vol. 11, pp. 136361-136375, 2023, DOI: 10.1109/ACCESS.2023.3337638.

K. Psychogyios, T.-H. Velivassaki, S. Bourou, A. Voulkidis, D. Skias, and T. Zahariadis, "GAN-Driven Data Poisoning Attacks and Their Mitigation in Federated Learning Systems," Electronics, vol. 12, no. 8, p. 1805, Apr. 2023, DOI: 10.3390/electronics12081805.

Z. Zhang, S. Umar, A. Y. Al Hammadi, S. Yoon, E. Damiani, C. A. Ardagna, N. Bena, and C. Y. Yeun, "Explainable Data Poison Attacks on Human Emotion Evaluation Systems Based on EEG Signals," IEEE Access, vol. 11, pp. 18134-18148, 2023, DOI: 10.1109/ACCESS.2023.3245813.

Q. Liu, C. Kang, Q. Zou, and Q. Guan, "Implementing a Multitarget Backdoor Attack Algorithm Based on Procedural Noise Texture Features," IEEE Access, vol. 12, pp. 69539-69552, 2024, DOI: 10.1109/ACCESS.2024.3401848.

T.-H. Kim, S.-H. Choi, and Y.-H. Choi, "Instance-Agnostic and Practical Clean Label Backdoor Attack Method for Deep Learning-Based Face Recognition Models," IEEE Access, vol. 11, pp. 144040-144054, 2023, DOI: 10.1109/ACCESS.2023.3342922.

M. Maabreh, A. Maabreh, B. Qolomany, and A. Al-Fuqaha, "The Robustness of Popular Multiclass Machine Learning Models Against Poisoning Attacks: Lessons and Insights," International Journal of Distributed Sensor Networks, vol. 18, no. 7, pp. 1–12, 2022, DOI: 10.1177/15501329221105159.

J. Wu, J. Jin, and C. Wu, "Challenges and Countermeasures of Federated Learning Data Poisoning Attack Situation Prediction," Mathematics, vol. 12, no. 6, p. 901, Mar. 2024, DOI: 10.3390/math12060901.

A. Vassilev, A. Oprea, A. Fordyce, and H. Anderson, "Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations," NIST AI 100-2e2023, DOI: 10.6028/NIST.AI.100-2e2023.

A. A. T. and K. Kartheeban, "SecureTransfer: A Transfer Learning-Based Poison Attack Detection in ML Systems," International Journal of Advanced Computer Science and Applications, vol. 15, no. 7, pp. 1451–1461, 2024.

A. K. Singh, A. Blanco-Justicia, and J. Domingo-Ferrer, "Fair Detection of Poisoning Attacks in Federated Learning on Non-i.i.d. Data," Data Mining and Knowledge Discovery, vol. 37, pp. 1998–2023, Jan. 2023, DOI: 10.1007/s10618-022-00912-6.

Ashneet Khandpur Singh, Alberto Blanco-Justicia, and Josep Domingo-Ferrer, "Fair detection of poisoning attacks in federated learning on non-i.i.d. data," Data Mining and Knowledge Discovery, vol. 37, pp. 1998–2023, 2023. DOI: 10.1007/s10618-022-00912-6.

Anum Paracha, Junaid Arshad, Mohamed Ben Farah, and Khalid Ismail, "Machine learning security and privacy: a review of threats and countermeasures," EURASIP Journal on Information Security, vol. 2024, no. 10, 2024. DOI: 10.1186/s13635-024-00158-3.

Vijay Raghavan, Thomas Mazzuchi, and Shahram Sarkani, "An improved real-time detection of data poisoning attacks in deep learning vision systems," Discover Artificial Intelligence, vol. 2, no. 18, 2022. DOI: 10.1007/s44163-022-00035-3.

Benxuan Huang, Lihui Pang, Anmin Fu, Said F. Al-Sarawi, Derek Abbott, and Yansong Gao, "Sponge attack against multi-exit networks with data poisoning," IEEE Access, vol. 12, pp. 33843–33855, 2024. DOI: 10.1109/ACCESS.2024.3370849.

A Survey on Machine Learning Approaches to Detect Label Flipping Adversarial Poisoning Attacks