Unveiling vulnerabilities in deep learning-based malware detection: Differential privacy driven adversarial attacks

Rahim Taheri*, Mohammad Shojafar, Farzad Arabikhan, Alexander Gegov

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

The exponential increase of Android malware creates a severe threat, motivating the development of machine learning and especially deep learning-based classifiers to detect and mitigate malicious applications. However, these classifiers are susceptible to adversarial attacks that manipulate input data to deceive the classifier and compromise performance. This paper investigates the vulnerability of deep learning-based Android malware classifiers against two adversarial attacks: Data Poisoning with Noise Injection (DP-NI) and Gradient-based Data Poisoning (GDP). In these attacks, we explore the utilization of differential privacy techniques by attackers aiming to compromise the effectiveness of deep learning based Android malware classifiers. We propose and evaluate a novel defense mechanism, Differential Privacy-Based Noise Clipping (DP-NC), designed to enhance the robustness of Android malware classifiers against these adversarial attacks. By leveraging deep neural networks and adversarial training techniques, DP-NC demonstrates remarkable efficacy in mitigating the impact of both DP-NI and GDP attacks. Through extensive experimentation on three diverse Android datasets (Drebin, Contagio, and Genome), we evaluate the performance of DP-NC against proposed adversarial attacks. Our results show that DP-NC significantly reduces the false-positive rate and improves classification accuracy across all datasets and attack scenarios. For instance, our findings on the Drebin dataset reveal a significant decrease in accuracy to 51% and 30% after applying DP-NI and GDP techniques, respectively. However, upon applying the DP-NC defense mechanism, the accuracy in both cases improved to approximately 70%. Furthermore, employing DP-NC defense against DP-NI and GDP attacks leads to a notable reduction in false positive rates by 45.46% and 7.67%, respectively. Similar results have been obtained in two other datasets, Contagio and Genome. These results underscore the effectiveness of DP-NC in enhancing the robustness of deep learning-based Android malware classifiers against adversarial attacks.
Original languageEnglish
JournalComputers and Security
Early online date8 Aug 2024
DOIs
Publication statusEarly online - 8 Aug 2024

Keywords

  • Adversarial attacks
  • Android malware detection
  • Deep learning
  • Differential privacy
  • Gradient perturbation

Cite this