Data augmentation using conditional generative adversarial network (Cgan) for android malware binary and multi-class classification

Fawaz Mohammad Hayel Othman, Bander Ali Saleh Al-Rimy, Sultan Ahmed Almalki, Tami Abdulrahman Alghamdi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In the field of mobile malware detection and classification, various datasets have been constructed, particularly for Android operating systems, which experience the highest average of malware attacks. Traditional detection and classification methods that rely on malware signatures are becoming increasingly ineffective, especially against zero-day malware, necessitating smart detection methods. Machine learning models have evolved to address this problem as new malware becomes more sophisticated. To improve these models for malware classification and detection, new datasets are required to enhance accuracy. However, constructing these datasets manually is tedious and complicates the prediction of zero-day malware patterns. Additionally, there is a lack of comprehensive Android malware datasets to improve detection accuracy. In this study, a new synthetic multiclass Android malware dataset was generated from one of the most recent datasets, CCCS-CIC-AndMal2020, using two approaches as conditional inputs to a Conditional Generative Adversarial Network (CGAN) model: binary classification and multiclassification. A machine learning classification model was developed to evaluate the results of the two approaches by combining the generated data with the original data to test the quality of the generated data. The multiclassification approach achieved low accuracy, while the binary classification approach achieved 85% accuracy using LightGBM and 78% using Random Forest, compared to the original data accuracy, which was 85% using both LightGBM and Random Forest.

Original languageEnglish
Title of host publication2025 IEEE 15th Annual Computing and Communication Workshop and Conference, CCWC 2025
EditorsRajashree Paul, Arpita Kundu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages584-592
Number of pages9
ISBN (Electronic)9798331507695
ISBN (Print)9798331507701
DOIs
Publication statusPublished - 5 Mar 2025
Event15th IEEE Annual Computing and Communication Workshop and Conference, CCWC 2025 - Las Vegas, United States
Duration: 6 Jan 20258 Jan 2025

Conference

Conference15th IEEE Annual Computing and Communication Workshop and Conference, CCWC 2025
Country/TerritoryUnited States
CityLas Vegas
Period6/01/258/01/25

Fingerprint

Dive into the research topics of 'Data augmentation using conditional generative adversarial network (Cgan) for android malware binary and multi-class classification'. Together they form a unique fingerprint.

Cite this