Leveraging Machine Learning for Enhanced  Android Malware Detection and Analysis

Manjari Sharma (VIT Bhopal University); Muneeswaran V; Narottam Das Patel; Ajay Kumar Phulre

doi:10.63169/GCARED2025.p29

Abstract

The spread of Android devices around the world has led to a startling increase in targeted, specialized cyberattacks.Device operation, data security, and user privacy are all threaten by these attacks. Advanced machine learning (ML)-based techniques are required to fully detect malicious behavior because traditional signature-based malware detection methods are frequently unsuccessful against developing threats. In order to create reliable malware detection algorithms, this study makes use of a large dataset of Android application parameters, such as permissions, operating system characteristics, security precautions, and data destinations. Support Vector Machine (SVM), Random Forest (RF), Decision Tree (DT), XGBoost, Naive Bayes, and three hybrid models—SVM+DT, SVM+Naive Bayes, and SVM+XGBoost—were among the eight machine learning classifiers that were assessed. Additionally, ensemble techniques like RandomForest + CatBoost were used to improve detection accuracy.Accuracy, precision, recall, and F1 score were among the measures used to thoroughly assess each model. Of the separate classifiers, Random Forest
performed well (accuracy: 95.49%, F1 score: 96.00%), but XGBoost had the greatest accuracy (95.61%) and F1 score (96.13%). Improved resilience was shown by hybrid models, with SVM+XGBoost obtaining an F1 score of 96.28% and SVM+DT producing reliable outcomes (F1 score: 94.39%). With better accuracy and F1 scores, the RandomForest + CatBoost combo proved to be the most successful malware detection method, surpassing all individual and hybrid classifiers.