Shap Based -Android Malware Detection Using Ensemble Learning
DOI:
https://doi.org/10.47392/IRJASH.2025.077Keywords:
Android malware detection, Sensitive Function Call Graph, NetworkX, Word2Vec, Smali code, API semantic analysis, shap interpreter, social network analysisAbstract
Android malware remains a critical threat to mobile security, demanding robust and transparent detection mechanisms. This approach proposes a complete method to identify malicious Android apps by using code analysis and graph-based techniques, enabling the identification to be more precise and interpretable. The workflow starts with a detailed pre-processing stage, during which APK samples are decompiled. With the help of Baksmali, we retrieve DEX files and decompile them into Smali code, extracting the program behaviour and program flow. Moreover, Androguard is used to retrieve abstract metadata and permission specifications, helping with code semantics inspection. We then build Sensitive Function Call Graphs (SFCGs) for all Android apps, where vertices are sensitive API-calling functions and edges are their calls between functions. We enrich the graphs with both layout-based features, like degree centrality, closeness centrality, and clustering coefficients, and permission patterns in Smali code. Semantic features are extracted by transforming smali code and using word2Vec.The features are then utilized to construct a strong ensemble learning system of multiple individual classifiers. Furthermore, in our effort to further make our detection system more transparent and strong, we employ SHAP to provide model explanations, resulting in attribute-specific explanations for malware classification results. Experiments with a large reference dataset illustrate the performance of the proposed approach towards obtaining accurate, interpretable, and scalable Android malware detection with approximately 99.9%. The system not only adds to security but also promotes transparency, which is crucial in security-critical applications.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.