Experimental Study with Real-world Data for Android App Security Analysis using Machine Learning

Sankardas Roy, Jordan DeLoach, Yuping Li, Nic Herndon, Doina Caragea, Xinming Ou, Venkatesh Prasad Ranganath, Hongmin Li, Nicolais Guevara
2015 Proceedings of the 31st Annual Computer Security Applications Conference on - ACSAC 2015  
Although Machine Learning (ML) based approaches have shown promise for Android malware detection, a set of critical challenges remain unaddressed. Some of those challenges arise in relation to proper evaluation of the detection approach while others are related to the design decisions of the same. In this paper, we systematically study the impact of these challenges as a set of research questions (i.e., hypotheses). We design an experimentation framework where we can reliably vary several
more » ... ters while evaluating ML-based Android malware detection approaches. The results from the experiments are then used to answer the research questions. Meanwhile, we also demonstrate the impact of some challenges on some existing ML-based approaches. The large (market-scale) dataset (benign and malicious apps) we use in the above experiments represents the real-world Android app security analysis scale. We envision this study to encourage the practice of employing a better evaluation strategy and better designs of future ML-based approaches for Android malware detection. • RQ5: Does presence of adware in the dataset affect the performance?
doi:10.1145/2818000.2818038 dblp:conf/acsac/RoyDLHCORLG15 fatcat:gadjthcjsjfyjp2qcr6lbfz4u4