Should You Consider Adware as Malware in Your Study?
2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER)
Empirical validations of research approaches eventually require a curated ground truth. In studies related to Android malware, such a ground truth is built by leveraging Anti-Virus (AV) scanning reports which are often provided free through online services such as VirusTotal. Unfortunately, these reports do not offer precise information for appropriately and uniquely assigning classes to samples in app datasets: AV engines indeed do not have a consensus on specifying information in labels.
... ion in labels. Furthermore, labels often mix information related to families, types, etc. In particular, the notion of "adware" is currently blurry when it comes to maliciousness. There is thus a need to thoroughly investigate cases where adware samples can actually be associated with malware (e.g., because they are tagged as adware but could be considered as malware as well). In this work, we present a large-scale analytical study of Android adware samples to quantify to what extent "adware should be considered as malware". Our analysis is based on the Androzoo repository of 5 million apps with associated AV labels and leverages a state-of-the-art label harmonization tool to infer the malicious type of apps before confronting it against the ad families that each adware app is associated with. We found that all adware families include samples that are actually known to implement specific malicious behavior types. Up to 50% of samples in an ad family could be flagged as malicious. Overall the study demonstrates that adware is not necessarily benign.