Filters








169 Hits in 5.6 sec

Android malware detection with weak ground truth data

Jordan DeLoach, Doina Caragea, Xinming Ou
2016 2016 IEEE International Conference on Big Data (Big Data)  
For Android malware detection, precise ground truth is a rare commodity.  ...  Our work is focused on approaches for learning classifiers for Android malware detection in a manner that is methodologically sound with regard to the uncertain and ever-changing ground truth in the problem  ...  ACKNOWLEDGMENT Part of the computing for this project was performed on the Beocat Research Cluster at Kansas State University, which is funded in part by the NSF grants MRI-1429316 and CC-IIE-1440548.  ... 
doi:10.1109/bigdata.2016.7841008 dblp:conf/bigdataconf/DeLoachCO16 fatcat:em3whjy3jrfjnaym5tt6qlnoce

On the Lack of Consensus in Anti-Virus Decisions: Metrics and Insights on Building Ground Truths of Android Malware [chapter]

Médéric Hurier, Kevin Allix, Tegawendé F. Bissyandé, Jacques Klein, Yves Le Traon
2016 Lecture Notes in Computer Science  
a malware ground trutha foundation stone of any malware detection approach.  ...  This challenges the building of authoritative ground-truth datasets.  ...  [10] have proposed weighting techniques towards deriving better, authoritative, ground truth based on AV labels.  ... 
doi:10.1007/978-3-319-40667-1_8 fatcat:a65osdn3pvc55ku7afdq6vy2ty

Euphony: Harmonious Unification of Cacophonous Anti-Virus Vendor Labels for Android Malware

Mederic Hurier, Guillermo Suarez-Tangil, Santanu Kumar Dash, Tegawende F. Bissyande, Yves Le Traon, Jacques Klein, Lorenzo Cavallaro
2017 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR)  
In this paper, we mine Anti-Virus labels and analyze the associations between all labels given by different vendors to systematically unify common samples into family groups.  ...  On the one hand, samples are often mis-labeled as different parties use distinct naming schemes for the same sample.  ...  We also thank VirusTotal [43] for their help and to let us use their service for research purposes.  ... 
doi:10.1109/msr.2017.57 dblp:conf/msr/HurierSDBTKC17 fatcat:6eqyljqfhbej3aswtehnc34hh4

Reviewer Integration and Performance Measurement for Malware Detection [chapter]

Brad Miller, Alex Kantchelian, Michael Carl Tschantz, Sadia Afroz, Rekha Bachwani, Riyaz Faizullabhoy, Ling Huang, Vaishaal Shankar, Tony Wu, George Yiu, Anthony D. Joseph, J. D. Tygar
2016 Lecture Notes in Computer Science  
We present and evaluate a large-scale malware detection system integrating machine learning with expert reviewers, treating reviewers as a limited labeling resource.  ...  We find that using training labels obtained well after samples appear, and thus unavailable in practice for current training data, inflates measured detection by almost 20 percentage points.  ...  An alternate approach would be to withhold a vendor's labels when evaluating that vendor, e↵ectively creating a separate ground truth for each vendor.  ... 
doi:10.1007/978-3-319-40667-1_7 fatcat:fwmxfmjtgneblbsa3zcbkcc4pe

Reviewer Integration and Performance Measurement for Malware Detection [article]

Brad Miller, Alex Kantchelian, Michael Carl Tschantz, Sadia Afroz, Rekha Bachwani, Riyaz Faizullabhoy, Ling Huang, Vaishaal Shankar, Tony Wu, George Yiu, Anthony D. Joseph, J. D. Tygar
2016 arXiv   pre-print
We present and evaluate a large-scale malware detection system integrating machine learning with expert reviewers, treating reviewers as a limited labeling resource.  ...  We find that using training labels obtained well after samples appear, and thus unavailable in practice for current training data, inflates measured detection by almost 20 percentage points.  ...  An alternate approach would be to withhold a vendor's labels when evaluating that vendor, effectively creating a separate ground truth for each vendor.  ... 
arXiv:1510.07338v2 fatcat:kr6r3uocrjgulcfwme4gxenyi4

VAMO

Roberto Perdisci, ManChon U
2012 Proceedings of the 28th Annual Computer Security Applications Conference on - ACSAC '12  
Previous studies propose to evaluate malware clustering results by leveraging the labels assigned to the malware samples by multiple anti-virus scanners (AVs).  ...  In fact, clustering can be viewed as an unsupervised learning process over a dataset for which the complete ground truth is usually not available.  ...  While "anti-malware" is probably a more appropriate term, we use "anti-virus" because that is the way in which many vendors of malware scanners and defense solutions still advertise their products.  ... 
doi:10.1145/2420950.2420999 dblp:conf/acsac/PerdisciU12 fatcat:ltosjjf5sfccnl6todxskyihbu

Familial Clustering For Weakly-labeled Android Malware Using Hybrid Representation Learning

Yanxin Zhang, Yulei Sui, Shirui Pan, Zheng Zheng, Baodi Ning, Ivor Tsang, Wanlei Zhou
2019 IEEE Transactions on Information Forensics and Security  
ground-truth samples.  ...  We have evaluated our approach using 5,416 ground-truth malware from Drebin and 9,000 malware from VIRUSSHARE (uploaded between Mar. 2017 and Feb. 2018), consisting of 3324 weakly-labeled malware.  ...  We would like to thank the anonymous reviewers for their helpful comments.  ... 
doi:10.1109/tifs.2019.2947861 fatcat:yps5spdsyresnepfjqi4kz236m

Malytics: A Malware Detection Scheme

Mahmood Yousefi-Azar, Len Hamey, Vijay Varadharajan, Shiping Chen
2018 IEEE Access  
Besides good precision and recognition rate, a malware detection scheme needs to be able to generalize well for novel malware families (a.k.a zero-day attacks).  ...  Malytics outperforms a wide range of learning-based techniques and also individual state-of-the-art models on both platforms.  ...  anti-virus vendors.  ... 
doi:10.1109/access.2018.2864871 fatcat:weigfkfppfd2jf6666s7ihugvu

Learning from Context: Exploiting and Interpreting File Path Information for Better Malware Detection [article]

Adarsh Kyadige, Ethan M. Rudd, Konstantin Berlin
2019 arXiv   pre-print
Machine learning (ML) used for static portable executable (PE) malware detection typically employs per-file numerical feature vector representations as input with one or more target labels during training  ...  We find that our model learns useful aspects of the file path for classification, while also learning artifacts from customers testing the vendor's product, e.g., by downloading a directory of malware  ...  and 100 with a benign ground truth label.  ... 
arXiv:1905.06987v1 fatcat:e46n6oxt6raidbiwu2pohhiwby

Beyond Labeling: Using Clustering to Build Network Behavioral Profiles of Malware Families [chapter]

Azqa Nadeem, Christian Hammerschmidt, Carlos H. Gañán, Sicco Verwer
2020 Malware Analysis Using Artificial Intelligence and Deep Learning  
We introduce temporal heatmaps-a data-driven and visualization-based cluster evaluation method that requires no ground truth; 4.  ...  We show the behavioral relationships between malwares using a Directed Acyclic Graph, which also uncovers discrepancies between behavioral clusters and traditional family labels;  ...  [37] propose a method to find inconsistencies in malware family labels generated by Anti-Virus (AV) scanners. Mohaisen et al.  ... 
doi:10.1007/978-3-030-62582-5_15 fatcat:v5kchsi3x5fjlilrlwv56tavee

Detecting Malware with Information Complexity [article]

Nadia Alshahwan and Earl T. Barr and David Clark and George Danezis
2015 arXiv   pre-print
We compare our results to the results of applying the 59 anti-malware programs used on the VirusTotal web site to our malware.  ...  Given a zoo of labelled malware and benign-ware, we ask whether a suspect program is more similar to our malware or to our benign-ware.  ...  The malware profiler is intended to test samples gathered from various sources against various anti-virus vendors to establish some ground truth. then the sample, once identified as malware, is run in  ... 
arXiv:1502.07661v1 fatcat:md46ei62mnbx5dijxcrwsuvhny

Unveiling Zeus [article]

Abedelaziz Mohaisen, Omar Alrawi
2013 arXiv   pre-print
Malware family classification is an age old problem that many Anti-Virus (AV) companies have tackled. There are two common techniques used for classification, signature based and behavior based.  ...  Our main class of malware we are interested in classifying is the popular Zeus malware. For its classification we identify 65 features that are unique and robust for identifying malware families.  ...  Their ground truth relies on anti-virus reported classification which is mostly signature-based, and they do not include manual classification like in our case.  ... 
arXiv:1303.7012v1 fatcat:u3jbnduj4vcbfnvrj375ghbfxa

Malware Detection via Extended Label Propagation through Graph Inference

Yitu Fu, Ju Xu
2019 IEEE Access  
' and hosts' connections to detect malware.  ...  Also, we theoretically show that, under some mild conditions, our propagation method could reveal the actual labels of unlabeled nodes in the complete graph.  ...  Each of the buckets contains several labeled datasets that can provide ground truth for testing. F.  ... 
doi:10.1109/access.2019.2948374 fatcat:w32fdza5o5fspgywtfemqsukpq

Screening smartphone applications using malware family signatures

Jehyun Lee, Suyeon Lee, Heejo Lee
2015 Computers & security  
We evaluated our mechanism with 5846 real world Android malware samples belonging to 48 families collected in April 2014 at an anti-virus company; experimental results showed that; our mechanism achieved  ...  A conventional technique for defeating malware is the use of signature matching which is efficient from a time perspective but not very practical because of its lack of robustness against the malware variants  ...  Thus, we utilize the family classification information labeling from multiple AV vendors as the ground truth for our signature construction and evaluation.  ... 
doi:10.1016/j.cose.2015.02.003 fatcat:u65qngvo4veb7f4mzmp7eupcke

Lens on the Endpoint: Hunting for Malicious Software Through Endpoint Data Analysis [chapter]

Ahmet Salih Buyukkayhan, Alina Oprea, Zhou Li, William Robertson
2017 Lecture Notes in Computer Science  
The large majority of our findings are confirmed as malicious by anti-virus tools and manual investigation by experienced security analysts.  ...  truth (small number of malicious software available for training), and coarse-grained data collection (strict requirements are imposed on agents' performance overhead).  ...  We would like to thank Justin Lamarre, Robin Norris, Todd Leetham, and Christopher Harrington for their help with system design and evaluation of our findings, as well as Kevin Bowers and Martin Rosa for  ... 
doi:10.1007/978-3-319-66332-6_4 fatcat:td7cn42ta5b6bewh5y23eaoh24
« Previous Showing results 1 — 15 out of 169 results