Filters








9 Hits in 2.6 sec

A Survey of Binary Code Similarity [article]

Irfan Ul Haq, Juan Caballero
2019 arXiv   pre-print
The ability to compare binary code enables many real-world applications on scenarios where source code may not be available such as patch analysis, bug search, and malware detection and analysis.  ...  It analyzes 61 binary code similarity approaches, which are systematized on four aspects: (1) the applications they enable, (2) their approach characteristics, (3) how the approaches are implemented, and  ...  However, of the 24 approaches evaluated on malware, only five use packed malware samples (SMIT, BEAGLE, MUTANTX-S, CXZ2014, BINSIM).  ... 
arXiv:1909.11424v1 fatcat:dry5hbq3qjdvdnvrjaoxwoztlq

The rise of machine learning for detection and classification of malware: Research developments, trends and challenges

Daniel Gibert, Carles Mateu, Jordi Planes
2020 Journal of Network and Computer Applications  
Current state-of-the-art research focus on the development and application of machine learning techniques for malware detection due to its ability to keep pace with malware evolution.  ...  The main contributions of the paper are: (1) it provides a complete description of the methods and features in a traditional machine learning workflow for malware detection and classification, (2) it explores  ...  MutantX-S improves the scalability on handling very large numbers of malware with high-dimensional features by applying a hashing trick and a closeto-linear clustering algorithm.  ... 
doi:10.1016/j.jnca.2019.102526 fatcat:3bf6afjqpnb53eoeghfxjeaus4

On the Impact of Sample Duplication in Machine-Learning-Based Android Malware Detection

Yanjie Zhao, Li Li, Haoyu Wang, Haipeng Cai, Tegawendé F. Bissyandé, Jacques Klein, John Grundy
2021 ACM Transactions on Software Engineering and Methodology  
., malware family clustering).  ...  Our experimental results reveal that duplication in published datasets has a limited impact on supervised malware classification models.  ...  [29] designed and implemented a framework, namely MutantX-S, to cluster samples into families based on code instruction sequences efficiently.  ... 
doi:10.1145/3446905 fatcat:dyxzy3riabew5m7fg5cyj3w5ky

Unleashing the Hidden Power of Compiler Optimization on Binary Code Difference: An Empirical Study [article]

Xiaolei Ren, Michael Ho, Jiang Ming, Yu Lei, Li Li
2021 arXiv   pre-print
We tailor search-based iterative compilation for the auto-tuning of binary code differences.  ...  We run BinTuner with GCC 10.2 and LLVM 11.0 on SPEC benchmarks (CPU2006 & CPU2017), Coreutils, and OpenSSL.  ...  We also thank VirusTotal for providing the academic API and malware samples. This research was supported by the National Science Foundation (NSF) under grant CNS-1850434.  ... 
arXiv:2103.12357v2 fatcat:7e2jq53kijc4nlayfa4ljqc2um

Guilt by association

Acar Tamersoy, Kevin Roundy, Duen Horng Chau
2014 Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '14  
that often appear together on machines.  ...  We present Aesop, a scalable algorithm that identifies malicious executable files by applying Aesop's moral that "a man is known by the company he keeps."  ...  Symantec's MutantX-S [15] system clusters executables according to their static and dynamic properties.  ... 
doi:10.1145/2623330.2623342 dblp:conf/kdd/TamersoyRC14 fatcat:wiivg6ad4zcbbi26ndncghpr24

Secure Learning In Adversarial Environments

Bo Li
2020 Proceedings of the 1st ACM Workshop on Security and Privacy on Artificial Intelligence  
] systems) or have relied exclusively on static or dynamic features derived from the file itself (e.g., MutantX-S [95] ).  ...  This baseline is similar to prior work in malware classification based on static features [95] .  ... 
doi:10.1145/3385003.3410927 fatcat:jejk6x63hfhnxfzbddpavtsmsy

Precise system-wide concatic malware unpacking [article]

David Korczynski
2019 arXiv   pre-print
The problem has received much attention, and so far, solutions based on dynamic analysis have been the most successful.  ...  Minerva introduces a unified approach to precisely uncover execution waves in a packed malware sample and produce PE files that are well-suited for follow-up static analysis.  ...  MutantX-S: automatic unpackers rely on third-party applications to perform Scalable Malware Clustering Based on Static Features.  ... 
arXiv:1908.09204v1 fatcat:r7ivlcxpj5h5lp2jnatld5dfyq

Toward Metric Indexes for Incremental Insertion and Querying [article]

Edward Raff, Charles Nicholas
2018 arXiv   pre-print
This use-case is inspired by a real-life need in malware analysis triage, and is surprisingly understudied.  ...  Existing literature tends to either focus on only final query efficiency, often does not support incremental insertion, or does not support arbitrary distance metrics.  ...  MutantX-S: Scalable Malware Clustering Based on Static Features. In Presented as part of the 2013 USENIX Annual Technical Conference (USENIX ATC 13), pages 187–198, San Jose, CA, 2013. USENIX.  ... 
arXiv:1801.05055v1 fatcat:cdzqxoypwndp7b4qg6uiqvga7q

jTrans: Jump-Aware Transformer for Binary Code Similarity [article]

Hao Wang, Wenjie Qu, Gilad Katz, Wenyu Zhu, Zeyu Gao, Han Qiu, Jianwei Zhuge, Chao Zhang
2022 arXiv   pre-print
Evaluation results show that jTrans outperforms state-of-the-art (SOTA) approaches on this more challenging dataset by 30.5% (i.e., from 32.0% to 62.5%).  ...  In this study, we propose a novel Transformer-based approach, namely jTrans, to learn representations of binary code.  ...  Methods such as BinClone [19] , ILine [34] , MutantX-S [31] , BinSign [47] , and Kam1n0 [13] use categorized operands or instructions as static features for the computation of binary similarity.  ... 
arXiv:2205.12713v1 fatcat:oloboad7gfenrbnqo4hibabywi