A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
A Survey of Binary Code Similarity
[article]
2019
arXiv
pre-print
The ability to compare binary code enables many real-world applications on scenarios where source code may not be available such as patch analysis, bug search, and malware detection and analysis. ...
It analyzes 61 binary code similarity approaches, which are systematized on four aspects: (1) the applications they enable, (2) their approach characteristics, (3) how the approaches are implemented, and ...
However, of the 24 approaches evaluated on malware, only five use packed malware samples (SMIT, BEAGLE, MUTANTX-S, CXZ2014, BINSIM). ...
arXiv:1909.11424v1
fatcat:dry5hbq3qjdvdnvrjaoxwoztlq
The rise of machine learning for detection and classification of malware: Research developments, trends and challenges
2020
Journal of Network and Computer Applications
Current state-of-the-art research focus on the development and application of machine learning techniques for malware detection due to its ability to keep pace with malware evolution. ...
The main contributions of the paper are: (1) it provides a complete description of the methods and features in a traditional machine learning workflow for malware detection and classification, (2) it explores ...
MutantX-S improves the scalability on handling very large numbers of malware with high-dimensional features by applying a hashing trick and a closeto-linear clustering algorithm. ...
doi:10.1016/j.jnca.2019.102526
fatcat:3bf6afjqpnb53eoeghfxjeaus4
On the Impact of Sample Duplication in Machine-Learning-Based Android Malware Detection
2021
ACM Transactions on Software Engineering and Methodology
., malware family clustering). ...
Our experimental results reveal that duplication in published datasets has a limited impact on supervised malware classification models. ...
[29] designed and implemented a framework, namely MutantX-S, to cluster samples into families based on code instruction sequences efficiently. ...
doi:10.1145/3446905
fatcat:dyxzy3riabew5m7fg5cyj3w5ky
Unleashing the Hidden Power of Compiler Optimization on Binary Code Difference: An Empirical Study
[article]
2021
arXiv
pre-print
We tailor search-based iterative compilation for the auto-tuning of binary code differences. ...
We run BinTuner with GCC 10.2 and LLVM 11.0 on SPEC benchmarks (CPU2006 & CPU2017), Coreutils, and OpenSSL. ...
We also thank VirusTotal for providing the academic API and malware samples. This research was supported by the National Science Foundation (NSF) under grant CNS-1850434. ...
arXiv:2103.12357v2
fatcat:7e2jq53kijc4nlayfa4ljqc2um
Guilt by association
2014
Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '14
that often appear together on machines. ...
We present Aesop, a scalable algorithm that identifies malicious executable files by applying Aesop's moral that "a man is known by the company he keeps." ...
Symantec's MutantX-S [15] system clusters executables according to their static and dynamic properties. ...
doi:10.1145/2623330.2623342
dblp:conf/kdd/TamersoyRC14
fatcat:wiivg6ad4zcbbi26ndncghpr24
Secure Learning In Adversarial Environments
2020
Proceedings of the 1st ACM Workshop on Security and Privacy on Artificial Intelligence
] systems) or have relied exclusively on static or dynamic features derived from the file itself (e.g., MutantX-S [95] ). ...
This baseline is similar to prior work in malware classification based on static features [95] . ...
doi:10.1145/3385003.3410927
fatcat:jejk6x63hfhnxfzbddpavtsmsy
Precise system-wide concatic malware unpacking
[article]
2019
arXiv
pre-print
The problem has received much attention, and so far, solutions based on dynamic analysis have been the most successful. ...
Minerva introduces a unified approach to precisely uncover execution waves in a packed malware sample and produce PE files that are well-suited for follow-up static analysis. ...
MutantX-S:
automatic unpackers rely on third-party applications to perform Scalable Malware Clustering Based on Static Features. ...
arXiv:1908.09204v1
fatcat:r7ivlcxpj5h5lp2jnatld5dfyq
Toward Metric Indexes for Incremental Insertion and Querying
[article]
2018
arXiv
pre-print
This use-case is inspired by a real-life need in malware analysis triage, and is surprisingly understudied. ...
Existing literature tends to either focus on only final query efficiency, often does not support incremental insertion, or does not support arbitrary distance metrics. ...
MutantX-S: Scalable Malware Clustering
Based on Static Features. In Presented as part of the 2013 USENIX Annual Technical
Conference (USENIX ATC 13), pages 187–198, San Jose, CA, 2013. USENIX. ...
arXiv:1801.05055v1
fatcat:cdzqxoypwndp7b4qg6uiqvga7q
jTrans: Jump-Aware Transformer for Binary Code Similarity
[article]
2022
arXiv
pre-print
Evaluation results show that jTrans outperforms state-of-the-art (SOTA) approaches on this more challenging dataset by 30.5% (i.e., from 32.0% to 62.5%). ...
In this study, we propose a novel Transformer-based approach, namely jTrans, to learn representations of binary code. ...
Methods such as BinClone [19] , ILine [34] , MutantX-S [31] , BinSign [47] , and Kam1n0 [13] use categorized operands or instructions as static features for the computation of binary similarity. ...
arXiv:2205.12713v1
fatcat:oloboad7gfenrbnqo4hibabywi