Filters








199 Hits in 5.5 sec

Augmenting Decompiler Output with Learned Variable Names and Types [article]

Qibin Chen and Jeremy Lacomis and Edward J. Schwartz and Claire Le Goues and Graham Neubig and Bogdan Vasilescu
2021 arXiv   pre-print
In this paper we present DIRTY (DecompIled variable ReTYper), a novel technique for improving the quality of decompiler output that automatically generates meaningful variable names and types.  ...  Decompilers are able to deterministically reconstruct structural properties of code, but comments, variable names, and custom variable types are technically impossible to recover.  ...  DIRTY takes a decompiled function as input, and outputs probable names and types for all of its variables.  ... 
arXiv:2108.06363v1 fatcat:up2d6ciynnhevok5cracznn6yq

Neutron: an attention-based neural decompiler

Ruigang Liang, Ying Cao, Peiwei Hu, Kai Chen
2021 Cybersecurity  
Decompilation plays a vital role in the cyberspace security fields such as software vulnerability discovery and analysis, malicious code detection and analysis, and software engineering fields such as  ...  Unfortunately, the existing decompilers mainly rely on experts to write rules, which leads to bottlenecks such as low scalability, development difficulties, and long cycles.  ...  Authors' contributions All authors have contributed to this manuscript and approve of this submission. Ruigang Liang and Ying Cao participated in all the work and drafting the article.  ... 
doi:10.1186/s42400-021-00070-0 fatcat:5362toppdbcb5gyr2ofs2o7rku

DIRE: A Neural Approach to Decompiled Identifier Naming [article]

Jeremy Lacomis, Pengcheng Yin, Edward J. Schwartz, Miltiadis Allamanis, Claire Le Goues, Graham Neubig, Bogdan Vasilescu
2019 arXiv   pre-print
Decompilers can reconstruct much of the information that is lost during the compilation process (e.g., structure and type information).  ...  We propose the Decompiled Identifier Renaming Engine (DIRE), a novel probabilistic technique for variable name recovery that uses both lexical and structural information recovered by the decompiler.  ...  This material is based upon work supported in part by the Software Engineering Institute (LINE project 6-18-001) and National Science Foundation (awards 1815287 and 1910067).  ... 
arXiv:1909.09029v2 fatcat:ijuhryanpbbnnd27bnosfuseou

Sum-Product Network Decompilation [article]

Cory J. Butz, Jhonatan S. Oliveira, Robert Peharz
2020 arXiv   pre-print
Secondly, the output BN produced by SPN2BN can be precisely characterized with respect to a compiled BN.  ...  Most significantly, the BNs returned by SPN2BN are minimal independence-maps that are more parsimonious with respect to the introduction of latent variables.  ...  A decompilation algorithm for SPNs should account for this fact, and generate BNs with a plausible set of latent variables.  ... 
arXiv:1912.10092v2 fatcat:acuuv47yvfcqrd2enqv332fmeq

Towards Neural Decompilation [article]

Omer Katz, Yuval Olshaker, Yoav Goldberg, Eran Yahav
2019 arXiv   pre-print
We used our framework to decompile both LLVM IR and x86 assembly to C code with high success rates.  ...  We present a novel approach to decompilation based on neural machine translation. The main idea is to automatically learn a decompiler from a given compiler.  ...  We implemented an instance of our framework for decompiling LLVM IR and x86 assembly to C. We evaluated these instances on randomly generated inputs with a high success rates.  ... 
arXiv:1905.08325v1 fatcat:nmd45uzjrbfqpj2meaxyu2u2ge

Finding Inlined Functions in Optimized Binaries [article]

Toufique Ahmed, Premkumar Devanbu, Anand Ashok Sawant
2021 arXiv   pre-print
Decompilation involves several desirable steps, including recreating source-language constructions, variable names, and perhaps even comments.  ...  We augment this large but limited labeled dataset with a pre-training step, which learns the decompiled code statistics from a much larger unlabeled dataset.  ...  ACKNOWLEDGMENTS We gratefully acknowledge support from NSF CISE (SHF) Grant No. 1414172, and from Sandia National Laboratories.  ... 
arXiv:2103.05221v1 fatcat:5sqqyyyiczdttli6fvz5apcu6q

Neural Reverse Engineering of Stripped Binaries using Augmented Control Flow Graphs [article]

Yaniv David, Uri Alon, Eran Yahav
2020 arXiv   pre-print
The main idea is to use static analysis to obtain augmented representations of call sites; encode the structure of these call sites using the control-flow graph (CFG) and finally, generate a target name  ...  We present a novel approach for predicting procedure names in stripped executables. Our approach combines static analysis with neural models.  ...  ACKNOWLEDGEMENTS We would like to thank Jingxuan He and Martin Vechev for their help in running Debin, and Emery Berger for his useful advice.  ... 
arXiv:1902.09122v4 fatcat:26kcqnoc7vbthcmovombp76ijm

Applications of Graph Integration to Function Comparison and Malware Classification [article]

Michael A. Slawinski, Andy Wortman
2019 arXiv   pre-print
The median time needed for decompilation and scoring was 24ms.  ...  The result is a fast, intuitive, and easy-to-compute glass-box vectorization scheme, which can be leveraged for training a standalone classifier or to augment an existing feature space.  ...  ACKNOWLEDGMENT The authors would like to thank former colleague Brian Wallace for both deduplicating our .NET corpus and applying the decompiler at scale.  ... 
arXiv:1810.04789v6 fatcat:6q7ulju2vbajzimul7t2ih65fe

From MinX to MinC: semantics-driven decompilation of recursive datatypes

Ed Robbins, Andy King, Tom Schrijvers
2016 Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages - POPL 2016  
We formalise and implement the approach for reversing MINX, an abstraction of x86, to MINC, a type-safe dialect of C with recursive datatypes.  ...  Moreover, the approach effectively yields a type-directed decompiler.  ...  Acknowledgments This work was supported by grant EP/K031929/1 funded by GCHQ in association with EPSRC, and partly funded by the Flemish Fund for Scientific Research (FWO).  ... 
doi:10.1145/2837614.2837633 dblp:conf/popl/RobbinsKS16 fatcat:xgykohvkovdlhalngrcqjufz7u

From MinX to MinC: semantics-driven decompilation of recursive datatypes

Ed Robbins, Andy King, Tom Schrijvers
2016 SIGPLAN notices  
We formalise and implement the approach for reversing MINX, an abstraction of x86, to MINC, a type-safe dialect of C with recursive datatypes.  ...  Moreover, the approach effectively yields a type-directed decompiler.  ...  Acknowledgments This work was supported by grant EP/K031929/1 funded by GCHQ in association with EPSRC, and partly funded by the Flemish Fund for Scientific Research (FWO).  ... 
doi:10.1145/2914770.2837633 fatcat:ifszk325ove2bprmsqni6ge7zm

When Coding Style Survives Compilation: De-anonymizing Programmers from Executable Binaries [article]

Aylin Caliskan, Fabian Yamaguchi, Edwin Dauber, Richard Harang, Konrad Rieck, Rachel Greenstadt, Arvind Narayanan
2016 arXiv   pre-print
Many distinguishing features present in source code, e.g. variable names, are removed in the compilation process, and compiler optimization may alter the structure of a program, further obscuring features  ...  We examine programmer de-anonymization from the standpoint of machine learning, using a novel set of features that include ones obtained by decompiling the executable binary to source code.  ...  Army Research Office) Grant W911NF-15-2-0055 and AWS in Education Research Grant award.  ... 
arXiv:1512.08546v2 fatcat:xpqswbzugnae3lg5kiq3k7vkgu

Searching a Database of Source Codes Using Contextualized Code Search [article]

Rohan Mukherjee, Swarat Chaudhuri, Chris Jermaine
2020 arXiv   pre-print
We cast contextualized code search as a learning problem, where the goal is to learn a distribution function computing the likelihood that each database code completes the program, and propose a neural  ...  We assume a database containing a large set of program source codes and consider the problem of contextualized code search over that database.  ...  (8) Return type of method with the missing code. (9) Formal parameter list of method with missing code; including formal parameter type and name, split using camel-case. (10) Set of types within method  ... 
arXiv:2001.03277v1 fatcat:okkyy6vvfvd6dbtiotk2vjxes4

Feature-level Malware Obfuscation in Deep Learning [article]

Keith Dillon
2020 arXiv   pre-print
We consider the problem of detecting malware with deep learning models, where the malware may be combined with significant amounts of benign code.  ...  Hence we focus on the use of static features, particularly Intents, Permissions, and API calls, which we presume cannot be ultimately hidden from the Android system, but only augmented with yet more such  ...  Network Architecture We ultimately used a network consisting of 20 dense layers with 1024 nodes each, plus a single-node output layer to form the binary classifier.  ... 
arXiv:2002.05517v1 fatcat:bwknouci35d6vb6zvvkro3yf5i

A new technique for intent elicitation in Android applications

Mohamed A. El-Zawawy
2019 Iran Journal of Computer Science  
The paper shows comparisons between results obtained by IntGet and those obtained by Androguard.  ...  IntGet was implemented and tested on 359461 smali files of 40 applications. The experimental results revealed that IntGet can be used for designing efficient malware detection methods.  ...  ICCDetector is a machine learning method that needs training using groups of benign and malicious applications. Salvia [14] augmented the static analyzer Julia with an intent analysis.  ... 
doi:10.1007/s42044-019-00032-3 fatcat:m55veikkf5fq7amiqbdhuuqwzi

Enhancing Fidelity of Description in Android Apps with Category-based Common Permissions

Zhiqiang Wu, Xin Chen, Muhammad Umair Khan, Scott Uk-Jin Lee
2021 IEEE Access  
The last hidden h l in the Bi-LSTM layer concatenates with the outputs of the attention and pooling layers as its output.  ...  Y (x ∈ X and y ∈ Y ), respectively, and p(x, y) is joint probabilities for random variables X and Y .  ... 
doi:10.1109/access.2021.3100118 fatcat:tluxtdazmzailpeq2bmqk2yo4q
« Previous Showing results 1 — 15 out of 199 results