Nightingale: Translating Embedded VM Code in x86 Binary Executables [chapter]

Xie Haijiang, Zhang Yuanyuan, Li Juanru, Gu Dawu
2017 Lecture Notes in Computer Science  
Code protection schemes nowadays adopt language embedding, a technique in which a customized language is built within a general-purpose one, often referred to as the host language, to obfuscate original code through transforming it into a customized form with which the analyst is not familiar. The transformed code is then interpreted by a so-called Embedded VM. This type of transformation does increase the cost of code comprehending and maintaining, and introduces extra runtime overhead. In
more » ... paper, we conduct an in-depth study on embedded VM based code protection and propose a de-obfuscation approach that aims to recover the original code form. Our approach first pinpoints the interpretation procedure and partitions handlers of the embedded VM, and then employs a VM-state based handler translating, which represents the VM-state-updated behaviors of handlers. Finally, the translated operations of each handler is optimized and transformed into host code. After this process, we can obtain a clear and runtime efficient code representation. We build Nightingale, a binary translation tool, to fulfil this de-obfuscation automatically with x86 binary executables. We test our approach on the latest commercial code obfuscators, embedded domainspecific languages and a set of home brewed obfuscation schemes. The results demonstrate that this kind of obfuscated code can be simplified with host language effectively.
doi:10.1007/978-3-319-69659-1_21 fatcat:q6o6e5kb4vcvxghcerwecffvlq