Automatically Locating ARM Instructions Deviation between Real Devices and CPU Emulators [article]

Muhui Jiang, Tianyi Xu, Yajin Zhou, Yufeng Hu, Ming Zhong, Lei Wu, Xiapu Luo, Kui Ren
2021 arXiv   pre-print
Emulator is widely used to build dynamic analysis frameworks due to its fine-grained tracing capability, full system monitoring functionality, and scalability of running on different operating systemsand architectures. However, whether the emulator is consistent with real devices is unknown. To understand this problem, we aim to automatically locate inconsistent instructions, which behave differently between emulators and real devices. We target ARM architecture, which provides machine readable
more » ... specification. Based on the specification, we propose a test case generator by designing and implementing the first symbolic execution engine for ARM architecture specification language (ASL). We generate 2,774,649 representative instruction streams and conduct differential testing with these instruction streams between four ARM real devices in different architecture versions (i.e., ARMv5, ARMv6, ARMv7-a, and ARMv8-a) and the state-of-the-art emulators (i.e., QEMU). We locate 155,642 inconsistent instruction streams, which cover 30% of all instruction encodings and 47.8% of the instructions. We find undefined implementation in ARM manual and implementation bugs of QEMU are the major causes of inconsistencies. Furthermore, we discover four QEMU bugs, which are confirmed and patched by thedevelopers, covering 13 instruction encodings including the most commonly used ones (e.g.,STR,BLX). With the inconsistent instructions, we build three security applications and demonstrate thecapability of these instructions on detecting emulators, anti-emulation, and anti-fuzzing.
arXiv:2105.14273v2 fatcat:7apz3eyr5vf2jip4pk5a67ehhq