Identifying Ubiquitious Third-Party Libraries in Compiled Executables Using Annotated and Translated Disassembled Code with Supervised Machine Learning

Jedediah Haile, Sage Havens
2020 2020 IEEE Security and Privacy Workshops (SPW)  
The size and complexity of the software ecosystem is a major challenge for vendors, asset owners and cybersecurity professionals who need to understand the security posture of these systems. Annotated and Translated Disassembled Code is a graph based datastore designed to organize firmware and software analysis data across builds, packages and systems, providing a highly scalable platform enabling automated binary software analysis tasks including corpora construction and storage for machine
more » ... rning. This paper describes an approach for the identification of ubiquitous third-party libraries in firmware and software using Annotated and Translated Disassembled Code and supervised machine learning. Annotated and Translated Disassembled Code provide matched libraries, function names and addresses of previously unidentified code in software as it is being automatically analyzed. This data can be ingested by other software analysis tools to improve accuracy and save time. Defenders can add the identified libraries to their vulnerability searches and add effective detection and mitigation into their operating environment.
doi:10.1109/spw50608.2020.00042 fatcat:5lh62vq4cjgxxmdrreyz7f3ewi