An executable formal semantics of PHP with applications to program analysis

Daniele Filaretti, Sergio Maffeis, Engineering And Physical Sciences Research Council
2016
Nowadays, many important activities in our lives involve the web. However, the software and protocols on which web applications are based were not designed with the appropriate level of security in mind. Many web applications have reached a level of complexity for which testing, code reviews and human inspection are no longer sufficient quality-assurance guarantees. Tools that employ static analysis techniques are needed in order to explore all possible execution paths through an application
more » ... guarantee the absence of undesirable behaviours. To make sure that an analysis captures the properties of interest, and to navigate the trade-offs between efficiency and precision, it is necessary to base the design and the development of static analysis tools on a firm understanding of the language to be analysed. When this underlying knowledge is missing or erroneous, tools can't be trusted no matter what advanced techniques they use to perform their task. In this Thesis, we introduce KPHP, the first executable formal semantics of PHP, one of the most popular languages for server-side web programming. Then, we demonstrate its practical relevance by developing two verification tools, of increasing complexity, on top of it - a simple verifier based on symbolic execution and LTL model checking and a general purpose, fully configurable and extensible static analyser based on Abstract Interpretation. Our LTL-based tool leverages the existing symbolic execution and model checking support offered by K, our semantics framework of choice, and constitutes a first proof-of-concept of the usefulness of our semantics. Our abstract interpreter, on the other hand, represents a more significant and novel contribution to the field of static analysis of dynamic scripting languages (PHP in particular). Although our tool is still a prototype and therefore not well suited for handling large real-world codebases, we demonstrate how our semantics-based, principled approach to the development of verification tools has lead to the design of sta [...]
doi:10.25560/40922 fatcat:6j7bs5eydzcgjh276yxpqe3dia