Abstract Interpretation-Based Protection [chapter]

Roberto Giacobazzi
2010 Lecture Notes in Computer Science  
Hiding information means both hiding as making it imperceptible and obscuring as making it incomprehensible [9] . In programming, perception and comprehension of code's structure and behaviour are deep semantic concepts, which depend on the relative degree of abstraction of the observer, which corresponds precisely to program semantics. In this tutorial we show that abstract interpretation can be used as an adequate model for developing a unifying theory for information hiding in software, by
more » ... deling observers (i.e., malicious host attackers) O as suitable abstract interpreters. An observation can be any static or dynamic interpretation of programs intended to extract properties from its semantics and abstract interpretation [2] provides the best framework to understand semantics at different levels of abstraction. The long standing experience in digital media protection by obscurity is inspiring here. It is known that practical steganography is an issue where compression methods are inefficient: "Where efficient compression is available, information hiding becomes vacuous." [1]. This means that the gain provided by compression can be used for hiding information. This, in contrast to cryptography, strongly relies upon the understanding of the supporting media: if we have a source which is completely understandable, i.e., it can be perfectly compressed, then steganography becomes trivial. In programming languages, a complete understanding of semantics means that no loss of precision is introduced by approximating data and control components while analysing computations. Complete abstractions [3, 8] model precisely the complete understanding of program semantics by an approximate observer, which corresponds to the possibility of replacing, with no loss of precision, concrete computations with abstract ones -some sort of perfect semantic compressibility around a given property. This includes, for instance, both static and dynamic, via monitoring, approaches to information disclosure and reverse engineering [4] . The lack of completeness of the observer is therefore the corresponding of its poor understanding of program semantics, and provides the key aspect for understanding and designing a new family of methods and tools for software steganography and obfuscation. Consider the simple statement, C : x = a * b, multiplying a and b, and storing the result in x. An automated program sign analysis replacing concrete computations with approximated ones (i.e., the rule of signs) is able to catch, with no loss of precision, the intended sign behaviour of C because the sign abstraction O = {+, 0, −}, is complete for integer multiplication. If we replace C with O(C): x = 0; if b ≤ 0 then {a = −a; b = −b};
doi:10.1007/978-3-642-11319-2_4 fatcat:ecyxmhfuujbl7ovgtg6ntdtb54