A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
CBMC Path: A Symbolic Execution Retrofit of the C Bounded Model Checker
[chapter]
2019
Msphere
Khazem-Juror.
c The Author(s) 2019 D. Beyer et al. ...
doi:10.1007/978-3-030-17502-3_13
fatcat:sw3agomoxvbhnmqt4jqavp3kte
Model Checking Boot Code from AWS Data Centers
[chapter]
2018
Lecture Notes in Computer Science
This paper describes our experience with symbolic model checking in an industrial setting. We have proved that the initial boot code running in data centers at Amazon Web Services is memory safe, an essential step in establishing the security of any data center. Standard static analysis tools cannot be easily used on boot code without modification owing to issues not commonly found in higher-level code, including memory-mapped device interfaces, byte-level memory access, and linker scripts.
doi:10.1007/978-3-319-96142-2_28
fatcat:33kkqaeezbdehpikxihtzrzez4
more »
... paper describes automated solutions to these issues and their implementation in the C Bounded Model Checker (CBMC). CBMC is now the first source-level static analysis tool to extract the memory layout described in a linker script for use in its analysis. 1. memory-mapped input/output (MMIO) for accessing devices, 2. device behavior behind these MMIO regions, 3. byte-level memory access as the dominant form of memory access, and 4. linker scripts used during the build process. Not handling MMIO or linker scripts results in imprecision (false positives), and not modeling device behavior is unsound (false negatives). We describe the solutions to these challenges that we developed. We implemented our solutions in the C Bounded Model Checker (CBMC) [20] . We achieve
Model checking boot code from AWS data centers
2020
Formal methods in system design
This paper describes our experience with symbolic model checking in an industrial setting. We have proved that the initial boot code running in data centers at Amazon Web Services is memory safe, an essential step in establishing the security of any data center. Standard static analysis tools cannot be easily used on boot code without modification owing to issues not commonly found in higher-level code, including memory-mapped device interfaces, bytelevel memory access, and linker scripts. This
doi:10.1007/s10703-020-00344-2
fatcat:lx63mgkbyja3bfsfafo7h4s65i
more »
... paper describes automated solutions to these issues and their implementation in the C Bounded Model Checker (CBMC). CBMC is now the first source-level static analysis tool to extract the memory layout described in a linker script for use in its analysis. 123 Formal Methods in System Design 3. byte-level memory access as the dominant form of memory access, and 4. linker scripts used during the build process. Not handling MMIO or linker scripts results in imprecision (false positives), and not modeling device behavior is unsound (false negatives). We describe the solutions to these challenges that we developed. We implemented our solutions in the C Bounded Model Checker (CBMC) [20] . We achieve soundness with CBMC by fully unrolling loops in the boot code. Our solutions automate boot code verification and require no changes to the code being analyzed. This makes our work particularly well-suited for deployment in a continuous validation environment to ensure that memory safety issues do not reappear in the code as it evolves during development. We use CBMC, but any other bit-precise, sound, automated static analysis tool could be used. Related work There are many approaches to finding memory safety errors in low-level code, from fuzzing [2] to static analysis [24, 30, 36, 54 ] to deductive verification [21, 32] . A key aspect of our work is soundness and precision in the presence of very low-level details. Furthermore, full automation is essential in our setting to operate in a continuous validation environment. This makes some form of model checking most appealing. CBMC is a bounded model checker for C, C++, and Java programs, available on GitHub [13]. It features bit-precise reasoning, and it verifies array bounds (buffer overflows), pointer safety, arithmetic exceptions, and assertions in the code. A user can bound the model checking done by CBMC by specifying for a loop a maximum number of iterations of the loop. CBMC can check that it is impossible for the loop to iterate more than the specified number of times by checking a loop-unwinding assertion. CBMC is sound when all loop-unwinding assertions hold. Loops in boot code typically iterate over arrays of known sizes, making it possible to choose loop unwinding limits such that all loop-unwinding assertions hold (see Sect. 5.6). BLITZ [16] or F-Soft [34] could be used in place of CBMC. SATABS [19], Ufo [3], Cascade [58], Blast [8], CPAchecker [9], Corral [31,39,40], and others [18,43] might even enable unbounded verification. Our work applies to any sound, bit-precise, automated tool. Note that boot code makes heavy use of pointers, bit vectors, and arrays, but not the heap. Thus, memory safety proof techniques based on three-valued logic [41] or separation logic as in [7] or other techniques [1,22] that focus on the heap are less appropriate since boot code mostly uses simple arrays. KLEE [12] is a symbolic execution engine for C that has been used to find bugs in firmware. Davidson et al. [25] built the tool FIE on top of KLEE for detecting bugs in firmware programs for the MSP430 family of microcontrollers for low-power platforms, and applied the tool to nearly a hundred open source firmware programs for nearly a dozen versions of the microcontroller to find bugs like buffer overflow and writing to read-only memory. Corin and Manzano [23] used KLEE to do taint analysis and prove confidentiality and integrity properties. KLEE and other tools like SMACK [49] based on the LLVM intermediate representation do not currently support the linker scripts that are a crucial part of building boot code (see Sect. 4.5). They support partial linking by concatenating object files and resolving symbols, but fail to make available to their analysis the addresses and constants assigned to symbols in linker scripts, resulting in an imprecise analysis of the code. S 2 E [15] is a symbolic execution engine for x86 binaries built on top of the QEMU [6] virtual machine and KLEE. S 2 E has been used on firmware. Parvez et al. [46] use symbolic
Making data-driven porting decisions with Tuscan
2018
Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis - ISSTA 2018
Software typically outlives the platform that it was originally written for. To smooth the transition to new tools and platforms, programs should depend on the underlying platform as little as possible. In practice, however, software build processes are highly sensitive to their build platform, notably the implementation of the compiler and standard library. This makes it difficult to port existing, mature software to emerging platforms-web based runtimes like WebAssembly, resource-constrained
doi:10.1145/3213846.3213855
dblp:conf/issta/KhazemBH18
fatcat:op3tiknygvgy3bytoxjw5gjp5m
more »
... nvironments for Internet-of-Things devices, or innovative new operating systems like Fuchsia. We present Tuscan, a framework for conducting automatic, deterministic, reproducible tests on build systems. Tuscan is the first framework to solve the problem of reproducibly testing builds cross-platform at massive scale. We also wrote a build wrapper, Red, which hijacks builds to tolerate common failures that arise from platform dependence, allowing the test harness to discover errors later in the build. Authors of innovative platforms can use Tuscan and Red to test the extent of unportability in the software ecosystem, and to quantify the effort necessary to port legacy software. We evaluated Tuscan by building an operating system distribution, consisting of 2,699 Red-wrapped programs, on four platforms, yielding a 'catalog' of the most common portability errors. This catalog informs data-driven porting decisions and motivates changes to programs, build systems, and language standards; systematically quantifies problems that platform writers have hitherto discovered only on an ad-hoc basis; and forms the basis for a common substrate of portability fixes that developers can apply to their software.
Code‐level model checking in the software development workflow at Amazon Web Services
2021
Software, Practice & Experience
This article describes a style of applying symbolic model checking developed over the course of four years at Amazon Web Services (AWS). Lessons learned are drawn from proving properties of numerous C-based systems, for example, custom hypervisors, encryption code, boot loaders, and an IoT operating system. Using our methodology, we find that we can prove the correctness of industrial low-level C-based systems with reasonable effort and predictability. Furthermore, AWS developers are
doi:10.1002/spe.2949
fatcat:3sirdpatwbdxvkard4fghvs3l4
more »
... y writing their own formal specifications. As part of this effort, we have developed a CI system that allows integration of the proofs into standard development workflows and extended the proof tools to provide better feedback to users. All proofs discussed in this article are publicly available on GitHub.
Code-level model checking in the software development workflow
2020
Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering in Practice
This experience report describes a style of applying symbolic model checking developed over the course of four years at Amazon Web Services (AWS). Lessons learned are drawn from proving properties of numerous C-based systems, e.g., custom hypervisors, encryption code, boot loaders, and an IoT operating system. Using our methodology, we find that we can prove the correctness of industrial low-level C-based systems with reasonable effort and predictability. Furthermore, AWS developers are
doi:10.1145/3377813.3381347
dblp:conf/icse/ChongCKKMSTTT20
fatcat:hu6kttl6azbc5bjdpjggyibxki
more »
... ngly writing their own formal specifications. All proofs discussed in this paper are publicly available on GitHub. CCS CONCEPTS • Software and its engineering → Formal software verification; Model checking; Correctness; • Theory of computation → Program reasoning.
Automatic Verification of C and Java Programs: SV-COMP 2019
[chapter]
2019
Msphere
Khazem
U. ...
member
Affiliation
2LS
[49, 61] Peter Schrammel
U. of Sussex, UK
AProVE
[34, 38] Jera Hensel
RWTH Aachen, Germany
CBMC
[46]
Michael Tautschnig
Amazon Web Services, UK
CBMC-Path
[44]
Kareem ...
doi:10.1007/978-3-030-17502-3_9
fatcat:nhfizu64uzhg7e4skftvgjrbyu
Experience report: growing and shrinking polygons for random testing of computational geometry algorithms
2016
SIGPLAN notices
Acknowledgments I would like to thank James Brotherston, Lewis Griffin, Robin Hirsch, Kareem Khazem, Gilles Rainer, Reuben Rowe and other members of UCL PPLV group for the fruitful discussions and for ...
doi:10.1145/3022670.2951927
fatcat:xe6j6ezu7ffmjjimhqdnkpsiie
Experience report: growing and shrinking polygons for random testing of computational geometry algorithms
2016
Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming - ICFP 2016
Acknowledgments I would like to thank James Brotherston, Lewis Griffin, Robin Hirsch, Kareem Khazem, Gilles Rainer, Reuben Rowe and other members of UCL PPLV group for the fruitful discussions and for ...
doi:10.1145/2951913.2951927
dblp:conf/icfp/Sergey16
fatcat:lnncbunvjbg4bpskc7w4h4425y
Designing collaborative musical experiences for broad audiences
2013
Proceedings of the 9th ACM Conference on Creativity & Cognition - C&C '13
ACKNOWLEDGMENTS We would like to thank Nela Brown, Kareem Khazem, Sara Heitlinger, Richard Kelly, Colin Powell, Mark Plumbley and Irini Papadimitriou. ...
doi:10.1145/2466627.2466633
dblp:conf/candc/BenglerB13
fatcat:wjjkjjqdevgc3dib3w3p5fjfzm
Exposing errors related to weak memory in GPU applications
2016
Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation - PLDI 2016
Acknowledgments We are grateful to: John Wickerson, Pantazis Deligiannis, Kareem Khazem and Peter Sewell for feedback and insightful discussions; Jade Alglave for discussion and support during the inception ...
doi:10.1145/2908080.2908114
dblp:conf/pldi/SorensenD16
fatcat:yonegl66kvgkvlc2zq3dqhz2he
Exposing errors related to weak memory in GPU applications
2016
SIGPLAN notices
Acknowledgments We are grateful to: John Wickerson, Pantazis Deligiannis, Kareem Khazem and Peter Sewell for feedback and insightful discussions; Jade Alglave for discussion and support during the inception ...
doi:10.1145/2980983.2908114
fatcat:ienrtnlv3jdgzknlrlibtqpvmi
Symbolic Execution Game Semantics
[article]
2020
arXiv
pre-print
Byron Cook, Kareem Khazem, Daniel Kroening, Serdar Tasiran, Michael Tautschnig, and Mark R. Tuttle. Model checking boot code from AWS data centers. 25 Andrzej S. Murawski, Steven J. ...
arXiv:2002.09115v1
fatcat:jfrcxjiamra4hllh6euhyhuv2a
Inter-workgroup barrier synchronisation on graphics processing units
2019
In particular, I thank Kareem Khazem for being a constant and reliable source of support, advice, and fun. I'm sure East Acton is happy to be rid of us! ...
doi:10.25560/68006
fatcat:6ogn37ahvnci3ooke5w42bxewu