An integrated approach to achieving high software reliability
1998 IEEE Aerospace Conference Proceedings (Cat. No.98TH8339)
In this paper we address the development, testing, and evaluation schemes for software reliability, and the integration of these schemes into a unified and consistent paradigm. Specifically, techniques and tools for the three phases of software reliability engineering will be described. The three phases are (1) modeling and analysis, (2) design and implementation, and (3) testing and measurement. In the modeling and analysis phase we describe Markov modeling and fault-tree analysis techniques.
... e present system-level reliability models based on these techniques, and provide modeling examples for the reliability analysis and study with known system architectures. We describe how reliability block diagrams can be constructed for a realworld system for reliability prediction, and how critical components can be identified from the existing architecture. We also apply fault tree models to fault tolerant system architectures, and formulate the resulting reliability quantity. Finally, we describe two software tools, SHARPE and UltraSAN, which are available for reliability modeling and analysis purpose. In the design and implementation phase we show specific fault-tolerant techniques in building reliable software systems for either single-version software or multipleversion software. In single-version software we form a generic platform and a set of reusable software components to perform software fault tolerance tasks in any application executing on that platform. These software fault tolerance components, including watchd, libft, REPL, libckp, and addrejuv, provide a powerful set of building blocks to defend against software faults in various levels of a system. We describe the concept and implementation of these techniques. In addition, we examine multiple-version systems using design diversity, including recovery blocks and N-version programming techniques.