Application-layer Fault-Tolerance Protocols [article]

Vincenzo De Florio
2016 arXiv   pre-print
The central topic of this book is application-level fault-tolerance, that is the methods, architectures, and tools that allow to express a fault-tolerant system in the application software of our computers. Application-level fault-tolerance is a sub-class of software fault-tolerance that focuses on the problems of expressing the problems and solutions of fault-tolerance in the top layer of the hierarchy of virtual machines that constitutes our computers. This book shows that application-level
more » ... ult-tolerance is a key ingredient to craft truly dependable computer systems--other approaches, such as hardware fault-tolerance, operating system fault-tolerance, or fault-tolerant middleware, are also important ingredients to achieve resiliency, but they are not enough. Failing to address the application layer means leaving a backdoor open to problems such as design faults, interaction faults, or malicious attacks, whose consequences on the quality of service could be as unfortunate as, e.g., a physical fault affecting the system platform. In other words, in most cases it is simply not possible to achieve complete coverage against a given set of faults or erroneous conditions without embedding fault-tolerance provisions also in the application layer.
arXiv:1611.02273v1 fatcat:my2uj2n2hrf4ljzpmbh4zlk57q