Improving End-to-End Availability Using Overlay Networks

David G. Andersen
2018
The end-to-end availability of Internet services is between two and three orders of magnitude worse than other important engineered systems, including the US airline system, the 911 emergency response system, and the US public telephone system. This dissertation explores three systems designed to mask Internet failures, and, through a study of three years of data collected on a 31-site testbed, why these failures happen and how effectively they can be masked. A core aspect of many of the
more » ... s that interrupt end-to-end communication is that they fall outside the expected domain of well-behaved network failures. Many traditional techniques cope with link and router failures; as a result, the remaining failures are those caused by software and hardware bugs, misconfiguration, malice, or the inability of current routing systems to cope with persistent congestion. The effects of these failures are exacerbated because Internet services depend upon the proper functioning of many components—wide-area routing, access links, the domain name system, and the servers themselves—and a failure in any of them can prove disastrous to the proper functioning of the service. This dissertation describes three complementary systems to increase Internet availability in the face of such failures. Each system builds upon the idea of an overlay network, a network created dynamically between a group of cooperating Internet hosts. The first two systems, Resilient Overlay Networks (RON) and Multi-homed Overlay Networks (MONET) determine whether the Internet path between two hosts is working on an end-to-end basis. Both systems exploit the considerable redundancy available in the underlying Internet to find failure-disjoint paths between nodes, and forward traffic along a working path. RON is able to avoid 50% of the Internet outages that interrupt communication between a small group of communicating nodes. MONET is more aggressive, combining an overlay network of Web proxies with explicitly engineered redundant links to the Internet to [...]
doi:10.1184/r1/6606386 fatcat:rekenzu7pzhhnoed23qqf5kn24