A Framework for Structured Peer-to-Peer Overlay Networks
Lecture Notes in Computer Science
Structured peer-to-peer overlay networks have recently emerged as good candidate infrastructure for building novel large-scale and robust Internet applications in which participating peers share computing resources as equals. In the past three year, various structured peer-to-peer overlay networks have been proposed, and probably more are to come. We present a framework for understanding, analyzing and designing structured peer-to-peer overlay networks. The main objective of the paper is to
... ide practical guidelines for the design of structured overlay networks by identifying a fundamental element in the construction of overlay networks: the embedding of k−ary trees. Then, a number of effective techniques for maintaining these overlay networks are discussed. The proposed framework has been effective in the development of the DKS system, whose preliminary design appears in  . Internet, can fail at any time. To cope with this dynamism, these systems should be stabilizing, that is, despite the high-dynamism, the system should converge to legitimate configurations, without external intervention. Peer-to-peer systems are attractive in at least two respects. First, from the user standpoint, peer-to-peer computing has a huge potential, as it reduces the need for expensive back-end servers, typically used to perform complex tasks. Moreover, the administrative costs are significantly reduced, as peer-to-peer systems are in general built on autonomous systems, without a centralized administration. Second, from the scientific perspective, peer-to-peer systems are large-scale distributed systems that involve challenging issues such as fault-tolerance, scalability and security. The current trend in building P2P systems, consists in providing an applicationindependent overlay network as a substrate on top of which novel large-scale applications can be constructed. An overlay network is a logical network on top of one or more networks. A well-known example of such networks is the Internet. The main purpose of an overlay network is to provide effective means by which a huge amount of computing resources are linked together and accessed. And, as can be seen nowadays, various high-level distributed services can be built on top of an overlay network [6, 3, 13] . The performance of these high-level distributed services strongly depends on the properties of the underlying overlay network. Two main design approaches can be identified for building overlay networks. On the one hand, there are un-structured overlay networks [14, 11] , in which peers are extremely autonomous. That is, a peer joins the overlay network by connecting itself to any other existing peers. We say that un-structured overlay networks are built in an un-controlled fashion. Unstructured overlay networks have the advantage of providing flexibility when it comes to finding resources within the system. For instance, arbitrary queries can be handled easily. However, they provide restricted guarantees, because even if a data item were inserted into the system, there is no guarantee that it will be located when needed. Furthermore, these overlay networks tend to be inefficient, as they mainly use flooding for search. On the other hand, there are structured overlay networks [26, 23, 24, 2, 1, 19] , where a peer joins the overlay network by connecting itself to some other well-defined peers, based on its logical identifier. We say that structured overlay networks are built in a controlled manner. These overlay networks provide high guarantees but have a limited query language. For example, complex queries are not supported in a "natural" way. In this paper, our focus is on structured overlay networks [25, 2, 24] . Hence, we will use the term overlay network to mean structured overlay network. The core service that these overlay networks provide is a location-independent virtual identifier basedrouting 3 . That is, given a message along with a virtual identifier vid, the overlay network routes the message to the ultimate destination dest(vid), which is related to vid in a well-defined manner. We discuss the relation between vid and dest(vid) in Section 2.3. On top of the core service mentioned above, a number of high-level services such as Distributed Hash Table (DHT) , location-independent one-to-one communication (point-to-point), one-to-many communication such as broadcast [13, 10] and multicast , object replication and caching under various consistency models can be built.