Effective Use of Multiple Random Walks in P2P Networks

Zita Maria Almeida do Vale, Carlos Ramos, Rosslin John Robles
2014 Asia-pacific Journal of Multimedia services convergent with Art Humanities and Sociology  
Nowadays, millions of users search and download desired data such as Napster and Gnutella as they are Peer-to-Peer (P2P) files sharing applications. In order to improve performance in unstructured P2Ps replication strategies are used. Efficient and effective full-text retrieval over unstructured p2p networks was developed in order to address the problems of the query popularity independent replication strategies, previously a novel strategy. In order to support random node sampling and network
more » ... ize estimation a lightweight DHT with an unstructured P2P overlay. However these well-organized techniques are executed irrespective of topologies and network size concerns. To overcome this problem, we propose a query algorithm based on multiple random walks that resolve queries almost as quickly as unstructured P2P overlay method while reducing the network traffic by two orders of magnitude in many cases. We also present simulation results on a distributed replication strategy. 2 [ Fig. 1 ] A peer-to-peer (P2P) network in which interconnected nodes A peer-to-peer network is designed around the notion of equal peer nodes simultaneously functioning as both "clients" and "servers" to the other nodes on the network. This model of network arrangement differs from the client-server model where communication is usually to and from a central server. A typical example of a file transfer that uses the client-server model is the File Transfer Protocol (FTP) service in which the client and server programs are distinct: the clients initiate the transfer, and the servers satisfy these requests. Routing and resource discovery Peer-to-peer networks generally implement some form of virtual overlay network on top of the physical network topology, where the nodes in the overlay form a subset of the nodes in the physical network. Data is still exchanged directly over the underlying TCP/IP network, but at the application layer peers are able to communicate with each other directly, via the logical overlay links (each of which corresponds to a path through the underlying physical network). Overlays are used for indexing and peer discovery, and make the P2P system independent from the physical network topology. Based on how the nodes are linked to each other within the overlay network, and how resources are indexed and located, we can classify networks as unstructured or structured (or as a hybrid between the two) [2-4]. Unstructured networks Unstructured peer-to-peer networks do not impose a particular structure on the overlay network by design, but rather are formed by nodes that randomly form connections to each other [5] (Gnutella, Gossip, and Kazaa are examples of unstructured P2P protocols [6]). Because there is no structure globally imposed upon them, unstructured networks are easy to build and allow for localized optimizations to different regions of the overlay [7] . Also, because the role of all peers in the 3 network is the same, unstructured networks are highly robust in the face of high rates of "churn"-that is, when large numbers of peers are frequently joining and leaving the network [8][9]. [ Fig. 2 ] Overlay network diagram for an unstructured P2P network, illustrating the ad hoc nature of the connections between nodes However the primary limitations of unstructured networks also arise from this lack of structure. In particular, when a peer wants to find a desired piece of data in the network, the search query must be flooded through the network to find as many peers as possible that share the data. Flooding causes a very high amount of signaling traffic in the network, uses more CPU/memory (by requiring every peer to process all search queries), and does not ensure that search queries will always be resolved. Furthermore, since there is no correlation between a peer and the content managed by it, there is no guarantee that flooding will find a peer that has the desired data. Popular content is likely to be available at several peers and any peer searching for it is likely to find the same thing. But if a peer is looking for rare data shared by only a few other peers, then it is highly unlikely that search will be successful [10]. Structured Networks In structured peer-to-peer networks the overlay is organized into a specific topology, and the protocol ensures that any node can efficiently search the network for a file/resource, even if the resource is extremely rare. The most common type of structured P2P networks implement a distributed hash table (DHT), in which a variant of consistent hashing is used to assign ownership of each file to a particular peer [11] .This enables peers to search for resources on the network using a hash table: that is, (key, value) pairs are stored in the DHT, and any participating node can efficiently retrieve the value associated with a given key. 4 [Fig 3] Overlay network diagram for a structured P2P network, using a distributed hash table (DHT) to identify and locate nodes/resources Related Work Dongsheng Li states that with the increasing popularity of the peer-to-peer (P2P) computing paradigm, many general range query schemes for distributed hash table (DHT)-based P2P systems have been proposed in recent years. Although those schemes can support range query without modifying the underlying DHTs, they cannot guarantee to return the query results with bounded delay. The query delay in these schemes depends on both the scale of the system and the size of the query space or the specific query. In this paper, we propose Armada, an efficient range query processing scheme to support delay-bounded single-attribute and multiple-attribute range queries. We first describe the order-preserving naming algorithms for assigning adjoining Object IDs to objects with close attribute values. Then, we present the design of the forwarding tree to efficiently match the search paths of range queries to the underlying DHT topology. Based on the tree, two query processing algorithms are proposed to, respectively, process single-attribute and multiple attribute range queries within a bounded delay. Analytical and simulation results show that Armada is an effective general range query scheme on constant-degree DHTs, and can return the query results within 2 logN hops in a P2P system with N peers, regardless of the queried range or the size of query space. Ion Stoica states that a fundamental problem that confronts peer-to-peer applications is to efficiently locate the node that stores a particular data item. This paper presents Chord, a distributed lookup protocol that addresses this problem. Chord provides support for just one operation: given a key, it maps the key onto a node. Data location can be easily implemented on top of Chord by associating a key with each data item, and storing the key/data item pair at the node to which the key maps. Chord adapts efficiently as nodes join and leave the system, and can answer queries even if the system is continuously changing. Results from theoretical
doi:10.14257/ajmscahs.2014.06.04 fatcat:lihfkrnavfcmnbdejiudumwaai