Performance of a possible Grid message infrastructure

Shrideep Pallickara, Geoffrey Fox, Ahmet Uyar, Hongbin Liu, Xi Rao, David Walker, Beytullah Yildiz
2005 Concurrency and Computation  
Introduction The Grid [1] [2] [3] [4] has made dramatic progress recently with impressive technology and several large important applications initiated in high-energy physics [5, 6] , earth science [7, 8] and other areas [9, 10] . At the same time, there have been equally impressive advances in broadly deployed Internet technology. We can cite the dramatic growth in the use of XML, the "disruptive" impact of peer-to-peer (P2P) approaches [11] that have resulted in a slew of powerful
more » ... , and the more orderly, but still widespread adoption, of a universal Web Service approach to Web based applications [12, 13]. Grids are exemplified by the infrastructure used to allow seamless access to supercomputers or large scale integration of datasets. P2P technology facilitates sophisticated resource sharing environments between peers over the "edges" of the Internet, enabling ad hoc communities of low-end clients to advertise and access resources on communal computers. The deployments and utilization of Web services are driven by a slew of XML based specifications that pertain to exposing services, discovering them and accessing these securely once the requestor is authenticated and authorized. Grids, P2P Networks and Web Services consist of a sea of message based services. Services inject and extract messages whose transport and manipulation are supported by a logically distinct "MessageGrid" supplying the Grid Message layer. We can abstract such environments as a distributed system of "clients" or "end-points" which consist either of "users" or "resources" or proxies thereto. These clients must be linked together in a flexible fault tolerant efficient high performance fashion. The messaging infrastructure linking clients (both users and resources of course) would provide the messaging backbone. The smallest unit of this messaging infrastructure should be able to intelligently process and route messages while working with multiple underlying communication protocols. We refer to this unit as a broker, where we avoid the use of the term servers to distinguish it clearly from the application and system servers that would be among the sources/sinks to messages generated within the integrated system. For our purposes (registering, transporting and discovering information), we use the term events/messages interchangeably where events are just messages − typically with time stamps. We may enumerate the following requirements for the messaging infrastructure for Grid messaging − 1. Scaling: This is of paramount importance considering the number of devices, clients and services that would be aggregated in the P2P grid. The distributed broker network should scale to support the increase in these aggregated entities. However the addition of brokers to aid the scaling should not degrade performance by increasing communication pathlengths or ineffective bandwidth utilizations between broker nodes within the system.. Efficient disseminations: The disseminations pertain to routing content, queries, invocations etc. to the relevant destinations in an efficient manner. The routing engine at each broker needs to ensure that the paths traversed within the broker network to reach destinations are along efficient paths that eschew failed broker nodes. 3. Robust delivery mechanisms: This is to ensure robust delivery of messages in, and despite, the presence of failures and prolonged disconnects. 4. Location independence: To eliminate bandwidth degradations and bottlenecks stemming from entities accessing a certain known broker over and over again to gain access to services, it must be ensured that any broker within the broker network is just as good as the other. Services and functionality would then be accessible from any point within the broker network. 5. Support for P2P interactions: P2P systems tend to be autonomic, obviating the need for dedicated management. P2P systems incorporate sophisticated search and subsequent discovery mechanisms. Support for P2P interactions facilitates access to information resources and services hosted by peers at the "edge" of the network. 6. Interoperate with other messaging clients: Enterprises have several systems that are built around messaging. These clients could be based on enterprise vendors such as IBM's MQSeries or Microsoft's MSMQ. Sometimes these would be clients conforming to mature messaging specifications such as the Java Message Service (JMS) [14] . JMS clients, existing in disparate enterprise realms, can utilize the distributed broker network as a JMS provider to communicate with each other. 7. Communication through proxies and firewalls: It is inevitable that the realms we try to federate would be protected by firewalls stopping our elegant application channels dead in their tracks. The messaging infrastructure should thus be able to communicate across firewall, DHCP and NAT boundaries. Sometimes communications would also be through authenticating proxies. 8. Extensible transport framework: Here we consider the communication subsystem, which provides the messaging between the resources and services. Examining the growing power of optical networks we see the increasing universal bandwidth that in fact motivates the thin client and server based application model. However the real world also shows slow networks and links (such as dial-ups), leading to a high fraction of dropped packets. Thus the messaging infrastructure should manage the communication between external resources, services and clients to achieve the highest possible system performance and reliability. We suggest this problem is sufficiently hard that we only need solve this problem "once" i.e. that all communicationwhether TCP/IP, UDP, RTP (A Transport Protocol for Real-Time Applications) [15] , RMI, XML/SOAP or you-name-it be handled by a single messaging or event subsystem. 9. Security Infrastructure: Since it is entirely conceivable that messages (including queries, invocations and responses) would have to traverse over hops where the underlying communication mechanisms are not necessarily secure, a security infrastructure that relies on message level security needs to be in place. Furthermore, the infrastructure should incorporate an authentication and authorization scheme to ensure restricted access to certain services. The infrastructure must also ensure a secure and efficient distribution of keys to ensure access by authorized clients to content encapsulated in encrypted messages. The infrastructure will also handle Denial Of Service attacks (message flooding, replay) and compromised endpoints and brokers. In this paper we discuss the NaradaBrokering messaging infrastructure which addresses several of the issues discussed above. The remainder of this paper is organized as follows. In section 2 we present an overview of NaradaBrokering. In section 3 we present results for NaradaBrokering's transport interfaces. In section 4 and 5 we provide results for NaradaBrokering's support for JMS and JXTA clients. In section 6 we present results from our experiments with routing audio/video conferencing applications. In section 7 we present results from the assortment of matching engines within NaradaBrokering that are used for computing destinations based on the content encapsulated by the event. We are currently in the process of gathering performance numbers over transatlantic links in collaboration with the University of Cardiff. The final version of this paper will include comprehensive results from these benchmarks. Furthermore, in the final version of the paper the JXTA results will be updated and the security overheads in NaradaBrokering will be qualified. NaradaBrokering NaradaBrokering [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] is an event brokering system designed to run on a large network of cooperating broker nodes. Communication within NaradaBrokering is asynchronous and the system can be used to support different interactions by encapsulating them in specialized events. NaradaBrokering efficiently routes any given event between the originators and registered consumers of the event in question. Events could be used to encapsulate information pertaining to transactions, data interchange, system conditions and finally the search, discovery and subsequent sharing of resources. Events encapsulate expressive power at multiple levels. Where, when and how these events reveal their expressive power is what constitutes information flow. NaradaBrokering manages this information flow. NaradaBrokering places no constraints either on the number, size or rate of these interactions. Scaling, availability and fault tolerance requirements entail that the messaging infrastructure managing this information flow be based on a distributed network of cooperating nodes. Every event has an implicit or explicit destination list, comprising clients, associated with it. The brokering system as a whole is responsible for computing broker destinations (targets) and ensuring efficient delivery to these targeted brokers en route to the intended client(s). Events as they pass through the broker network are be updated to snapshot its dissemination within the network. The event dissemination traces eliminate continuous echoing and in tandem with the BNM -used for computing shortest paths -at each broker, is used to deploy a near optimal routing solution. The routing is near optimal since for every event the associated targeted set of brokers are usually the only ones involved in disseminations. Furthermore, every broker, either targeted or en route to one, computes the shortest path to reach target destinations while employing only those links and brokers that have not failed In NaradaBrokering, stable storages existing in parts of the system are responsible for introducing state into the events. The arrival of events at clients advances the state associated with the corresponding clients. Brokers do not keep track of this state and are responsible for ensuring the most efficient routing. Since the brokers are stateless, they can fail and remain failed forever. The guaranteed delivery scheme within NaradaBrokering does not require every broker to have access to a stable store or DBMS. The replication scheme is flexible and easily extensible. Stable storages can be added/removed and the replication scheme can be updated. Stable stores can fail but they do need to recover within a finite amount of time. During these failures the clients that are affected are those that were being serviced by the failed storage. Currently in NaradaBrokering we have both SQL and file based implementations of the storage services needed by the robust delivery algorithms. NaradaBrokering incorporates an extensible transport framework [19] and virtualizes the channels over which entities interact with each other. Entities are thus free to communicate across firewalls, proxies and NAT boundaries which can prevent interactions from taking place. Furthermore, NaradaBrokering provides support for multiple transport protocols such as TCP (blocking and non-blocking), UDP, SSL, HTTP and RTP. NaradaBrokering incorporates a monitoring service [20] at individual broker nodes which monitor the state of the links originating from a node. The performance metric measured include loss rates, communication delays and jitters among others. NaradaBrokering is JMS compliant [21] and also provides support for routing JXTA interactions [22] . Work is currently underway to provide support for routing Gnutella interactions. NaradaBrokering has also been used to support audio-video conferencing [23] applications. To address the issues [24] of scaling, load balancing and failure resiliency, NaradaBrokering is implemented on a network of cooperating brokers. In NaradaBrokering we impose a hierarchical structure on the broker network, where a broker is part of a cluster that is part of a super-cluster, which in turn is part of a super-super-cluster and so on. Figure 1 depicts a sub-system comprising of a super-super-cluster SSC-A with 3 super-clusters SC-1, SC-2 and SC-3 each of which have clusters that in turn are comprised of broker nodes. Clusters comprise strongly connected brokers with multiple links to brokers in other clusters, ensuring alternate communication routes during failures. This organization scheme results in "small world networks" where the average communication pathlengths between brokers increase logarithmically with geometric increases in network size, as opposed to exponential increases in uncontrolled settings.
doi:10.1002/cpe.924 fatcat:z3cy5bmrqneg5aiepnezq42zey