Java Fast Sockets: Enabling high-speed Java communications on high performance clusters

Guillermo L. Taboada, Juan Touriño, Ramón Doallo
2008 Computer Communications  
This paper presents Java Fast Sockets (JFS), an optimized Java socket implementation on clusters for high performance computing. Current socket libraries do not efficiently support high-speed cluster interconnects and impose substantial communication overhead. JFS overcomes these performance constraints by: (1) enabling high-speed communication on cluster networks such as Scalable Coherent Interface (SCI), Myrinet and Gigabit Ethernet; (2) avoiding the need of primitive data type array
more » ... tion; (3) reducing buffering and unnecessary copies; and (4) reimplementing the protocol for boosting shared memory (intra-node) communication. Its interoperability and user and application transparency allow for immediate applicability on a wide range of parallel and distributed target applications. A performance evaluation conducted on a dual-core cluster has shown experimental evidence of throughput increase on SCI, Myrinet, Gigabit Ethernet and shared memory communication. It has also been analyzed the impact of this improvement on the overall application performance of representative parallel codes. j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / c o m c o m networks to boost communication performance, Java cannot take advantage of them as shown in [1] because it has to resort to inefficient TCP/IP emulations for full networking support. These emulation libraries present high start-up latency (the 0-byte message latency), low bandwidth and high CPU load as shown in [2] . The main reason behind this poor throughput is that the IP protocol was designed to cope with low speed, unreliable and prone to failure links in WAN environments, whereas current cluster networks are high-speed, hardware reliable and non-prone to failure in LAN environments. Examples of IP emulations are IPoMX and IPoGM [3] on top of the Myrinet low-level libraries MX (Myrinet eXpress) and GM, LANE driver [4] over Giganet, IP over Infiniband (IPoIB) [5], and ScaIP [6] and SCIP [7] on SCI. A direct implementation of native sockets on top of low-level communication libraries can avoid the TCP/IP overhead, and thus increases performance. Representative examples are next presented. FastSockets [8] is a socket implementation on top of Active-Messages, a light-weight protocol with high-speed network access. SOVIA [4] has been implemented on VIA (Virtual Interface Architecture); and Sockets over Gigabit Ethernet [9] and GAMMAsockets [10] have been developed for Gigabit Ethernet. The Socket Direct Protocol (SDP) over Infiniband [11] is the representative socket library of the Offload Sockets Framework (OSF). Sockets-MX and Sockets-GM [3] are the developments on Myrinet, where MX is intended to supersede GM thanks to a more efficient protocol implementation. The high performance native sockets library on SCI is SCI Sockets [12]. However, from these implementations only SDP, Sockets-MX/GM and SCI Sockets are currently available. The Windows Sockets Direct components for Windows platforms provide access to certain high-speed networks. A related project is Xen-Socket [13], an optimized socket library restricted to Xen virtual machine intra-node communication that replaces TCP/IP by shared memory transfers. However, the previous socket libraries usually implement a subset of socket functionality on top of low-level libraries, resorting to the system socket library for unimplemented functions. Thus, some applications such as kernel-level network services and Java codes can request features not present in the underlying libraries and thus failover to system sockets. In order to provide Java with full and more efficient support on high-speed networks several approaches have been followed: (1) VIA-based projects, (2) RMI optimizations, (3) Java Distributed Shared Memory (DSM) middleware on clusters and (4) low-level libraries on high-speed networks. Javia [14] and Jaguar [15] provide access to high-speed cluster interconnects through VIA, communication library implemented on Giganet, Myrinet, Gigabit Ethernet and SCI [16], among others. More specifically Javia reduces data copying using native buffers, and Jaguar acts as a replacement of the Java Native Interface (JNI). Their main drawbacks are the use of particular APIs, the need of modified Java compilers and the lack of non-VIA communication support. Additionally Javia exposes programmers to buffer management and uses a specific garbage collector. Representative works about RMI optimization are Manta [17], a Java to native code compiler with a fast RMI protocol, and KaRMI [18], that improves RMI through a more efficient object serialization that reduces protocol latency. Serialization is the process of transforming objects in byte series, in this case to be sent across the network. However, the use of specific high-level solutions with substantial protocol overhead and focused on Myrinet has restricted the applicability of these projects. In fact, their start-up latency is from several times up to an order of magnitude larger than socket latencies. Therefore, current Java communication middleware such as MPJ Express [19] and MPJ/ Ibis [20], two Message-Passing in Java (MPJ) libraries use sockets
doi:10.1016/j.comcom.2008.08.012 fatcat:jtribpsapzdklcw2fkmvnto3qy