Fast RPC on the SHRIMP Virtual Memory Mapped Network Interface

Angelos Bilas, Edward W. Felten
1997 Journal of Parallel and Distributed Computing  
The emergence of new network interface technology is enabling new approaches to the development of communications software. This paper evaluates the SHRIMP virtual memory mapped network interface by using it to build two fast implementations of remote procedure call (RPC). Our rst implementation, called vRPC, is fully compatible with the SunRPC standard. We change the RPC runtime library; the operating system kernel is unchanged, and only a minimal change was needed in the stub generator to
more » ... te a new protocol identier. Despite these restrictions, our vRPC implementation is several times faster than existing SunRPC implementations. A round-trip null RPC with no arguments and results under vRPC takes about 33 microseconds. Our second implementation, called ShrimpRPC, is not compatible with SunRPC but oers much better performance. ShrimpRPC specializes the stub generator and runtime library to take full advantage of SHRIMP's features. The result is a round-trip null RPC latency of 9.5 microseconds, which is about one microsecond above the hardware minimum. The SHRIMP network interface The SHRIMP project at Princeton studies how to provide high-performance communication mechanisms to integrate commodity desktop systems such as PCs and workstations into inexpensive, high-performance multicomputers. End-to-end latency and bandwidth available to user processes are the primary performance metrics. The challenge is to provide a low-latency, high-bandwidth communication mechanism whose performance is competitive with or better than that of specially designed multicomputers. The network interfaces of existing multicomputers and workstations require a signicant amount of software overhead to implement message-passing protocols. The main reason for such high overheads is that these multicomputers use network interfaces that require a signicant number of instructions at the operating system and user levels to provide protection, buer management, and message-passing protocols. In these designs, communication is treated as a service of the operating system. This is expensive because it requires several crossings between user level and kernel level for each message, and also because it prevents applications from customizing their use of the communication hardware. The computing nodes of SHRIMP are Pentium PCs, and the interconnection network is the same one used in the Intel Paragon. The key hardware component is the network interface board, which supports the virtual memory mapped communication (VMMC) model, to provide low-overhead, protected, user-level communication. For more details on the SHRIMP architecture the reader can consult [6, 7] . VMMC is discussed in the next section. Virtual Memory-Mapped Communication Virtual memory-mapped communication (VMMC) [10] was developed in response to the need for a basic multicomputer communication mechanism with extremely low latency and high bandwidth. These performance goals are achieved by allowing applications to transfer data directly between two virtual memory address spaces over the network. The basic mechanism is designed to eciently support applications and common communication models such as message passing, shared memory, and client-server. The VMMC mechanism consists of several calls to support user-level buer management, various data transfer strategies, and transfer of control. Import-Export Mappings In the VMMC model, an import-export mapping must be established before communication begins. A receiving process can export a region of its address space as a receive buer together with a set of permissions to dene access rights for the buer. In order to send data to an exported receive buer, a user process must import the buer with the right permissions.
doi:10.1006/jpdc.1996.1272 fatcat:qrw4pjf2mza3vat2ymkc6ll4vm