Yuqun Chen, Angelos Bilas, Stefanos N. Damianakis, Cezary Dubnicki, Kai Li
1998 SIGPLAN notices  
An important aspect of a high-speed network system is the ability to transfer data directly between the network interface and application buffers. Such a direct data path requires the network interface to "know" the virtual-to-physical address translation of a user buffer, i.e., the physical memory location of the buffer. This paper presents an efficient address translation architecture, User-managed TLB (UTLB), which eliminates system calls and device interrupts from the common communication
more » ... th. UTLB also supports application-specific policies to pin and unpin application memory. We report micro-benchmark results for an implementation on Myrinet PC clusters. A trace-driven analysis is used to compare the UTLB approach with the interrupt-based approach. It is also used to study the effects of UTLB cache size, associativity, and prefetching. Our results show that the UTLB approach delivers robust performance with relatively small translation cache sizes. Introduction A computer cluster consists of multiple hosts connected with a high speed network. The goal of a cluster is to exploit the aggregate computing power of many microprocessors and the throughput of multiple I/O buses. The key to achieving high performance in a cluster is an efficient communication subsystem. The communication subsystem includes the network interface and the software that enables applications to access it. A major responsibility of the communication software is to facilitate efficient data transfer between the network interface and applications. There are two kinds of data paths between the network interface and application buffers: indirect and direct. The indirect path involves copying data between an application buffer and a dedicated system buffer. This path is straightforward to implement but incurs significant overhead due to data copying. In contrast, the direct data path allows data transfers between the network interface and application buffers which are usually accomplished by programmed I/O or DMA. The direct path not only eliminates the copy overhead, but it also reduces involvement of the host processor. User-level protocols that take advantage of the direct path can increase the end-to-end communication bandwidth by as much as 100% [13] . User-level communication allows an application to issue communication requests directly to the network interface, bypassing the operating system (OS). It eliminates OS calls, and sometimes interrupts, from the common communication path. Fast user-level communication relies on the direct path to avoid copying data to and from a dedicated system buffer [46, 2, 35, 45, 7, 33] . Combining user-level communication and direct path communication introduces an address translation problem. Address translation is necessary because the application process uses virtual memory whereas the network interface accesses physical memory. The virtual-to-physical mappings are kept in the operating system and are inaccessible to applications running at the user level.
doi:10.1145/291006.291046 fatcat:yhok4stgobdkllbofwp6ma2fmq