Evolution of the virtual interface architecture
T o provide a faster path between applications and the network, most researchers have advocated removing the operating system kernel and its centralized networking stack from the critical path and creating a userlevel network interface. With these interfaces, designers can tailor the communication layers each process uses to the demands of that process. Consequently, applications can send and receive network packets without operating system intervention, which greatly decreases communication
... ency and increases network throughput. Unfortunately, the diversity of approaches and lack of consensus has stalled progress in refining research results into products-a prerequisite to the widespread adoption of these interfaces. Recently, however, Intel, Microsoft, and Compaq have introduced the Virtual Interface Architecture, 1 an emerging standard for cluster or system-area networks. Products based on the VIA have already surfaced, notably GigaNet's GNN1000 network interface (http://www.giganet. com). As more products appear, research into application-level issues can proceed and the technology of user-level network interfaces should mature. Several prototypes-among them Cornell University's U-Net 2 -have heavily influenced the VIA. In this article, we describe the architectural issues and design trade-offs at the core of these prototype designs, including Thorsten von Eicken Werner Vogels Cornell University Cover Feature . 62 Computer Low communication latency is key to using clusters in enterprise computing, where systems must be highly available and scalable. Cluster management and cluster-aware server applications rely on multiround protocols to reach agreement on the system's state when there are potential node and process failures. These protocols involve multiple participants-all of which must respond in each round. This makes the protocols extremely sensitive to any latency. Cluster applications that require fault tolerance (for example through a primary/backup scheme or through active replication) use extensive intracluster communication to synchronize replicated information internally. These cluster applications can be scaled up only if the intracluster communication is many times faster than the time in which the systems are expected to respond to their clients. Recent experiments with Microsoft's Cluster Server have shown that without low-latency intracluster communication, the scalability is limited to eight nodes. 3 On the parallel computing front, many researchers use networks of workstations to provide the resources for computationally intensive parallel applications. However, these networks are difficult to program, and the communication costs across LANs must decrease by more than an order of magnitude to address this problem. High bandwidth for small messages The demand for high bandwidth when sending many small messages (typically less than 1 Kbyte each) is increasing for the same reasons industry needs low communication latency. Web servers, for example, often receive and send many small messages to many clients. By reducing the per-message overhead, userlevel network interfaces attempt to provide full network bandwidth for the smallest messages possible. Reducing the message size at which full bandwidth can be achieved may also benefit datastream proto-cols like TCP, whose buffer requirements are directly proportional to the communication latency. The TCP window size, for example, is the product of the network bandwidth and the round-trip time. One way to have maximum bandwidth at minimum cost is to achieve low latency in local area networks, which will keep buffer consumption within reason.