Lightweight remote procedure call

Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, Henry M. Levy
1990 ACM Transactions on Computer Systems  
Lightweight Remote Procedure Call (LRPC) is a communication facility designed and optimized for communication between protection domains on the same machine. In contemporary small-kernel operating systems, existing RPC systems incur an unnecessarily high cost when used for the type of communication that predominates-between protection domains on the same machine. This cost leads system designers to coalesce weakly related subsystems into the same protection domain, trading safety for
more » ... . By reducing the overhead of same-machine communication, LRPC encourages both safety and performance. LRPC combines the control transfer and communication model of capability systems with the programming semantics and large-grained protection model of RPC. LRPC achieves a factor-of-three performance improvement over more traditional approaches based on independent threads exchanging messages, reducing the cost of same-machine communication to nearly the lower bound imposed by conventional hardware. LRPC has been integrated into the Taos operating system of the DEC SRC Firefly multiprocessor workstation. General Terms: Design, Measurement, Performance Additional Key Words and Phrases: Modularity, remote procedure call, small-kernel operating systems -B. N. Bershad et al. L.RPC combines the control transfer and communication model of capability systems with the programming semantics and large-grained protection model of RPC. For the common case of same-machine communication passing small, simple arguments, LRPC achieves a factor-of-three performance improvement over more traditional approaches. The granularity of the protection mechanisms used by an operating system has a significant impact on the system's design and use. Some operating systems [lo, 131 have large, monolithic kernels insulated from user programs by simple hardware boundaries. Within the operating system itself, though, there are no protection boundaries. The lack of strong fire walls, combined with the size and complexity typical of a monolithic system, makes these systems difficult to modify, debug, and validate. Furthermore, the shallowness of the protection hierarchy (typically only two levels) makes the underlying hardware directly vulnerable to a large mass of complicated operating system software. Capability systems supporting fine-grained protection were suggested as a solution to the problems of large-kernel operating systems [5] . In a capability system, each fine-grained object exists in its own protection domain, but all live within a single name or address space. A process in one domain can act on an object in another only by making a protected procedure call, transferring control to the second domain. Parameter passing is simplified by the existence of a global name space containing all objects. Unfortunately, many found it difficult to efficiently implement and program systems that had such fine-grained protection. In contrast to the fine-grained protection of capability systems, some distributed computing environments rely on relatively large-grained protection mechanisms: Protection boundaries are defined by machine boundaries [12] . Remote Procedure Call (RPC) [ 1] facilitates the placement of subsystems onto separate machines. Subsystems present themselves to one another in terms of interfaces implemented by servers. The absence of a global address space is ameliorated by automatic stub generators and sophisticated run-time libraries that can transfer arbitrarily complex arguments in messages. RPC is a system structuring and programming style that has become widely successful, enabling efficient and convenient communication across machine boundaries. Small-kernel operating systems have borrowed the large-grained protection and programming models used in distributed computing environments and have demonstrated these to be appropriate for managing subsystems, even those not primarily intended for remote operation [ll]. In these small-kernel systems, separate components of the operating system can be placed in disjoint domains (or address spaces), with messages used for all interdomain communication. The advantages of this approach include modular structure, easing system design, implementation, and maintenance; failure isolation, enhancing debuggability and validation; and transparent access to network services, aiding and encouraging distribution. In addition to the large-grained protection model of distributed computing systems, small-kernel operating systems have adopted their control transfer and communication models-independent threads exchanging messages containing (potentially) large, structured values. In this paper, though, we show that most communication traffic in operating systems is (1) between domains on the same machine (cross-domain), rather than between domains located on separate l 39 machines (cross-machine), and (2) simple rather than complex. Cross-domain communication dominates because operating systems, even those supporting distribution, localize processing and resources to achieve acceptable performance at reasonable cost for the most common requests. Most communication is simple because complex data structures are concealed behind abstract system interfaces; communication tends to involve only handles to these structures and small value parameters (Booleans, integers, etc.). Although the conventional message-based approach can serve the communication needs of both local and remote subsystems, it violates a basic tenet of system design by failing to isolate the common case [9]. A cross-domain procedure call can be considerably less complex than its cross-machine counterpart, yet conventional RPC systems have not fully exploited this fact. Instead, local communication is treated as an instance of remote communication, and simple operations are considered in the same class as complex ones. Because the conventional approach has high overhead, today's small-kernel operating systems have suffered from a loss in performance or a deficiency in structure or both. Usually structure suffers most; logically separate entities are packaged together into a single domain, increasing its size and complexity. Such
doi:10.1145/77648.77650 fatcat:5vs6ffqsinclrou67qlachavka