Stampede RT: Programming Abstractions for Live Streaming Applications

David Hilley, Umakishore Ramachandran
2007 27th International Conference on Distributed Computing Systems (ICDCS '07)  
We present Stampede RT , middleware designed to provide a natural programming model appropriate for live streaming applications. Such applications require pervasive access to multiple streaming data sources for distributed online analysis. One motivating example is a distributed robotics application which analyzes live camera feeds for control and planning. Most existing middlewares for streaming data focus on media streams and low-level transport characteristics such as delivery latency and
more » ... icient transfer, but do not define a programming model to succinctly express applications that manipulate and analyze the streaming content. Stampede RT provides for straightforward transport and manipulation of temporally-ordered data streams, enabling simple synchronization and correlation of data sources. We present an abstract programming model to support the aforementioned class of applications and then describe a concrete realization of the model as a distributed middleware architecture. We also evaluate our implementation of the architecture and present several motivating applications Stampede RT is designed to support. tion 5). Finally, we provide an experimental evaluation of our initial system prototype (Section 6). Related Work The development of more expressive and convenient programming models to ease the burden of distributed programming is a very common goal and there is an extremely large body of prior work in this area. Nonetheless, no existing system directly provides a programming model tailored for the natural expression of idioms common in the development of live streaming applications with appropriate performance. Much work has gone into layering more convenient and structured abstractions on top of underlying unstructured transports. Remote Procedure Call [23] and Remote Method Invocation [24] are now nearly ubiquitous mechanisms to provide more structured network programming. Although very expressive and widely applicable, RPC/RMI are very general and typical implementations are unsuited for continuous bulk data transfers; the programming model of RPC makes it unnatural or impractical to exploit multicast for many kinds of interactions. Additionally, timebased manipulation of streaming data would have to be layered on top of a basic RPC system. In many domains involving high-performance computing, message-passing systems are more common than RPC/RMI. Systems like PVM [9] and MPI [12] provide more facilities for point-to-point messaging and various collective communication operations. Although significantly more convenient than raw transport-level operations and very general, message passing systems like MPI and PVM are still fairly low-level; additionally, such systems have traditionally been narrowly targeted towards relatively static cluster-computing environments and may not handle failure or dynamism in a manner appropriate for more widely distributed environments. Various efforts have attempted to address related shortcomings: MPI-2 [18] addresses the issue of static participants by expanding the process model to allow runtime dynamism. FT-MPI [10] stands for "Fault Tolerant MPI" and attempts to address MPI's shortcomings with regard to failure tolerance. Configurable communication systems like Isis and Horus [26] are often targeted for group communication, but are more appropriate for applications requiring heavyweight features such as group membership agreement or causal message ordering. CCL [3] also provides a number of powerful primitives for group communication. For our target class of applications, group communication is more appropriate than point-to-point messaging, but per-stream group broadcasts would still involve much redundant messaging because each item would be broadcast to each stream consumer, even those that may not need it. Additionally, many group communication systems are not designed to support a large number of groups or groups with quickly varying membership. Finally, the recognition of time as a first-class entity would still need to be layered on top of a group-based communication system. Although they are not traditionally used for applications with high-volume communication requirements, tuplespace programming models like Linda [4] can provide a
doi:10.1109/icdcs.2007.140 dblp:conf/icdcs/HilleyR07 fatcat:vmlngxq4xjfj5b5qkfij26vjpm