SciStream

Joaquin Chung, Wojciech Zacherek, AJ Wisniewski, Zhengchun Liu, Tekin Bicer, Rajkumar Kettimuthu, Ian Foster
2022 Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing  
Modern scientific instruments, such as detectors at synchrotron light sources, generate data at such high rates that online processing is needed for data reduction, feature detection, experiment steering, and other purposes. The same high data rates also demand memory-to-memory streaming from instrument to remote computer, because local computational capacity is limited and data transmissions that engage the file system introduce unacceptable latencies. But efficient and secure memory-to-memory
more » ... data streaming is challenging to realize in practice, because of a lack of direct external network connectivity for scientific instruments and because of authentication and security requirements. In response, we propose here SciStream, a middlebox-based architecture with control protocols to enable efficient and secure memory-to-memory data streaming between producers and consumers that lack direct network connectivity. We describe the protocols that SciStream uses to establish authenticated and transparent connections between producers and consumers, and we discuss the experiments that we have conducted to evaluate alternative implementation approaches for key SciStream components. Experiments on the Chameleon cloud show that SciStream improves the throughput of a streaming pipeline by an order of magnitude compared with state-of-the-art data transfer methods and adds only ∼4 𝜇sec latency compared with an ideal scenario in which producers and consumers have direct external connectivity.
doi:10.1145/3502181.3531475 fatcat:pbrg6mcx7zei3nv54vt5iqkh34