Multi-Way Windowed Streams θ-Joins Using Cluster

Xinchun Liu, Jing Li, Xiaopeng Fan, Jun Chen
2016 International Journal of Grid and Distributed Computing  
Recent years have witnessed an increasing interesting in data stream processing, such as network monitoring, the e-business, advertising system and etc. Join is applied to explore the correlation among the tuples from multiple streams. In this paper, we present a general method named Distributed Streams Join (DSJ) to process multi-way windowed streams θ-joins using a shared-nothing cluster. DSJ contains a distribution method named Time-Slice Distribution Method (TDM) and a join method named
more » ... sfer Join Method (TJM). Different from previous work, DSJ can (1) process multi-way θ-joins under arbitrary predicates; (2) preserve the integrity of results and load balance while distributing tuples to different nodes for parallel joining; (3) carry out the join operation in a local optimum order according to the histograms maintained in a real-time way. We have built DSJ on our own stream processing cluster to deal with multi-way streams joins and the experiments demonstrate that our DSJ can not only guarantee the load balance among all the computing nodes but also improve the throughput effectively.
doi:10.14257/ijgdc.2016.9.2.10 fatcat:3v3s3udyenccnm7jd7yza6frhy