Truth Discovery in Data Streams

Zhou Zhao, James Cheng, Wilfred Ng
2014 Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management - CIKM '14  
Truth discovery is a long-standing problem for assessing the validity of information from various data sources that may provide different and conflicting information. With the increasing prominence of data streams arising in a wide range of applications such as weather forecast and stock price prediction, effective techniques for truth discovery in data streams are demanded. However, existing work mainly focuses on truth discovery in the context of static databases, which is not applicable in
more » ... plications involving streaming data. This motivates us to develop new techniques to tackle the problem of truth discovery in data streams. In this paper, we propose a probabilistic model that transforms the problem of truth discovery over data streams into a probabilistic inference problem. We first design a streaming algorithm that infers the truth as well as source quality in real time. Then, we develop a one-pass algorithm, in which the inference of source quality is proved to be convergent and the accuracy is further improved. We conducted extensive experiments on real datasets which verify both the efficiency and accuracy of our methods for truth discovery in data streams.
doi:10.1145/2661829.2661892 dblp:conf/cikm/ZhaoCN14 fatcat:5km7ftat6ffyroubhlgvfyzmee