Filters








4,225 Hits in 7.8 sec

A Bayesian Approach to Discovering Truth from Conflicting Sources for Data Integration [article]

Bo Zhao, Benjamin I. P. Rubinstein, Jim Gemmell, Jiawei Han
2012 arXiv   pre-print
Consequently, a major challenge for data integration is to derive the most complete and accurate integrated records from diverse and sometimes conflicting sources.  ...  In practical data integration systems, it is common for the data sources being integrated to provide conflicting information about the same entity.  ...  ACKNOWLEDGEMENTS We thank Ashok Chandra, Duo Zhang, Sahand Negahban and three anonymous reviewers for their valuable comments. The work was supported in part by the U.S.  ... 
arXiv:1203.0058v1 fatcat:3pllovnmlzaijdnt2vqjmzf5nu

A Bayesian approach to discovering truth from conflicting sources for data integration

Bo Zhao, Benjamin I. P. Rubinstein, Jim Gemmell, Jiawei Han
2012 Proceedings of the VLDB Endowment  
Consequently, a major challenge for data integration is to derive the most complete and accurate integrated records from diverse and sometimes conflicting sources.  ...  In practical data integration systems, it is common for the data sources being integrated to provide conflicting information about the same entity.  ...  ACKNOWLEDGEMENTS We thank Ashok Chandra, Duo Zhang, Sahand Negahban and three anonymous reviewers for their valuable comments. The work was supported in part by the U.S.  ... 
doi:10.14778/2168651.2168656 fatcat:z376bflf4za3vaesddqndmweuq

An Integrated Bayesian Approach for Effective Multi-Truth Discovery

Xianzhi Wang, Quan Z. Sheng, Xiu Susie Fang, Lina Yao, Xiaofei Xu, Xue Li
2015 Proceedings of the 24th ACM International on Conference on Information and Knowledge Management - CIKM '15  
Truth-finding is the fundamental technique for corroborating reports from multiple sources in both data integration and collective intelligent applications.  ...  Based on this insight, we propose an integrated Bayesian approach to the multi-truth-finding problem, by taking these features into account.  ...  Distinguishing from previous approaches, our approach features an integrated Bayesian model based on a reformulated MTF model.  ... 
doi:10.1145/2806416.2806443 dblp:conf/cikm/WangSFYXL15 fatcat:52hhimrek5cz3dght64sm5tga4

Sailing the Information Ocean with Awareness of Currents: Discovery and Application of Source Dependence [article]

Laure Berti-Equille, Anish Das Sarma, Amelie Marian
2009 arXiv   pre-print
We also discuss how this knowledge can benefit a variety of technologies, such as data integration and Web 2.0, that help users manage and access the totality of the available information from various  ...  Given the huge number of data sources and the vast volume of conflicting data available on the Web, doing so in a scalable manner is extremely challenging and has not been addressed by existing work yet  ...  is an underlying true value and one can seek to discover the truth from amongst the conflicting values.  ... 
arXiv:0909.1776v1 fatcat:an4xfadwsrcdxa72lvj4jervda

Truth Discovery via Exploiting Implications from Multi-Source Data

Xianzhi Wang, Quan Z. Sheng, Lina Yao, Xue Li, Xiu Susie Fang, Xiaofei Xu, Boualem Benatallah
2016 Proceedings of the 25th ACM International on Conference on Information and Knowledge Management - CIKM '16  
by discovering the truth, which conforms to the reality, from the multi-source data.  ...  Data veracity is a grand challenge for various tasks on the Web.  ...  It therefore becomes important to discover the truth from the multi-source data to resolve the conflicts.  ... 
doi:10.1145/2983323.2983791 dblp:conf/cikm/WangSYLFXB16 fatcat:2fenc6a7bjcgdefoejt4bjmqai

VERA

Mouhamadou Lamine Ba, Laure Berti-Equille, Kushal Shah, Hossam M. Hammady
2016 Proceedings of the 25th International Conference Companion on World Wide Web - WWW '16 Companion  
VERA will be demonstrated through several real-world scenarios to show its potential value for fact-checking from Web data.  ...  Given a user query, VERA systematically extracts entities and relations from Web content, structures them as claims relevant to the query and gathers more conflicting/corroborating information.  ...  Recent approaches have been developed to discover true values extracted from textual content in a large corpus of Web sources using various information extractors [5, 14] .  ... 
doi:10.1145/2872518.2890536 dblp:conf/www/BaBSH16 fatcat:bgdidkh5urb2zcp7zpudhwd5mi

A Survey on Truth Discovery [article]

Yaliang Li, Jing Gao, Chuishi Meng, Qi Li, Lu Su, Bo Zhao, Wei Fan, Jiawei Han
2015 arXiv   pre-print
Thanks to information explosion, data for the objects of interest can be collected from increasingly more sources.  ...  To tackle this challenge, truth discovery, which integrates multi-source noisy information by estimating the reliability of each source, has emerged as a hot topic.  ...  One straightforward approach to eliminate conflicts among multi-source data is to conduct majority voting or averaging.  ... 
arXiv:1505.02463v2 fatcat:sqvfxldfqjbtlexi5gqaldtgqq

Influence-Aware Truth Discovery

Hengtong Zhang, Qi Li, Fenglong Ma, Houping Xiao, Yaliang Li, Jing Gao, Lu Su
2016 Proceedings of the 25th ACM International on Conference on Information and Knowledge Management - CIKM '16  
In the Big Data era, truth discovery has served as a promising technique to solve conflicts in the facts provided by numerous data sources.  ...  We propose an integrated Bayesian approach to incorporate the domain expertise of data sources and confidence scores of value sets, aiming to find multiple possible truths without any supervision.  ...  AccuSim [1] applies Bayesian analysis to iteratively detect dependence between sources and discover the truth from conflicting information.  ... 
doi:10.1145/2983323.2983785 dblp:conf/cikm/ZhangLMXLGS16 fatcat:riu3y4eb2jhwfffz6a5djoqcyq

Experiment Design Frameworks for Accelerated Discovery of Targeted Materials Across Scales

Anjana Talapatra, Shahin Boluki, Pejman Honarmandi, Alexandros Solomou, Guang Zhao, Seyede Fatemeh Ghoreishi, Abhilash Molkeri, Douglas Allaire, Ankit Srivastava, Xiaoning Qian, Edward R. Dougherty, Dimitris C. Lagoudas (+1 others)
2019 Frontiers in Materials  
Over the last decade, there has been a paradigm shift away from labor-intensive and time-consuming materials discovery methods, and materials exploration through informatics approaches is gaining traction  ...  Such approaches, however, do not account for the practicalities of resource constraints which eventually result in bottlenecks at various stage of the workflow.  ...  ACKNOWLEDGMENTS Calculations were carried out in the Texas A&M Supercomputing Facility.  ... 
doi:10.3389/fmats.2019.00082 fatcat:jmpom5i5vjbqbc25apmrufi6nm

Automatically building probabilistic databases from the web

Lorenzo Blanco, Mirko Bronzi, Valter Crescenzi, Paolo Merialdo, Paolo Papotti
2011 Proceedings of the 20th international conference companion on World wide web - WWW '11  
, extracts and integrate the published data, and finally performs a probabilistic analysis to characterize the impreciseness of the data and the accuracy of the sources.  ...  There is a great chance to create applications that rely on a huge amount of data taken from the Web.  ...  entity instances; (ii) extract data from these pages; (iii) integrate the extracted data in a mediated schema; (iv) analyze the integrated data according to a probabilistic model and characterize the  ... 
doi:10.1145/1963192.1963285 dblp:conf/www/BlancoBCMP11 fatcat:jtlpz53wjraijh7mkl6lwqpxia

A novel method for data conflict resolution using multiple rules

Zhang Yong-Xin, Li Qing-Zhong, Peng Zhao-Hui
2013 Computer Science and Information Systems  
Our approach can divide attributes according to their conflict degree, then resolves data conflicts in the following two steps: (1)For the week conflicting attributes, we exploit a few common rules to  ...  Experimental results using a large number of real-world data collected from two domains show that the proposed approach can significantly improve the accuracy of data conflict resolution.  ...  To provide high-quality data to user, it is essential for data integration system to resolve data conflicts and discover the true values from false ones.  ... 
doi:10.2298/csis110613005y fatcat:o5rjglo4tjatfairhzzscgd7ea

Truth Discovery in Data Streams

Zhou Zhao, James Cheng, Wilfred Ng
2014 Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management - CIKM '14  
Truth discovery is a long-standing problem for assessing the validity of information from various data sources that may provide different and conflicting information.  ...  This motivates us to develop new techniques to tackle the problem of truth discovery in data streams.  ...  At each time t, the system collects a set of conflicting values for entity i as V t i = {v1, v2, . . . , v k } from multiple data sources.  ... 
doi:10.1145/2661829.2661892 dblp:conf/cikm/ZhaoCN14 fatcat:5km7ftat6ffyroubhlgvfyzmee

Information Fusion for Multi-Source Material Data: Progress and Challenges

Zhou, Hong, Jin
2019 Applied Sciences  
The integration and fusion of material data can offer a unified framework for material data representation, processing, storage and mining, which can further help to accomplish many tasks, including material  ...  The development of material science in the manufacturing industry has resulted in a huge amount of material data, which are often from different sources and vary in data format and semantics.  ...  Acknowledgments: We would like to thank the editors and anonymous reviewers for their suggestions and comments to improve the quality of the paper.  ... 
doi:10.3390/app9173473 fatcat:yq25vikkqfdflp36bqx5o3l6wu

Truth Discovery from Conflicting Multi-Valued Objects

Xiu Susie Fang
2017 Proceedings of the 26th International Conference on World Wide Web Companion - WWW '17 Companion  
Truth discovery is a fundamental research topic, which aims at identifying the true value(s) of objects of interest given the conflicting multi-sourced data.  ...  We also present a general approach, which utilizes Markov chain models with Bayesian inference, for comparing the existing truth discovery methods and validate our approach without ground truth.  ...  Formally, given a set of multi-valued objects (O), conflicting values V can be collected from a set of sources (S).  ... 
doi:10.1145/3041021.3053374 dblp:conf/www/Fang17 fatcat:gohttomo65bujlzawjdgelckwi

Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation

Qi Li, Yaliang Li, Jing Gao, Bo Zhao, Wei Fan, Jiawei Han
2014 Proceedings of the 2014 ACM SIGMOD international conference on Management of data - SIGMOD '14  
In many applications, one can obtain descriptions about the same objects or events from a variety of sources. As a result, this will inevitably lead to data or information conflicts.  ...  One important problem is to identify the true information (i.e., the truths) among conflicting sources of data.  ...  ACKNOWLEDGMENTS We would like to thank the anonymous reviewers for their valuable comments and suggestions, which help us tremendously in improving the quality of the paper.  ... 
doi:10.1145/2588555.2610509 dblp:conf/sigmod/LiLGZFH14 fatcat:ouecjui24jeo5je5klxmloyzt4
« Previous Showing results 1 — 15 out of 4,225 results