The Internet Archive has a preservation copy of this work in our general collections.
The file type is application/pdf
.
Filters
A Bayesian Approach to Discovering Truth from Conflicting Sources for Data Integration
[article]
2012
arXiv
pre-print
Consequently, a major challenge for data integration is to derive the most complete and accurate integrated records from diverse and sometimes conflicting sources. ...
In practical data integration systems, it is common for the data sources being integrated to provide conflicting information about the same entity. ...
ACKNOWLEDGEMENTS We thank Ashok Chandra, Duo Zhang, Sahand Negahban and three anonymous reviewers for their valuable comments. The work was supported in part by the U.S. ...
arXiv:1203.0058v1
fatcat:3pllovnmlzaijdnt2vqjmzf5nu
A Bayesian approach to discovering truth from conflicting sources for data integration
2012
Proceedings of the VLDB Endowment
Consequently, a major challenge for data integration is to derive the most complete and accurate integrated records from diverse and sometimes conflicting sources. ...
In practical data integration systems, it is common for the data sources being integrated to provide conflicting information about the same entity. ...
ACKNOWLEDGEMENTS We thank Ashok Chandra, Duo Zhang, Sahand Negahban and three anonymous reviewers for their valuable comments. The work was supported in part by the U.S. ...
doi:10.14778/2168651.2168656
fatcat:z376bflf4za3vaesddqndmweuq
An Integrated Bayesian Approach for Effective Multi-Truth Discovery
2015
Proceedings of the 24th ACM International on Conference on Information and Knowledge Management - CIKM '15
Truth-finding is the fundamental technique for corroborating reports from multiple sources in both data integration and collective intelligent applications. ...
Based on this insight, we propose an integrated Bayesian approach to the multi-truth-finding problem, by taking these features into account. ...
Distinguishing from previous approaches, our approach features an integrated Bayesian model based on a reformulated MTF model. ...
doi:10.1145/2806416.2806443
dblp:conf/cikm/WangSFYXL15
fatcat:52hhimrek5cz3dght64sm5tga4
Sailing the Information Ocean with Awareness of Currents: Discovery and Application of Source Dependence
[article]
2009
arXiv
pre-print
We also discuss how this knowledge can benefit a variety of technologies, such as data integration and Web 2.0, that help users manage and access the totality of the available information from various ...
Given the huge number of data sources and the vast volume of conflicting data available on the Web, doing so in a scalable manner is extremely challenging and has not been addressed by existing work yet ...
is an underlying true value and one can seek to discover the truth from amongst the conflicting values. ...
arXiv:0909.1776v1
fatcat:an4xfadwsrcdxa72lvj4jervda
Truth Discovery via Exploiting Implications from Multi-Source Data
2016
Proceedings of the 25th ACM International on Conference on Information and Knowledge Management - CIKM '16
by discovering the truth, which conforms to the reality, from the multi-source data. ...
Data veracity is a grand challenge for various tasks on the Web. ...
It therefore becomes important to discover the truth from the multi-source data to resolve the conflicts. ...
doi:10.1145/2983323.2983791
dblp:conf/cikm/WangSYLFXB16
fatcat:2fenc6a7bjcgdefoejt4bjmqai
VERA
2016
Proceedings of the 25th International Conference Companion on World Wide Web - WWW '16 Companion
VERA will be demonstrated through several real-world scenarios to show its potential value for fact-checking from Web data. ...
Given a user query, VERA systematically extracts entities and relations from Web content, structures them as claims relevant to the query and gathers more conflicting/corroborating information. ...
Recent approaches have been developed to discover true values extracted from textual content in a large corpus of Web sources using various information extractors [5, 14] . ...
doi:10.1145/2872518.2890536
dblp:conf/www/BaBSH16
fatcat:bgdidkh5urb2zcp7zpudhwd5mi
A Survey on Truth Discovery
[article]
2015
arXiv
pre-print
Thanks to information explosion, data for the objects of interest can be collected from increasingly more sources. ...
To tackle this challenge, truth discovery, which integrates multi-source noisy information by estimating the reliability of each source, has emerged as a hot topic. ...
One straightforward approach to eliminate conflicts among multi-source data is to conduct majority voting or averaging. ...
arXiv:1505.02463v2
fatcat:sqvfxldfqjbtlexi5gqaldtgqq
Influence-Aware Truth Discovery
2016
Proceedings of the 25th ACM International on Conference on Information and Knowledge Management - CIKM '16
In the Big Data era, truth discovery has served as a promising technique to solve conflicts in the facts provided by numerous data sources. ...
We propose an integrated Bayesian approach to incorporate the domain expertise of data sources and confidence scores of value sets, aiming to find multiple possible truths without any supervision. ...
AccuSim [1] applies Bayesian analysis to iteratively detect dependence between sources and discover the truth from conflicting information. ...
doi:10.1145/2983323.2983785
dblp:conf/cikm/ZhangLMXLGS16
fatcat:riu3y4eb2jhwfffz6a5djoqcyq
Experiment Design Frameworks for Accelerated Discovery of Targeted Materials Across Scales
2019
Frontiers in Materials
Over the last decade, there has been a paradigm shift away from labor-intensive and time-consuming materials discovery methods, and materials exploration through informatics approaches is gaining traction ...
Such approaches, however, do not account for the practicalities of resource constraints which eventually result in bottlenecks at various stage of the workflow. ...
ACKNOWLEDGMENTS Calculations were carried out in the Texas A&M Supercomputing Facility. ...
doi:10.3389/fmats.2019.00082
fatcat:jmpom5i5vjbqbc25apmrufi6nm
Automatically building probabilistic databases from the web
2011
Proceedings of the 20th international conference companion on World wide web - WWW '11
, extracts and integrate the published data, and finally performs a probabilistic analysis to characterize the impreciseness of the data and the accuracy of the sources. ...
There is a great chance to create applications that rely on a huge amount of data taken from the Web. ...
entity instances; (ii) extract data from these pages; (iii) integrate the extracted data in a mediated schema; (iv) analyze the integrated data according to a probabilistic model and characterize the ...
doi:10.1145/1963192.1963285
dblp:conf/www/BlancoBCMP11
fatcat:jtlpz53wjraijh7mkl6lwqpxia
A novel method for data conflict resolution using multiple rules
2013
Computer Science and Information Systems
Our approach can divide attributes according to their conflict degree, then resolves data conflicts in the following two steps: (1)For the week conflicting attributes, we exploit a few common rules to ...
Experimental results using a large number of real-world data collected from two domains show that the proposed approach can significantly improve the accuracy of data conflict resolution. ...
To provide high-quality data to user, it is essential for data integration system to resolve data conflicts and discover the true values from false ones. ...
doi:10.2298/csis110613005y
fatcat:o5rjglo4tjatfairhzzscgd7ea
Truth Discovery in Data Streams
2014
Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management - CIKM '14
Truth discovery is a long-standing problem for assessing the validity of information from various data sources that may provide different and conflicting information. ...
This motivates us to develop new techniques to tackle the problem of truth discovery in data streams. ...
At each time t, the system collects a set of conflicting values for entity i as V t i = {v1, v2, . . . , v k } from multiple data sources. ...
doi:10.1145/2661829.2661892
dblp:conf/cikm/ZhaoCN14
fatcat:5km7ftat6ffyroubhlgvfyzmee
Information Fusion for Multi-Source Material Data: Progress and Challenges
2019
Applied Sciences
The integration and fusion of material data can offer a unified framework for material data representation, processing, storage and mining, which can further help to accomplish many tasks, including material ...
The development of material science in the manufacturing industry has resulted in a huge amount of material data, which are often from different sources and vary in data format and semantics. ...
Acknowledgments: We would like to thank the editors and anonymous reviewers for their suggestions and comments to improve the quality of the paper. ...
doi:10.3390/app9173473
fatcat:yq25vikkqfdflp36bqx5o3l6wu
Truth Discovery from Conflicting Multi-Valued Objects
2017
Proceedings of the 26th International Conference on World Wide Web Companion - WWW '17 Companion
Truth discovery is a fundamental research topic, which aims at identifying the true value(s) of objects of interest given the conflicting multi-sourced data. ...
We also present a general approach, which utilizes Markov chain models with Bayesian inference, for comparing the existing truth discovery methods and validate our approach without ground truth. ...
Formally, given a set of multi-valued objects (O), conflicting values V can be collected from a set of sources (S). ...
doi:10.1145/3041021.3053374
dblp:conf/www/Fang17
fatcat:gohttomo65bujlzawjdgelckwi
Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation
2014
Proceedings of the 2014 ACM SIGMOD international conference on Management of data - SIGMOD '14
In many applications, one can obtain descriptions about the same objects or events from a variety of sources. As a result, this will inevitably lead to data or information conflicts. ...
One important problem is to identify the true information (i.e., the truths) among conflicting sources of data. ...
ACKNOWLEDGMENTS We would like to thank the anonymous reviewers for their valuable comments and suggestions, which help us tremendously in improving the quality of the paper. ...
doi:10.1145/2588555.2610509
dblp:conf/sigmod/LiLGZFH14
fatcat:ouecjui24jeo5je5klxmloyzt4
« Previous
Showing results 1 — 15 out of 4,225 results