Managing Probabilistic Data with MystiQ: The Can-Do, the Could-Do, and the Can't-Do [chapter]

Christopher Re, Dan Suciu
<span title="">2008</span> <i title="Springer Berlin Heidelberg"> <a target="_blank" rel="noopener" href="" style="color: black;">Lecture Notes in Computer Science</a> </i> &nbsp;
Introduction MystiQ is a system that allows users to define a probabilistic database, then to evaluate SQL queries over this database. MystiQ is a middleware: the data itself is stored in a standard relational database system, and MystiQ is providing the probabilistic semantics. The advantage of a middleware over a reimplementation from scratch is that it can leverage the infrastructure of an existing database engine, e.g. indexes, query evaluation, query optimization, etc. Furthermore, MystiQ
more &raquo; ... ttempts to perform most of the probabilistic inference inside the relational database engine. MystiQ is currently available from The MystiQ system resulted from research on probabilistic databases at the University of Washington [8, 11, 10, 13, 23, 14] . Some of these research results have been fully incorporated in MystiQ, like the query evaluation techniques that allow it to evaluate SELECT-FROM-WHERE-GROUPBY queries over large probabilistic databases: this is what MystiQ can do. Other results are not implemented in the system, but they could either be implemented in some future version after only minor extensions, or can be used even today by a database administrator to perform certain data management tasks manually; an example are the techniques for representing materialized views over probabilistic data. This is what MystiQ could do. Finally, other research results require more work before they can be implemented in a system. For example, our evaluation techniques for queries with a HAVING clause applies only to safe queries; for another example, we currently don't know of a good approach to extend safe queries and safe plans to queries with self-joins. This is what MystiQ can't do. In this paper we give a gentle introduction into the MystiQ system, and describe the associated research that is used, or could be used, or is not yet ready to be used in MystiQ. Related Work Our research has focused primarily on SQL query evaluation and on views. Other groups have studied different aspects of probabilistic or incomplete databases. The Trio project [29, 7, 6, 15] focused on the study of lineage in incomplete databases. The MayBMS project has focused on representation problems, query language design, and query evaluation [3, 4, 2, 20] . Other
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="">doi:10.1007/978-3-540-87993-0_3</a> <a target="_blank" rel="external noopener" href="">fatcat:h4xoijkrvjfrlilkbfth5nictm</a> </span>
<a target="_blank" rel="noopener" href="" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href=""> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> </button> </a>