A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is
In this paper, we take a more formal approach and propose the use of information algebra as a general theory to describe structured data sets and data cleaning. ... Existing data cleaning tools have focused on cleaning the errors at hand. ... In Proposition 3, we described how association rules and association functions can be applied towards cleaning dirty data. ...doi:10.1016/j.procs.2013.09.009 fatcat:ykop6amz3vc3xc3u7jc6pq7ogy
We eliminate the transform-and-load cost using in-situ query processing approaches which adapt to any data format and facilitate querying diverse datasets. ... To accurately execute queries over the transformed data, users have to remove any inconsistencies by applying cleaning operations. ... Then, the resulting monoid comprehension is rewritten to an algebraic plan of the nested relational algebra  . ...dblp:journals/debu/OlmaGKA19 fatcat:qsxdoltagbcudect3m23rw54uu
The aim of this article is to present an overview of the major XML warehousing approaches from the literature, as well as the existing approaches for performing OLAP analyses over XML data (which is termed ... algebra. ... As a first step toward an XOLAP platform, we initiated a previously inexistent formal framework in the XML context by demonstrating how the TAX Tree Algebra for XML (Jagadish et al., 2001 ) could support ...arXiv:1701.08612v1 fatcat:ne65xezff5hltdo2sbdwlr6lay
We report here on an implementation of this approach and its application to the procaryotes. ... The algebraic approach to metabolic networks is suitable to study metabolic innovations in two sets of organisms, free living microbes and Pyrococci, as well as obligate intracellular pathogens. ... We illustrate our approach using pathogenic procaryotes as an example. ...doi:10.1186/1471-2105-7-67 pmid:16478540 pmcid:PMC1475643 fatcat:bwmztrkwx5hcbggrmysgdtn6dq
Label propagation aggressively gathers fact candidates, and an Integer Linear Program is used to clean out false hypotheses that violate temporal constraints. ... Acknowledgements This work is supported by the 7 th Framework IST programme of the European Union through the focused research project (STREP) on Longitudinal Analytics of Web Archive data (LAWA) under ... An approach toward fact extraction based on coupled semi-supervised learning for information extraction (IE) is NELL (Carlson et al., 2010b) . ...dblp:conf/acl/WangDSW12 fatcat:h6xkx4onx5b7rgqonyhp7y2jxm
V. 83b:68010 Toward a theory of loop cleaning. Programmirovanie 1980, no. 5, 3-16, 95 ( Russian); translated as Programming and Comput. Software 6 (1980), no. 5, 223-235 (1981). ... Pair, C. 83b:68013 Abstract data types and algebraic semantics of programming languages. Theoret. Comput. Sci. 18 (1982), no. 1, 1-31. ...
One approach to noisy speech recognition is to automatically remove the noise from the cepstrum sequence before feeding it in to a clean speech recognizer. ... In this paper, we show how the noise model can be learned even when the data contains speech. ... An additional benefit to this approach is that channel distortion is an additive effect in the log-spectrum domain. ...dblp:conf/nips/FreyKDA01 fatcat:bxpbknd3ufgtzpxp5k2whnkumm
Our tutorial gives an overview of progress the database community has made towards meeting this challenge. ... In particular, we start by discussing design requirements in building an enterprise IE system. ... as a Service: Extract and clean useful information hidden in publicly available documents, creating a valuable collection of structured data that can be rented or shared over the Internet. • Business ...doi:10.1145/1807167.1807339 dblp:conf/sigmod/ChiticariuLRR10 fatcat:6blwd4zdebdabhuzyhuhendfey
SA is a form of text mining that helps us to understand the attitude and behavior of a customer towards a product/service. ... Vector space model or term vector model is an algebraic model for representing and filtering the text documents. ... Text cleaning is removing unwanted data. ...doi:10.35940/ijitee.d1409.029420 fatcat:wg6vs4cdcba73oxdi2ixueqjqm
past editions of both VLDB and SIG-MOD, and a special issue on Towards Quality Data with Fusion and Cleaning in the IEEE Internet Computing. ... models and algebra • Quality of linked data • Cleaning extremely large data sets • Data quality on the Web • Privacy-preserving data quality • Data quality benchmarks • Data quality on novel data management ... models and algebra QDB HISTORY Data and information quality has become an increasingly important and interesting topic for the database community. ...doi:10.1145/2430456.2430472 fatcat:yhzijillhfe35pwpi7jtdhcoje
Lecture Notes in Computer Science
Our core contribution is that we show how extending fact oriented modeling languages with the single concept of algebraic data types leads to a natural and straightforward modeling of complex information ... Notice that such a property can not expressed with an algebraic data type, but should be implemented by access functions of the data type. ... Algebraic Data Types We briefly introduce algebraic data types. In the next section we will show how algebraic data types can be part of FOM models. ...doi:10.1007/11915072_20 fatcat:nnkwav37gzhmrjbezin5suze54
Abstract data types: Stéphane Kaplan and Amir Pnueli, Specifica- tion and implementation of concurrently accessed data structures: an abstract data type approach (pp. 220-244); Christoph Beierle and Angelika ... Lescanne, Transformation ordering (pp. 69-80); Martin Gogolla, On parametric algebraic specifica- tions with clean error handling (pp. 81-95); Donald Sannella and Andrzej Tarlecki, Toward formal development ...
This property is obtained through the use of the ML invariance property and leads to an approach to developing a classifier when training has been mislabeled: namely train the classifier on noisy data ... We investigate the problem of machine learning with mislabeled training data. ... Alternatively the noisy model can be transformed into the clean model and decisions made with a clean model using a decision threshold of 1/2, an approach that we consider briefly in this paper. ...arXiv:1909.09136v1 fatcat:oftptf64kjcs5isslfqqao7dsu
Physical review B
In this article, we consider the simplest problem to which the scratched-XY model relates: a single weak link in an otherwise clean system, with an intensity J_W which decreases algebraically with the ... cut for K<K_c, with an adjustable K_c=1/(1-α) depending on α. ... The model consists of a weak link whose strength decreases algebraically with the system size J W ∼ L −α , in an otherwise clean system. ...doi:10.1103/physrevb.99.054519 fatcat:6oaqfrvsnrda5bexn6oromo5s4
Next, using Lie algebra, we map the homography matrices to an intermediate vector space that preserves the intrinsic geometric structure of the transformation. ... To extract meaningful features from time-series, we propose an efficient linear dynamical system based technique. ... A schematic diagram showing the various processes involved in ourproposed approach towards classification of a typical shot. ...doi:10.1109/tmm.2014.2300833 fatcat:qqqmd5ovhfh35gobqaovuxnwfa
« Previous Showing results 1 — 15 out of 24,252 results