Filters








24,252 Hits in 5.0 sec

An Algebraic Approach Towards Data Cleaning

Ridha Khedri, Fei Chiang, Khair Eddin Sabri
2013 Procedia Computer Science  
In this paper, we take a more formal approach and propose the use of information algebra as a general theory to describe structured data sets and data cleaning.  ...  Existing data cleaning tools have focused on cleaning the errors at hand.  ...  In Proposition 3, we described how association rules and association functions can be applied towards cleaning dirty data.  ... 
doi:10.1016/j.procs.2013.09.009 fatcat:ykop6amz3vc3xc3u7jc6pq7ogy

Toward Intelligent Query Engines

Matthaios Olma, Stella Giannakopoulou, Manos Karpathiotakis, Anastasia Ailamaki
2019 IEEE Data Engineering Bulletin  
We eliminate the transform-and-load cost using in-situ query processing approaches which adapt to any data format and facilitate querying diverse datasets.  ...  To accurately execute queries over the transformed data, users have to remove any inconsistencies by applying cleaning operations.  ...  Then, the resulting monoid comprehension is rewritten to an algebraic plan of the nested relational algebra [16] .  ... 
dblp:journals/debu/OlmaGKA19 fatcat:qsxdoltagbcudect3m23rw54uu

XML Warehousing and OLAP [article]

Hadj Mahboubi, Jérôme Darmont
2017 arXiv   pre-print
The aim of this article is to present an overview of the major XML warehousing approaches from the literature, as well as the existing approaches for performing OLAP analyses over XML data (which is termed  ...  algebra.  ...  As a first step toward an XOLAP platform, we initiated a previously inexistent formal framework in the XML context by demonstrating how the TAX Tree Algebra for XML (Jagadish et al., 2001 ) could support  ... 
arXiv:1701.08612v1 fatcat:ne65xezff5hltdo2sbdwlr6lay

Algebraic comparison of metabolic networks, phylogenetic inference, and metabolic innovation

Christian V Forst, Christoph Flamm, Ivo L Hofacker, Peter F Stadler
2006 BMC Bioinformatics  
We report here on an implementation of this approach and its application to the procaryotes.  ...  The algebraic approach to metabolic networks is suitable to study metabolic innovations in two sets of organisms, free living microbes and Pyrococci, as well as obligate intracellular pathogens.  ...  We illustrate our approach using pathogenic procaryotes as an example.  ... 
doi:10.1186/1471-2105-7-67 pmid:16478540 pmcid:PMC1475643 fatcat:bwmztrkwx5hcbggrmysgdtn6dq

Coupling Label Propagation and Constraints for Temporal Fact Extraction

Yafang Wang, Maximilian Dylla, Marc Spaniol, Gerhard Weikum
2012 Annual Meeting of the Association for Computational Linguistics  
Label propagation aggressively gathers fact candidates, and an Integer Linear Program is used to clean out false hypotheses that violate temporal constraints.  ...  Acknowledgements This work is supported by the 7 th Framework IST programme of the European Union through the focused research project (STREP) on Longitudinal Analytics of Web Archive data (LAWA) under  ...  An approach toward fact extraction based on coupled semi-supervised learning for information extraction (IE) is NELL (Carlson et al., 2010b) .  ... 
dblp:conf/acl/WangDSW12 fatcat:h6xkx4onx5b7rgqonyhp7y2jxm

Page 753 of Mathematical Reviews Vol. , Issue 83b [page]

1983 Mathematical Reviews  
V. 83b:68010 Toward a theory of loop cleaning. Programmirovanie 1980, no. 5, 3-16, 95 ( Russian); translated as Programming and Comput. Software 6 (1980), no. 5, 223-235 (1981).  ...  Pair, C. 83b:68013 Abstract data types and algebraic semantics of programming languages. Theoret. Comput. Sci. 18 (1982), no. 1, 1-31.  ... 

ALGONQUIN - Learning Dynamic Noise Models From Noisy Speech for Robust Speech Recognition

Brendan J. Frey, Trausti T. Kristjansson, Li Deng, Alex Acero
2001 Neural Information Processing Systems  
One approach to noisy speech recognition is to automatically remove the noise from the cepstrum sequence before feeding it in to a clean speech recognizer.  ...  In this paper, we show how the noise model can be learned even when the data contains speech.  ...  An additional benefit to this approach is that channel distortion is an additive effect in the log-spectrum domain.  ... 
dblp:conf/nips/FreyKDA01 fatcat:bxpbknd3ufgtzpxp5k2whnkumm

Enterprise information extraction

Laura Chiticariu, Yunyao Li, Sriram Raghavan, Frederick R. Reiss
2010 Proceedings of the 2010 international conference on Management of data - SIGMOD '10  
Our tutorial gives an overview of progress the database community has made towards meeting this challenge.  ...  In particular, we start by discussing design requirements in building an enterprise IE system.  ...  as a Service: Extract and clean useful information hidden in publicly available documents, creating a valuable collection of structured data that can be rented or shared over the Internet. • Business  ... 
doi:10.1145/1807167.1807339 dblp:conf/sigmod/ChiticariuLRR10 fatcat:6blwd4zdebdabhuzyhuhendfey

An Optimal Aggregation of Product Data using Vector Space Model

2020 VOLUME-8 ISSUE-10, AUGUST 2019, REGULAR ISSUE  
SA is a form of text mining that helps us to understand the attitude and behavior of a customer towards a product/service.  ...  Vector space model or term vector model is an algebraic model for representing and filtering the text documents.  ...  Text cleaning is removing unwanted data.  ... 
doi:10.35940/ijitee.d1409.029420 fatcat:wg6vs4cdcba73oxdi2ixueqjqm

10th international workshop on quality in databases

Xin Luna Dong, Eduard Constantin Dragut
2013 SIGMOD record  
past editions of both VLDB and SIG-MOD, and a special issue on Towards Quality Data with Fusion and Cleaning in the IEEE Internet Computing.  ...  models and algebra • Quality of linked dataCleaning extremely large data sets • Data quality on the Web • Privacy-preserving data quality • Data quality benchmarks • Data quality on novel data management  ...  models and algebra QDB HISTORY Data and information quality has become an increasingly important and interesting topic for the database community.  ... 
doi:10.1145/2430456.2430472 fatcat:yhzijillhfe35pwpi7jtdhcoje

Fact-Oriented Modeling from a Programming Language Designer's Perspective [chapter]

Betsy Pepels, Rinus Plasmeijer, H. A. (Erik) Proper
2006 Lecture Notes in Computer Science  
Our core contribution is that we show how extending fact oriented modeling languages with the single concept of algebraic data types leads to a natural and straightforward modeling of complex information  ...  Notice that such a property can not expressed with an algebraic data type, but should be implemented by access functions of the data type.  ...  Algebraic Data Types We briefly introduce algebraic data types. In the next section we will show how algebraic data types can be part of FOM models.  ... 
doi:10.1007/11915072_20 fatcat:nnkwav37gzhmrjbezin5suze54

Page 2638 of Mathematical Reviews Vol. , Issue 88e [page]

1988 Mathematical Reviews  
Abstract data types: Stéphane Kaplan and Amir Pnueli, Specifica- tion and implementation of concurrently accessed data structures: an abstract data type approach (pp. 220-244); Christoph Beierle and Angelika  ...  Lescanne, Transformation ordering (pp. 69-80); Martin Gogolla, On parametric algebraic specifica- tions with clean error handling (pp. 81-95); Donald Sannella and Andrzej Tarlecki, Toward formal development  ... 

Towards a New Understanding of the Training of Neural Networks with Mislabeled Training Data [article]

Herbert Gish, Jan Silovsky, Man-Ling Sung, Man-Hung Siu, William Hartmann, Zhuolin Jiang
2019 arXiv   pre-print
This property is obtained through the use of the ML invariance property and leads to an approach to developing a classifier when training has been mislabeled: namely train the classifier on noisy data  ...  We investigate the problem of machine learning with mislabeled training data.  ...  Alternatively the noisy model can be transformed into the clean model and decisions made with a clean model using a decision threshold of 1/2, an approach that we consider briefly in this paper.  ... 
arXiv:1909.09136v1 fatcat:oftptf64kjcs5isslfqqao7dsu

Kane-Fisher weak link physics in the clean scratched XY model

G. Lemarié, I. Maccari, C. Castellani
2019 Physical review B  
In this article, we consider the simplest problem to which the scratched-XY model relates: a single weak link in an otherwise clean system, with an intensity J_W which decreases algebraically with the  ...  cut for K<K_c, with an adjustable K_c=1/(1-α) depending on α.  ...  The model consists of a weak link whose strength decreases algebraically with the system size J W ∼ L −α , in an otherwise clean system.  ... 
doi:10.1103/physrevb.99.054519 fatcat:6oaqfrvsnrda5bexn6oromo5s4

Classification of Cinematographic Shots Using Lie Algebra and its Application to Complex Event Recognition

Subhabrata Bhattacharya, Ramin Mehran, Rahul Sukthankar, Mubarak Shah
2014 IEEE transactions on multimedia  
Next, using Lie algebra, we map the homography matrices to an intermediate vector space that preserves the intrinsic geometric structure of the transformation.  ...  To extract meaningful features from time-series, we propose an efficient linear dynamical system based technique.  ...  A schematic diagram showing the various processes involved in ourproposed approach towards classification of a typical shot.  ... 
doi:10.1109/tmm.2014.2300833 fatcat:qqqmd5ovhfh35gobqaovuxnwfa
« Previous Showing results 1 — 15 out of 24,252 results