A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Discovering data quality rules
2008
Proceedings of the VLDB Endowment
Data quality rules are known to be contextual, so we focus on the discovery of context-dependent rules. ...
Our discovery algorithm searches for minimal CFDs among the data values and prunes redundant candidates. No universal objective measures of data quality or data quality rules are known. ...
We thank Tasos Kementsietsidis and Xibei Jia for providing us with the tax data generator. ...
doi:10.14778/1453856.1453980
fatcat:kqsmykm3nffxzbo4x224cfc3bi
Discovering dynamic integrity rules with a rules-based tool for data quality analyzing
2010
Proceedings of the 11th International Conference on Computer Systems and Technologies and Workshop for PhD Students in Computing on International Conference on Computer Systems and Technologies - CompSysTech '10
Rules based approaches for data quality solutions often use business rules or integrity rules for data monitoring purpose. ...
In this paper, we present our rule-based approach for data quality analyzing, in which we discuss a comprehensive method for discovering dynamic integrity rules. ...
INTRODUCTION Data quality (DQ) is an increasing concern for most businesses. High quality data helps the organisations to save costs, to make better decisions and to improve customer service. ...
doi:10.1145/1839379.1839396
dblp:conf/compsystech/ThiH10
fatcat:7gdwhojprbenhcjhg6o24kyzie
Data quality: The other face of Big Data
2014
2014 IEEE 30th International Conference on Data Engineering
With the variety of data, often from a diversity of sources, data quality rules cannot be specified a priori; one needs to let the "data to speak for itself" in order to discover the semantics of the data ...
This tutorial presents recent results that are relevant to big data quality management, focusing on the two major dimensions of (i) discovering quality issues from the data itself and (ii) trading-off ...
Since rules are discovered based on dirty data, inconsistencies may appear as an effect of faulty rules. ...
doi:10.1109/icde.2014.6816764
dblp:conf/icde/SahaS14
fatcat:eomux6d3vbaedflg2fhfhe7kpa
GPR: A Data Mining Tool Using Genetic Programming
2001
Communications of the Association for Information Systems
We present GPR, an inductive data-mining system we developed. GPR uses the technique of genetic programming to discover rules. ...
In the following section, we briefly define terminology and concepts related to knowledge discovery and the reasons for our focus on discovering production rules. ...
Certainty discovered a very large number of exact rules, including some that applied to only three or four Sample Rules Produced by Three Knowledge Quality Functions GPR: A Data Mining Tool Using Genetic ...
doi:10.17705/1cais.00506
fatcat:xbklzxyinjgtfj7ifchudm4j74
Comparative Analysis of Variations of Ant-Miner by Varying Input Parameters
2012
International Journal of Computer Applications
ACO can be applied to the data mining field to extract rule-based classifiers. ...
Three algorithms (Ant-Miner, Ant-Tree-Miner and cAnt-Miner) are compared against input parameters with respect to predictive accuracy and simplicity of the discovered rules. ...
Extend Quality Measures for classification 4. New Multi-class rule Quality measures 5. Modification for Multi-Label classification 6. Discovering fuzzy classification rules 7. ...
doi:10.5120/9673-4097
fatcat:upcbisdwhfhzlo5kqxztdjyiu4
A new version of the ant-miner algorithm discovering unordered rule sets
2006
Proceedings of the 8th annual conference on Genetic and evolutionary computation - GECCO '06
Hence, the proposed version facilitates the interpretation of discovered knowledge, an important point in data mining. ...
The Ant-Miner algorithm, first proposed by Parpinelli and colleagues, applies an ant colony optimization heuristic to the classification task of data mining to discover an ordered list of classification ...
The Ljubljana breast cancer data set was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. Thanks go to M. Zwitter and M. Soklic for providing the data. ...
doi:10.1145/1143997.1144004
dblp:conf/gecco/SmaldonF06
fatcat:4okqumfgfjabfjekq5oc2gorh4
Methodology Design for Data Preparation in the Process of Discovering Patterns of Web Users Behaviour
2013
Applied Mathematics & Information Sciences
Data preparation represents the first inevitable step in the process of discovering users' behavioural patterns. ...
Considering the obtained results we propose a methodology for data preparation in the process of discovering patterns of web user behaviour based on the results of experiments we carried out. ...
and on the quality in terms of the basic quality characteristics of discovered rules. ...
doi:10.12785/amis/071l05
fatcat:dreogqdww5aqlgs4zmwrwmbchi
User Identification in the Process of Web Usage Data Preprocessing
2019
International Journal of Emerging Technologies in Learning (iJET)
This comparison was performed concerning the quality of the se-quential rules generated, i.e., a comparison was made regarding generation useful, trivial and inexplicable rules. ...
There are multiple places where we can extract the necessary data. ...
Differences in the results of sequence rule analysis are not only in the quantity of discovered rules, but also in the quality (the value of support variable) of discovered rules in examined files. ...
doi:10.3991/ijet.v14i09.9854
fatcat:nvu2b5e74vf3nbl7ggxv3qh4ie
SEWEBAR-CMS: A System for Postprocessing Data Mining Models
2010
International Web Rule Symposium
The principal problem of the association rule (AR) mining task is the selection of rules that might be interesting for the domain expert from the many rules typically generated by the software. ...
-based Content Management System for post-processing AR models that supports the data analyst in this effort. ...
discovered association rules. ...
dblp:conf/ruleml/KliegrCHR10
fatcat:5wfqkqdibfcqref7dmhebnxbke
Evolutionary Mining for Image Classification Rules
[chapter]
2004
Lecture Notes in Computer Science
Classification rules, discovered by application of a genetic algorithm on remote sensing data, are able to identify spectral classes with comparable accuracy to that of a human expert. ...
In our case studies, the hyperspectral images contain voluminous, complex and frequently noisy data. ...
In this paper, a new data-driven approach is proposed in order to discover classification rules using the paradigm of genetic evolution. ...
doi:10.1007/978-3-540-24621-3_13
fatcat:bm22gin7c5erdpcuirocsulzu4
Enhanced cAntMinerPB Algorithm for Induction of Classification Rules using Ant Colony Approach
2014
IOSR Journal of Computer Engineering
Rule induction is a method used in data mining where the desired output is a set of Rules or Statements that characterize the data. ...
Mining classification rules from data is a key mission of data mining and is getting great attention in recent years. ...
rule
16:
add the rule in discovered list of rules
17:
end while
18:
if compare quality of discovered list of rules then
19:
update list according to highest quality
20:
end if
21: end for
22 ...
doi:10.9790/0661-16326372
fatcat:g7hs3opkkrc3dofra6j6anpyde
Improving the cAnt-MinerPB Classification Algorithm
[chapter]
2012
Lecture Notes in Computer Science
We have found that changing the rule quality function has little effect on the overall performance, but that by improving the rule-list quality function we can positively affect the discovered lists of ...
We aim to improve cAnt-MinerPB in two ways, firstly by dynamically finding the rule quality function which is used while the rules are being pruned, and secondly improving the rule-list quality function ...
In terms of the discovered model size, the use of the error-based rule-list quality (cAnt-Miner PB [E]) led to a statistically significant improvement in the size of the discovered lists, reducing the ...
doi:10.1007/978-3-642-32650-9_7
fatcat:ewf5tbrbdfex7nj7qyjjaps7ne
Data mining with an ant colony optimization algorithm
2002
IEEE Transactions on Evolutionary Computation
This work proposes an algorithm for data mining called Ant-Miner (Ant Colony-based Data Miner). The goal of Ant-Miner is to extract classification rules from data. ...
discovered by CN2. ...
In CN2 there is no mechanism to allow the quality of a discovered rule to be used as a feedback for constructing other rules. ...
doi:10.1109/tevc.2002.802452
fatcat:yxrqzgsp3re7tibmodkq7od3tm
Data Quality Measurement on Categorical Data Using Genetic Algorithm
2012
International Journal of Data Mining & Knowledge Management Process
Our basic idea is to employ association rule for the purpose of data quality measurement. Strong rule generation is an important area of data mining. ...
Data quality on categorical attribute is a difficult problem that has not received as much attention as numerical counterpart. ...
INTRODUCTION Data Mining is the most instrumental tool in discovering knowledge from transactions [1, 2] .The most important application of data mining is discovering association rules. ...
doi:10.5121/ijdkp.2012.2103
fatcat:p6lo2kk6wzchritsunhx6sj2zq
Heuristic Mining Revamped: An Interactive, Data-aware, and Conformance-aware Miner
2017
International Conference on Business Process Management
visualized as described in literature, and (5) existing tools do not give reliable quality diagnostics for discovered models. ...
It uses data attributes to improve the discovery procedure and provides built-in conformance checking to get direct feedback on the quality of the model. ...
(Step 4) Discover Decision Rules. Fourth, decision rules that determine which of the output bindings may be activated are discovered. ...
dblp:conf/bpm/MannhardtLR17
fatcat:3ea4ooyh3jguzaxqdvkkvc4hlu
« Previous
Showing results 1 — 15 out of 465,240 results