Filters








111 Hits in 6.9 sec

Answering the Most Correlated N Association Rules Efficiently [chapter]

Jun Sese, Shinichi Morishita
2002 Lecture Notes in Computer Science  
In this paper, we propose the heuristics for the vertical decomposition of a database, for pruning unproductive itemsets, and for traversing a setenumeration tree of itemsets that is tailored to the calculation  ...  We experimentally compared the combination of these three techniques with the previous statistical approach.  ...  of association rules with significant statistical metrics.  ... 
doi:10.1007/3-540-45681-3_34 fatcat:eyf6ct2iwfa2vhbkv4fzriphhi

MAFIA: a maximal frequent itemset algorithm

D. Burdick, M. Calimlim, J. Flannick, J. Gehrke, T. Yiu
2005 IEEE Transactions on Knowledge and Data Engineering  
The search strategy of the algorithm integrates a depth-first traversal of the itemset lattice with effective pruning mechanisms that significantly improve mining performance.  ...  We present a new algorithm for mining maximal frequent itemsets from a transactional database.  ...  First, in Section 3.1, we describe a simple depth-first traversal with no pruning.  ... 
doi:10.1109/tkde.2005.183 fatcat:vzcvvebm4zhwzasyyno5aynlyi

DualMiner

Cristian Bucila, Johannes Gehrke, Daniel Kifer, Walker White
2002 Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '02  
Constraint-based mining of itemsets for questions such as "find all frequent itemsets where the total price is at least $50" has received much attention recently.  ...  In this paper, we present the first algorithm (called DualMiner) that uses both monotone and antimonotone constraints to prune its search space.  ...  For example, consider the itemset lattice illustrated in Figure 1 (with the set of items M = {A, B, C, D}).  ... 
doi:10.1145/775052.775054 fatcat:uuv5x7uyl5fejewdwijf4iv23i

DualMiner

Cristian Bucila, Johannes Gehrke, Daniel Kifer, Walker White
2002 Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '02  
Constraint-based mining of itemsets for questions such as "find all frequent itemsets where the total price is at least $50" has received much attention recently.  ...  In this paper, we present the first algorithm (called DualMiner) that uses both monotone and antimonotone constraints to prune its search space.  ...  For example, consider the itemset lattice illustrated in Figure 1 (with the set of items M = {A, B, C, D}).  ... 
doi:10.1145/775047.775054 dblp:conf/kdd/BucilaGKW02 fatcat:gww5qer63jcf7jtd7qc4g3z2c4

AC-Close: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery

Hong Cheng, Philip Yu, Jiawei Han
2006 IEEE International Conference on Data Mining. Proceedings  
By focusing on the so-called core patterns, integrated with a top-down mining and several effective pruning strategies, the algorithm narrows down the search space to those potentially interesting ones  ...  Recent studies have proposed methods to discover approximate frequent itemsets in the presence of random noise.  ...  Traverse upward in the lattice for 2 levels (i.e., levels 2 and 3), which constitute the extension space for {a, b, c, d}.  ... 
doi:10.1109/icdm.2006.10 dblp:conf/icdm/ChengYH06 fatcat:awxqz3jwl5aujaftavxpxed3ny

Efficient Analysis of Pattern and Association Rule Mining Approaches

Thabet Slimani, Amor Lazzez
2014 International Journal of Information Technology and Computer Science  
Frequent pattern mining has been a focused topic in data mining research with a good number of references in literature and for that reason an important progress has been made, varying from performant  ...  The most recognized data mining tasks are the process of discovering frequent itemsets, frequent sequential patterns, frequent sequential rules and frequent association rules.  ...  The authors introduce a novel depth-first search strategy that integrates a depth-first traversal of the search space with effective pruning mechanisms.  ... 
doi:10.5815/ijitcs.2014.03.09 fatcat:azuo5zey35flrc3disyiersjnu

A Comprehensive Survey of Frequent Itemsets Mining on Transactional Database with Weighted Items

Thanh Huan Phan, Hoài Bắc Lê
2021 Research and Development on Information and Communication Technology  
In this article, the authors present a survey of frequent itemsets mining algorithmson transactional database with weighted items over the pasttwenty years.  ...  In 1993, Agrawal et al. proposed the first algorithm for mining traditional frequent itemset on binarytransactional database with unweighted items - This algorithmis essential in finding hindden relationships  ...  space pruning/reducing strategy and the method to calculate the relevant metrics in the mining process, the authors have the following recommendations: − First, the algorithms for mining frequent itemsets  ... 
doi:10.32913/mic-ict-research.v2021.n1.967 fatcat:bmc6mv643jhmdo3tknyr2lcdoa

Survey on Mining High Utility Patterns in One Phase

Harshita Taran, Shilpa Ghode
2017 International Journal of Engineering Research and  
High Utility Itemset Mining that discovers the itemsets considering not only the frequency of the itemset but also utility associated with the itemset.  ...  The linear data structure is to compute a tight bound for powerful pruning and to directly identify high utility patterns in an efficient and scalable way, which targets the root cause with prior algorithms  ...  Efficient lattice traversal techniques are presented which quickly identify all the long frequent itemsets and their subsets if required.  ... 
doi:10.17577/ijertv6is070111 fatcat:yn6s4mfwvzacvh5hep5k2hxn6y

An Experiment with Association Rules and Classification: Post-Bagging and Conviction [chapter]

Alípio M. Jorge, Paulo J. Azevedo
2005 Lecture Notes in Computer Science  
We do this with a particular kind of model: large sets of classification association rules, and in combination with ordinary best rule and weighted voting approaches.  ...  We also discuss the predictive power of different metrics used for association rule mining, such as confidence, lift, conviction and χ 2 .  ...  The rule selection method RC [15] builds a decision list by traversing the generalization lattice of the rules and by looking at the training error of the rules.  ... 
doi:10.1007/11563983_13 fatcat:gsplc6uds5hydpddcagjh6rk2e

The Hows, Whys, and Whens of Constraints in Itemset and Rule Discovery [chapter]

Roberto J. Bayardo
2006 Lecture Notes in Computer Science  
"we" is to emphasize this paper is a personal position statement, along with a view of existing research in light of my position.  ...  In this paper, I propose various strategies for applying constraints within algorithms for itemset and rule mining in order to escape this pitfall. 1 1 My use of he informal "I" rather than the typical  ...  While it is strictly more powerful than pruning with closure, we are still plagued by "near equivalence" relationships between an itemset and its subsets.  ... 
doi:10.1007/11615576_1 fatcat:6cfuc3xakze4hewgfuud7kpzfq

Plant Protein Localization Using Discriminative and Frequent Partition-Based Subsequences

S. Vahid Jazayeri, Osmar R. Zaïane
2008 2008 IEEE International Conference on Data Mining Workshops  
The function of proteins in the living cells varies with respect to their localizations.  ...  Extracellular plant proteins are responsible for vital functions such as nutrition acquisition, protection from pathogens, communication with other soil organisms, etc.  ...  This technique prunes many rules on the fly especially using a depthfirst search of the itemset lattice.  ... 
doi:10.1109/icdmw.2008.130 dblp:conf/icdm/JazayeriZ08 fatcat:dtmkmcaf7jd4lgvobf7zpk6kfu

Parallel mining of association rules using a lattice based approach

Wessel Thomas
2007 Proceedings 2007 IEEE SoutheastCon  
The Dynamic Distributed Rule Mining (DDRM) is a lattice-based algorithm that partitions the lattice into sublattices to be assigned to processors for processing and identification of frequent itemsets.  ...  However the costs associated with these algorithms are hash tree construction, hash tree traversal, communication overhead, input/output (I/O) cost and data movement respectively.  ...  In addition they also developed FDM with Local Pruning (FDM-LP), FDM with Local Upper Bound Pruning (FDM-LUP) and FDM with Local Pruning and Polling-Site-Pruning (FDM-LPP), which are based on different  ... 
doi:10.1109/secon.2007.342981 fatcat:z3xe4f6hs5cybcl4ttd5p5au2e

Discovering data quality rules

Fei Chiang, Renée J. Miller
2008 Proceedings of the VLDB Endowment  
Our discovery algorithm searches for minimal CFDs among the data values and prunes redundant candidates. No universal objective measures of data quality or data quality rules are known.  ...  Hence, to avoid returning an unnecessarily large number of CFDs and only those that are most interesting, we evaluate a set of interest metrics and present comparative results using real datasets.  ...  We thank Tasos Kementsietsidis and Xibei Jia for providing us with the tax data generator.  ... 
doi:10.14778/1453856.1453980 fatcat:kqsmykm3nffxzbo4x224cfc3bi

Profiling relational data: a survey

Ziawasch Abedjan, Lukasz Golab, Felix Naumann
2015 The VLDB journal  
We conclude with an outlook on the future of data profiling beyond traditional profiling tasks and beyond relational databases.  ...  Among the simpler results are statistics, such as the number of null values and distinct values in a column, its data type, or the most frequent patterns of its data values.  ...  Column-based traversal of the column lattice The problem of finding minimal uniques is comparable to the problem of finding frequent itemsets [8] .  ... 
doi:10.1007/s00778-015-0389-y fatcat:ojj7blyqgrfrhmyi7yjtn6stia

A scalable association rule learning heuristic for large datasets

Haosong Li, Phillip C.-Y. Sheu
2021 Journal of Big Data  
To model the former, we apply the antimonotone property to the itemset lattice.  ...  ., T001: {1, 3} T002:{1, 4}), the Eclat (Equivalence Class Clustering and bottom-up Lattice Traversal) algorithm [5] uses a vertical dataset (e.g. Item1: {T001, T002}, Item3: {T001}, Item4:{T002}).  ...  Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.  ... 
doi:10.1186/s40537-021-00473-3 fatcat:ozhotg54jfbkdburkrtyfgnjuu
« Previous Showing results 1 — 15 out of 111 results