3,801 Hits in 4.6 sec

Estimating the number of frequent itemsets in a large database

Ruoming Jin, Scott McCallen, Yuri Breitbart, Dave Fuhry, Dong Wang
2009 Proceedings of the 12th International Conference on Extending Database Technology Advances in Database Technology - EDBT '09  
Estimating the number of frequent itemsets for minimal support α in a large dataset is of great interest from both theoretical and practical perspectives.  ...  However, finding not only the number of frequent itemsets, but even the number of maximal frequent itemsets, is #P-complete.  ...  Indeed, if the number of frequent itemsets is large, the data miner may either increase a support level to reduce the number of frequent itemsets or use the number of frequent itemsets to determine an  ... 
doi:10.1145/1516360.1516420 dblp:conf/edbt/JinMBFW09 fatcat:hkchrchamzfapkfxptjb6r2kfi

Efficiently Mining Approximate Models of Associations in Evolving Databases [chapter]

Adriano Veloso, Bruno Gusmão, Wagner Meira, Marcio Carvalho, Srini Parthasarathy, Mohammed Zaki
2002 Lecture Notes in Computer Science  
Research on how the accuracy of a model changes as a function of dynamic updates to the databases is very limited.  ...  We propose a new approach to incrementally generate approximate models of associations in evolving databases.  ...  If there is a large number of invariant itemsets in the database, the set of popular itemsets generated will remain accurate for a long time.  ... 
doi:10.1007/3-540-45681-3_36 fatcat:cotcw2pzebbqnbz3v7zmfs6hcm

Analysis of Sequential Mining Algorithms

Surbhi Chandhok, Romil Anand, Soumay Gupta, Aatif Jamshed
2017 International Journal of Computer Applications  
As a result of which, mining association rules from enormous databases has been a significant topic in recent research for knowledge discovery in databases.  ...  The discovery of Association relationship seeks more attention in data mining due to the constantly increasing amount of data stored in the real application system.  ...  Therefore, in dynamic databases, the maintenance of large itemsets can be extremely expensive, in case rerun of previous mining algorithms on updated database is applied as it repeats a major portion of  ... 
doi:10.5120/ijca2017914085 fatcat:euevx2giqbc53h5rjpaalb4j3q

A New Approach for Approximately Mining Frequent Itemsets

Timur Valiullin, Joshua Zhexue Huang, Jianfei Yin, Dingming Wu
2019 International Conference on Data Analytics and Management in Data Intensive Domains  
In this paper, we propose a new approach for approximately mining of frequent itemsets in a transaction database.  ...  Then, we randomly select a set of subsets and independently mine the frequent itemsets in each of them.  ...  In [10] and [11] , the authors introduced two different approaches for mining frequent itemsets in a large database based on MapReduce.  ... 
dblp:conf/rcdl/ValiullinHY019 fatcat:xtx5mnwfyrenblv2pb6oufahji

Probabilistic Models for Local Patterns Analysis

Khiat Salim, Belbachir Hafida, Rahal Sid Ahmed
2014 Journal of Information Processing Systems  
It consists of mining different datasets in order to obtain frequent patterns, which are forwarded to a centralized place for global pattern analysis.  ...  In such situations we propose the application of a probabilistic model in the synthesizing process.  ...  The itemset inclusion-exclusion model provides an estimate quality in a very short amount of time, but it needs space memory to store the large number of itemsets in the ADTree.  ... 
doi:10.3745/jips.2014.10.1.145 fatcat:htyowt3ipjd7jkbp3zok65xioe

A Survey On Itemset Mining For Large Transaction Database

Ancy Jose*, Dr. John T Abraham
2016 Zenodo  
mining which can handle large transactions in the database.  ...  Mining itemsets from the databases is an important data mining task.Frequent itemset mining refers to the mining of set of items occur frequently in the database.Utility itemset mining refers to the discovery  ...  In the case of frequent itemset mining PFP and Apriori with smart splitting perform efficiently in the case of large transaction.  ... 
doi:10.5281/zenodo.52500 fatcat:bucxzdtkpnefvm7g4hfyyn3b6i

A Survey Paper on Differentially Private Frequent Item Mining

Ms Chanchal Rathi, Ratnaraj Kumar
2016 International Journal of Engineering Research and  
Association rule mining is a method of discovering interesting correlations between variables in large databases. Mining of frequent itemset is most popular problem in data mining.  ...  To cover the information loss by smart splitting, we contrive run time estimation to calculate actual support of itemsets in original database.  ...  Although a number of relevant approaches have been proposed in recent years, they incur the problem of producing a large number of candidate itemsets for high utility itemsets.  ... 
doi:10.17577/ijertv5is010630 fatcat:s524aicfbngxfepegpd6ydpnee

Synthesizing Global Exceptional Patterns in Different Data Sources

Animesh Adhikari
2012 Journal of Intelligent Systems  
In this paper, we propose type I and type II global exceptional frequent itemsets in multiple databases by extending the notion of global exceptional frequent itemset.  ...  The number of multi-branch companies as well as the number of branches of a multi-branch company is increasing over time. Thus, it is important to study data mining on multiple databases.  ...  The estimated support of a missing itemset usually reduces the error of synthesizing a frequent itemset in multiple databases.  ... 
doi:10.1515/jisys-2012-0013 fatcat:hgqlod4f45atzppt2srg2d4vhi

Space Lower Bounds for Itemset Frequency Sketches

Edo Liberty, Michael Mitzenmacher, Justin Thaler, Jonathan Ullman
2016 Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems - PODS '16  
independent of the number of rows in the original database.  ...  A uniform sample of rows is a good sketch of the database in the sense that all sufficiently frequent itemsets and their approximate frequencies are recoverable from the sample, and the sketch size is  ...  This parameter regime -with n a sufficiently large polynomial in d, k, and 1/ -is consistent with typical usage scenarios, where the number of rows in a database far exceeds the number of attributes.  ... 
doi:10.1145/2902251.2902278 dblp:conf/pods/LibertyMTU16 fatcat:ex7vgevns5aqljvc2elcruyqki

Probability-Based Incremental Association Rule Discovery Algorithm

Ratchadaporn Amornchewin, Worapoj Kreesuradej
2008 International Symposium on Computer Science and its Applications  
Mining frequent Itemsets has proved to be very difficult because of its computational complexity.  ...  But, , it has gained a lot of popularity due to the usefulness of association rules, despite having huge processing cost.  ...  Our algorithm can reduce not only a number of times to scan an original database but also the number of candidate itemsets to generate frequent 2 itemsets.  ... 
doi:10.1109/csa.2008.39 fatcat:f6i7vypbcnhldatk5kzxs24xki

Frequent pattern discovery with memory constraint

Kun-Ta Chuang, Ming-Syan Chen
2005 Proceedings of the 14th ACM international conference on Information and knowledge management - CIKM '05  
We explore in this paper a practicably interesting mining task to retrieve frequent itemsets with memory constraint.  ...  by mining frequent itemsets in this paper.  ...  A large memory, which may not be prevalent in most computers nowadays, is in general required when the database is large or the minimum support is small.  ... 
doi:10.1145/1099554.1099659 dblp:conf/cikm/ChuangC05 fatcat:p277xfsqgncyzmxeqyuzrhgqsi

A Survey on Approaches for Mining Frequent Itemsets

S. Neelima, N. Satyanarayana, P. Krishna Murthy
2014 IOSR Journal of Computer Engineering  
This paper presents a literature review on different techniques for mining frequent itemsets.  ...  In data mining, association rule mining is one of the important techniques for discovering meaningful patterns from large collection of data.  ...  Acknowledgements The authors would like to thank Anonymous Reviewers for their valuable suggestions and comments. This paper has greatly benefited from their Efforts.  ... 
doi:10.9790/0661-16473134 fatcat:7nxhpfqgg5cejbs4lb375cvbfm

Support Estimation in Frequent Itemset Mining by Locality Sensitive Hashing

Annika Pick, Tamás Horváth, Stefan Wrobel
2019 Lernen, Wissen, Daten, Analysen  
The main computational eort in generating all frequent itemsets in a transactional database is in the step of deciding whether an itemset is frequent, or not.  ...  The support of a query itemset is then estimated by means of these summaries.  ...  The most time-consuming step of all frequent itemset mining algorithms is to decide the frequency of itemsets, i.e., whether an itemset is supported by at least a certain number of transactions in the  ... 
dblp:conf/lwa/Pick0W19 fatcat:aka6cmx6ejhdlibjiqwqpa2u5a

Computing the minimum-support for mining frequent patterns

Shichao Zhang, Xindong Wu, Chengqi Zhang, Jingli Lu
2007 Knowledge and Information Systems  
In this paper we propose a computational strategy for identifying frequent itemsets, consisting of polynomial approximation and fuzzy estimation.  ...  Frequent pattern mining is based on the assumption that users can specify the minimum-support for mining their databases.  ...  Number of transactions in database = 100,000; average transaction length = 15; number of items = 1,000; large Itemsets: Number of patterns = 10,000 AveSupp Lean a b Evaluate values 0.006627 0.496451  ... 
doi:10.1007/s10115-007-0081-7 fatcat:6a6wydmdwzhllcxw37rb3774qy

Anytime mining for multiuser applications

Shichao Zhang, Chengqi Zhang
2002 IEEE transactions on systems, man and cybernetics. Part A. Systems and humans  
Database systems have been designed to serve multiusers in real-world applications. There are essential differences between mono-and multi-user applications when a database is very large.  ...  He is a member of the editorial board of Asian Journal of Information Technology. Dr.  ...  ACKNOWLEDGMENT The authors would like to thank the three anonymous reviewers for their constructive comments on the first version of this paper.  ... 
doi:10.1109/tsmca.2002.804793 fatcat:pt2pxaiyingo7jvhek6x7uvkla
« Previous Showing results 1 — 15 out of 3,801 results