Filters








229 Hits in 5.3 sec

Variable-grain and dynamic work generation for Minimal Unique Itemset mining

Paraskevas Yiapanis, David J. Haglin, Anna M. Manning, Ken Mayes, John Keane
2008 2008 IEEE International Conference on Cluster Computing  
This paper investigates the effectiveness of variable-grained and dynamic work generation strategies for parallel SUDA2.  ...  SUDA2 is a recursive search algorithm for Minimal Unique Itemset detection. Such sets of items are formed via combinations of non-obvious attributes enabling individual record identification.  ...  ACKNOWLEDGMENT This work was supported in part by the National Science Foundation under grant CTS-0619641. The authors wish to acknowledge use of the SMP cluster at Minnesota State University.  ... 
doi:10.1109/clustr.2008.4663753 dblp:conf/cluster/YiapanisHMMK08 fatcat:ejrfwxlp35fdlepjsbcvnlvd24

Document stream clustering: experimenting an incremental algorithm and AR-based tools for highlighting dynamic trends [article]

Alain Lelu
2008 arXiv   pre-print
generating the main conclusions about the dynamics of a data-stream.  ...  conditions and ordering of the data-vectors stream, 2) the cognitive challenge: we have implemented a stringent selection process of association rules between clusters at time t-1 and time t for directly  ...  Acknowledgements This work has been set up in the framework of the project Ingénierie des Langues, du Document et de l'Information Scientifique, Technique et Culturelle (ILD&ISTC) of the AI pole in the  ... 
arXiv:0811.0340v1 fatcat:m5b36qdqzfh6jk7pqhvsuqrk24

Factors affecting the performance of parallel mining of minimal unique itemsets on diverse architectures

D. J. Haglin, K. R. Mayes, A. M. Manning, J. Feo, J. R. Gurd, M. Elliot, J. A. Keane
2009 Concurrency and Computation  
Three parallel implementations of a divide and conquer search algorithm (called SUDA2) for finding minimal unique itemsets are compared.  ...  The identification of minimal unique itemsets is used by national statistics agencies for statistical disclosure assessment.  ...  We would also like to thank members of the Centre for Novel Computing at the University of Manchester for supporting the work and for providing invaluable feedback.  ... 
doi:10.1002/cpe.1379 fatcat:cbt4jfjxofcfvm3uvc7tt27mq4

Big Data Frequent Pattern Mining [chapter]

David C. Anastasiu, Jeremy Iverson, Shaden Smith, George Karypis
2014 Frequent Pattern Mining  
We identify three areas as challenges to designing parallel frequent pattern mining algorithms: memory scalability, work partitioning, and load balancing.  ...  Frequent pattern mining is an essential data mining task, with a goal of discovering knowledge in the form of repeated patterns.  ...  Dynamic load balancing attempts to minimize the time that processes are idle by actively distributing work among processes.  ... 
doi:10.1007/978-3-319-07821-2_10 fatcat:an2ygoyxzrcavggp66u47vjhk4

An Enhanced Algorithm for Association Rule Mining in Huge Temporal Database

Abdel Rahman Mahmoud, Dr. Nagy Ramadan, Abdel Moniem Helmy
2019 Zenodo  
The result for that will be developing more efficient approach for mining temporal association rules on large data sets.  ...  The current algorithms of association rule mining has limitations in handling temporal data in different data sets for these two main reasons.  ...  The problem of mining general temporal association can be decomposed into two steps: (1) Generate all frequent maximal temporal itemsets and the corresponding maximal temporal sub-itemsets with their relative  ... 
doi:10.5281/zenodo.3372570 fatcat:34v4zq44frbhxkfj2rx6ze73f4

A Two-Armed Bandit Collective for Hierarchical Examplar Based Mining of Frequent Itemsets with Applications to Intrusion Detection [chapter]

Vegard Haugland, Marius Kjølleberg, Svein-Erik Larsen, Ole-Christoffer Granmo
2014 Lecture Notes in Computer Science  
Although several efficient techniques for generating frequent itemsets with a minimum frequency have been proposed, the number of itemsets produced is in many cases too large for effective usage in real-life  ...  Over the last decades, frequent itemset mining has become a major area of research, with applications including indexing and similarity search, as well as mining of data streams, web, and software bugs  ...  In this paper we introduce a completely different approach to frequent itemset mining that possesses several unique properties: -In contrast to being based on extensive and dynamically built data structures  ... 
doi:10.1007/978-3-662-44509-9_1 fatcat:xducldaulrblppyozaqc23rn4m

A two phased service oriented Broker for replica selection in data grids

Rafah M. Almuttairi, Rajeev Wankar, Atul Negi, C.R. Rao, Arun Agarwal, Rajkumar Buyya
2013 Future generations computer systems  
The second is a Fine-grain phase,used for extracting the replicas admissible for user requirements through applying Modified Minimum Cost and Delay Policy (MMCD).  ...  The motivation of this work is to introduce a novel Service OrientedBroker for Replica Selection in Data Grid.  ...  Yulia Sukonkina and Dr. Mahdi S. Almhanna for their help and suggestions.  ... 
doi:10.1016/j.future.2012.09.007 fatcat:ozszdbhwozhqpbf5z4zsdfvxvq

Association Rule Mining with the Micron Automata Processor

Ke Wang, Yanjun Qi, Jeffrey J. Fox, Mircea R. Stan, Kevin Skadron
2015 2015 IEEE International Parallel and Distributed Processing Symposium  
Association rule mining (ARM) is a widely used data mining technique for discovering sets of frequently associated items in large databases.  ...  The Apriori algorithm that ARM uses for discovering itemsets maps naturally to the massive parallelism of the AP.  ...  ACKNOWLEDGMENT This work was supported in part by the Virginia CIT CRCF program under grant no.  ... 
doi:10.1109/ipdps.2015.101 dblp:conf/ipps/WangQFSS15 fatcat:cpxy5oijozgkhlazyl4zorspnq

Parallel mining of association rules using a lattice based approach

Wessel Thomas
2007 Proceedings 2007 IEEE SoutheastCon  
The Dynamic Distributed Rule Mining (DDRM) is a lattice-based algorithm that partitions the lattice into sublattices to be assigned to processors for processing and identification of frequent itemsets.  ...  The goal of this research was to develop and implement a parallel algorithm for mining association rules.  ...  Acknowledgements I thank my advisor, Professor Junping Sun for his extraordinary patience while modifying and editing the earlier drafts of this dissertation.  ... 
doi:10.1109/secon.2007.342981 fatcat:z3xe4f6hs5cybcl4ttd5p5au2e

Quality-driven resource-adaptive data stream mining?

Conny Junghans, Marcel Karnstedt, Michael Gertz
2011 SIGKDD Explorations  
In this paper, we propose a general model to achieve resource and quality awareness for stream mining algorithms in dynamic setups.  ...  Data stream processing is therefore required to work under virtually any dynamic resource constraints.  ...  Most work in this area has focused solely on minimizing resource utilization.  ... 
doi:10.1145/2031331.2031342 fatcat:vd73wioqpjftxetxsdx6h4lzce

DEMIDS: A Misuse Detection System for Database Systems [chapter]

Christina Yip Chung, Michael Gertz, Karl Levitt
2000 IFIP Advances in Information and Communication Technology  
Distance measures are used to guide the search for frequent itemsets describing the working scopes of users.  ...  In DEMIDS such frequent itemsets are computed e ciently from audit logs using the database system's data management and query processing features.  ...  Theorem 4.3 Minimality All itemsets in I are minimal frequent itemsets for AuditS, that is, there are no two itemset I ; I 0 2 I such that I 6 = I 0 and I I 0 .  ... 
doi:10.1007/978-0-387-35501-6_12 fatcat:vkbxvt6h2rgxbppiwvqnagsnyq

Beyond market baskets

Sergey Brin, Rajeev Motwani, Craig Silverstein
1997 SIGMOD record  
This leads to a measure that is upward closed in the itemset lattice, enabling us to reduce the mining problem to the search for a border between correlated and uncorrelated itemsets in the lattice.  ...  consider both the absence and presence of items as a basis for generating rules.  ...  Acknowledgements We are grateful to Jeff Ullman for many valuable discussions. We would also like to thank members of the Stanford Data Mining group, particularly Shalom Tsur, for helpful discussions.  ... 
doi:10.1145/253262.253327 fatcat:rgovge7aongxpl6krl2ibdvwiu

Beyond market baskets

Sergey Brin, Rajeev Motwani, Craig Silverstein
1997 Proceedings of the 1997 ACM SIGMOD international conference on Management of data - SIGMOD '97  
This leads to a measure that is upward closed in the itemset lattice, enabling us to reduce the mining problem to the search for a border between correlated and uncorrelated itemsets in the lattice.  ...  consider both the absence and presence of items as a basis for generating rules.  ...  Acknowledgements We are grateful to Jeff Ullman for many valuable discussions. We would also like to thank members of the Stanford Data Mining group, particularly Shalom Tsur, for helpful discussions.  ... 
doi:10.1145/253260.253327 dblp:conf/sigmod/BrinMS97 fatcat:kvimgob7krentgpnjqyu7qdbhq

I/O conscious algorithm design and systems support for data analysis on emerging architectures

G. Buehrer, A. Ghoting, Xi Zhang, S. Tatikonda, S. Parthasarathy, T. Kurc, J. Saltz
2006 Proceedings 20th IEEE International Parallel & Distributed Processing Symposium  
Advances in data collection and storage technologies have given rise to large dynamic data stores.  ...  In this article, we present a topdown view of how one can achieve this goal for next generation data analysis centers.  ...  Conclusion In this work, we present a top-down view of our systems framework for next generation data analysis centers which will accommodate the dynamic and large scale nature of future workloads.  ... 
doi:10.1109/ipdps.2006.1639586 dblp:conf/ipps/BuehrerGZSPKS06 fatcat:yzagxb56mje25kklkkhjhumdpq

A log mining approach for process monitoring in SCADA

Dina Hadžiosmanović, Damiano Bolzoni, Pieter H. Hartel
2012 International Journal of Information Security  
SCADA (supervisory control and data acquisition) systems are used for controlling and monitoring industrial processes.  ...  Process-related threats take place when an attacker gains user access rights and performs actions, which look legitimate, but which are intended to disrupt the SCADA process.  ...  For mining a k-size itemset, an algorithm that uses candidate generation may need up to 2 k scans of the data set.  ... 
doi:10.1007/s10207-012-0163-8 fatcat:d2hna6wybje75jlka6tjrsb6ba
« Previous Showing results 1 — 15 out of 229 results