Filters








34,530 Hits in 2.6 sec

Enabling efficient process mining on large data sets: realizing an in-database process mining operator

Remco Dijkman, Juntao Gao, Alifah Syamsiyah, Boudewijn van Dongen, Paul Grefen, Arthur ter Hofstede
2019 Distributed and parallel databases  
distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in  ...  We also created a plugin for the ProM process mining tool [14] that enables process mining directly on an H2 Database.  ...  follows' operator proposed in this paper and the process mining tool supports on-database process mining.  ... 
doi:10.1007/s10619-019-07270-1 fatcat:m4b6dgtnmbbmvgqf7iabw67slu

Decision Trees in Large Data Sets

Zeynep ÇETİNKAYA, Fahrettin HORASAN
2021 Uluslararası Muhendislik Arastirma ve Gelistirme Dergisi  
One of the important problems encountered in this process is the classification process in large data sets.  ...  In this article, various decision tree structures and algorithms used for classification process in large data sets are discussed.  ...  Conclusion One of the important problems encountered in the data mining process is the difficulties encountered in the classification process in large data sets.  ... 
doi:10.29137/umagd.763490 fatcat:dcn4yqk47rakrbjfpugei2fbiu

Fuzzy sets in pattern recognition and machine intelligence

Sushmita Mitra, Sankar K. Pal
2005 Fuzzy sets and systems (Print)  
In this position paper we seek to outline the contribution of fuzzy sets to pattern recognition, image processing, and machine intelligence over the last 40 years.  ...  Integration of fuzzy sets with other soft computing tools has lead to the generation of more powerful, intelligent and efficient systems.  ...  The success of fuzzy sets has been mainly vindicated by the commercial popularity in Japan of fuzzy logic and control systems, where both pattern recognition and image processing provide direct interaction  ... 
doi:10.1016/j.fss.2005.05.035 fatcat:3uvawntwirhtvkqgmoldvc6vtm

Data Deduplication in Parallel Mining of Frequent Item sets using MapReduce

Pavithra. K
2016 International Journal Of Engineering And Computer Science  
We aim to implement recommendation algorithm using Mahout, a machine learning device, on Hadoop platform to provide a scalable system for processing large data sets efficiently.  ...  In this paper, we applying Deduplication technique in third MapReduce job to avoid the replication of data in frequent item sets and improve the performance.  ...  The design aim of FiDoop is to construct a mechanism that enables repeated parallelization, load balancing, and data sharing for parallel mining of frequent itemsets on huge clusters.  ... 
doi:10.18535/ijecs/v5i11.12 fatcat:zy4cbcvdhjfnjd6h4vklwxdqm4

A Data Mining Based Approach to Customer Behaviour in an Electronic Settings

A. Tope-Oke, C. A. Afolalu, O. Omofade
2019 Journal of Computer and Communications  
A brief description of the background of e-commerce and data mining, previous work of researchers who have worked on data mining in e-commerce settings, was reviewed and the relationship between their  ...  Furthermore, the interaction between the data mining system and the customer's dataset on an ecommerce website was defined.  ...  Acknowledgements We hereby acknowledge the constructive criticism by all lecturers in the Department of Mathematical and Physical Sciences, Afe Babalola University. Thank you all.  ... 
doi:10.4236/jcc.2019.75004 fatcat:apss762m6rhtzczv3rpcr3sipu

CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SET

Shaker El-Sappagh, Ahmed Saad Mohammed, Tarek Ahmed AlSheshtawy
2019 International journal of network security and its applications  
is a consideration as analysis procedure utilized in a large data e.g.  ...  This paper presented various data mining classification for handling false alerts in intrusion detection as reviewed.  ...  DATA MINING AND INTRUSION DETECTION Data mining is the process of discovering interesting knowledge from large amounts of data stored either in databases, data warehouses, or other information repositories  ... 
doi:10.5121/ijnsa.2019.11302 fatcat:clyis6rtjvghjap5s7jc6en3f4

Performance Evaluation of Predictive Classifiers For Knowledge Discovery From Engineering Materials Data Sets [article]

Hemanth K. S Doreswamy
2012 arXiv   pre-print
In this paper, naive Bayesian and C4.5 Decision Tree Classifiers(DTC) are successively applied on materials informatics to classify the engineering materials into different classes for the selection of  ...  The knowledge discovered by the naive bayesian classifier can be employed for decision making in materials selection in manufacturing industries.  ...  Data Mining [16] , [21] is a process of extracting previously unknown and potentially useful knowledge from the large volume of data sets.  ... 
arXiv:1209.2501v1 fatcat:dvegjusmrraihjus2gbootbuye

Data structure set-trie for storing and querying sets: Theoretical and empirical analysis

Iztok Savnik, Mikita Akulich, Matjaž Krnc, Riste Škrekovski, Unil Yun
2021 PLoS ONE  
Set containment operations form an important tool in various fields such as information retrieval, AI systems, object-relational databases, and Internet applications.  ...  In the paper, a set-trie data structure for storing sets is considered, along with the efficient algorithms for the corresponding set containment operations.  ...  Acknowledgments The authors would like to thank anonymous referees for helpful comments, and to the Plo-sONE editorial board for the patience throughout the finalizing process of this paper.  ... 
doi:10.1371/journal.pone.0245122 pmid:33566827 fatcat:ivztz4szhrex3gaxlp7q7shtgi

A rough set approach to multiple dataset analysis

Ken Kaneiwa
2011 Applied Soft Computing  
In the area of data mining, the discovery of valuable changes and connections (e.g., causality) from multiple data sets has been recognized as an important issue.  ...  This issue essentially differs from finding statistical associations in a single data set because it is complicated by the different data behaviors and relationships across multiple data sets.  ...  This research has been partially supported by the Japanese Ministry of Education, Science, Sports and Culture, Grant-in-Aid for Scientific Research (20700147).  ... 
doi:10.1016/j.asoc.2010.08.021 fatcat:isvalxpnircnjj3atuve6s365i

Fuzzy sets in machine learning and data mining

Eyke Hüllermeier
2011 Applied Soft Computing  
In this connection, some advantages of fuzzy methods for representing and mining vague patterns in data are especially emphasized.  ...  Automated knowledge acquisition of that kind has been an essential aspect of artificial intelligence since a long time and has more recently also attracted considerable attention in the fuzzy sets community  ...  It is likely to become more important in the future, once the enabling data engineering technology, allowing to acquire, store, and process fuzzy data on a large scale, has been established.  ... 
doi:10.1016/j.asoc.2008.01.004 fatcat:5nfpsiwn5bfs3k25t5eg5fc5pe

Decision Tree Algorithm Based on OLAP Multidimensional Data Set System

2016 Revista Técnica de la Facultad de Ingeniería Universidad del Zulia  
Based on the demands of statistical analysis and the decision support service information management system on the information data of Party Members in University, this paper presents a solution for multidimensional  ...  scheme of OLAP multidimensional data and data presentation method.  ...  Data warehouse and the related technology, OLAP (On-line Analytic Processing), and database mining technology is developing and improving.  ... 
doi:10.21311/001.39.6.22 fatcat:qx4uv5ajfvb6nhmvibxshway3i

Automated construction of fuzzy event sets and its application to active databases

Y. Saygin, O. Ulusoy
2001 IEEE transactions on fuzzy systems  
In this paper, we propose a method for automated construction of fuzzy event sets out of event histories via data mining techniques.  ...  Sometimes the number of events in an event-driven system may become very high and unmanageable.  ...  Yazici for his help in establishing the fuzzy concept background and Dr. E. Başçi for his comments in the mathematical formulations of this work.  ... 
doi:10.1109/91.928741 fatcat:hap5ad4nezfutauzxqeu6gvjey

Application of Improved Decision Tree Method based on Rough Set in Building Smart Medical Analysis CRM System

Hongsheng Xu, Lan Wang, Wenli Gan
2016 International Journal of Smart Home  
Decision tree learning is an inductive learning algorithm based example. Rough set theory is used to process uncertain and imprecise information.  ...  In this paper, a decision tree algorithm based on rough set is proposed, and the improved decision tree algorithm based on rough classification is better than the standard C4.5 algorithm in classification  ...  Acknowledgments This paper is supported by Scientific and technological projects of Henan Province in China (142102310482), and also is supported by the science and technology research major project of  ... 
doi:10.14257/ijsh.2016.10.1.23 fatcat:tymdc3mxvfbwheckscu7ujtn4y

Automating Genomic Data Mining via a Sequence-based Matrix Format and Associative Rule Set

Jonathan D Wren, David Johnson, Le Gruenwald
2005 BMC Bioinformatics  
Genomic features are varied in their data types and annotation of these features is spread across multiple databases.  ...  There is an enormous amount of information encoded in each genome -enough to create living, responsive and adaptive organisms.  ...  Acknowledgements This work was funded in part by NSF-EPSCoR grant # EPS-0132534 (JDW).  ... 
doi:10.1186/1471-2105-6-s2-s2 pmid:16026599 pmcid:PMC1637034 fatcat:mrxunf4dlfc3jdb65wgbpeyfo4

Design and Implementation of Rough Set Algorithms on FPGA: A Survey

Kanchan Shailendra, Ashwin. G.
2014 International Journal of Advanced Research in Artificial Intelligence (IJARAI)  
Conventional Rough set information processing like discovering data dependencies, data reduction, and approximate set classification involves the use of software running on general purpose processor.  ...  Rough set theory, developed by Z. Pawlak, is a powerful soft computing tool for extracting meaningful patterns from vague, imprecise, inconsistent and large chunk of data.  ...  In Table 3 , object 2 and 5 makes database inconsistent. III. NEED OF HARDWARE ACCELERATORS In data mining, processing of large volumes of data using complex algorithms is increasingly common.  ... 
doi:10.14569/ijarai.2014.030903 fatcat:t4wxelzcizau7f4txgf2iekleq
« Previous Showing results 1 — 15 out of 34,530 results