Filters








216 Hits in 1.8 sec

The Minimum Description Length Principle for Pattern Mining: A Survey [article]

Esther Galbrun
2021 arXiv   pre-print
After giving an outline of relevant concepts from information theory and coding, as well as of work on the theory behind the MDL and similar principles, we review MDL-based methods for mining various types  ...  of data and patterns.  ...  The Predictive Paradigm -Compression and Model Bias in Human Cognition.  ... 
arXiv:2007.14009v3 fatcat:v7zhxwfa5zhc7gd6msyews73ze

Summarizing categorical data by clustering attributes

Michael Mampaey, Jilles Vreeken
2011 Data mining and knowledge discovery  
While low-order statistics only provide very limited insight, downright mining the data rapidly provides too much detail for such a quick glance.  ...  Besides providing a practical overview of which attributes interact most strongly, these summaries can also be used as surrogates for the data, and can easily be queried.  ...  Acknowledgments The authors thank i-ICT of Antwerp University Hospital (uza) for providing the MCADD data and expertise.  ... 
doi:10.1007/s10618-011-0246-6 fatcat:to3yvei5tzguno5tsprbi3pt6a

AI Approaches to Environmental Impact Assessments (EIAs) in the Mining and Metals Sector Using AutoML and Bayesian Modeling

Saki Gerassis, Eduardo Giráldez, María Pazo-Rodríguez, Ángeles Saavedra, Javier Taboada
2021 Applied Sciences  
environmental factors, finding relevant patterns in the data and minimizing the influence of human bias.  ...  Mining engineers and environmental experts around the world still identify and evaluate environmental risks associated with mining activities using field-based, basic qualitative methods The main objective  ...  In practical terms, minimizing the MDL score consists of finding the best trade-off between complexity and data representation [45] .  ... 
doi:10.3390/app11177914 fatcat:ixiyr2gmw5ca5db7vq64fcdhcu

Unsupervised interaction-preserving discretization of multivariate data

Hoang-Vu Nguyen, Emmanuel Müller, Jilles Vreeken, Klemens Böhm
2014 Data mining and knowledge discovery  
It is an important and general pre-processing technique, and a critical element of many data mining and data management tasks.  ...  In particular, our method examines consecutive multivariate regions and combines them if (a) their multivariate data distributions are statistically similar, and (b) this merge reduces the MDL encoding  ...  Acknowledgments We thank the anomymous reviewers for their insightful comments. Hoang-Vu Nguyen is supported by the German Research Foundation (DFG) within GRK 1194.  ... 
doi:10.1007/s10618-014-0350-5 fatcat:pwv4t4snw5djvnwd6qybqsrvjy

Simplicity, Scientific Inference and Econometric Modelling

Hugo A. Keuzenkamp, Michael McAleer
1995 Economic Journal  
Then the minimal description length MDL of the data encoded with help of the theory, and the theory itself, is (up to a term of order (logk)IN): MDI. -mine k{ -IogP(xle) f k log(2neN) } klogueu }.  ...  In particular, use will be made of insights in algorithmic information theory, the theory of inductive reasoning, and Kolmogorov complexity theory.  ... 
doi:10.2307/2235317 fatcat:yj6qxe64ifc4thik7x5yxbrz7a

The blind men and the elephant: on meeting the problem of multiple truths in data from clustering and pattern mining perspectives

Arthur Zimek, Jilles Vreeken
2013 Machine Learning  
In this position paper, we discuss how different branches of research on clustering and pattern mining, while rather different at first glance, in fact have a lot in common and can learn a lot from each  ...  Second, we relate a representative of these areas, subspace clustering, to pattern mining.  ...  In the middle plot, we show both modern pattern mining and subspace clustering. Both these branches of exploratory data mining research focus on identifying informative submatrices of the data.  ... 
doi:10.1007/s10994-013-5334-y fatcat:lmqlklqeevcmtmuq4urn74is6u

Support vector machines in the prediction of mutagenicity of chemical compounds

Thomas Ferrari, Giuseppina Gini, Emilio Benfenati
2009 NAFIPS 2009 - 2009 Annual Meeting of the North American Fuzzy Information Processing Society  
The classifier, that we derived from SVM methods, outperforms the available methods in performance and simplicity.  ...  In this paper we introduce the problem of predicting the mutagenic toxicity property of chemical compounds and we discuss how this can be partially formulated as a computational intelligence problem.  ...  Bursi for supplying the Salmonella assay data and the partners of the EU CAESAR project (in particular BCX) for checking the chemical data and computing the descriptors.  ... 
doi:10.1109/nafips.2009.5156478 fatcat:ayupgo3zfvgivfpkjsb6qyjdqq

Algorithmic Information Dynamics of Cellular Automata [article]

Hector Zenil, Alyssa Adams
2022 arXiv   pre-print
We demonstrate the sensitivity of the Block Decomposition Method on 1D and 2D CA, including Conway's Game of Life, against measures of statistical nature such as compression (LZW) and Shannon Entropy in  ...  two different contexts (1) perturbation analysis and (2) dynamic-state colliding CA.  ...  In practice, however, MDL and AID can complement each other because AID is more difficult to estimate while MDL provides some statistical shortcuts, and this is similar to and mostly already exemplified  ... 
arXiv:2112.13177v2 fatcat:vw43d7rfnrcoja5rzzzqh5abga

Criminal Network Community Detection Using Graphical Analytic Methods: A Survey

Theyvaa Sangkaran, Azween Abdullah, NZ. JhanJhi
2018 EAI Endorsed Transactions on Energy Web  
Criminal networks analysis has attracted several numbers of researchers as network analysis gained its popularity among professionals and researchers.  ...  Thus, it becomes obvious through this study that more research activities is necessary and expected in order to further grow this research area.  ...  a insight.  ... 
doi:10.4108/eai.13-7-2018.162690 fatcat:objcbuwp4bamlcr7asc4r3ejzq

From "Cases" to "Litigation"

Judith Resnik
1991 Law & Contemporary Problems  
While in theory distinct, claim expedition and claim enabling are not severable in practice.  ...  While in theory and in form each case is separate, in practice lawyers on both sides deal with the cases as a group, sometimes making "block settlements"-in which defendants give a lawyer representing  ... 
doi:10.2307/1191922 fatcat:bmtzluo7tzgzrgjyfilr3nxboe

Identifying roles of clinical pharmacy with survey evaluation [article]

Andreja Čufar, Aleš Mrhar, Marko Robnik-Šikonja
2014 arXiv   pre-print
The survey data sets are important sources of data and their successful exploitation is of key importance for informed policy-decision making.  ...  We show how the OrdEval algorithm exploits the information hidden in the ordering of class and attribute values and their inherent correlation.  ...  Aleš Mrhar and Marko Robnik-Šikonja were supported by the Slovenian Research Agency, ARRS, through research programmes P1-0189 (B) and P2-0209, respectively. References  ... 
arXiv:1406.4287v1 fatcat:grcinn4zlnfn3pmxpcono2b6fi

Graph-based data mining: A new tool for the analysis and comparison of scientific domains represented as scientograms

Arnaud Quirin, Oscar Cordón, Benjamín Vargas-Quesada, Félix de Moya-Anegón
2010 Journal of Informetrics  
In this paper, we aim to show that graph-based data mining tools are useful to deal with scientogram analysis.  ...  several countries, the task is titanically complex as the amount of data to analyze becomes huge and complex.  ...  We would like to thank Elsevier for its permission to use the SCOPUS-SJR data in order to build and compare the scientograms.  ... 
doi:10.1016/j.joi.2010.01.004 fatcat:ch344ki5fvafjd3sd4wldxvk6i

Assessment of surveys for the management of hospital clinical pharmacy services

Andreja Čufar, Aleš Mrhar, Marko Robnik-Šikonja
2015 Artificial Intelligence in Medicine  
Methods and material: We use a data mining analytical approach to extract relevant managerial consequences.  ...  Some marketing scholars have questioned this assumption on the basis of economic and psychological theory as well as on a better empirical insight in the satisfaction response function [18] . offered as  ...  In a recent review [44] medical data mining was divided into six medical tasks (screening, diagnosis, treatment, prognosis, monitoring, and management) and for each task five data mining approaches are  ... 
doi:10.1016/j.artmed.2015.04.003 pmid:25940855 fatcat:6g25ihzwpfavtce2n4mmuhv55q

Mining Sparse and Big Data by Case-based Reasoning

Petra Perner
2014 Procedia Computer Science  
Therefore, CBR can mine sparse and big data.  ...  We will develop the bridge between CBR and Statistics and show how casebased reasoning can mine big and sparse data. Examples are being given based on multimedia applications.  ...  must not meet the true values of and of feature i.  ... 
doi:10.1016/j.procs.2014.08.081 fatcat:iprxdlu3zfcnfm3wfqazbobwm4

Application of Bayesian Networks in Consumer Service Industry and Healthcare [chapter]

Le Zhang, Yuan Gao, Balmatee Bidassie, Vincent G. Duffy
2014 Lecture Notes in Computer Science  
Backed by information theory and learning algorithms, Bayesian network has seen extensive applications in data mining, especially for complicated systems involving association and causal relationships  ...  Compared to an artificial neural network (ANN), another widely used technique in modern data mining, a Bayesian network reflects complex systems like the case in this study more accurately.  ... 
doi:10.1007/978-3-319-07725-3_48 fatcat:avnc3hczajcvrpmgknedcqkffi
« Previous Showing results 1 — 15 out of 216 results