Filters








399 Hits in 7.1 sec

Automated Empirical Selection of Rule Induction Methods Based on Recursive Iteration of Resampling Methods [chapter]

Shusaku Tsumoto, Shoji Hirano, Hidenao Abe
2010 IFIP Advances in Information and Communication Technology  
For this purpose, we introduce multiple testing based on recursive iteration of resampling methods for rule-induction (MULT-RECITE-R).  ...  One of the most important problems in rule induction methods is how to estimate which method is the best to use in an applied domain.  ...  For this purpose, we introduce multiple testing based on recursive iteration of resampling methods for rule-induction (MULT-RECITE-R).  ... 
doi:10.1007/978-3-642-16327-2_19 fatcat:ufq5kyylbrb3vbu7aftjjbylpy

Data mining with decision trees and decision rules

Chidanand Apté, Sholom Weiss
1997 Future generations computer systems  
This paper describes the use of decision tree and rule induction in data mining applications.  ...  Of methods for classi cation and regression that have been developed in the elds of pattern recognition, statistics, and machine learning, these are o f p articular interest for data mining since they  ...  Using train and test evaluation methods, the initial covering rule set is scaled back to the most statistically accurate subset of rules.  ... 
doi:10.1016/s0167-739x(97)00021-6 fatcat:ou2pqwutqnh2bdmxcfhziy5raa

Machine learning-based clinical prediction modeling – A practical guide for clinicians [article]

Julius M. Kernbach, Victor E. Staartjes
2020 arXiv   pre-print
In further sections, we review the importance of resampling, overfitting and model generalizability as well as feature reduction and selection (Part II), strategies for model evaluation, reporting and  ...  In the first section, we provide explanations on the general principles of machine learning, as well as analytical steps required for successful machine learning-based predictive modelling - which is the  ...  The use of multiple measures of performance (AUC, F1 etc.) are recommended. Resampling Resampling methods fit a model multiple times on different subsets of the training data.  ... 
arXiv:2006.15069v1 fatcat:ovhpshrz5nbn7dwjrrzrvouraq

Auto-GNN: Neural Architecture Search of Graph Neural Networks [article]

Kaixiong Zhou, Qingquan Song, Xiao Huang, Xia Hu
2019 arXiv   pre-print
Experiments on real-world benchmark datasets demonstrate that the GNN architecture identified by AGNN achieves the best performance, comparing with existing handcrafted models and tradistional search methods  ...  First, the search space of GNN is different from the ones in existing NAS work.  ...  Under the inductive learning, the training process has no idea about the graph structure and node features on both validation and testing sets.  ... 
arXiv:1909.03184v2 fatcat:lvfwckwir5cdngznmuwbhjkvpa

Ensemble Decision Making System for Breast Cancer Data

D. Lavanya, K. Usha Rani
2012 International Journal of Computer Applications  
In this study, a hybrid approach: CART decision tree classifier with feature selection and boosting ensemble method has been considered to evaluate the performance of classifier.  ...  Various Breast cancer data sets are considered for this study as breast cancer is one of the leading causes of death in women.  ...  The accuracy of the hybrid approach which is a combination of feature selection and CART with Boosting is tested on the selected breast cancer datasets with best feature selection method.  ... 
doi:10.5120/8134-1823 fatcat:p22ad7wvsrdfxb7ujtrro3d474

Minimally-Supervised Morphological Segmentation using Adaptor Grammars

Kairit Sirts, Sharon Goldwater
2013 Transactions of the Association for Computational Linguistics  
We evaluate on five languages and show that semi-supervised training provides a boost over unsupervised training, while the model selection method yields the best average results over all languages and  ...  We compare three training methods: unsupervised training, semisupervised training, and a novel model selection method.  ...  We thank Constantine Lignos for releasing his Morsel code to us, Sami Virpioja for evaluating test set results, and Federico Sangati for providing useful scripts.  ... 
doi:10.1162/tacl_a_00225 fatcat:pfxwqf6hv5gutff26olruk5bqq

Application of statistics and machine learning for risk stratification of heritable cardiac arrhythmias

P.S. Wasan, M. Uttamchandani, S. Moochhala, V.B. Yap, P.H. Yap
2013 Expert systems with applications  
We then explore less common and more recent statistical and machine learning methods adopted by other biological studies and assess their applicability in the study of HCA.  ...  They have been adopted for feature selection of predictor variables in risk stratification studies, and in some cases, prove better than classical methods.  ...  Jiang et al. (2007) utilized the ensemble learning scheme, as used in random forests to develop a rule voting method and used it as the underlying scoring mechanism in their multiple selection rule voting  ... 
doi:10.1016/j.eswa.2012.10.054 fatcat:mqvrvbrpbncwfb7cgm2p452yqu

Exploratory data analysis in the context of data mining and resampling

Chong Ho Yu
2010 International Journal of Psychological Research  
In this article, EDA is introduced in the context of data mining and resampling with an emphasis on three goals: cluster detection, variable selection, and pattern recognition.  ...  In addition, the nature of EDA has been changing due to the emergence of new methods and convergence between EDA and other methodologies, such as data mining and resampling.  ...  Thus, this iterative loop continues one layer at a time until the errors are minimized. Neural networks use multiple paths for model construction.  ... 
doi:10.21500/20112084.819 fatcat:qeezpshxw5gorg2hyhcowwghn4

Size matters: three methods for estimating nuclear size in mycorrhizal roots of Medicago truncatula by image analysis

Gennaro Carotenuto, Ivan Sciascia, Ludovica Oddi, Veronica Volpe, Andrea Genre
2019 BMC Plant Biology  
The intracellular accommodation of arbuscular mycorrhizal (AM) fungi involves a profound molecular reprogramming of the host cell architecture and metabolism, based on the activation of a symbiotic signaling  ...  analysing the correlation in space and time between the induction of cortical cell division and endoreduplication upon AM colonization.  ...  in the design of the study, collection, analysis, and interpretation of data or in writing the manuscript.  ... 
doi:10.1186/s12870-019-1791-1 fatcat:yfijij6xeje7xe6atjkyjgfr7m

Classifying Protein Fingerprints [chapter]

Melanie Hilario, Alex Mitchell, Jee-Hyub Kim, Paul Bradley, Terri Attwood
2004 Lecture Notes in Computer Science  
The final model's error rate was estimated at 14.1% on a blind test set, representing a 26% accuracy gain over PRECIS' handcrafted rules.  ...  This paper reports on an attempt to build more accurate classifiers based on information drawn from the fingerprints themselves and from the SWISS-PROT database.  ...  Acknowledgements The work reported above was partially funded by the European Commission and the Swiss Federal Office for Education and Science in the framework of the BioMinT project.  ... 
doi:10.1007/978-3-540-30116-5_20 fatcat:3lwuzkru5bd33ikmszj7upsm5u

Credit Decision Support Based on Real Set of Cash Loans Using Integrated Machine Learning Algorithms

Paweł Ziemba, Jarosław Becker, Aneta Becker, Aleksandra Radomska-Zalas, Mateusz Pawluk, Dariusz Wierzba
2021 Electronics  
During experiments, we analyzed the impact of feature selection on the results of binary classification, and the impact of data resampling with feature discretization on the results of feature selection  ...  In particular, processing pipeline was designed, which consists of methods for data resampling, feature discretization, feature selection, and binary classification.  ...  Only one processing method used on training cases without testing cases was data resampling.  ... 
doi:10.3390/electronics10172099 fatcat:wssm4xwtnfe3tgjmvubjighqdu

Remote Sensing Image Registration Techniques: A Survey [chapter]

Suma Dawn, Vikas Saxena, Bhudev Sharma
2010 Lecture Notes in Computer Science  
Despite numerous techniques being developed for image registration, only a handful has proved to be useful for registration of remote sensing images due to their characteristic of being computationally  ...  This paper presents a comprehensive survey of such literatures including recently developed techniques.  ...  of a transformation function and (e) Resampling.  ... 
doi:10.1007/978-3-642-13681-8_13 fatcat:52sbajz7cff2bjrxz5mr2h2cem

Improving scenario discovery by bagging random boxes

J.H. Kwakkel, S.C. Cunningham
2016 Technological forecasting & social change  
This improved version is based on the idea of performing multiple PRIM analyses based on randomly selected features and combining these results using a bagging technique.  ...  The most frequently used algorithm is the Patient Rule Induction Method (PRIM).  ...  An earlier version of this paper has been presented at the 2014 PICMET conference. Based on comments and suggestions received there, we substantially enhanced the paper.  ... 
doi:10.1016/j.techfore.2016.06.014 fatcat:qk655ezfprg3hiai62r6nvfhpa

Learning Proposals for Probabilistic Programs with Inference Combinators [article]

Sam Stites, Heiko Zimmermann, Hao Wu, Eli Sennesh, Jan-Willem van de Meent
2021 arXiv   pre-print
We demonstrate the flexibility of this framework by implementing advanced variational methods based on amortized Gibbs sampling and annealing.  ...  Inference combinators define a grammar over importance samplers that compose primitive operations such as application of a transition kernel and importance resampling.  ...  ACKNOWLEDGEMENTS This work was supported by the Intel Corporation, the 3M Corporation, NSF award 1835309, startup funds from Northeastern University, the Air Force Research Laboratory (AFRL), and DARPA  ... 
arXiv:2103.00668v3 fatcat:wc5yn2njabbzpeayykqk55cymu

Mapping land-cover modifications over large areas: A comparison of machine learning algorithms

John Rogan, Janet Franklin, Doug Stow, Jennifer Miller, Curtis Woodcock, Dar Roberts
2008 Remote Sensing of Environment  
Comparisons were based on several criteria: overall accuracy, sensitivity to data set size and variation, and noise.  ...  The objective of this research is to compare the performance of three machine learning algorithms (MLAs); two classification tree software routines (S-plus and C4.5) and an artificial neural network (ARTMAP  ...  The trees were generated in C4.5 using the 'iterative' mode, which generates a series of decision trees based on randomly-selected subsets of the data (termed the 'window').  ... 
doi:10.1016/j.rse.2007.10.004 fatcat:u7nuvvxs2rb5doovs52fbck4j4
« Previous Showing results 1 — 15 out of 399 results