A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2015; you can also visit the original URL.
The file type is application/pdf
.
Filters
Automated Empirical Selection of Rule Induction Methods Based on Recursive Iteration of Resampling Methods
[chapter]
2010
IFIP Advances in Information and Communication Technology
For this purpose, we introduce multiple testing based on recursive iteration of resampling methods for rule-induction (MULT-RECITE-R). ...
One of the most important problems in rule induction methods is how to estimate which method is the best to use in an applied domain. ...
For this purpose, we introduce multiple testing based on recursive iteration of resampling methods for rule-induction (MULT-RECITE-R). ...
doi:10.1007/978-3-642-16327-2_19
fatcat:ufq5kyylbrb3vbu7aftjjbylpy
Data mining with decision trees and decision rules
1997
Future generations computer systems
This paper describes the use of decision tree and rule induction in data mining applications. ...
Of methods for classi cation and regression that have been developed in the elds of pattern recognition, statistics, and machine learning, these are o f p articular interest for data mining since they ...
Using train and test evaluation methods, the initial covering rule set is scaled back to the most statistically accurate subset of rules. ...
doi:10.1016/s0167-739x(97)00021-6
fatcat:ou2pqwutqnh2bdmxcfhziy5raa
Machine learning-based clinical prediction modeling – A practical guide for clinicians
[article]
2020
arXiv
pre-print
In further sections, we review the importance of resampling, overfitting and model generalizability as well as feature reduction and selection (Part II), strategies for model evaluation, reporting and ...
In the first section, we provide explanations on the general principles of machine learning, as well as analytical steps required for successful machine learning-based predictive modelling - which is the ...
The use of multiple measures of performance (AUC, F1 etc.) are recommended.
Resampling Resampling methods fit a model multiple times on different subsets of the training data. ...
arXiv:2006.15069v1
fatcat:ovhpshrz5nbn7dwjrrzrvouraq
Auto-GNN: Neural Architecture Search of Graph Neural Networks
[article]
2019
arXiv
pre-print
Experiments on real-world benchmark datasets demonstrate that the GNN architecture identified by AGNN achieves the best performance, comparing with existing handcrafted models and tradistional search methods ...
First, the search space of GNN is different from the ones in existing NAS work. ...
Under the inductive learning, the training process has no idea about the graph structure and node features on both validation and testing sets. ...
arXiv:1909.03184v2
fatcat:lvfwckwir5cdngznmuwbhjkvpa
Ensemble Decision Making System for Breast Cancer Data
2012
International Journal of Computer Applications
In this study, a hybrid approach: CART decision tree classifier with feature selection and boosting ensemble method has been considered to evaluate the performance of classifier. ...
Various Breast cancer data sets are considered for this study as breast cancer is one of the leading causes of death in women. ...
The accuracy of the hybrid approach which is a combination of feature selection and CART with Boosting is tested on the selected breast cancer datasets with best feature selection method. ...
doi:10.5120/8134-1823
fatcat:p22ad7wvsrdfxb7ujtrro3d474
Minimally-Supervised Morphological Segmentation using Adaptor Grammars
2013
Transactions of the Association for Computational Linguistics
We evaluate on five languages and show that semi-supervised training provides a boost over unsupervised training, while the model selection method yields the best average results over all languages and ...
We compare three training methods: unsupervised training, semisupervised training, and a novel model selection method. ...
We thank Constantine Lignos for releasing his Morsel code to us, Sami Virpioja for evaluating test set results, and Federico Sangati for providing useful scripts. ...
doi:10.1162/tacl_a_00225
fatcat:pfxwqf6hv5gutff26olruk5bqq
Application of statistics and machine learning for risk stratification of heritable cardiac arrhythmias
2013
Expert systems with applications
We then explore less common and more recent statistical and machine learning methods adopted by other biological studies and assess their applicability in the study of HCA. ...
They have been adopted for feature selection of predictor variables in risk stratification studies, and in some cases, prove better than classical methods. ...
Jiang et al. (2007) utilized the ensemble learning scheme, as used in random forests to develop a rule voting method and used it as the underlying scoring mechanism in their multiple selection rule voting ...
doi:10.1016/j.eswa.2012.10.054
fatcat:mqvrvbrpbncwfb7cgm2p452yqu
Exploratory data analysis in the context of data mining and resampling
2010
International Journal of Psychological Research
In this article, EDA is introduced in the context of data mining and resampling with an emphasis on three goals: cluster detection, variable selection, and pattern recognition. ...
In addition, the nature of EDA has been changing due to the emergence of new methods and convergence between EDA and other methodologies, such as data mining and resampling. ...
Thus, this iterative loop continues one layer at a time until the errors are minimized. Neural networks use multiple paths for model construction. ...
doi:10.21500/20112084.819
fatcat:qeezpshxw5gorg2hyhcowwghn4
Size matters: three methods for estimating nuclear size in mycorrhizal roots of Medicago truncatula by image analysis
2019
BMC Plant Biology
The intracellular accommodation of arbuscular mycorrhizal (AM) fungi involves a profound molecular reprogramming of the host cell architecture and metabolism, based on the activation of a symbiotic signaling ...
analysing the correlation in space and time between the induction of cortical cell division and endoreduplication upon AM colonization. ...
in the design of the study, collection, analysis, and interpretation of data or in writing the manuscript. ...
doi:10.1186/s12870-019-1791-1
fatcat:yfijij6xeje7xe6atjkyjgfr7m
Classifying Protein Fingerprints
[chapter]
2004
Lecture Notes in Computer Science
The final model's error rate was estimated at 14.1% on a blind test set, representing a 26% accuracy gain over PRECIS' handcrafted rules. ...
This paper reports on an attempt to build more accurate classifiers based on information drawn from the fingerprints themselves and from the SWISS-PROT database. ...
Acknowledgements The work reported above was partially funded by the European Commission and the Swiss Federal Office for Education and Science in the framework of the BioMinT project. ...
doi:10.1007/978-3-540-30116-5_20
fatcat:3lwuzkru5bd33ikmszj7upsm5u
Credit Decision Support Based on Real Set of Cash Loans Using Integrated Machine Learning Algorithms
2021
Electronics
During experiments, we analyzed the impact of feature selection on the results of binary classification, and the impact of data resampling with feature discretization on the results of feature selection ...
In particular, processing pipeline was designed, which consists of methods for data resampling, feature discretization, feature selection, and binary classification. ...
Only one processing method used on training cases without testing cases was data resampling. ...
doi:10.3390/electronics10172099
fatcat:wssm4xwtnfe3tgjmvubjighqdu
Remote Sensing Image Registration Techniques: A Survey
[chapter]
2010
Lecture Notes in Computer Science
Despite numerous techniques being developed for image registration, only a handful has proved to be useful for registration of remote sensing images due to their characteristic of being computationally ...
This paper presents a comprehensive survey of such literatures including recently developed techniques. ...
of a transformation function and (e) Resampling. ...
doi:10.1007/978-3-642-13681-8_13
fatcat:52sbajz7cff2bjrxz5mr2h2cem
Improving scenario discovery by bagging random boxes
2016
Technological forecasting & social change
This improved version is based on the idea of performing multiple PRIM analyses based on randomly selected features and combining these results using a bagging technique. ...
The most frequently used algorithm is the Patient Rule Induction Method (PRIM). ...
An earlier version of this paper has been presented at the 2014 PICMET conference. Based on comments and suggestions received there, we substantially enhanced the paper. ...
doi:10.1016/j.techfore.2016.06.014
fatcat:qk655ezfprg3hiai62r6nvfhpa
Learning Proposals for Probabilistic Programs with Inference Combinators
[article]
2021
arXiv
pre-print
We demonstrate the flexibility of this framework by implementing advanced variational methods based on amortized Gibbs sampling and annealing. ...
Inference combinators define a grammar over importance samplers that compose primitive operations such as application of a transition kernel and importance resampling. ...
ACKNOWLEDGEMENTS This work was supported by the Intel Corporation, the 3M Corporation, NSF award 1835309, startup funds from Northeastern University, the Air Force Research Laboratory (AFRL), and DARPA ...
arXiv:2103.00668v3
fatcat:wc5yn2njabbzpeayykqk55cymu
Mapping land-cover modifications over large areas: A comparison of machine learning algorithms
2008
Remote Sensing of Environment
Comparisons were based on several criteria: overall accuracy, sensitivity to data set size and variation, and noise. ...
The objective of this research is to compare the performance of three machine learning algorithms (MLAs); two classification tree software routines (S-plus and C4.5) and an artificial neural network (ARTMAP ...
The trees were generated in C4.5 using the 'iterative' mode, which generates a series of decision trees based on randomly-selected subsets of the data (termed the 'window'). ...
doi:10.1016/j.rse.2007.10.004
fatcat:u7nuvvxs2rb5doovs52fbck4j4
« Previous
Showing results 1 — 15 out of 399 results