Cascade Generalization: One versus Many

Nahla Barakat
2017 Journal of Computers  
The choice of the best classification algorithm for a specific problem domain has been extensively researched. This issue was also the main motivations behind the ever increasing interest in ensemble methods as well as the choice of ensemble base and meta classifiers. In this paper, we extend and further evaluate a hybrid method for classifiers fusion. The method utilizes two learning algorithms only, in particular; a Support Vector Machine (SVM) as the base-level classifier and a different
more » ... and a different classification algorithm at the meta-level. This is then followed by a final voting stage. Results on nine benchmark data sets confirm that the proposed algorithm, though simple, is a promising ensemble classifier that compares favourably to other well established techniques. Solving for α, the training examples with non zero α's are called the support vectors (SVs). The separating hyper-plane is completely defined by the SVs and they are the only samples which contribute to the classification decision [16] .  It integrates cascade generalization and voting for classifiers fusion, using simple majority voting of both base and meta-level classifiers, to decide the final class of a test example;  The base-level has two classifiers, C0 and SVM0, where only the output of the SVM0 is used to create the meta-data (to train the meta-level classifier C1), while the output of C0 is only considered at the voting stage. To obtain good generalization performance, the classifier C is chosen from a different family of Journal of Computers 242 Volume 12, Number 3, May 2017  Pima Indians diabetes: A sample of 438 samples were used from the original dataset, after removing all samples with a zero value for the features 2-hour OGTT plasma glucose, diastolic blood pressure and Journal of Computers 243 Volume 12, Number 3, May 2017 triceps skin fold thickness which are clinically insignificant;  Heart diseases: The reduced Cleveland heart diseases dataset was used. All samples with missing values were discarded;  Breast cancer: The Wisconsin breast cancer dataset was used. All repeated samples were discarded to avoid the bias resulting from the effect of those samples; Hypothyroid: the experiments were executed as binary classification task: normal against all other class labels. All samples with missing values were discarded; Australian Credit Approval: This dataset represents credit card applications, with a good mixture of attribute types. All samples with missing values were discarded; German: This dataset represents German Credit data. It has 7 numerical, 13 categorical features; Wine: represents wine classification data. It has 13 continuously valued features. The experiments were conducted as binary classification task (first class/other wine); Ionosphere: This dataset represents radar data. All of the features are continuous in value; Glass: This dataset is used for classification of types of glass. All features are continuous in value and the experiments were conducted as binary classification task (window/non-window glass);
doi:10.17706/jcp.12.3.238-249 fatcat:gje7er5dgfdmfieljoztgkokne