Comparative Analysis of Selected Heterogeneous Classifiers for Software Defects Prediction Using Filter-Based Feature Selection Methods

Abimbola G Akintola, Abdullateef Balogun, Fatimah B Lafenwa-Balogun, Hameed A Mojeed
<span title="2018-02-27">2018</span> <i title="Faculty of Engineering, Federal University Oye-Ekiti"> <a target="_blank" rel="noopener" href="" style="color: black;">FUOYE Journal of Engineering and Technology</a> </i> &nbsp;
Classification techniques is a popular approach to predict software defects and it involves categorizing modules, which is represented by a set of metrics or code attributes into fault prone (FP) and non-fault prone (NFP) by means of a classification model. Nevertheless, there is existence of low quality, unreliable, redundant and noisy data which negatively affect the process of observing knowledge and useful pattern. Therefore, researchers need to retrieve relevant data from huge records
more &raquo; ... feature selection methods. Feature selection is the process of identifying the most relevant attributes and removing the redundant and irrelevant attributes. In this study, the researchers investigated the effect of filter feature selection on classification techniques in software defects prediction. Ten publicly available datasets of NASA and Metric Data Program software repository were used. The topmost discriminatory attributes of the dataset were evaluated using Principal Component Analysis (PCA), CFS and FilterSubsetEval. The datasets were classified by the selected classifiers which were carefully selected based on heterogeneity. Naïve Bayes was selected from Bayes category Classifier, KNN was selected from Instance Based Learner category, J48 Decision Tree from Trees Function classifier and Multilayer perceptron was selected from the neural network classifiers. The experimental results revealed that the application of feature selection to datasets before classification in software defects prediction is better and should be encouraged and Multilayer perceptron with FilterSubsetEval had the best accuracy. It can be concluded that feature selection methods are capable of improving the performance of learning algorithms in software defects prediction.
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="">doi:10.46792/fuoyejet.v3i1.178</a> <a target="_blank" rel="external noopener" href="">fatcat:m4cjd63o4jhtjhsbhxd57b5bge</a> </span>
<a target="_blank" rel="noopener" href="" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href=""> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> Publisher / </button> </a>