A Comparative Study on Bioinformatics Feature Selection and Classification

Amal Tamer, Amr Badr
2012 International Journal of Computer Applications  
This paper presents an application of supervised machine learning approaches to the classification of the colon cancer gene expression data. Established feature selection techniques based on principal component analysis (PCA), independent component analysis (ICA), genetic algorithm (GA) and support vector machine (SVM) are, for the first time, applied to this data set to support learning and classification. Different classifiers are implemented to investigate the impact of combining feature
more » ... ction and classification methods. Learning classifiers implemented include K-Nearest Neighbors (KNN) and support vector machine. Results of comparative studies are provided, demonstrating that effective feature selection is essential to the development of classifiers intended for use in high dimension domains. This research also shows that feature selection helps increase computational efficiency while improving classification accuracy.. General Terms
doi:10.5120/6081-8219 fatcat:ky26i3xcwra73cvl7qj47rx6um