EXPLORING FEATURES AND CLASSIFIERS TO CLASSIFY GENE EXPRESSION PROFILES OF ACUTE LEUKEMIA

SUNG-BAE CHO
2002 International journal of pattern recognition and artificial intelligence  
Bioinformatics has recently drawn a lot of attention to efficiently analyze biological genomic information with information technology, especially pattern recognition. In this paper, we attempt to explore extensive features and classifiers through a comparative study of the most promising feature selection methods and machine learning classifiers. The gene information from a patient's marrow expressed by DNA microarray, which is either the acute myeloid leukemia or acute lymphoblastic leukemia,
more » ... is used to predict the cancer class. Pearson's and Spearman's correlation coefficients, Euclidean distance, cosine coefficient, information gain, mutual information and signal to noise ratio have been used for feature selection. Backpropagation neural network, self-organizing map, structure adaptive self-organizing map, support vector machine, inductive decision tree and k-nearest neighbor have been used for classification. Experimental results indicate that backpropagation neural network with Pearson's correlation coefficients produces the best result, 97.1% of recognition rate on the test data.
doi:10.1142/s0218001402002015 fatcat:d3gljkjvcvcehcbsnjrpmxl2hm