Metasample-Based Sparse Representation for Tumor Classification

Chun-Hou Zheng, Lei Zhang, To-Yee Ng, Chi Keung Shiu, De-Shuang Huang
2011 IEEE/ACM Transactions on Computational Biology & Bioinformatics  
A reliable and accurate identification of the type of tumors is crucial to the proper treatment of cancers. In recent years, it has been shown that sparse representation (SR) by 1 l -norm minimization is robust to noise, outliers and even incomplete measurements, and SR has been successfully used for classification. This paper presents a new SR based method for tumor classification using gene expression data. A set of metasamples are extracted from the training samples, and then an input
more » ... hen an input testing sample is represented as the linear combination of these metasamples by l 1 -regularized least square method. Classification is achieved by using a discriminating function defined on the representation coefficients. Since l 1 -norm minimization leads to a sparse solution, the proposed method is called metasample based SR classification (MSRC). Extensive experiments on publicly available gene expression datasets show that MSRC is efficient for tumor classification, achieving higher accuracy than many existing representative schemes. .hk). generalize well to new data from the same class of cancer [11] . Sparse representation (SR) is a new and powerful data processing method, which is inspired by the recent progress of l 1 -norm minimization based methods such as basis pursuing [12 ], compressive sensing for signal reconstruction [13-15], and least absolute shrinkage and selection operator (LASSO) algorithm for feature selection [16]. By using the SR technique to represent the input testing face image as a sparse linear combination of the training samples, an SR based classification (SRC) method was proposed in [18] for face recognition. Ideally, in SRC it is expected that a testing sample
doi:10.1109/tcbb.2011.20 pmid:21282864 fatcat:v7ytlshvjzcxvaidhbdpt7gvpm