On Feature Selection for Genomic Signal Processing and Data Mining

S.Y. Kung
2007 Machine Learning for Signal Processing  
An effective data mining system lies in the representation of pattern vectors. The most vital information to be represented is the characteristics embedded in the raw data most essential for the intended applications. In order to extract a useful high-level representation, it is desirable that a representation can provide concise, invariant, and/or intelligible information on input patterns. The curse of dimensionality has traditionally been a serious concern in many genomic applications. For
more » ... ample, the feature dimension of gene expression data is often in the order of thousands. This motivates exploration into feature selection and representation, both aiming at reducing the feature dimensionality to facilitate the training and prediction of genomic data. The challenge lies in how to reduce feature dimension while conceding minimum sacrifice 1-4244-1566-7/07/$25.00 ©2007 IEEE.
doi:10.1109/mlsp.2007.4414275 fatcat:at3rnrj7u5eyrkohbnuqr6jrny