Structure feature selection for chemical compound classification

Hongliang Fei, Jun Huan
2008 2008 8th IEEE International Conference on BioInformatics and BioEngineering  
With the development of highly efficient cheminformatics data collection technology, classification of chemical structure data emerges as an important topic in cheminformatics. Towards building highly accurate predictive models for chemical data, here we present an efficient feature selection method. In our method, we first represent a chemical structure by its 2D connectivity map. We then use frequent subgraph mining to identify structural fragments as features for graph classification.
more » ... nt from existing methods, we consider the spatial distribution of the subgraph features in the graph data and select those ones that have consistent spatial locations. We have applied our feature selection methods to several cheminformatics benchmarks. Our experimental results demonstrate a significant improvement of prediction as compared to the state-of-the-art feature selection methods.
doi:10.1109/bibe.2008.4696655 dblp:conf/bibe/FeiH08 fatcat:73plpknoi5gp5owinpglgoxqti