GPU-accelerated machine learning techniques enable QSAR modeling of large HTS data

E. W. Lowe, M. Butkiewicz, N. Woetzel, J. Meiler
2012 2012 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)  
Quantitative structure activity relationship (QSAR) modeling using high-throughput screening (HTS) data is a powerful technique which enables the construction of predictive models. These models are utilized for the in silico screening of libraries of molecules for which experimental screening methods are both cost-and time-expensive. Machine learning techniques excel in QSAR modeling where the relationship between structure and activity is often complex and non-linear. As these HTS data sets
more » ... tinue to increase in number of compounds screened, extensive feature selection and cross validation becomes computationally expensive. Leveraging massively parallel architectures such as graphics processing units (GPUs) to accelerate the training algorithms for these machine learning techniques is a cost-efficient manner in which to combat this problem. In this work, several machine learning techniques are ported in OpenCL for GPU-acceleration to enable construction of QSAR ensemble models using HTS data. We report computational performance numbers using several HTS data sets freely available from PubChem database. We also report results of a case study using HTS data for a target of pharmacological and pharmaceutical relevance, cytochrome P450 3A4, for which an enrichment of 94% of the theoretical maximum is achieved.
doi:10.1109/cibcb.2012.6217246 dblp:conf/cibcb/LoweBWM12 fatcat:wvn6gdiuijcmpgcv5k2ubo4myi