Modeling Chemical Interaction Profiles: I. Spectral Data-Activity Relationship and Structure-Activity Relationship Models for Inhibitors and Non-inhibitors of Cytochrome P450 CYP3A4 and CYP2D6 Isozymes

Brooks McPhail, Yunfeng Tie, Huixiao Hong, Bruce A. Pearce, Laura K. Schnackenberg, Weigong Ge, Luis G. Valerio, James C. Fuscoe, Weida Tong, Dan A. Buzatu, Jon G. Wilkes, Bruce A. Fowler (+2 others)
2012 Molecules  
An interagency collaboration was established to model chemical interactions that may cause adverse health effects when an exposure to a mixture of chemicals occurs. Many of these chemicals--drugs, pesticides, and environmental pollutants--interact at the level of metabolic biotransformations mediated by cytochrome P450 (CYP) enzymes. In the present work, spectral data-activity relationship (SDAR) and structure-activity relationship (SAR) approaches were used to develop machine-learning
more » ... rs of inhibitors and non-inhibitors of the CYP3A4 and CYP2D6 isozymes. The models were built upon 602 reference pharmaceutical compounds whose interactions have been deduced from clinical data, and 100 additional chemicals that were used to evaluate model performance in an external validation (EV) test. SDAR is an innovative modeling approach that relies on discriminant analysis applied to binned nuclear magnetic resonance (NMR) spectral descriptors. In the present work, both 1D ¹³C and 1D ¹⁵N-NMR spectra were used together in a novel implementation of the SDAR technique. It was found that increasing the binning size of 1D ¹³C-NMR and ¹⁵N-NMR spectra caused an increase in the tenfold cross-validation (CV) performance in terms of both the rate of correct classification and sensitivity. The results of SDAR modeling were verified using SAR. For SAR modeling, a decision forest approach involving from 6 to 17 Mold2 descriptors in a tree was used. Average rates of correct classification of SDAR and SAR models in a hundred CV tests were 60% and 61% for CYP3A4, and 62% and 70% for CYP2D6, respectively. The rates of correct classification of SDAR and SAR models in the EV test were 73% and 86% for CYP3A4, and 76% and 90% for CYP2D6, respectively. Thus, both SDAR and SAR methods demonstrated a comparable performance in modeling a large set of structurally diverse data. Based on unique NMR structural descriptors, the new SDAR modeling method complements the existing SAR techniques, providing an independent estimator that can increase confidence in a structure-activity assessment. When modeling was applied to hazardous environmental chemicals, it was found that up to 20% of them may be substrates and up to 10% of them may be inhibitors of the CYP3A4 and CYP2D6 isoforms. The developed models provide a rare opportunity for the environmental health branch of the public health service to extrapolate to hazardous chemicals directly from human clinical data. Therefore, the pharmacological and environmental health branches are both expected to benefit from these reported models.
doi:10.3390/molecules17033383 pmid:22421792 pmcid:PMC6268752 fatcat:dsrnpaqiafavlejw5kzk5ytuha