How Does the Methodology of 3D Structure Preparation Influence the Quality of pKa Prediction?

Stanislav Geidl, Radka Svobodová Vařeková, Veronika Bendová, Lukáš Petrusek, Crina-Maria Ionescu, Zdeněk Jurka, Ruben Abagyan, Jaroslav Koča
2015 Journal of Chemical Information and Modeling  
The acid dissociation constant is an important molecular property and it can be successfully predicted by Quantitative Structure-Property Relationship (QSPR) models, even for in silico designed molecules. We analyzed how the methodology of in silico 3D structure preparation influences the quality of QSPR models. Specifically, we evaluated and compared QSPR models based on six different 3D structure sources (DTP NCI, Pubchem, Balloon, Frog2, OpenBabel and RDKit) combined with four different
more » ... of optimization. These analyses were performed for three classes of molecules (phenols, carboxylic acids, anilines) and the QSPR model descriptors were quantum mechanical (QM) and empirical partial atomic charges. Specifically, we developed 516 QSPR models and afterwards systematically analyzed the influence of the 3D structure source and other factors on their quality. Our results confirmed that QSPR models based on partial atomic charges are able to predict pK a with high accuracy. We also confirmed that ab-initio and semiempirical QM charges provide very accurate QSPR models, and using empirical charges based on electronegativity equalization is also acceptable, as well as advantageous, since their calculation is very fast. On the other hand, Gasteiger-Marsili empirical charges are not applicable for pK a prediction. We later found that QSPR models for some classes of molecules (carboxylic acids) are less accurate. In this context, we compared the influence of different 3D structure sources. We found that an appropriate selection of 3D structure source and optimization method is essential for the successful QSPR modeling of pK a . Specifically, the 3D structures from the DTP NCI and Pubchem databases performed the best, as they provided very accurate QSPR models for all the tested molecular classes and charge calculation approaches, and they do not require optimization. Also Frog2 performed very well. Other 3D structure sources can also be used, but are not so robust, and an
doi:10.1021/ci500758w pmid:26010215 pmcid:PMC5098400 fatcat:yyfxfrsnsbe3xc6j5rmiiradya