Predicting activities without computing descriptors: graph machines for QSAR§

A. Goulon, T. Picot, A. Duprat, G. Dreyfus
2007 SAR and QSAR in environmental research (Print)  
We describe graph machines, an alternative approach to traditional machinelearning-based QSAR, which circumvents the problem of designing, computing and selecting molecular descriptors. In that approach, which is similar in spirit to recursive networks, molecules are considered as structured data, represented as graphs. For each example of the data set, a mathematical function (graph machine) is built, whose structure reflects the structure of the molecule under consideration; it is the
more » ... ion of identical parameterized functions, called "node functions" (e.g. a feedforward neural network). The parameters of the node functions, shared both within and across the graph machines, are adjusted during training with the "shared weights" technique. Model selection is then performed by traditional cross-validation. Therefore, the designer's main task consists in finding the optimal complexity for the node function. The efficiency of this new approach has been demonstrated in many QSAR or QSPR tasks, as well as in modeling the activities of complex chemicals (e.g. the toxicity of a family of phenols or the anti-HIV activities of HEPT derivatives), generally outperforming traditional techniques without requiring the selection and computation of descriptors.
doi:10.1080/10629360601054313 pmid:17365965 fatcat:53gdkwazrvexvdmryckpgpzhkm