A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is
The structured nature of chemical data means machine learning models trained to predict protein-ligand binding risk overfitting the data, impairing their ability to generalise and make accurate predictions for novel candidate ligands. To address this limitation, data debiasing algorithms systematically partition the data to reduce bias. When models are trained using debiased data splits, the reward for simply memorising the training data is reduced, suggesting that the ability of the model todoi:10.26434/chemrxiv.8139194 fatcat:lxyuxokshngmjhqw4hadlsno7i