A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Levenshtein Augmentation Improves Performance of SMILES Based Deep-Learning Synthesis Prediction
[post]
2020
unpublished
<p>SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call "Levenshtein augmentation" which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and
doi:10.26434/chemrxiv.12562121.v2
fatcat:hmvwfwtbabaafdcxlt3w2dzjem