Exploring the Limits of Graph Invariant- and Spectrum-Based Discrimination of (Sub)structures‡

Christoph Rücker, Gerta Rücker, Markus Meringer
2002 Journal of chemical information and computer sciences  
The limits of a recently proposed computer method for finding all distinct substructures of a chemical structure are systematically explored within comprehensive graph samples which serve as supersets of the graphs corresponding to saturated hydrocarbons, both acyclic (up to n = 20) and (poly)cyclic (up to n = 10). Several pairs of smallest graphs and compounds are identified that cannot be distinguished using selected combinations of invariants such as combinations of Balaban's index J and
more » ... h matrix eigenvalues. As the most important result, it can now be stated that the computer program NIMSG, using J and distance eigenvalues, is safe within the domain of mono-through tetracyclic saturated hydrocarbon substructures up to n = 10 (oligocyclic decanes) and of all acyclic alkane substructures up to n = 19 (nonadecanes), i.e. it will not miss any of these substructures. For the regions surrounding this safe domain, upper limits are found for the numbers of substructures that may be lost in the worst case, and these are low. This taken together means that the computer program can be reasonably employed in chemistry whenever one is interested in finding the saturated hydrocarbon substructures. As to unsaturated and heteroatom containing substructures, there are reasons to conjecture that the method's resolving power for them is similar.
doi:10.1021/ci010121y pmid:12086526 fatcat:sf45hbkckbdnnbos7wacmqz2zi