Hierarchical Organization of Molecular Structure Computations
Journal of Computational Biology
The task of computing molecular structure from combinations of experimental and theoretical constraints i s expensive because of the large number of estimated parameters (the 3D coordinates of each atom) and the rugged landscape of many objective functions. For large molecular ensembles with multiple protein and nucleic acid components, the problem of maintaining tractability i n structural computations becomes critical. A well known strategy for solving difficult problems is divide and
... divide and conquer. For molecular computations, there are two ways i n which problems can be divided: (1) using the natural hierarchy within biological macromolecules (taking advantage of primary sequence, secondary structural subunits and tertiary structural motifs, when they are known), and (2) using the hierarchy that results from analyzing the distribution of structural constraints (providing information about which substructures are constrained to one another). In this paper, we show that these two hierarchies can be complementary and can provide information for efficient decomposition of structural computations. We demonstrate four methods for building such hierarchies-two automated heuristics that use both natural and empirical hierarchies, one knowledge-based process using both hierarchies, and one method based o n the natural hierarchy alone-and apply them to a data set for the procaryotic 30S ribosomal subunit using our probabilistic least squares structure estimation algorithm. We show that the three methods that combine natural hierarchies with empirical hierarchies create decompositions which increase the empirical efficiency of computations by as much as 50-fold. There is only half this gain when using the natural decomposition alone. Although the knowledge-based method performs marginally better, the automatic heuristics are easier to use, scale more reliably to larger problems, and can match the performance of knowledge-based methods if provided with basic structural information.