Reversible Folding of Hyperstable RNA Tetraloops Using Molecular Dynamics Simulations

Angel E. Garcia, Jacob Miner, Alan A. Chen
2015 Biophysical Journal  
The ongoing transformation of biology to a quantitative discipline has drastically increased our opportunities to unravel the mechanisms that relate the dynamics of biological systems to their functions as it allows for the investigation of such systems at spatial and temporal scales never observed before. The biggest challenge today is to assimilate the wealth of information generated in this process into a conceptual framework. We face issues with the volume of data generated (a Big Data
more » ... enge) as well as with the complexity of the systems they represent. In this talk I will show examples for which a combination of mathematics, physics, and biology provides solutions to these challenges. I will focus specifically on the concept of networks in biology, their morphologies and dynamic behaviors. 1854-Wkshp Glycan Biosynthesis: Structure, Information, and Heterogeneity The surfaces of all living cells are decorated with branched sugar polymers known as glycans. These information-rich structures confer cells with a recognizable molecular identity, and underlie many specific cell-cell interactions. Analytic methods -including NMR, mass-spectrometry, and glycan arrays -now permit the routine profiling of glycans associated with various cells or proteins. This has stimulated efforts to build comprehensive and searchable glycan databases. However, from an informatics perspective glycans present multiple challenges. First, whereas nucleotide and amino acid chains are efficiently represented as strings, sugars can polymerize into complex tree-like objects. The potential combinatorial space of glycans is therefore much larger than that of proteins. Second, many specific molecular interactions appear to be mediated by groups of closely-related glycan variants rather than by a single well-defined structure. This phenomenon of "micro-heterogeneity" makes it difficult to rigorously characterize the glycan repertoire of a cell. In this workshop, I will use ideas from algorithmic self-assembly to show that glycan structure and diversity are best understood through the lens of glycan biosynthesis. I will demonstrate that a specific glycan structure is the outcome of glycosyltransferase enzymes acting according to simple rules in a specific order, like workers on a factory floor. Errors in this process produce a well-defined spectrum of glycan by-products, precisely matching the observed micro-heterogeneity in real glycan profiles. This predictive theoretical framework allows us to use glycans as sensitive cell-biological probes. It provides a unifying perspective within which the rich and growing datasets of glycan structures can be organized and fully utilized. Extracting knowledge from large, heterogeneous, unstructured and highdimensional data is one of the major challenges for large-scale machine learning algorithms. In this talk, I will present our recent results developing unsupervised machine learning approaches to explore such data sets. A large number of these datasets follow heavy-tailed distributions, characterized by long-range dependencies. We quantify the tails of these distributions using higher order statistics and use tensor-based representations to build data mining algorithms for: (1) online detection of events that signify anomalies in spatio-temporal patterns; (2) building low-dimensional latent variable models to capture the intrinsic multiscale structure; and (3) hierarchical clustering and visual organization of the data to gain relevant insights. We will illustrate these approaches on a variety of applications including the integration of sparse experimental observations with atomistic-scale information for understanding the function of cellular systems. We will also discuss how these approaches can be widely applied to other domains.
doi:10.1016/j.bpj.2014.11.2029 fatcat:edclqo6iijdwvjduhk3r3w5tkq