Exploring the repertoire of RNA secondary motifs using graph theory; implications for RNA design

H. H. Gan
2003 Nucleic Acids Research  
Understanding the structural repertoire of RNA is crucial for RNA genomics research. Yet current methods for ®nding novel RNAs are limited to small or known RNA families. To expand known RNA structural motifs, we develop a two-dimensional graphical representation approach for describing and estimating the size of RNA's secondary structural repertoire, including naturally occurring and other possible RNA motifs. We employ tree graphs to describe RNA tree motifs and more general (dual) graphs to
more » ... escribe both RNA tree and pseudoknot motifs. Our estimates of RNA's structural space are vastly smaller than the nucleotide sequence space, suggesting a new avenue for ®nding novel RNAs. Speci®cally our survey shows that known RNA trees and pseudoknots represent only a small subset of all possible motifs, implying that some of the 'missing' motifs may represent novel RNAs. To help pinpoint RNA-like motifs, we show that the motifs of existing functional RNAs are clustered in a narrow range of topological characteristics. We also illustrate the applications of our approach to the design of novel RNAs and automated comparison of RNA structures; we report several occurrences of RNA motifs within larger RNAs. Thus, our graph theory approach to RNA structures has implications for RNA genomics, structure analysis and design.
doi:10.1093/nar/gkg365 pmid:12771219 pmcid:PMC156709 fatcat:otds6qix2rbodnjtmiama37fwy