Sequence-Based Prediction of RNA-Binding Residues in Proteins [chapter]

Rasna R. Walia, Yasser EL-Manzalawy, Vasant G. Honavar, Drena Dobbs
2016 Msphere  
Identifying individual residues in the interfaces of protein-RNA complexes is important for understanding the molecular determinants of protein-RNA recognition and has many potential applications. Recent technical advances have led to several high-throughput experimental methods for identifying partners in protein-RNA complexes, but determining RNA-binding residues in proteins is still expensive and timeconsuming. This chapter focuses on available computational methods for identifying which
more » ... o acids in an RNA-binding protein participate directly in contacting RNA. Step-by-step protocols for using three different web-based servers to predict RNA-binding residues are described. In addition, currently available web servers and software tools for predicting RNA-binding sites, as well as databases that contain valuable information about known protein-RNA complexes, RNA-binding motifs in proteins, and protein-binding recognition sites in RNA are provided. We emphasize sequence-based methods that can reliably identify interfacial residues without the requirement for structural information regarding either the RNA-binding protein or its RNA partner. 206 16 ]-at least as many RBPs as DNA-binding transcription factors [ 17 ]-our understanding of the cellular roles of RBPs, how they recognize their targets, and how they are regulated has lagged far behind our understanding of transcription factors. Recent exciting developments have begun to close this gap, providing proteomewide catalogs and databases of RNA-binding proteins, "RNA interactomes" or "RBPomes" [ 18 -21 ], an impressive compendium of RNA recognition sites [ 22 ] , detailed views of the architecture and dynamics of important RNP complexes and RNA viruses, e.g., refs. [ 23 , 24 ], and substantial progress in engineering RBPs with customized functions and high specifi city for desired RNA targets [ 25 , 26 ]. RNA-binding proteins are often modular, and many wellcharacterized RBPs contain one or more conserved RNA-binding domains or motifs [ 1 , 27 ]. The RNA recognition motif (RRM), for example, is one of the most abundant structural motifs in vertebrate proteins, and is found in ~2 % of all human proteins [ 25 ] . Other abundant RNA-binding domains and motifs include the KH, dsRBD, DEAD-Box, PUF, SAM, and ZnF domains [ 1 , 27 ], all which have conserved structures and can be recognized in the primary sequences of proteins ( see Subheading 3.1 , step 6 below). However, only ~50 % of the mRNA-binding proteins identifi ed by "interactome capture" in HeLa cells contain a characterized RNAbinding domain [ 19 ] . Also, many RBPs bind RNA through intrinsically disordered regions (IDRs), which are thought to promote formation of extended interaction interfaces and contribute to the generation of higher order assemblies and the formation of RNA granules [ 28 , 29 ]. Finally, a survey of available structures for protein-RNA complexes revealed that the majority of amino acids in the protein-RNA interface are not part of a characterized RNAbinding motif [ 30 ] and the presence of an RNA-binding signature does not conclusively identify the specifi c amino acids involved in RNA recognition and binding. The most defi nitive way to identify RNA-binding residues (i.e., residues that directly contact RNA) ( see Note 1 ) is to extract them from a high-resolution experimental structure of a protein-RNA complex. Three-dimensional structures are available for only a small fraction of the known protein-RNA complexes [ 31 ] . As of December 16, 2015, the number of solved structures in the Protein Data Bank (PDB) for protein-RNA complexes is only 1661 out of 114,402 total structures, and ~40 % of the RNA-containing structures in the PDB correspond to ribosomes. Protein-RNA complexes can be very diffi cult to crystallize and many are too large for structure determination using NMR spectroscopy [ 32 , 33 ]. Fortunately, recent advances in NMR [ 34 ], cryo-electron microscopy [ 35 ] , and small-angle X-ray scattering (SAXS) [ 36 ] offer tremendous promise for providing structural details for RNPs that have been recalcitrant to experimental structure determination. At present, in the absence of a 3D structure, several types of Rasna R. Walia et al.
doi:10.1007/978-1-4939-6406-2_15 pmid:27787829 pmcid:PMC5796408 fatcat:552w4ny4rvcntj3kv55phjpfii