A Polynomial-Time Nuclear Vector Replacement Algorithm for Automated NMR Resonance Assignments

Christopher James Langmead, Anthony Yan, Ryan Lilien, Lincong Wang, Bruce Randall Donald
2004 Journal of Computational Biology  
High-throughput NMR structural biology can play an important role in structural genomics. We report an automated procedure for high-throughput NMR resonance assignment for a protein of known structure, or of a homologous structure. These assignments are a prerequisite for probing protein-protein interactions, protein-ligand binding, and dynamics by NMR. Assignments are also the starting point for structure determination and refinement. A new algorithm, called Nuclear Vector Replacement (NVR) is
more » ... introduced to compute assignments that optimally correlate experimentally measured NH residual dipolar couplings (RDCs) to a given a priori whole-protein 3D structural model. The algorithm requires only uniform 15 N-labeling of the protein and processes unassigned H N -15 N HSQC spectra, H N -15 N RDCs, and sparse H N -H N NOE's (d NN s), all of which can be acquired in a fraction of the time needed to record the traditional suite of experiments used to perform resonance assignments. NVR runs in minutes and efficiently assigns the (H N , 15 N) backbone resonances as well as the d NN s of the 3D 15 N-NOESY spectrum, in O(n 3 ) time. The algorithm is demonstrated on NMR data from a 76-residue protein, human ubiquitin, matched to four structures, including one mutant (homolog), determined either by x-ray crystallography or by different NMR experiments (without RDCs). NVR achieves an assignment accuracy of 92-100%. We further demonstrate the feasibility of our algorithm for different and larger proteins, using NMR data for hen lysozyme (129 residues, 97-100% accuracy) and streptococcal protein G (56 residues, 100% accuracy), matched to a variety of 3D structural models. Finally, we extend NVR to a second application, 3D structural homology detection, and demonstrate that NVR is able to identify structural homologies between proteins with remote amino acid sequences using a database of structural models.
doi:10.1089/1066527041410436 pmid:15285893 fatcat:cdfib2ztkbbxtl4anuukjp3lba