Have we seen all structures corresponding to short protein fragments in the Protein Data Bank? An update
Protein Engineering Design & Selection
Assembling short fragments from known structures has been a widely used approach to construct novel protein structures. To what extent there exist structurally similar fragments in the database of known structures for short fragments of a novel protein is a question that is fundamental to this approach. This work addresses that question for seven-, nine-and 15-residue fragments. For each fragment size, two databases, a query database and a template database of fragments from high-quality
... high-quality protein structures in SCOP20 and SCOP90, respectively, were constructed. For each fragment in the query database, the template database was scanned to ®nd the lowest r.m.s.d. fragment among non-homologous structures. For sevenresidue fragments, there is a 99% probability that there exists such a fragment within 0.7 A Ê r.m.s.d. for each loop fragment. For nine-residue fragments there is a 96% probability of a fragment within 1 A Ê r.m.s.d., while for 15residue fragments there is a 91% probability of a fragment within 2 A Ê r.m.s.d.. These results, which update previous studies, show that there exists suf®cient coverage to model even a novel fold using fragments from the Protein Data Bank, as the current database of known structures has increased enormously in the last few years. We have also explored the use of a grid search method for loop homology modeling and make some observations about the use of a grid search compared with a database search for the loop modeling problem.