Real-time structural motif searching in proteins using an inverted index strategy [article]

Sebastian Bittrich, Stephen K. Burley, Alexander S. Rose
<span title="2020-09-12">2020</span> <i title="Cold Spring Harbor Laboratory"> bioRxiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Biochemical and biological functions of proteins are the product of both the overall fold of the polypeptide chain, and, typically, structural motifs made up of smaller numbers of amino acids constituting a catalytic center or a binding site. Detection of such structural motifs can provide valuable insights into the function(s) of previously uncharacterized proteins. Technically, this remains an extremely challenging problem because of the size of the Protein Data Bank (PDB) archive. Existing
more &raquo; ... thods depend on a clustering by sequence similarity and can be computationally slow. We have developed a new approach that uses an inverted index strategy capable of analyzing >160,000 PDB structures with unmatched speed. The efficiency of the inverted index method depends critically on identifying the small number of structures containing the query motif and ignoring most of the structures that are irrelevant. Our approach (implemented at https://motif.rcsb.org) enables real-time retrieval and superposition of structural motifs, either extracted from a reference structure or uploaded by the user. Herein, we describe the method and present five case studies that exemplify its efficacy and speed for analyzing 3D structures of both proteins and nucleic acids.
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1101/2020.09.11.293977">doi:10.1101/2020.09.11.293977</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/qzxwtcodinco5i4q4lfelt4tdm">fatcat:qzxwtcodinco5i4q4lfelt4tdm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20201213034006/https://www.biorxiv.org/content/biorxiv/early/2020/09/12/2020.09.11.293977.full.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/70/eb/70eba8797799c92d3c8a4b7abbf8c27a747a4a3e.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1101/2020.09.11.293977"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> biorxiv.org </button> </a>