MMTF—An efficient file format for the transmission, visualization, and analysis of macromolecular structures

Anthony R. Bradley, Alexander S. Rose, Antonín Pavelka, Yana Valasatava, Jose M. Duarte, Andreas Prlić, Peter W. Rose, Dina Schneidman
2017 PLoS Computational Biology  
Recent advances in experimental techniques have led to a rapid growth in complexity, size, and number of macromolecular structures that are made available through the Protein Data Bank. This creates a challenge for macromolecular visualization and analysis. Macromolecular structure files, such as PDB or PDBx/mmCIF files can be slow to transfer, parse, and hard to incorporate into third-party software tools. Here, we present a new binary and compressed data representation, the MacroMolecular
more » ... smission Format, MMTF, as well as software implementations in several languages that have been developed around it, which address these issues. We describe the new format and its APIs and demonstrate that it is several times faster to parse, and about a quarter of the file size of the current standard format, PDBx/mmCIF. As a consequence of the new data representation, it is now possible to visualize structures with millions of atoms in a web browser, keep the whole PDB archive in memory or parse it within few minutes on average computers, which opens up a new way of thinking how to design and implement efficient algorithms in structural bioinformatics. The PDB archive is available in MMTF file format through web services and data that are updated on a weekly basis. This is a PLOS Computational Biology Software paper. OPEN ACCESS Citation: Bradley AR, Rose AS, Pavelka A, Valasatava Y, Duarte JM, Prlić A, et al. (2017) MMTF-An efficient file format for the transmission, visualization, and analysis of macromolecular structures. PLoS Comput Biol 13 (6): e1005575. https://doi.org/10.
doi:10.1371/journal.pcbi.1005575 pmid:28574982 pmcid:PMC5473584 fatcat:niriuttkdnhcdp2l5mfc6vocdq