Models of protein–ligand crystal structures: trust, but verify

Marc C. Deller, Bernhard Rupp
2015 Journal of Computer-Aided Molecular Design  
X-ray crystallography provides the most accurate models of protein-ligand structures. These models serve as the foundation of many computational methods including structure prediction, molecular modelling, and structure-based drug design. The success of these computational methods ultimately depends on the quality of the underlying protein-ligand models. X-ray crystallography offers the unparalleled advantage of a clear mathematical formalism relating the experimental data to the proteinligand
more » ... odel. In the case of X-ray crystallography, the primary experimental evidence is the electron density of the molecules forming the crystal. The first step in the generation of an accurate and precise crystallographic model is the interpretation of the electron density of the crystal, typically carried out by construction of an atomic model. The atomic model must then be validated for fit to the experimental electron density and also for agreement with prior expectations of stereochemistry. Stringent validation of protein-ligand models has become possible as a result of the mandatory deposition of primary diffraction data, and many computational tools are now available to aid in the validation process. Validation of protein-ligand complexes has revealed some instances of overenthusiastic interpretation of ligand density. Fundamental concepts and metrics of protein-ligand quality validation are discussed and we highlight software tools to assist in this process. It is essential that end users select high quality protein-ligand models for their computational and biological studies, and we provide an overview of how this can be achieved. Keywords Crystal structure Á Protein structure Á Proteinligand complex Á Quality control Á Structure validation Á Structure-based drug design Protein-ligand models Models of biomolecular structures determined experimentally, by X-ray diffraction or Nuclear Magnetic Resonance (NMR) spectroscopy, are deposited in a world-wide public repository, the Protein Data Bank (PDB, http://www.pdb.org) [1] [2] [3] [4] . Electron Microscopy (EM) models are deposited in a separate database, EMDB (http://www.emdatabank.org/) [5, 6]. As a basis for computational studies, and ultimately structure-based drug design (SBDD), atomic models of the highest quality are the most desirable [7, 8] . Atomic models determined by X-ray crystallography are often preferable to those generated by NMR spectroscopy, especially for use in SBDD, although both techniques have their own specific strengths and weaknesses [9, 10] . Accurate atomic models are also essential for computational methods such as ligand docking, active site identification, and in silico lead optimization. Multiple structures of a protein target in complex with small molecule compounds (or fragments) [11] , as well as complementary apo (ligand-free) structures are often needed in order to obtain a comprehensive atomic-level view of the
doi:10.1007/s10822-015-9833-8 pmid:25665575 pmcid:PMC4531100 fatcat:aciit6767faunktpk4mtk5m3oq