A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is
Many biological databases that provide comparative genomics information and tools are now available on the internet. While certainly quite useful, to our knowledge none of the existing databases combine results from multiple comparative genomics methods with manually curated information from the literature. Here we describe the Princeton Protein Orthology Database (P-POD, http://ortholog.princeton.edu), a user-friendly database system that allows users to find and visualize the phylogeneticdoi:10.1371/journal.pone.0000766 pmid:17712414 pmcid:PMC1942082 fatcat:75wtjbkedrax3mt52xvscn4xtm
more »... tionships among predicted orthologs (based on the OrthoMCL method) to a query gene from any of eight eukaryotic organisms, and to see the orthologs in a wider evolutionary context (based on the Jaccard clustering method). In addition to the phylogenetic information, the database contains experimental results manually collected from the literature that can be compared to the computational analyses, as well as links to relevant human disease and gene information via the OMIM, model organism, and sequence databases. Our aim is for the P-POD resource to be extremely useful to typical experimental biologists wanting to learn more about the evolutionary context of their favorite genes. P-POD is based on the commonly used Generic Model Organism Database (GMOD) schema and can be downloaded in its entirety for installation on one's own system. Thus, bioinformaticians and software developers may also find P-POD useful because they can use the P-POD database infrastructure when developing their own comparative genomics resources and database tools.
The Biological General Repository for Interaction Datasets (BioGRID: https://thebiogrid.org) is an open access database dedicated to the curation and archival storage of protein, genetic and chemical interactions for all major model organism species and humans. As of September 2018 (build 3.4.164), BioGRID contains records for 1 598 688 biological interactions manually annotated from 55 809 publications for 71 species, as classified by an updated set of controlled vocabularies for experimentaldoi:10.1093/nar/gky1079 pmid:30476227 pmcid:PMC6324058 fatcat:72ppbdg3vbbhzfeugxmqu7athm
more »... etection methods. BioGRID also houses records for >700 000 post-translational modification sites. BioGRID now captures chemical interaction data, including chemical-protein interactions for human drug targets drawn from the DrugBank database and manually curated bioactive compounds reported in the literature. A new dedicated aspect of BioGRID annotates genome-wide CRISPR/Cas9-based screens that report gene-phenotype and gene-gene relationships. An extension of the BioGRID resource called the Open Repository for CRISPR Screens (ORCS) database (https://orcs.thebiogrid.org) currently contains over 500 genome-wide screens carried out in human or mouse cell lines. All data in BioGRID is made freely available without restriction, is directly downloadable in standard formats and can be readily incorporated into existing applications via our web service platforms. BioGRID data are also freely distributed through partner model organism databases and meta-databases.
Rose Oughtred, Nathalie Bédard, Alice Vrielink and Simon S. ...doi:10.1074/jbc.273.29.18435 pmid:9660812 fatcat:2febauniyja3baoykzuo2xxete
The Biological General Repository for Interaction Datasets (BioGRID: https://thebiogrid.org) is an open access database dedicated to the annotation and archival of protein, genetic and chemical interactions for all major model organism species and humans. As of September 2016 (build 3.4.140), the Bi-oGRID contains 1 072 173 genetic and protein interactions, and 38 559 post-translational modifications, as manually annotated from 48 114 publications. This dataset represents interaction recordsdoi:10.1093/nar/gks1158 pmid:23203989 pmcid:PMC3531226 fatcat:nmn2ijk3qje6jfpsgpzgk2p2nm
more »... 66 model organisms and represents a 30% increase compared to the previous 2015 BioGRID update. Bi-oGRID curates the biomedical literature for major model organism species, including humans, with a recent emphasis on central biological processes and specific human diseases. To facilitate network-based approaches to drug discovery, BioGRID now incorporates 27 501 chemical-protein interactions for human drug targets, as drawn from the DrugBank database. A new dynamic interaction network viewer allows the easy navigation and filtering of all genetic and protein interaction data, as well as for bioactive compounds and their established targets. BioGRID data are directly downloadable without restriction in a variety of standardized formats and are freely distributed through partner model organism databases and meta-databases.
A detailed step-by-step guide to the BioGRID web interface is now available (Oughtred et al., submitted) . ...doi:10.1093/nar/gku1204 pmid:25428363 pmcid:PMC4383984 fatcat:jeok6meldzhmphu3mpxqxpeoiy
We can now routinely identify coding variants within individual human genomes. A pressing challenge is to determine which variants disrupt the function of disease-associated genes. Both experimental and computational methods exist to predict pathogenicity of human genetic variation. However, a systematic performance comparison between them has been lacking. Therefore, we developed and exploited a panel of 26 yeast-based functional complementation assays to measure the impact of 179 variantsdoi:10.1101/gr.192526.115 pmid:26975778 pmcid:PMC4864455 fatcat:cxsftwnkznburcpopbj2bopwze
more »... disease-and 78 non-disease-associated variants) from 22 human disease genes. Using the resulting reference standard, we show that experimental functional assays in a 1-billion-year diverged model organism can identify pathogenic alleles with significantly higher precision and specificity than current computational methods.
Determining the complete Arabidopsis (Arabidopsis thaliana) protein-protein interaction network is essential for understanding the functional organization of the proteome. Numerous small-scale studies and a couple of large-scale ones have elucidated a fraction of the estimated 300,000 binary protein-protein interactions in Arabidopsis. In this study, we provide evidence that a docking algorithm has the ability to identify real interactions using both experimentally determined and predicteddoi:10.1104/pp.18.01216 fatcat:jzbo36dvmrh5fjyotmamvewyry
more »... in structures. We ranked 0.91 million interactions generated by all possible pairwise combinations of 1,346 predicted structure models from an Arabidopsis predicted "structure-ome" and found a significant enrichment of real interactions for the topranking predicted interactions, as shown by cosubcellular enrichment analysis and yeast two-hybrid validation. Our success rate for computationally predicted, structure-based interactions was 63% of the success rate for published interactions naively tested using the yeast two-hybrid system and 2.7 times better than for randomly picked pairs of proteins. This study provides another perspective in interactome exploration and biological network reconstruction using protein structural information. We have made these interactions freely accessible through an improved Arabidopsis Interactions Viewer and have created community tools for accessing these and ;2.8 million other protein-protein and protein-DNA interactions for hypothesis generation by researchers worldwide. The Arabidopsis Interactions Viewer is freely available at Figure 8 . Example outputs from the updated Arabidopsis Interactions Viewer. A, "Stacked" layout from outside the cell ("extracellular") to nucleus. Circular nodes represent proteins, and the numbers in each node represent MapMan terms, while colored "doughnuts" around the nodes represent subcellular localizations (predicted localizations have been turned off in this example) from SUBA. Clicking on the square chromosomal containers (chromosome 5 in this example) calls up the protein-DNA
These authors contributed equally to this work. Citation details: Islamaj Do gan,R., Kim,S., Chatr-Aryamontri,A. et al. The BioC-BioGRID corpus: full text articles annotated for curation of protein-protein and genetic interactions. Abstract A great deal of information on the molecular genetics and biochemistry of model organisms has been reported in the scientific literature. However, this data is typically described in free text form and is not readily amenable to computational analyses. Todoi:10.1093/database/baw147 pmid:28077563 pmcid:PMC5225395 fatcat:s7dpik24cfct3ihiz3sbghbbdq
more »... s end, the BioGRID database systematically curates the biomedical literature for genetic and protein interaction data. This data is provided in a standardized computationally tractable format and includes structured annotation of experimental evidence. BioGRID curation necessarily involves substantial human effort by expert curators who must read each publication to extract the relevant information. Computational text-mining methods offer the potential to augment and accelerate manual curation. To facilitate the development of practical text-mining strategies, a new challenge was organized in BioCreative V for the BioC task, the collaborative Biocurator Assistant Task. This was a noncompetitive, cooperative task in which the participants worked together to build BioCcompatible modules into an integrated pipeline to assist BioGRID curators. As an integral part of this task, a test collection of full text articles was developed that contained both biological entity annotations (gene/protein and organism/species) and molecular interaction annotations (protein-protein and genetic interactions (PPIs and GIs)). This collection, which we call the BioC-BioGRID corpus, was annotated by four BioGRID curators over three rounds of annotation and contains 120 full text articles curated in a dataset representing two major model organisms, namely budding yeast and human. The BioC-BioGRID corpus contains annotations for 6409 mentions of genes and their Entrez Gene IDs, 186 mentions of organism names and their NCBI Taxonomy IDs, 1867 mentions of PPIs and 701 annotations of PPI experimental evidence statements, 856 mentions of GIs and 399 annotations of GI evidence statements. The purpose, characteristics and possible future uses of the BioC-BioGRID corpus are detailed in this report.
Journal of Biology
The study of complex biological networks and prediction of gene function has been enabled by high-throughput (HTP) methods for detection of genetic and protein interactions. Sparse coverage in HTP datasets may, however, distort network properties and confound predictions. Although a vast number of well substantiated interactions are recorded in the scientific literature, these data have not yet been distilled into networks that enable system-level inference. We describe here a comprehensivedoi:10.1186/jbiol36 pmid:16762047 pmcid:PMC1561585 fatcat:uoapffbytfbc7fy7sqsh2gmrza
more »... base of genetic and protein interactions, and associated experimental evidence, for the budding yeast Saccharomyces cerevisiae, as manually curated from over 31,793 abstracts and online publications. This literature-curated (LC) dataset contains 33,311 interactions, on the order of all extant HTP datasets combined. Surprisingly, HTP protein-interaction datasets currently achieve only around 14% coverage of the interactions in the literature. The LC network nevertheless shares attributes with HTP networks, including scale-free connectivity and correlations between interactions, abundance, localization, and expression. We find that essential genes or proteins are enriched for interactions with other essential genes or proteins, suggesting that the global network may be functionally unified. This interconnectivity is supported by a substantial overlap of protein and genetic interactions in the LC dataset. We show that the LC dataset considerably improves the predictive power of network-analysis approaches. The full LC dataset is available at the BioGRID (http://www.thebiogrid.org) and SGD (http://www.yeastgenome.org/) databases. Comprehensive datasets of biological interactions derived from the primary literature provide critical benchmarks for HTP methods, augment functional prediction, and reveal system-level attributes of biological networks.
RELATED INFORMATION For background on the BioGRID database, see Introduction: BioGRID: A Resource for Studying Biological Interactions in Yeast (Oughtred et al. 2015) . ...doi:10.1101/pdb.prot088880 pmid:26729909 pmcid:PMC5975959 fatcat:ublifsrndreuhomte4g7527r2e
Citation details: Sun Kim, S., Do gan, R.I., Chatr-Aryamontri, A. et al. BioCreative V BioC track overview: collaborative biocurator assistant task for BioGRID.doi:10.1093/database/baw121 pmid:27589962 pmcid:PMC5009341 fatcat:bqe4o77fizadhf5e3xese3i23y
« Previous Showing results 1 — 15 out of 93 results