A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is
Underlying Aquaria is PSSH2, a Sequel database of sequence-to-structure alignments (Schafferhans and O'Donoghue 2020). ... Underlying Aquaria is PSSH2, a Sequel database of sequence-to-structure alignments (Schafferhans and O'Donoghue 2020). ...doi:10.1101/2021.09.10.459756 fatcat:6xck67x4rvbxvlr5ucrchspe4a
The COVID-19 pandemic spawned by SARS-CoV-2 requires quick characterisation of the protein structures comprising the viral proteome. As experimentally determined 3D structures become available, these data can be augmented by high-throughput generation of homology models, thereby helping researchers leverage structural data to gain detailed insights into the molecular mechanisms underlying COVID-19. These insights, in turn, help in generating hypotheses aimed at identifying druggable targets fordoi:10.1101/2020.07.16.207308 fatcat:5jb7yrkdqffgzo7arlwkityuiy
more »... the development of therapies intervention, including vaccines. We present an online resource that provides 872 structural models, derived from all current entries in the PDB that have detectable sequence similarity to any of the SARS-CoV-2 proteins. The matching of sequence-to-structure alignments were generated by aligning pairs of Hidden Markov Models (HMMs) via HHblits. The structures are presented in the Aquaria molecular graphics systems, which was designed to facilitate overlay of sequence features, e.g., SNPs and posttranslational modifications from UniProt. Aquaria has recently been enhanced to include a much richer set of sequence features, including predictions from the PredictProtein and CATH resources. The COVID-19 models - together with 32,717 sequence features - are available at https://aquaria.ws/covid19. Our resource provides researchers with a wealth of information on the molecular mechanisms of COVID-19; the information can easily be accessed, and, to the best of our knowledge, is currently not available at other resources. The resource provides an immediate visual overview of what is known - and not known - about the 3D structure of the viral proteome, thereby helping direct future research.
To understand the molecular mechanisms that give rise to a protein's function, biologists often need to (i) find and access all related atomic-resolution 3D structures, and (ii) map sequence-based features (e.g., domains, single-nucleotide polymorphisms, post-translational modifications) onto these structures. Results: To streamline these processes we recently developed Aquaria, a resource offering unprecedented access to protein structure information based on an all-against-all comparison ofdoi:10.1186/1471-2105-16-s11-s7 pmid:26329268 pmcid:PMC4547178 fatcat:tzp3vv2qxbc3jpr4tbyrckqos4
more »... issProt and PDB sequences. In this work, we provide a requirements analysis for several frequently occuring tasks in molecular biology and describe how design choices in Aquaria meet these requirements. Finally, we show how the interface can be used to explore features of a protein and gain biologically meaningful insights in two case studies conducted by domain experts. Conclusions: The user interface design of Aquaria enables biologists to gain unprecedented access to molecular structures and simplifies the generation of insight. The tasks involved in mapping sequence features onto structures can be conducted easier and faster using Aquaria.
et al, 2021) , as is the full PSSH2 database (Data ref: Schafferhans & O'Donoghue, 2020) . ... derived in this work can be directly accessed from links provided in Datasets EV1-EV3; additionally, the underlying SARS-CoV-2 sequence-to-structure alignments are available online for download (Data ref: Schafferhans ...doi:10.15252/msb.202010079 pmid:34519429 fatcat:swflm5kc2rfwbffbomvbfub2uq
We investigated the influence of oxygen on the performance of P3HT:PCBM (poly(3-hexylthiophene):[6,6]-phenyl C61 butyric acid methyl ester) solar cells by current--voltage, thermally stimulated current (TSC) and charge extraction by linearly increasing voltage (CELIV) measurement techniques. The exposure to oxygen leads to an enhanced charge carrier concentration and a decreased charge carrier mobility. Further, an enhanced formation of deeper traps was observed, although the overall density ofdoi:10.1016/j.orgel.2010.07.016 fatcat:4kzcr3ssbbg2rpnql5qgv7u2vq
more »... traps was found to be unaffected upon oxygen exposure. With the aid of macroscopic simulations, based on solving the differential equation system of Poisson, continuity and drift-diffusion equations in one dimension, we demonstrate the influence of a reduced charge carrier mobility and an increased charge carrier density on the main solar cell parameters, consistent with experimental findings.
The trap distribution in the conjugated polymer poly(3-hexylthiophene) was investigated by fractional thermally stimulated current measurements. Two defect states with activation energies of about 50 meV and 105 meV and Gaussian energy distributions were revealed. The first is assigned to the tail of the intrinsic density of states, whereas the concentration of second trap is directly related to oxygen exposure. The impact of the oxygen induced traps on the charge transport was examined bydoi:10.1063/1.2978237 fatcat:ojbrrpzu2vd4pp5arkc5l36z2i
more »... rming photo-induced charge carrier extraction by linearly increasing voltage measurements, that exhibited a strong decrease in the mobility with air exposure time.
OPEN ACCESS Citation: Hücker SM, Ardern Z, Goldberg T, Schafferhans A, Bernhofer M, Vestergaard G, et al. (2017) Discovery of numerous novel small genes in the intergenic regions of the Escherichia coli ...doi:10.1371/journal.pone.0184119 pmid:28902868 pmcid:PMC5597208 fatcat:qzpooj4q7zaadlubrtvmnwqorq
Genomes of E. coli, including that of the human pathogen Escherichia coli O157:H7 (EHEC) EDL933, still harbor undetected protein-coding genes which, apparently, have escaped annotation due to their small size and non-essential function. To find such genes, global gene expression of EHEC EDL933 was examined, using strand-specific RNAseq (transcriptome), ribosomal footprinting (translatome) and mass spectrometry (proteome). Results: Using the above methods, 72 short, non-annotated protein-codingdoi:10.1186/s12864-016-2456-1 pmid:26911138 pmcid:PMC4765031 fatcat:gcvgh2w2f5eifhrznlb47f6nnu
more »... enes were detected. All of these showed signals in the ribosomal footprinting assay indicating mRNA translation. Seven were verified by mass spectrometry. Fifty-seven genes are annotated in other enterobacteriaceae, mainly as hypothetical genes; the remaining 15 genes constitute novel discoveries. In addition, protein structure and function were predicted computationally and compared between EHEC-encoded proteins and 100-times randomly shuffled proteins. Based on this comparison, 61 of the 72 novel proteins exhibit predicted structural and functional features similar to those of annotated proteins. Many of the novel genes show differential transcription when grown under eleven diverse growth conditions suggesting environmental regulation. Three genes were found to confer a phenotype in previous studies, e.g., decreased cattle colonization. Conclusions: These findings demonstrate that ribosomal footprinting can be used to detect novel protein coding genes, contributing to the growing body of evidence that hypothetical genes are not annotation artifacts and opening an additional way to study their functionality. All 72 genes are taxonomically restricted and, therefore, appear to have evolved relatively recently de novo.
We surveyed the "dark" proteome-that is, regions of proteins never observed by experimental structure determination and inaccessible to homology modeling. For 546,000 Swiss-Prot proteins, we found that 44-54% of the proteome in eukaryotes and viruses was dark, compared with only ∼14% in archaea and bacteria. Surprisingly, most of the dark proteome could not be accounted for by conventional explanations, such as intrinsic disorder or transmembrane regions. Nearly half of the dark proteomedoi:10.1073/pnas.1508380112 pmid:26578815 pmcid:PMC4702990 fatcat:u6kfcwbrwfasfj7polmlilavv4
more »... ed dark proteins, in which the entire sequence lacked similarity to any known structure. Dark proteins fulfill a wide variety of functions, but a subset showed distinct and largely unexpected features, such as association with secretion, specific tissues, the endoplasmic reticulum, disulfide bonding, and proteolytic cleavage. Dark proteins also had short sequence length, low evolutionary reuse, and few known interactions with other proteins. These results suggest new research directions in structural and computational biology. structure prediction | protein disorder | transmembrane proteins | secreted proteins | unknown unknowns T he Protein Data Bank (PDB) (1) of experimentally determined macromolecular structures recently surpassed 110,000 entries-a landmark in understanding the molecular machinery of life. Structure determination lags far behind DNA sequencing, but high-throughput computational modeling (2, 3) can leverage the PDB to provide accurate structural predictions for a large fraction of protein sequences. Thus, structural data now scale with se-
Since 1992 PredictProtein (https://predictprotein.org) is a one-stop online resource for protein sequence analysis with its main site hosted at the Luxembourg Centre for Systems Biomedicine (LCSB) and queried monthly by over 3,000 users in 2020. PredictProtein was the first Internet server for protein predictions. It pioneered combining evolutionary information and machine learning. Given a protein sequence as input, the server outputs multiple sequence alignments, predictions of proteindoi:10.1101/2021.02.23.432527 fatcat:5tdh5vgujjbataxvwidial7fee
more »... re in 1D and 2D (secondary structure, solvent accessibility, transmembrane segments, disordered regions, protein flexibility, and disulfide bridges) and predictions of protein function (functional effects of sequence variation or point mutations, Gene Ontology (GO) terms, subcellular localization, and protein-, RNA-, and DNA binding). PredictProtein's infrastructure has moved to the LCSB increasing throughput; the use of MMseqs2 sequence search reduced runtime five-fold; user interface elements improved usability, and new prediction methods were added. PredictProtein recently included predictions from deep learning embeddings (GO and second-ary structure) and a method for the prediction of proteins and residues binding DNA, RNA, or other proteins. Pre-dictProtein.org aspires to provide reliable predictions to computational and experimental biologists alike. All scripts and methods are freely available for offline execution in high-throughput settings.
PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regionsdoi:10.1093/nar/gku366 pmid:24799431 pmcid:PMC4086098 fatcat:c5pmgwyx7zg4jjncj4eywixmf4
more »... rf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein-protein binding sites (ISIS2), protein-polynucleotide binding sites (SomeNA) and predictions of the effect of point mu-tations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org.
Since 1992 PredictProtein (https://predictprotein.org) is a one-stop online resource for protein sequence analysis with its main site hosted at the Luxembourg Centre for Systems Biomedicine (LCSB) and queried monthly by over 3,000 users in 2020. PredictProtein was the first Internet server for protein predictions. It pioneered combining evolutionary information and machine learning. Given a protein sequence as input, the server outputs multiple sequence alignments, predictions of proteindoi:10.1093/nar/gkab354 pmid:33999203 pmcid:PMC8265159 fatcat:3ozsvwjgmze35lzdfp46ortxjq
more »... re in 1D and 2D (secondary structure, solvent accessibility, transmembrane segments, disordered regions, protein flexibility, and disulfide bridges) and predictions of protein function (functional effects of sequence variation or point mutations, Gene Ontology (GO) terms, subcellular localization, and protein-, RNA-, and DNA binding). PredictProtein's infrastructure has moved to the LCSB increasing throughput; the use of MMseqs2 sequence search reduced runtime five-fold (apparently without lowering performance of prediction methods); user interface elements improved usability, and new prediction methods were added. PredictProtein recently included predictions from deep learning embeddings (GO and secondary structure) and a method for the prediction of proteins and residues binding DNA, RNA, or other proteins. PredictProtein.org aspires to provide reliable predictions to computational and experimental biologists alike. All scripts and methods are freely available for offline execution in high-throughput settings.
Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a Nucleic Acids Research, 2016, Vol. 44, Databasedoi:10.1093/nar/gkv1116 pmid:26538599 pmcid:PMC4702812 fatcat:npzer6xh7fezlbcamfmdvv552m
more »... sue D43 Table 3 . Resource providers. A non-exhaustive list of collections that have contributed or will contribute to the registry. The list includes a cross-section of bioinformatics service providers including other catalogues such as SEQwiki and BioCatalogue Name, URL, Short description CBS Prediction Servers http://www.cbs.dtu.dk/biotools A collection of on-line prediction services from CBS-DTU. The resource contains 75 tools for gene finding and splice sites, post-translational protein modification, immunological features, protein function and structure, protein sorting, genomic epidemiology and more. The tools can be used via interactive input forms, with many available as software packages and SOAP Web services. DRCAT resource catalogue http://drcat.sourceforge.net The data resource catalogue is a collection of metadata on bioinformatics Web-based data resources. The catalog contains over 600 resources including bioinformatics and biomedical databases, ontologies, taxonomies and catalogues. BiBiServ http://bibiserv.cebitec.uni-bielefeld.de BiBiServ is a collection of bioinformatics tools that emerged from the research at Bielefeld University. It contains over 40 mainly analysis and utility tools, including RNA structure prediction, metagenomics, genome rearrangement, alignments, evolutionary relationships, primer design and suffix trees. These are available as interactive web applications, HTTP Web services and downloadable software. BINF.KU.DK Services and Software http://www.binf.ku.dk/services A collection of over 20 web services, databases and software packages from The Bioinformatics Centre at The University of Copenhagen. The resource covers sequence and structure analysis, prediction and modeling, gene regulation, population genetics and more. ELIXIR-CZ Services collection https://www.elixir-czech.cz/services The Czech Bioinformatics Services resource is provided by members of ELIXIR CZ node. It contains over 30 bioinformatics tools and databases for analysis of sequence, topology and structure of nucleic acids and proteins to genomics, proteomics and benchmarks for small molecule interactions. The databases can be accessed via web GUIs while tools are available as web, standalone and command-line applications.
« Previous Showing results 1 — 15 out of 22 results