A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is
http://genomebiology.com/2007/8/6/R109 Genome Biology 2007, Volume 8, Issue 6, Article R109 Huerta-Cepas et al. ... Genome Biology 2007, Volume 8, Issue 6, Article R109 Huerta-Cepas et al. http://genomebiology.com/2007/8/6/R109 Genome Biology 2007, 8:R109 Table Species included in the present phylome and their genomic ...doi:10.1186/gb-2007-8-6-r109 pmid:17567924 pmcid:PMC2394744 fatcat:b3btzemx3ncyhkkunyucv2jkeu
Many bioinformatics analyses, ranging from gene clustering to phylogenetics, produce hierarchical trees as their main result. These are used to represent the relationships among different biological entities, thus facilitating their analysis and interpretation. A number of standalone programs are available that focus on tree visualization or that perform specific analyses on them. However, such applications are rarely suitable for large-scale surveys, in which a higher level of automation isdoi:10.1186/1471-2105-11-24 pmid:20070885 pmcid:PMC2820433 fatcat:4g6bz74wurc4pk4h5rgwi73kau
more »... uired. Currently, many genome-wide analyses rely on tree-like data representation and hence there is a growing need for scalable tools to handle tree structures at large scale. Results: Here we present the Environment for Tree Exploration (ETE), a python programming toolkit that assists in the automated manipulation, analysis and visualization of hierarchical trees. ETE libraries provide a broad set of tree handling options as well as specific methods to analyze phylogenetic and clustering trees. Among other features, ETE allows for the independent analysis of tree partitions, has support for the extended newick format, provides an integrated node annotation system and permits to link trees to external data such as multiple sequence alignments or numerical arrays. In addition, ETE implements a number of built-in analytical tools, including phylogeny-based orthology prediction and cluster validation techniques. Finally, ETE's programmable tree drawing engine can be used to automate the graphical rendering of trees with customized node-specific visualizations. Conclusions: ETE provides a complete set of methods to manipulate tree data structures that extends current functionality in other bioinformatic toolkits of a more general purpose. ETE is free software and can be downloaded from
We extracted the prevalence information by annotating protein sequences in KEGG database (rel. 78, Apr 1st, 2016) with eggNOG-mapper (Huerta-Cepas et al., 2016) . ...doi:10.1016/j.ymben.2017.02.010 pmid:28232136 pmcid:PMC5368410 fatcat:y24fpmqqb5gmzktj2gxv32xidy
Building on recent improvements made to the eggNOG orthology resource (Huerta-Cepas et al. 2016 ), we have created eggNOG-mapper, an application intended for fast functional annotation of novel sequences ... For each query sequence, HMMER 3 (Eddy 2011 ) is first used to search for significant matches in the precomputed collection of Hidden Markov Models (HMM) available from the eggNOG database (Huerta-Cepas ...doi:10.1101/076331 fatcat:gb7ez66vgjhrnnfbxj24i4auoi
Phylogenetic trees are routinely visualized to present and interpret the evolutionary relationships of species. Virtually all empirical evolutionary data studies contain a visualization of the inferred tree with branch support values. Ambiguous semantics in tree file formats can lead to erroneous tree visualizations and therefore to incorrect interpretations of phylogenetic analyses. Here, we discuss problems that can and do arise when displaying branch values on trees after re-rooting. Branchdoi:10.1101/035360 fatcat:bgaxpwvyt5ctthntzfmj45ltju
more »... alues are typically stored as node labels in the widely-used Newick tree format. However, such values are attributes of branches. Storing them as node labels can therefore yield errors when re-rooting trees. This depends on the mostly implicit semantics that tools deploy to interpret node labels. We reviewed 10 tree viewers and 10 bioinformatics toolkits that can display and re-root trees. We found that 14 out of 20 of these tools do not permit users to select the semantics of node labels. Thus, unaware users might obtain incorrect results when rooting trees inferred by common phylogenetic inference programs. We illustrate such incorrect mappings for several test cases and real examples taken from the literature. This review has already led to improvements and workarounds in 8 of the tested tools. We suggest tools should provide an option that explicitly forces users to define the semantics of node labels.
Notably, ETE-build was recently used to compute over one million phylogenetic trees for the EggNOG v4.5 database (Huerta-Cepas et al. 2016). ... Huerta-Cepas et al. . doi:10.1093/molbev/msw046 MBE convenience, raw output files produced by CodeML and SLR can also be visualized using ete-evol. ...doi:10.1093/molbev/msw046 pmid:26921390 pmcid:PMC4868116 fatcat:bu225g3htzccxo6qo5ijeyeg4e
Even though automated functional annotation of genes represents a fundamental step in most genomic and metagenomic workflows, it remains challenging at large scales. Here, we describe a major upgrade to eggNOG-mapper, a tool for functional annotation based on precomputed orthology assignments, now optimized for vast (meta)genomic data sets. Improvements in version 2 include a full update of both the genomes and functional databases underlying eggNOG v5, as well as several efficiencydoi:10.1101/2021.06.03.446934 fatcat:do4oh4xcl5cerm73pmakdwlfbi
more »... and new features. Most notably, eggNOG-mapper v2 now allows: (i) de novo gene prediction from raw contigs, (ii) built-in pairwise orthology prediction, (iii) fast protein domain discovery, and (iv) automated GFF decoration. eggNOG-mapper v2 is available as a standalone tool or as an online service at http://emapperdev.compgenomics.org.
Modern sequencing technologies have massively increased the amount of data available for comparative genomics. Whole-transcriptome shotgun sequencing (RNA-seq) provides a powerful basis for comparative studies. In particular, this approach holds great promise for emerging model species in fields such as evolutionary developmental biology (evo-devo). Results: We have sequenced early embryonic transcriptomes of two non-drosophilid dipteran species: the moth midge Clogmia albipunctata, and thedoi:10.1186/1471-2164-14-123 pmid:23432914 pmcid:PMC3616871 fatcat:3itr3ebjlzcbjdux33mu22yysm
more »... tle fly Megaselia abdita. Our analysis includes a third, published, transcriptome for the hoverfly Episyrphus balteatus. These emerging models for comparative developmental studies close an important phylogenetic gap between Drosophila melanogaster and other insect model systems. In this paper, we provide a comparative analysis of early embryonic transcriptomes across species, and use our data for a phylogenomic re-evaluation of dipteran phylogenetic relationships. Conclusions: We show how comparative transcriptomics can be used to create useful resources for evo-devo, and to investigate phylogenetic relationships. Our results demonstrate that de novo assembly of short (Illumina) reads yields high-quality, high-coverage transcriptomic data sets. We use these data to investigate deep dipteran phylogenetic relationships. Our results, based on a concatenation of 160 orthologous genes, provide support for the traditional view of Clogmia being the sister group of Brachycera (Megaselia, Episyrphus, Drosophila), rather than that of Culicomorpha (which includes mosquitoes and blackflies).
NGLess is a domain specific language for describing next-generation sequence processing pipelines. It was developed with the goal of enabling user-friendly computational reproducibility. Using this framework, we developed 'NG-meta-profiler', a fast profiler for metagenomes which performs sequence preprocessing, mapping to bundled databases, filtering of the mapping results, and profiling (taxonomic and functional). It is significantly faster than either MOCAT2 or htseq-count and (as it buildsdoi:10.1101/367755 fatcat:g6tzdy2y6zfhpeqb5dsxsfgcia
more »... NGLess) its results are perfectly reproducible. These pipelines can easily be customized and extended with other tools. NGLess and NG-meta-profiler are open source software (under the liberal MIT licence) and can be downloaded from http://ngless.embl.de or installed through bioconda.
Post-transcriptional regulation is essential for life, yet we are currently unable to investigate its role in complex microbiome samples. Here we discover that co-translational mRNA degradation, where the degradation machinery follows the last translating ribosome, is conserved across prokaryotes. By investigating 5′P mRNA decay intermediates, we obtain in vivo ribosome protection information that allows the study of codon and gene specific ribosome stalling in response to stress and drugdoi:10.1101/2021.04.08.439066 fatcat:nhifr6eb6baenodtrtqdkyllwu
more »... ent at single nucleotide resolution. We use this approach to investigate in vivo species-specific ribosome footprints of clinical and environmental microbiomes and show for the first time that ribosome protection patterns can be used to phenotype microbiome perturbations. Our work paves the way for the study of the metatranslatome, and enables the investigation of fast, species-specific, post-transcriptional responses to environmental and chemical perturbations in unculturable microbial communities.
Cancer arises from the consecutive acquisition of genetic alterations. Increasing evidence suggests that as a consequence of these alterations, molecular interactions are reprogrammed in the context of highly connected and regulated cellular networks. Coordinated reprogramming would allow the cell to acquire the capabilities for malignant growth. Results: Here, we determine the coordinated function of cancer gene products (i.e., proteins encoded by differentially expressed genes in tumorsdoi:10.1186/1471-2164-8-185 pmid:17584915 pmcid:PMC1929080 fatcat:qfylkgyhsfczjkz2hmaex6avsu
more »... ve to healthy tissue counterparts, hereafter referred to as "CGPs") defined as their topological properties and organization in the interactome network. We show that CGPs are central to information exchange and propagation and that they are specifically organized to promote tumorigenesis. Centrality is identified by both local (degree) and global (betweenness and closeness) measures, and systematically appears in down-regulated CGPs. Up-regulated CGPs do not consistently exhibit centrality, but both types of cancer products determine the overall integrity of the network structure. In addition to centrality, down-regulated CGPs show topological association that correlates with common biological processes and pathways involved in tumorigenesis. Conclusion: Given the current limited coverage of the human interactome, this study proposes that tumorigenesis takes place in a specific and organized way at the molecular systems-level and suggests a model that comprises the precise down-regulation of groups of topologically-associated proteins involved in particular functions, orchestrated with the up-regulation of specific proteins.
With the popularisation of high-throughput techniques, the need for procedures that help in the biological interpretation of results has increased enormously. Recently, new procedures inspired in systems biology criteria have started to be developed. Results: Here we present FatiScan, a web-based program which implements a thresholdindependent test for the functional interpretation of large-scale experiments that does not depend on the pre-selection of genes based on the multiple application ofdoi:10.1186/1471-2105-8-114 pmid:17407596 pmcid:PMC1853114 fatcat:l4qti6dx7va6ppgco7f6ejih4y
more »... independent tests to each gene. The test implemented aims to directly test the behaviour of blocks of functionally related genes, instead of focusing on single genes. In addition, the test does not depend on the type of the data used for obtaining significance values, and consequently different types of biologically informative terms (gene ontology, pathways, functional motifs, transcription factor binding sites or regulatory sites from CisRed) can be applied to different classes of genome-scale studies. We exemplify its application in microarray gene expression, evolution and interactomics. Conclusion: Methods for gene set enrichment which, in addition, are independent from the original data and experimental design constitute a promising alternative for the functional profiling of genome-scale experiments. A web server that performs the test described and other similar ones can be found at: http://www.babelomics.org.
Shotgun metagenomes contain a sample of all the genomic material in an environment, allowing for the characterization of a microbial community. In order to understand these communities, bioinformatics methods are crucial. A common first step in processing metagenomes is to compute abundance estimates of different taxonomic or functional groups from the raw sequencing data. Given the breadth of the field, computational solutions need to be flexible and extensible, enabling the combination of different tools into a larger pipeline.doi:10.1186/s40168-019-0684-8 pmid:31159881 pmcid:PMC6547473 fatcat:a44vlibljvebhl2gs7vwfllbnq
They were constructed using the "sptree_raxml_all" workflow as implemented in ETE3 v3.0.0b36 (Huerta-Cepas et al, 2016a) . ... From this study, we specifically used annotations to eggNOG (Huerta-Cepas et al, 2016b), KEGG (Kanehisa & Goto, 2000) , and SEED (Overbeek et al, 2005) as indicated in the main text. ...doi:10.15252/msb.20177589 pmid:29242367 pmcid:PMC5740502 fatcat:gpiwif7sbzadxlu42rg4wsw2vy
Table 1 . 1 Databases from which functional properties are obtained Proteins Coverage Precision Recall Reference Protein domains and families eggNOG 7 449 593 100 100 100 Huerta-Cepas et al ...doi:10.1093/bioinformatics/btw183 pmid:27153620 pmcid:PMC4978931 fatcat:njanfdmrb5fy7ocmj5x56i4yby
« Previous Showing results 1 — 15 out of 261 results