GreenPhylDB: A Gene Family Database for plant functional Genomics

Mathieu Rouard, Matthieu Conte, Marie-Angélique Laporte, Christophe Périn
2009 Nature Precedings  
Nowadays, most of the manual annotation in biology is done on gene sequences or protein patterns but relatively little is done for gene families at large. However, a proper catalogue of homeomorphic gene families, genes that evolved from a common ancestor and sharing full-length sequence similarity and common domain architecture, would be a valuable resource for evolution studies and orthologs inference. GreenPhylDB v2.0a contains groups of protein-coding gene sequences automatically clustered
more » ... rom 12 complete genomes of plants ( fig. 1 ) that cover most of the taxonomy of green plants. Each cluster is first manually checked and then analyzed by a phylogeny approach to predict orthologs. We add value with several annotations, including family names defined via a consensus from existing gene and protein pattern annotations (e.g. UniProt, InterPro, Pirsf, Kegg, GO) for the sequences composing the clusters. Here, we present our methodology and annotation tool for the curation of Protein-coding gene sequence families, a critical step before any phylogenetic analysis.
doi:10.1038/npre.2009.3136.1 fatcat:5gmt6mvemvgbpmisarpvhvia3m