The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons

Ingo Braasch, Andrew R Gehrke, Jeramiah J Smith, Kazuhiko Kawasaki, Tereza Manousaki, Jeremy Pasquier, Angel Amores, Thomas Desvignes, Peter Batzel, Julian Catchen, Aaron M Berlin, Michael S Campbell (+49 others)
2016 Nature Genetics  
Teleost fishes represent about half of all living vertebrate species 1 and provide important models for human disease (for example, zebrafish and medaka) 2-9 . Connecting teleost genes and gene functions to human biology (Fig. 1a) can be challenging given (i) the two rounds of early vertebrate genome duplication (VGD1 and VGD2 (ref. 10), but see ref. 11) followed by reciprocal loss of some ohnologs (gene dupli cates derived from genome duplication 12 ) in teleosts and tetrapods, including
more » ... 13,14 ; (ii) the TGD, which resulted in duplicates of many human genes 15,16 ; and (iii) rapid teleost sequence evolution 17,18 , often due to asymmetric rates of ohnolog evolution, that frustrates ortholog identification. To help connect teleost biomedicine to human biology, we sequenced the genome of spotted gar (L. oculatus, henceforth 'gar'; Supplementary Fig. 1 and Supplementary Note) because its lineage represents the unduplicated sister group of tele osts 19,20 (Fig. 1a) . Gar informs the evolution of vertebrate genomes and gene functions after genome duplication and illuminates evolutionary mechanisms leading to teleost biodiversity. The gar genome evolved comparatively slowly and clarifies the evolution and orthology of problematic teleost proteincoding and microRNA (miRNA) gene families. Surprisingly, many entire gar chromosomes have been conserved with some tetra pods for 450 million years. Notably, gar facilitates the identification of CNEs, which are often regulatory, that teleosts and humans share but that are not detected by direct sequence comparisons. Global gene expression analyses show that expression domains and levels for TGDgenerated duplicates usually sum to those for the corresponding gar gene, as expected if ancestral regulatory elements were partitioned after the TGD. By illuminating the legacy of genome duplication, the gar genome bridges teleost biology to human health, disease, development, physiology and evolution. RESULTS Genome assembly and annotation The genome of a single adult gar female collected in Louisiana was sequenced to 90× coverage using Illumina technology. The ALLPATHS LG 21 draft assembly covers 945 Mb with quality metrics comparable To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before teleost genome duplication (TGD). The slowly evolving gar genome has conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization and development (mediated, for example, by Hox, ParaHox and microRNA genes). Numerous conserved noncoding elements (CNEs; often cis regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles for such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses showed that the sums of expression domains and expression levels for duplicated teleost genes often approximate the patterns and levels of expression for gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes and the function of human regulatory sequences. A full list of affiliations appears at the end of the paper.
doi:10.1038/ng.3526 pmid:26950095 pmcid:PMC4817229 fatcat:oqldvyvguveell3dxg6ftsjleu