8 Hits in 0.73 sec

Bayesian transcriptome assembly

Lasse Maretty, Jonas Andreas Sibbesen, Anders Krogh
2014 Genome Biology  
RNA sequencing allows for simultaneous transcript discovery and quantification, but reconstructing complete transcripts from such data remains difficult. Here, we introduce Bayesembler, a novel probabilistic method for transcriptome assembly built on a Bayesian model of the RNA sequencing process. Under this model, samples from the posterior distribution over transcripts and their abundance values are obtained using Gibbs sampling. By using the frequency at which transcripts are observed during
more » ... sampling to select the final assembly, we demonstrate marked improvements in sensitivity and precision over state-of-the-art assemblers on both simulated and real data. Bayesembler is available at
doi:10.1186/s13059-014-0501-4 pmid:25367074 pmcid:PMC4397945 fatcat:fxzrgy6lazgwfg6cjwmlprjff4

Genotyping structural variants in pangenome graphs using the vg toolkit [article]

Glenn Hickey, David Heller, Jean Monlong, Jonas Andreas Sibbesen, Jouni Siren, Jordan Eizenga, Eric Dawson, Erik Garrison, Adam Novak, Benedict Paten
2019 bioRxiv   pre-print
Structural variants (SVs) are significant components of genetic diversity and have been associated with diseases, but the technological challenges surrounding their representation and identification make them difficult to study relative to point mutations. Still, thousands of SVs have been characterized, and catalogs continue to improve with new technologies. In parallel, variation graphs have been proposed to represent human pangenomes, offering reduced reference bias and better mapping
more » ... y than linear reference genomes. We contend that variation graphs provide an effective means for leveraging SV catalogs for short-read SV genotyping experiments. In this work, we extend vg (a software toolkit for working with variation graphs) to support SV genotyping. We show that it is capable of genotyping insertions, deletions and inversions, even in the presence of small errors in the location of the SVs breakpoints. We then benchmark vg against state-of-the-art SV genotypers using three high-quality sequence-resolved SV catalogs generated by recent studies ranging up to 97,368 variants in size. We find that vg systematically produces the best genotype predictions in all datasets. In addition, we use assemblies from 12 yeast strains to show that graphs constructed directly from aligned de novo assemblies can improve genotyping compared to graphs built from intermediate SV catalogs in the VCF format. Our results demonstrate the power of variation graphs for SV genotyping. Beyond single nucleotide variants and short insertions/deletions, the vg toolkit now incorporates SVs in its unified variant calling framework and provides a natural solution to integrate high-quality SV catalogs and assemblies.
doi:10.1101/654566 fatcat:nanvfcphv5dqdfd3mruitwpbka

Coalescent inference using serially sampled, high-throughput sequencing data from intra-host HIV infection [article]

Kevin Dialdestoro, Jonas Andreas Sibbesen, Lasse Maretty, Jayna Raghwani, Astrid Gall, Paul Kellam, Oliver Pybus, Jotun Hein, Paul Jenkins
2015 bioRxiv   pre-print
Human immunodeficiency virus (HIV) is a rapidly evolving pathogen that causes chronic infections, so genetic diversity within a single infection can be very high. High-throughput "deep" sequencing can now measure this diversity in unprecedented detail, particularly since it can be performed at different timepoints during an infection, and this offers a potentially powerful way to infer the evolutionary dynamics of the intra-host viral population. However, population genomic inference from HIV
more » ... quence data is challenging because of high rates of mutation and recombination, rapid demographic changes, and ongoing selective pressures. In this paper we develop a new method for inference using HIV deep sequencing data using an approach based on importance sampling of ancestral recombination graphs under a multi-locus coalescent model. The approach further extends recent progress in the approximation of so-called conditional sampling distributions, a quantity of key interest when approximating coalescent likelihoods. The chief novelties of our method are that it is able to infer rates of recombination and mutation, as well as the effective population size, while handling sampling over different timepoints and missing data without extra computational difficulty. We apply our method to a dataset of HIV-1, in which several hundred sequences were obtained from an infected individual at seven timepoints over two years. We find mutation rate and effective population size estimates to be comparable to those produced by the software BEAST. Additionally, our method is able to produce local recombination rate estimates. The software underlying our method, Coalescenator, is freely available.
doi:10.1101/020552 fatcat:gv7r42oxunf65aa3yvfx2sbj5q

Next-generation biology: Sequencing and data analysis approaches for non-model organisms

Rute R. da Fonseca, Anders Albrechtsen, Gonçalo Espregueira Themudo, Jazmín Ramos-Madrigal, Jonas Andreas Sibbesen, Lasse Maretty, M. Lisandra Zepeda-Mendoza, Paula F. Campos, Rasmus Heller, Ricardo J. Pereira
2016 Marine Genomics  
As sequencing technologies become more affordable, it is now realistic to propose studying the evolutionary history of virtually any organism on a genomic scale. However, when dealing with non-model organisms it is not always easy to choose the best approach given a specific biological question, a limited budget, and challenging sample material. Furthermore, although recent advances in technology offer unprecedented opportunities for research in non-model organisms, they also demand
more » ... d awareness from the researcher regarding the assumptions and limitations of each method. In this review we present an overview of the current sequencing technologies and the methods used in typical high-throughput data analysis pipelines. Subsequently, we contextualize high-throughput DNA sequencing technologies within their applications in non-model organism biology. We include tips regarding managing unconventional sample material, comparative and population genetic approaches that do not require fully assembled genomes, and advice on how to deal with low depth sequencing data.
doi:10.1016/j.margen.2016.04.012 pmid:27184710 fatcat:j37svmhkkbhcjguswzwk5rpfim

RNA Sequencing of Trigeminal Ganglia in Rattus Norvegicus after Glyceryl Trinitrate Infusion with Relevance to Migraine

Sara Hougaard Pedersen, Lasse Maretty, Roshni Ramachandran, Jonas Andreas Sibbesen, Victor Yakimov, Rikke Elgaard-Christensen, Thomas Folkmann Hansen, Anders Krogh, Jes Olesen, Inger Jansen-Olesen, Emanuele Buratti
2016 PLoS ONE  
Introduction Infusion of glyceryl trinitrate (GTN), a donor of nitric oxide, induces immediate headache in humans that in migraineurs is followed by a delayed migraine attack. In order to achieve increased knowledge of mechanisms activated during GTN-infusion this present study aims to investigate transcriptional responses to GTN-infusion in the rat trigeminal ganglia. Methods Rats were infused with GTN or vehicle and trigeminal ganglia were isolated either 30 or 90 minutes post infusion. RNA
more » ... quencing was used to investigate transcriptomic changes in response to the treatment. Furthermore, we developed a novel method for Gene Set Analysis Of Variance (GSANOVA) to identify gene sets associated with transcriptional changes across time. Results 15 genes displayed significant changes in transcription levels in response to GTN-infusion. Ten of these genes showed either sustained up-or down-regulation in the 90-minute period after infusion. The GSANOVA analysis demonstrate enrichment of pathways pointing towards an increase in immune response, signal transduction, and neuroplasticity in response to GTN-infusion. Future functional in-depth studies of these mechanisms are expected to increase our understanding of migraine pathogenesis.
doi:10.1371/journal.pone.0155039 pmid:27213950 pmcid:PMC4877077 fatcat:vt2ldtou6jd7lblv4ynlfajbvm

Sequencing and de novo assembly of 150 genomes from Denmark as a population reference

Lasse Maretty, Jacob Malte Jensen, Bent Petersen, Jonas Andreas Sibbesen, Siyang Liu, Palle Villesen, Laurits Skov, Kirstine Belling, Christian Theil Have, Jose M. G. Izarzugaza, Marie Grosjean, Jette Bork-Jensen (+47 others)
2017 Nature  
3 a u g u s t 2 0 1 7 | V O L 5 4 8 | N a t u R E | 8 7 LEttER
doi:10.1038/nature23264 pmid:28746312 fatcat:p7m6a5ripfagrall2euqtokyim

BMC Bioinformatics reviewer acknowledgement 2015

Dirk Krüger
2016 BMC Bioinformatics  
Andreas Sibbesen Denmark Dennis Hazelett USA Dennis Welker USA Des Higgins Ireland Deyvid Amgarten Brazil Domenico L Gatti USA Damyanthi Herath Australia Dragos Horvath France  ...  USA David A Eberhard USA Davide Risso USA Danny Barash Israel Dukka Kc USA Debasisa Mohanty India Deepak Ayyala USA Deepak Singla USA Denis Bauer Australia Huangdi Yi China Jonas  ... 
doi:10.1186/s12859-016-0936-6 fatcat:shf2gqndefawpdj3dcm6ndoavi

Pangenome-based genome inference [article]

Jana Ebler, Wayne E. Clarke, Tobias Rausch, Peter A. Audano, Torsten Houwaart, Jan Korbel, Evan E. Eichler, Michael C. Zody, Alexander T. Dilthey, Tobias Marschall
2020 bioRxiv   pre-print
Genome research, 27(2):300-309, 2017. 406 [25] Jonas Andreas Sibbesen, Lasse Maretty, and Anders Krogh. Accurate genotyping across variant classes and 407 lengths using variant graphs.  ...  Genome Biology, 20:201, 2019. 396 [21] Glenn Hickey, David Heller, Jean Monlong, Jonas Andreas Sibbesen, Jouni Siren, Jordan Eizenga, Eric Dawson, 397 Erik Garrison, Adam Novak, and Benedict Paten.  ... 
doi:10.1101/2020.11.11.378133 fatcat:ris4w4rq65blpe4icda3zl6if4