Filters








33 Hits in 1.4 sec

Large-scale sequence comparisons with sourmash [article]

N Tessa Pierce, Luiz Irber, Taylor Reiter, Phillip Brooks, C. Titus Brown
2019 bioRxiv   pre-print
The sourmash software package uses MinHash-based sketching to create "signatures", compressed representations of DNA, RNA, and protein sequences, that can be stored, searched, explored, and taxonomically annotated. sourmash signatures can be used to estimate sequence similarity between very large data sets quickly and in low memory, and can be used to search large databases of genomes for matches to query genomes and metagenomes. sourmash is implemented in C++, Rust, and Python, and is freely
more » ... ailable under the BSD license at http://github.com/dib-lab/sourmash.
doi:10.1101/687285 fatcat:7sanndl4z5fpdirc5os7hpvdbu

Large-scale sequence comparisons with sourmash

N. Tessa Pierce, Luiz Irber, Taylor Reiter, Phillip Brooks, C. Titus Brown
2019 F1000Research  
The sourmash software package uses MinHash-based sketching to create "signatures", compressed representations of DNA, RNA, and protein sequences, that can be stored, searched, explored, and taxonomically annotated. sourmash signatures can be used to estimate sequence similarity between very large data sets quickly and in low memory, and can be used to search large databases of genomes for matches to query genomes and metagenomes. sourmash is implemented in C++, Rust, and Python, and is freely
more » ... ailable under the BSD license at http://github.com/dib-lab/sourmash.
doi:10.12688/f1000research.19675.1 pmid:31508216 pmcid:PMC6720031 fatcat:mpzjwte2djf5bhyrp5u45evqzq

Protein k-mers enable assembly-free microbial metapangenomics [article]

Taylor E. Reiter, N. Tessa Pierce-Ward, Luiz C. Irber, Olga Botvinnik, C. Titus Brown
2022 bioRxiv   pre-print
Luiz Irber, Phillip T Brooks, Taylor Reiter, NTessa Pierce-Ward, Mahmudur Rahman Hera, David Koslicki, CTitus Brown Bioinformatics (2022-01-12) https://doi.org/gn34zt DOI: 10.1101/2022.01.11.475838 Genome-resolved  ... 
doi:10.1101/2022.06.27.497795 fatcat:wl25hdbvdvfxpjro6okzd7ncxm

Efficient cardinality estimation for k-mers in large DNA sequencing data sets [article]

Luiz Carlos Irber Junior, C. Titus Brown
2016 bioRxiv   pre-print
We present an open implementation of the HyperLogLog cardinality estimation sketch for counting fixed-length substrings of DNA strings (k-mers). The HyperLogLog sketch implementation is in C++ with a Python interface, and is distributed as part of the khmer software package. khmer is freely available from \url{https://github.com/dib-lab/khmer} under a BSD License. The features presented here are included in version 1.4 and later.
doi:10.1101/056846 fatcat:b7bkhrj54nbqpnlvx3xscszghi

Evaluating Metagenome Assembly on a Simple Defined Community with Many Strain Variants [article]

Sherine Awad, Luiz Irber, C. Titus Brown
2017 bioRxiv   pre-print
We evaluate the performance of three metagenome assemblers, IDBA, MetaSPAdes, and MEGAHIT, on short-read sequencing of a defined "mock" community containing 64 genomes (Shakya et al. (2013)). We update the reference metagenome for this mock community and detect several additional genomes in the read data set. We show that strain confusion results in significant loss in assembly of reference genomes that are otherwise completely present in the read data set. In agreement with previous studies,
more » ... find that MEGAHIT performs best computationally; we also show that MEGAHIT tends to recover larger portions of the strain variants than the other assemblers.
doi:10.1101/155358 fatcat:nd7gc6o635hqhasbaf572vpnta

Streamlining Data-Intensive Biology With Workflow Systems [article]

Taylor Reiter, Phillip T. Brooks, Luiz Irber, Shannon E.K. Joslin, Charles M. Reid, Camille Scott, C. Titus Brown, N. Tessa Pierce
2020 bioRxiv   pre-print
As the scale of biological data generation has increased, the bottleneck of research has shifted from data generation to analysis. Researchers commonly need to build computational workflows that include multiple analytic tools and require incremental development as experimental insights demand tool and parameter modifications. These workflows can produce hundreds to thousands of intermediate files and results that must be integrated for biological insight. The maturation of data-centric
more » ... systems that internally manage computational resources, software, and conditional execution of analysis steps are reshaping the landscape of biological data analysis, and empowering researchers to conduct reproducible analyses at scale. Adoption of these tools can facilitate and expedite robust data analysis, but knowledge of these techniques is still lacking. Here, we provide a series of practices and strategies for leveraging workflow systems with structured project, data, and resource management to streamline large-scale biological analysis.
doi:10.1101/2020.06.30.178673 fatcat:up6eozdxyjhlxmkllqa4deewfm

Lightweight compositional analysis of metagenomes with FracMinHash and minimum metagenome covers [article]

Luiz Carlos Irber, Phillip T Brooks, Taylor E Reiter, N Tessa Pierce-Ward, Mahmudur Rahman Hera, David Koslicki, C. Titus Brown
2022 bioRxiv   pre-print
Irber, Taylor Reiter, Phillip Brooks, CTitus Brown F1000Research (2019-07-04) https://doi.org/gf9v84 DOI: 10.12688/f1000research.19675.1 • PMID: 31508216 • PMCID: PMC6720031 luizirber/phd 2020.09.28 Luiz  ...  Adam M Phillippy Genome Biology (2016-06-20) https://doi.org/gfx74q DOI: 10.1186/s13059-016-0997-x • PMID: 27323842 • PMCID: PMC4915045 sourmash: a library for MinHash sketching of DNA C Titus Brown, Luiz  ... 
doi:10.1101/2022.01.11.475838 fatcat:qr2hee27hnbwdme5b7ch2wrody

Haplotype-phased synthetic long reads from short-read sequencing [article]

James A Stapleton, Jeongwoon Kim, John P Hamilton, Ming Wu, Luiz C Irber, Rohan Maddamsetti, Bryan Briney, Linsey Newton, Dennis R Burton, C Titus Brown, Christina Chan, C Robin Buell (+1 others)
2015 bioRxiv   pre-print
Next-generation DNA sequencing has revolutionized the study of biology. However, the short read lengths of the dominant instruments complicate assembly of complex genomes and haplotype phasing of mixtures of similar sequences. Here we demonstrate a method to reconstruct the sequences of individual nucleic acid molecules up to 11.6 kilobases in length from short (150-bp) reads. We show that our method can construct 99.97%-accurate synthetic reads from bacterial, plant, and animal genomic
more » ... full-length mRNA sequences from human cancer cell lines, and individual HIV env gene variants from a mixture. The preparation of multiple samples can be multiplexed into a single tube, further reducing effort and cost relative to competing approaches. Our approach generates sequencing libraries in three days from less than one microgram of DNA in a single-tube format without custom equipment or specialized expertise.
doi:10.1101/022897 fatcat:wii6yzsa5fgijer4ruxy3fkbyi

Context-aware genomic surveillance reveals hidden transmission of a carbapenemase-producing Klebsiella pneumoniae [article]

Adrian Viehweger, Christian Blumenscheit, Norman Lippmann, Kelly L. Wyres, Christian Brandt, Jörg B. Hans, Martin Hölzer, Luiz Irber, Sören Gatermann, Christoph Lübbert, Mathias Pletz, Kathryn E. Holt (+1 others)
2021 bioRxiv   pre-print
AbstractGenomic surveillance can inform effective public health responses to pathogen outbreaks. However, integration of non-local data is rarely done. We investigate two large hospital outbreaks of a carbapenemase-carrying Klebsiella pneumoniae strain in Germany and show the value of contextual data. By screening more than ten thousand genomes, 500 thousand metagenomes, and two culture collections using in silico and in vitro methods, we identify a total of 415 closely related genomes reported
more » ... in 28 studies. We identify the relationship between the two outbreaks through time-dated phylogeny, including their respective origin. One of the outbreaks presents extensive hidden transmission, with descendant isolates only identified in other studies. We then leverage the genome collection from this meta-analysis to identify genes under positive selection. We thereby identify an inner membrane transporter (ynjC) with a putative role in colistin resistance. Contextual data from other sources can thus enhance local genomic surveillance at multiple levels and should be integrated by default when available.
doi:10.1101/2021.06.07.447408 fatcat:g5ghhxbepfhcthzuiayryzvolu

Haplotype-Phased Synthetic Long Reads from Short-Read Sequencing

James A. Stapleton, Jeongwoon Kim, John P. Hamilton, Ming Wu, Luiz C. Irber, Rohan Maddamsetti, Bryan Briney, Linsey Newton, Dennis R. Burton, C. Titus Brown, Christina Chan, C. Robin Buell (+2 others)
2016 PLoS ONE  
Next-generation DNA sequencing has revolutionized the study of biology. However, the short read lengths of the dominant instruments complicate assembly of complex genomes and haplotype phasing of mixtures of similar sequences. Here we demonstrate a method to reconstruct the sequences of individual nucleic acid molecules up to 11.6 kilobases in length from short (150-bp) reads. We show that our method can construct 99.97%-accurate synthetic reads from bacterial, plant, and animal genomic
more » ... full-length mRNA sequences from human cancer cell lines, and individual HIV env gene variants from a mixture. The preparation of multiple samples can be multiplexed into a single tube, further reducing effort and cost relative to competing approaches. Our approach generates sequencing libraries in three days from less than one microgram of DNA in a single-tube format without custom equipment or specialized expertise.
doi:10.1371/journal.pone.0147229 pmid:26789840 pmcid:PMC4720449 fatcat:k3dhbedgdbawtoowdhdkvrzq5i

VARIABILIDADE E CONFIABILIDADE DOS PARÂMETROS CINEMÁTICOS DA MARCHA DE IDOSOS APÓS UM TROPEÇO CONTROLADO: ESTUDO PRELIMINAR

Roberta Castilhos Detanico Bohrer, Angélica Lodovico, Marcia Regina Irber Kertscher, Gleber Pereira, André Luiz Felix Rodacki
2018 Journal of Physical Education  
RESUMO Aproximadamente 21% das quedas em idosos ocorrem como consequência de tropeços ao caminhar. Há uma escassez de informações referentes à variabilidade e à confiabilidade dos parâmetros cinemáticos da marcha em diferentes dias de avaliação, sobretudo com idosos. Buscou-se analisar a variabilidade e a confiabilidade (intra e inter-dia) dos parâmetros espaço-temporais e angulares da marcha de idosos, após a indução de tropeço controlado. Oito idosas participaram do estudo. O tropeço foi
more » ... ido durante o início da fase de balanço da marcha. Foram analisados os dados cinemáticos das tentativas de marcha. A variabilidade e confiabilidade dos parâmetros espaço-temporais da marcha foram verificados através do coeficiente de variação (CV), do coeficiente de correlação intraclasse (ICC) e do erro padrão de medida (SEM). A variabilidade dos parâmetros espaço-temporais e angulares intra e inter-dia foi baixa para a maioria das variáveis, à exceção da flexão plantar. O SEM foi baixo para todos os parâmetros. A confiabilidade intra-dia foi moderada a alta para os parâmetros espaço-temporais e angulares; A confiabilidade inter-dia foi baixa a moderada para todos os parâmetros. As variáveis não diferiram entre instantes e dias. Apesar do padrão de marcha não ter alterado deve ser analisado com cautela em estudos que incluam intervenção, particularmente para os parâmetros angulares. Palavras-chave: Tropeço. Queda. Envelhecimento. ABSTRACT Approximately 21% of the falls in older adults occur due to tripping, while walking. There is a paucity of information regarding the gait variability and reliability when a tripping is induced in different days mainly with elderly. It was aimed to analyze the variability and the reliability (intra-and inter-day) of spatiotemporal gait parameters and joint angles after controlled tripping in older adults. Eight healthy older women participated. The trip was induced during the early-mid swing phase on the transposing segment and the kinematic data was obtained from trials. The variability and reliability of spatiotemporal gait parameters and joint angles during the gait cycle were checked through the coefficient of variation (CV), the intraclass coefficient correlation (ICC) and the standard error of measurement (SEM). The variability of spatiotemporal and intra-and inter-day angular parameters was low for most variables, except for plantar flexion. The SEM was low for all parameters. Intra-day reliability was moderate to high for the spatiotemporal and angular parameters. Inter-day reliability was considered low to moderate for all parameters. The variables did not differ between instants and days. Experimental procedures demonstrate that the walking pattern did not change, but should be considered with caution in studies that include intervention, particularly for angular parameters during gait.
doi:10.4025/jphyseduc.v29i1.2906 fatcat:zgxp676adrhopb4wum4ozg7ite

khmer release v2.1: software for biological sequence analysis

Daniel Standage, Ali yari, Lisa J. Cohen, Michael R. Crusoe, Tim Head, Luiz Irber, Shannon EK Joslin, N. B. Kingsley, Kevin D. Murray, Russell Neches, Camille Scott, Ryan Shean (+3 others)
2017 Journal of Open Source Software  
doi:10.21105/joss.00272 fatcat:dvsn2kh7rjgf7ejmar4doczwle

Meta-analysis of metagenomes via machine learning and assembly graphs reveals strain switches in Crohn's disease [article]

Taylor E. Reiter, Luiz Irber, Alicia A. Gingrich, Dylan Haynes, N. Tessa Pierce-Ward, Phillip T. Brooks, Yosuke Mizutani, Dominik Moritz, Felix Reidl, Amy D. Willis, Blair D. Sullivan, C. Titus Brown
2022 bioRxiv   pre-print
AbstractMicrobial strains have closely related genomes but may have different phenotypes in the same environment. Shotgun metagenomic sequencing can capture the genomes of all strains present in a community but strain-resolved analysis from shotgun sequencing alone remains difficult. We developed an approach to identify and interrogate strain-level differences in groups of metagenomes. We use this approach to perform a meta-analysis of stool microbiomes from individuals with and without
more » ... tory bowel disease (IBD; Crohn's disease, ulcerative colitis; n = 605), a disease for which there are not specific microbial biomarkers but some evidence that microbial strain variation may stratify by disease state. We first developed a machine learning classifier based on compressed representations of complete metagenomes (FracMinHash sketches) and identified genomes that correlate with IBD subtype. To rescue variation that may not have been present in the genomes, we then used assembly graph genome queries to recover strain variation for correlated genomes. Lastly, we developed a novel differential abundance framework that works directly on the assembly graph to uncover all sequence variants correlated with IBD. We refer to this approach as dominating set differential abundance analysis and have implemented it in the spacegraphcats software package. Using this approach, we identified five bacterial strains that are associated with Crohn's disease. Our method captures variation within the entire sequencing data set, allowing for discovery of previously hidden disease associations.
doi:10.1101/2022.06.30.498290 fatcat:cxhazztgsrdtjancwxtdubc24u

Climate Simulation and Change in the Brazilian Climate Model

Paulo Nobre, Leo S. P. Siqueira, Roberto A. F. de Almeida, Marta Malagutti, Emanuel Giarolla, Guilherme P. Castelão, Marcus J. Bottino, Paulo Kubota, Silvio N. Figueroa, Mabel C. Costa, Manoel Baptista, Luiz Irber (+1 others)
2013 Journal of Climate  
14 15 The response of the global climate system to atmospheric CO 2 concentration increase in 16 time is scrutinized employing the Brazilian Climate Model (BESM-OA2.3). Through the 17 achievement of over two thousand years of coupled model integrations in ensemble 18 mode, it is shown that the model simulates the signal of recent changes of global climate 19 trends, depicting a steady atmospheric and oceanic temperature increase and 20 corresponding marine ice retreat. The model simulations
more » ... mpass the time period from 21 1960 to 2105, following the Coupled Model Intercomparison Project 5 (CMIP5) protocol. 22
doi:10.1175/jcli-d-12-00580.1 fatcat:2bmobauyw5cwbotrvb3bi3r3fy

The khmer software package: enabling efficient nucleotide sequence analysis

Michael R. Crusoe, Hussien F. Alameldin, Sherine Awad, Elmar Boucher, Adam Caldwell, Reed Cartwright, Amanda Charbonneau, Bede Constantinides, Greg Edvenson, Scott Fay, Jacob Fenton, Thomas Fenzl (+49 others)
2015 F1000Research  
A manuscript on this implementation is in progress (Irber and Brown, unpublished).  ... 
doi:10.12688/f1000research.6924.1 pmid:26535114 pmcid:PMC4608353 fatcat:oudtrf4aufeexic6rxmthh4pk4
« Previous Showing results 1 — 15 out of 33 results