Filters








27 Hits in 2.3 sec

Large-scale sequence comparisons with sourmash

N. Tessa Pierce, Luiz Irber, Taylor Reiter, Phillip Brooks, C. Titus Brown
2019 F1000Research  
The sourmash software package uses MinHash-based sketching to create "signatures", compressed representations of DNA, RNA, and protein sequences, that can be stored, searched, explored, and taxonomically  ...  annotated. sourmash signatures can be used to estimate sequence similarity between very large data sets quickly and in low memory, and can be used to search large databases of genomes for matches to query  ...  Here we present version 2.0 of sourmash 9 , a Python library for building and utilizing MinHash sketches of DNA, RNA, and protein data. sourmash incorporates and extends standard MinHash techniques for  ... 
doi:10.12688/f1000research.19675.1 pmid:31508216 pmcid:PMC6720031 fatcat:mpzjwte2djf5bhyrp5u45evqzq

Large-scale sequence comparisons with sourmash [article]

N Tessa Pierce, Luiz Irber, Taylor Reiter, Phillip Brooks, C. Titus Brown
2019 bioRxiv   pre-print
The sourmash software package uses MinHash-based sketching to create "signatures", compressed representations of DNA, RNA, and protein sequences, that can be stored, searched, explored, and taxonomically  ...  annotated. sourmash signatures can be used to estimate sequence similarity between very large data sets quickly and in low memory, and can be used to search large databases of genomes for matches to query  ...  Here we present version 2.0 of sourmash (9) , a Python library for building and utilizing MinHash sketches of DNA, RNA, and protein data. sourmash incorporates and extends standard MinHash techniques  ... 
doi:10.1101/687285 fatcat:7sanndl4z5fpdirc5os7hpvdbu

Lightweight compositional analysis of metagenomes with FracMinHash and minimum metagenome covers [article]

Luiz Carlos Irber, Phillip T Brooks, Taylor E Reiter, N Tessa Pierce-Ward, Mahmudur Rahman Hera, David Koslicki, C. Titus Brown
2022 bioRxiv   pre-print
We implement a greedy approximate solution using FracMinHash sketches, and evaluate its accuracy for taxonomic assignment using a CAMI community benchmark.  ...  We first investigate the FracMinHash sketching technique, a derivative of modulo hash that supports Jaccard containment estimation between sets of different sizes.  ...  FracMinHash sketches were created for DNA sequence inputs using the sourmash sketch dna command with the scaled parameter.  ... 
doi:10.1101/2022.01.11.475838 fatcat:qr2hee27hnbwdme5b7ch2wrody

Genomic characterization of a diazotrophic microbiota associated with maize aerial root mucilage

Shawn M. Higdon, Tania Pozzo, Nguyet Kong, Bihua C. Huang, Mai Lee Yang, Richard Jeannotte, C. Titus Brown, Alan B. Bennett, Bart C. Weimer, Jen-Tsung Chen
2020 PLoS ONE  
develops an extensive network of mucilage-secreting aerial roots that harbors a diazotrophic (N2-fixing) microbiota.  ...  We examined each diazotroph genome for the presence of nif genes essential to nitrogen fixation (nifHDKENB) and carbohydrate utilization genes relevant to the mucilage polysaccharide digestion.  ...  All by all comparison of MinHash sketches of draft genome assemblies from 588 bacterial isolates using Sourmash [19] .  ... 
doi:10.1371/journal.pone.0239677 pmid:32986754 fatcat:3g6qlowvvve25bia4k4qfxm2m4

Streaming histogram sketching for rapid microbiome analytics

Will PM Rowe, Anna Paola Carrieri, Cristina Alcon-Giner, Shabhonam Caim, Alex Shaw, Kathleen Sim, J. Simon Kroll, Lindsay J. Hall, Edward O. Pyzer-Knapp, Martyn D. Winn
2019 Microbiome  
To address this need, we propose a new method for tyrhe compact representation of microbiome sequencing data using similarity-preserving sketches of streaming k-mer spectra.  ...  can process huge amounts of data in a short amount of time.  ...  This work was funded via a Wellcome Trust Investigator Award to LJH (100/974/C/13/Z), and support of the BBSRC Norwich Research Park Bioscience Doctoral Training Grant (BB/M011216/1, supervisor LJH, student  ... 
doi:10.1186/s40168-019-0653-2 pmid:30878035 pmcid:PMC6420756 fatcat:dmumukc3ljcyhfx242zrb3fme4

Genomic characterization of a diazotrophic microbiota associated with maize aerial root mucilage [article]

Shawn M Higdon, Tania Pozzo, Nguyet Kong, Bihua C Huang, Mai Lee Yang, Richard Jeannotte, C Titus Brown, Alan B Bennett, Bart C Weimer
2020 biorxiv/medrxiv   pre-print
develops an extensive network of mucilage-secreting aerial roots that harbors a diazotrophic microbiota.  ...  We examined each diazotroph genome for the presence of nif genes essential to nitrogen fixation (nifHDKENB) and carbohydrate utilization genes relevant to the mucilage polysaccharide digestion.  ...  All by all comparison of MinHash sketches of draft genome assemblies from 588 bacterial isolates using Sourmash [19] .  ... 
doi:10.1101/2020.04.27.064337 fatcat:35vldsk7tfdepcpuenuncsxeby

To Petabytes and beyond: recent advances in probabilistic and signal processing algorithms and their application to metagenomics

R A Leo Elworth, Qi Wang, Pavan K Kota, C J Barberan, Benjamin Coleman, Advait Balaji, Gaurav Gupta, Richard G Baraniuk, Anshumali Shrivastava, Todd J Treangen
2020 Nucleic Acids Research  
For instance, sketching algorithms such as MinHash have seen a rapid and widespread adoption.  ...  As computational biologists continue to be inundated by ever increasing amounts of metagenomic data, the need for data analysis approaches that keep up with the pace of sequence archives has remained a  ...  implied, of the ODNI, IARPA, ARO or the US Government.  ... 
doi:10.1093/nar/gkaa265 pmid:32338745 fatcat:v4julw322nc47kgb7q7urkkkci

Alignment-free microbiome-based classification of fresh produce safety and quality [article]

Chao Liao, Luxin Wang, Gerald Quon
2022 bioRxiv   pre-print
(ASV) strategy that uses a typical denoising step.  ...  Here, we explored an alignment-free analysis strategy using k-mer hashes to identify DNA signatures predictive of produce safety and produce quality, and compared it against the amplicon sequence variant  ...  Acknowledgements This work was partially funded through a UC Davis CeDAR Innovative Data Science Seed Funding Program Grant. G.Q. was supported by NSF CAREER award 1846559.  ... 
doi:10.1101/2022.08.25.505309 fatcat:kn5bwng5hfhehkbaowpbsyc724

Dashing: fast and accurate genomic distances with HyperLogLog

Daniel N. Baker, Ben Langmead
2019 Genome Biology  
Dashing summarizes genomes more rapidly than previous MinHash-based methods while providing greater accuracy across a wide range of input sizes and sketch sizes.  ...  Dashing is a fast and accurate software tool for estimating similarities of genomes or sequencing datasets.  ...  Acknowledgements We thank Florian Breitwieser for HLL implementation discussions and Nikita Ivkin for insights with regard to sketch data structure theory and implementation.  ... 
doi:10.1186/s13059-019-1875-0 pmid:31801633 pmcid:PMC6892282 fatcat:ujldsdngora6hkkyclds5xibmq

Single cell genome sequencing of laboratory mouse microbiota improves taxonomic and functional resolution of this model microbial community [article]

Svetlana Lyalina, Ramunas Stepanauskas, Frank Wu, Shomyseh Sanjabi, Katherine S Pollard
2021 bioRxiv   pre-print
However, much of the taxonomic and functional diversity of the mouse gut microbiome is missed in current metagenomic studies, because genome databases have not achieved a balanced representation of the  ...  From these, we generated 298 high-coverage microbial genome assemblies, which we annotated for open reading frames and phylogenetic placement.  ...  Acknowledgments We thank the staff of the Bigelow Laboratory Single Cell Genomics Center for the generation of single cell genomics data  ... 
doi:10.1101/2021.12.13.472402 fatcat:tz6wsvzsqfbhhaw2xyipimvmm4

Streaming histogram sketching for rapid microbiome analytics [article]

Will PM Rowe, Anna Paola Carrieri, Cristina Alcon-Giner, Shabhonam Caim, Alex Shaw, Kathleen Sim, J Simon Kroll, Lindsay Hall, Edward O Pyzer-Knapp, Martyn D Winn
2018 bioRxiv   pre-print
To address this need, we propose a new method for the compact representation of microbiome sequencing data using similarity-preserving sketches of streaming k-mer spectra.  ...  These sketches allow for dissimilarity estimation, rapid microbiome catalogue searching, and classification of microbiome samples in near real-time.  ...  Acknowledgements Availability and implementation The source code for our implementation, as well as the code used to run the analyses and plot the manuscript figures, can be found in the HULK ( github.com  ... 
doi:10.1101/408070 fatcat:f7glkk7uabbflgtwouv7f3oc5m

Short- and long-read metagenomics of South African gut microbiomes reveal a transitional composition and novel taxa [article]

Fiona B Tamburini, Dylan Maghini, Ovokeraye H Oduaran, Ryan Brewster, Michaella R Hulley, Venesa Sahibdeen, Shane A Norris, Stephen Tollman, Kathleen Kahn, Ryan Wagner, Alisha N Wade, Floidy Wafawanaka (+5 others)
2020 bioRxiv   pre-print
Furthermore, we demonstrate that current reference collections are incomplete for nonwestern microbiomes and as a result, patterns of within-cohort beta diversity are reversed compared to the ground truth  ...  Yet, the majority of the world's population resides along a continuum between these two extremes.  ...  We thank Karen Andrade for her contributions in planning the 2019 Community Advisory Group workshop. We thank the INDEPTH consortium for their support of this project.  ... 
doi:10.1101/2020.05.18.099820 fatcat:yhkrb32vdverlf7topqsawetji

Transcriptomics provides a genetic signature of vineyard site with insight into vintage-independent regional wine characteristics [article]

Taylor Reiter, Rachel Montpetit, Shelby Byer, Isadora Frias, Esmeralda Leon, Robert Viano, Michael Mcloughlin, Thomas Halligan, Desmond Hernandez, Rosa Figueroa-Balderas, Dario Cantu, Kerri Steenwerth (+2 others)
2021 bioRxiv   pre-print
Ribosomal DNA amplicon sequencing of grape musts has demonstrated that microorganisms occur non-randomly and are associated with the vineyard of origin, suggesting a role for the vineyard, grape, and wine  ...  We used ribosomal DNA amplicon sequencing of grape must and RNA sequencing of primary fermentations to profile fermentations from 15 vineyards in California and Oregon across two vintages.  ...  Brown CT, Irber L. 2016. sourmash: a library for MinHash sketching of DNA. Hoff KJ, Stanke M. 2013. WebAUGUSTUS-a web service for training AUGUSTUS and 866 predicting genes in eukaryotes.  ... 
doi:10.1101/2021.01.07.425830 fatcat:y5vc6xle4jhdrlrz6qe3upt3oi

Kssd: sequence dimensionality reduction by k-mer substring space sampling enables real-time large-scale datasets analysis

Huiguang Yi, Yanling Lin, Chengqi Lin, Wenfei Jin
2021 Genome Biology  
Here, we develop k -mer substring space decomposition (Kssd), a sketching technique which is significantly faster and more accurate than current sketching methods.  ...  Using Kssd, we prioritize references for all 1,019,179 bacteria whole genome sequencing (WGS) runs from NCBI Sequence Read Archive and find misidentification or contamination in 6164 of these.  ...  Acknowledgements We thank three anonymous referees for their constructive comments for this paper. Review history The review history is available as Additional file 5.  ... 
doi:10.1186/s13059-021-02303-4 pmid:33726811 pmcid:PMC7962209 fatcat:ylta5ntqqjflno5wen5r22665u

Integrating Culture-based Antibiotic Resistance Profiles with Whole-genome Sequencing Data for 11,087 Clinical Isolates

Valentina Galata, Cédric C. Laczny, Christina Backes, Georg Hemmrich-Stanisak, Susanne Schmolke, Andre Franke, Eckart Meese, Mathias Herrmann, Lutz von Müller, Achim Plum, Rolf Müller, Cord Stähler (+2 others)
2019 Genomics, Proteomics & Bioinformatics  
The isolate collection and the analysis results have been integrated into GEAR-base, a resource available for academic research use free of charge at https://gear-base.com.  ...  isolates including 18 main species spanning a time period of 30 years.  ...  We would like to thank Siemens Healthcare and the Curetis Group for their support and for the datasets provided. We are grateful to Laura Smoot, Andrea L. Mrotz, Khoa D. Nguyen, Michael A.  ... 
doi:10.1016/j.gpb.2018.11.002 pmid:31100356 pmcid:PMC6624217 fatcat:6cdvk4vrwjd6nonsav6t3awabe
« Previous Showing results 1 — 15 out of 27 results