1,754 Hits in 3.4 sec

A scalable data analysis platform for metagenomics

Wei Tang, Jared Wilkening, Narayan Desai, Wolfgang Gerlach, Andreas Wilke, Folker Meyer
2013 2013 IEEE International Conference on Big Data  
Shock and AWE can be used to build a scalable and reproducible data analysis infrastructure for upper-level biological data analysis services.  ...  To address the computational challenges posed by this workload, we developed a new data analysis platform, including a data management system (Shock) for biological sequence data and a workflow management  ...  As a product, we have developed the Shock data management and AWE workflow management system which can be used to build a scalable a data analysis platform.  ... 
doi:10.1109/bigdata.2013.6691723 dblp:conf/bigdataconf/TangWDGWM13 fatcat:n5w2eavmtraehhzkx35tydliai

NanoSPC: a scalable, portable, cloud compatible viral nanopore metagenomic data processing pipeline

Yifei Xu, Fan Yang-Turner, Denis Volk, Derrick Crook
2020 Nucleic Acids Research  
Here we introduce NanoSPC, a scalable, portable and cloud compatible pipeline for analyzing Nanopore sequencing data.  ...  Moreover, we deploy NanoSPC to our scalable pathogen pipeline platform, enabling elastic computing for high throughput Nanopore data on HPC cluster as well as multiple cloud platforms, such as Google Cloud  ...  CONCLUSION We report NanoSPC, a scalable, portable, and cloud compatibilable pipeline for analyzing metagenomic sequencing data generated using ONT.  ... 
doi:10.1093/nar/gkaa413 pmid:32442274 fatcat:6dtk4z4yqjdtjnocm3xnjozsia

META-pipe - Pipeline Annotation, Analysis and Visualization of Marine Metagenomic Sequence Data [article]

Espen Mikal Robertsen, Tim Kahlke, Inge Alexander Raknes, Edvard Pedersen, Erik Kjærner Semb, Martin Ernstsen, Lars Ailo Bongo, Nils Peder Willassen
2016 arXiv   pre-print
We provide a new pipeline, META-pipe, for marine metagenomics analysis. It offers pre- processing, assembly, taxonomic classification and functional analysis.  ...  We have evaluated the scalability and performance of the analysis pipeline.  ...  There is therefore a need to develop a scalable pipeline for the marine metagenomics field.  ... 
arXiv:1604.04103v1 fatcat:i5qw7db6jfgohpbtq3t4oymmca

Web Resources for Metagenomics Studies

Pravin Dudhagara, Sunil Bhavsar, Chintan Bhagat, Anjana Ghelani, Shreyas Bhatt, Rajesh Patel
2015 Genomics, Proteomics & Bioinformatics  
The development of next-generation sequencing (NGS) platforms spawned an enormous volume of data. This explosion in data has unearthed new scalability challenges for existing bioinformatics tools.  ...  In this article, we review several commonly-used online tools for metagenomics data analysis with respect to their quality and detail of analysis using simulated metagenomics data.  ...  Acknowledgments The authors are thankful to BIT Virtual Centre, Patan Node, Hemchandracharya North Gujarat University, India for providing servers and computational facilities.  ... 
doi:10.1016/j.gpb.2015.10.003 pmid:26602607 pmcid:PMC4678780 fatcat:uldztj42fbdqvbq55u3poyigeq

Workload characterization for MG-RAST metagenomic data analytics service in the cloud

Wei Tang, Jared Bischof, Narayan Desai, Kanak Mahadik, Wolfgang Gerlach, Travis Harrison, Andreas Wilke, Folker Meyer
2014 2014 IEEE International Conference on Big Data (Big Data)  
For example, MG-RAST, a production open-public metagenome annotation service, has experienced increasingly large amount of data submission and has demanded scalable resources for the computational needs  ...  The consequent data deluge has imposed big burdens for data analysis applications.  ...  ACKNOWLEDGMENTS This work was supported in part by the NIH award U01HG006537 "OSDF: Support infrastructure for NextGen sequence storage, analysis, and management", and U.S.  ... 
doi:10.1109/bigdata.2014.7004394 dblp:conf/bigdataconf/TangBDMGHWM14 fatcat:4qcliocqhbfyxam2ch26dmst2u

A Review of Scalable Bioinformatics Pipelines

Bjørn Fjukstad, Lars Ailo Bongo
2017 Data Science and Engineering  
Scalability is increasingly important for bioinformatics analysis services, since these must handle larger datasets, more jobs, and more users.  ...  We also discuss current trends for bioinformatics pipeline development.  ...  GESALL Variant Calling Pipeline GESALL [21] is a genomic analysis platform for unmodified analysis tools that use the POSIX file system interface.  ... 
doi:10.1007/s41019-017-0047-z fatcat:7wyzccy7ffhjdd46pfmljrzioy

Flexible metagenome analysis using the MGX framework

Sebastian Jaenicke, Stefan P. Albaum, Patrick Blumenkamp, Burkhard Linke, Jens Stoye, Alexander Goesmann
2018 Microbiome  
Conclusions: With MGX, we provide a novel metagenome analysis platform giving researchers access to the most recent analysis tools.  ...  Results: We present MGX, a flexible and extensible client/server-framework for the management and analysis of metagenomic datasets; MGX features a comprehensive set of adaptable workflows required for  ...  Acknowledgements Funding for the operation and maintenance of MGX is provided by the German Federal Ministry of Education and Research (BMBF) project "Bielefeld-Gießen Center for Microbial Bioinformatics-BiGi  ... 
doi:10.1186/s40168-018-0460-1 pmid:29690922 pmcid:PMC5937802 fatcat:a5sd3o6lpbbcjdioaoo6jar2zq

Metagenome2Vec: Building Contextualized Representations for Scalable Metagenome Analysis [article]

Sathyanarayanan N. Aakur, Vineela Indla, Vennela Indla, Sai Narayanan, Arunkumar Bagavathi, Vishalini Laguduva Ramnath, Akhilesh Ramachandran
2021 arXiv   pre-print
Given the high volume of metagenome sequences, there is a need for scalable frameworks to analyze and segment metagenome sequences from clinical samples, which can be highly imbalanced.  ...  There is an increased need for learning robust representations from metagenome reads since pathogens within a family can have highly similar genome structures (some more than 90%) and hence enable the  ...  Andres Espindola (Institute of Biosecurity and Microbial Forensics, Oklahoma State University) for providing access and assisting with use of the MiFi platform.  ... 
arXiv:2111.08001v1 fatcat:4cj5uwdumveynpyivhnt53gi3q

METAREP: JCVI metagenomics reports—an open source tool for high-performance comparative metagenomics

Johannes Goll, Douglas B. Rusch, David M. Tanenbaum, Mathangi Thiagarajan, Kelvin Li, Barbara A. Methé, Shibu Yooseph
2010 Computer applications in the biosciences : CABIOS  
A data management layer allows collaborative data analysis and result sharing. Availability: Website  ...  JCVI Metagenomics Reports (METAREP) is a Web 2.0 application designed to help scientists analyze and compare annotated metagenomics data sets.  ...  WEB ANALYSIS FEATURES The METAREP View pages provide high level summaries for a dataset (Fig 1 A) . Each tab provides a ranked list and bar chart for the respective data type.  ... 
doi:10.1093/bioinformatics/btq455 pmid:20798169 pmcid:PMC2951084 fatcat:yrcth3x7rbeuzdifk6tmdicyzm

SparkBLAST: scalable BLAST processing using in-memory operations

Marcelo Rodrigo de Castro, Catherine dos Santos Tostes, Alberto M. R. Dávila, Hermes Senger, Fabricio A. B. da Silva
2017 BMC Bioinformatics  
The demand for processing ever increasing amounts of genomic data has raised new challenges for the implementation of highly scalable and efficient computational systems.  ...  As a proof of concept, some radionuclide-resistant bacterial genomes were selected for similarity analysis.  ...  Acknowledgements The authors would like to thank Thais Martins for help with input data preparation, and Rodrigo Jardim for data preparation and preliminary RBH analysis.  ... 
doi:10.1186/s12859-017-1723-8 pmid:28655296 pmcid:PMC5488373 fatcat:gn5gxf5oxvdgdiri6xhquoe4my

Analysis of Metagenomics Data [chapter]

Elizabeth M. Glass, Folker Meyer
2011 Bioinformatics for High Throughput Sequencing  
This service has removed one of the primary bottlenecks in metagenome sequence analysis, the availability of high-performance computing for annotating data.  ...  In MG-RAST, all users retain full control of their data, and everything is available for download in a variety of formats.  ...  The MG-RAST server is the most widely used tool for the analysis of shotgun metagenomics and provides a basis for sequence analysis of large, complex data sets.  ... 
doi:10.1007/978-1-4614-0782-9_13 fatcat:62me5g4mbre2pjgpeh3zko5aa4

BugSeq: a highly accurate cloud platform for long-read metagenomic analyses [article]

Jeremy Fan, Steven Huang, Samuel D Chorlton
2020 bioRxiv   pre-print
As the use of nanopore sequencing for metagenomic analysis increases, tools capable of performing long-read taxonomic classification in a fast and accurate manner are needed.  ...  Results: We present BugSeq, a novel, highly accurate metagenomic classifier for nanopore reads.  ...  Resulting data, including quality control and metagenomic classification, was packaged into two HTML files (Supplementary Material), and showed superior accuracy compared with the original WIMP analysis  ... 
doi:10.1101/2020.10.08.329920 fatcat:vs3tkdimwrfgrdunb5t2mkrale

MetaGeniE: Characterizing Human Clinical Samples Using Deep Metagenomic Sequencing

Arun Rawat, David M. Engelthaler, Elizabeth M. Driebe, Paul Keim, Jeffrey T. Foster, Patrick Tang
2014 PLoS ONE  
Among the primary challenges of clinical metagenomic sequencing is the rapid filtering of human reads to survey for pathogens with high specificity and sensitivity.  ...  This variation in metagenomes typically manifests in sequencing datasets as low pathogen abundance, a high number of host reads, and the presence of close relatives and complex microbial communities.  ...  Analyzed the data: AR. Contributed reagents/ materials/analysis tools: EMD PK JTF AR. Wrote the paper: AR DME JTF.  ... 
doi:10.1371/journal.pone.0110915 pmid:25365329 pmcid:PMC4218713 fatcat:n76wfhkklrfnpcyj6fvoal73ze

SpaRC: Scalable Sequence Clustering using Apache Spark [article]

Lizhen Shi, Xiandong Meng, Elizabeth Tseng, Michael Mascagni, Zhong Wang
2018 bioRxiv   pre-print
with rapid development/deployment cycles for similar large scale sequence data analysis problems.  ...  It achieved a near linear scalability with respect to input data size and number of compute nodes.  ...  For data scalability test we use 20GB, 40GB, 60GB, 80GB, 249 and 100GB fastq datasets from Cow Rumen metagenome.  ... 
doi:10.1101/246496 fatcat:3qdr2rvcmzcbxiqrjjlm3lh5ry

Towards Solving The Metagenomics Reproducibility Crisis With Cwl And Ro

Folker Meyer
2018 Zenodo  
Both metagenome data and computation with metagenomes are expensive [Thomas], significant degrees of freedom exists for the computational analysis underscoring the need for reproducibility in the field  ...  The MG-RAST portal [Meyer] and its European sister project MGnify [Mitchell] at the European Bioinformatics Institute (EMBL-EBI) provide metagenome analysis services to a large, international community  ...  Microbiome puzzle challenge Distributed containerized workflows Skyport --a scalable platform for distributed data centric reproducible computing • Multiple backends ("cloud", bare-metal, legacy  ... 
doi:10.5281/zenodo.1484480 fatcat:zillrhyup5hsbc7vobrjmlz3ai
« Previous Showing results 1 — 15 out of 1,754 results