Filters








414 Hits in 10.2 sec

Group-based variant calling leveraging next-generation supercomputing for large-scale whole-genome sequencing studies

Kristopher A. Standish, Tristan M. Carland, Glenn K. Lockwood, Wayne Pfeiffer, Mahidhar Tatineni, C Chris Huang, Sarah Lamberth, Yauheniya Cherkas, Carrie Brodmerkel, Ed Jaeger, Lance Smith, Gunaretnam Rajagopal (+2 others)
2015 BMC Bioinformatics  
on 437 whole human genomes generated as part of large association study.  ...  Results: We describe our experience implementing and evaluating a group-based approach to calling variants on large numbers of whole human genomes.  ...  Whole-genome sequencing Whole blood samples were processed at the Beijing Genomic Institute (BGI) for DNA sequencing.  ... 
doi:10.1186/s12859-015-0736-4 pmid:26395405 pmcid:PMC4580299 fatcat:q4otwefczjg2tfi7su2holnstu

A hybrid computational strategy to address WGS variant analysis in >5000 samples

Zhuoyi Huang, Navin Rustagi, Narayanan Veeraraghavan, Andrew Carroll, Richard Gibbs, Eric Boerwinkle, Manjunath Gorentla Venkata, Fuli Yu
2016 BMC Bioinformatics  
The decreasing costs of sequencing are driving the need for cost effective and real time variant calling of whole genome sequencing data.  ...  Other infrastructures like the cloud AWS environment and supercomputers also have limitations due to which large scale joint variant calling becomes infeasible, and infrastructure specific variant calling  ...  We would like to thank Dr Marek Kimmel and Rice Supercomputing Research Center for help with access to the Blue BioU supercomputing facility.  ... 
doi:10.1186/s12859-016-1211-6 pmid:27612449 pmcid:PMC5018196 fatcat:ihflzgl4rrbabc7qrx2w3eeyge

Cyberinfrastructure resources enabling creation of the loblolly pine reference transcriptome

Le-Shin Wu, Carrie L. Ganote, Thomas G. Doak, William Barnett, Keithanne Mockaitis, Craig A. Stewart
2015 Proceedings of the 2015 XSEDE Conference on Scientific Advancements Enabled by Enhanced Cyberinfrastructure - XSEDE '15  
, with and without whole genome sequence references.  ...  The Mason cluster, an XSEDE second tier resource at Indiana University, provides the necessary fast CPU cycles, large memory, and high I/O throughput for conducting large-scale genomics research.  ...  This large memory profile is particularly suitable for assembly of data from next-generation sequencers, large-scale phylogenetic software, or other genome analysis applications that require large amounts  ... 
doi:10.1145/2792745.2792748 dblp:conf/xsede/WuGDBMS15 fatcat:oz2hnvquifahblc6ykf4wjcppa

Experiences building Globus Genomics: a next-generation sequencing analysis service using Galaxy, Globus, and Amazon Web Services

Ravi K. Madduri, Dinanath Sulakhe, Lukasz Lacinski, Bo Liu, Alex Rodriguez, Kyle Chard, Utpal J. Dave, Ian T. Foster
2014 Concurrency and Computation  
We describe Globus Genomics, a system that we have developed for rapid analysis of large quantities of next-generation sequencing (NGS) genomic data.  ...  The system allows biomedical researchers to perform rapid analysis of large NGS datasets in a fully automated manner, without software installation or a need for any local computing infrastructure.  ...  ., for an award of Amazon Web Services time that facilitated early experiments. We thank Globus Genomics users for their invaluable contributions.  ... 
doi:10.1002/cpe.3274 pmid:25342933 pmcid:PMC4203657 fatcat:glcie6spdzdllakjibkyniqo5y

Large-Scale Uniform Analysis of Cancer Whole Genomes in Multiple Computing Environments [article]

Christina K. Yung, Brian D. O'Connor, Sergei Yakneen, Junjun Zhang, Kyle Ellrott, Kortine Kleinheinz, Naoki Miyoshi, Keiran M. Raine, Romina Royo, Gordon B. Saksena, Matthias Schlesner, Solomon I. Shorser (+14 others)
2017 bioRxiv   pre-print
To provide this dataset to the research working groups for downstream analysis, the PCAWG Technical Working Group marshalled ~800TB of sequencing data from distributed geographical locations; developed  ...  portable software for uniform alignment, variant calling, artifact filtering and variant merging; performed the analysis in a geographically and technologically disparate collection of compute environments  ...  Acknowledgements The authors would like to acknowledge the donation of the following compute resources: the PRACE Research Infrastructure resource MareNostrum3 at Barcelona Supercomputing Center  ... 
doi:10.1101/161638 fatcat:lgpfe77jmbf6bnljteztvqla5i

ELIXIR-IT HPC@CINECA: high performance computing resources for the bioinformatics community

Tiziana Castrignanò, Silvia Gioiosa, Tiziano Flati, Mirko Cestari, Ernesto Picardi, Matteo Chiara, Maddalena Fratelli, Stefano Amente, Marco Cirilli, Marco Antonio Tangaro, Giovanni Chillemi, Graziano Pesole (+1 others)
2020 BMC Bioinformatics  
The advent of Next Generation Sequencing (NGS) technologies and the concomitant reduction in sequencing costs allows unprecedented high throughput profiling of biological systems in a cost-efficient manner  ...  Starting from April 2016, CINECA and ELIXIR-IT launched the pilot Call "ELIXIR-IT HPC@CINECA", offering streamlined access to HPC resources for bioinformatics.  ...  Acknowledgements We would like to thank all HPC@Cineca users for their wonderful feedback.  ... 
doi:10.1186/s12859-020-03565-8 pmid:32838759 fatcat:dcoic2ot4zhatjj34jeunp5ecq

Some experiences and opportunities for big data in translational research

Christopher G. Chute, Mollie Ullman-Cullere, Grant M. Wood, Simon M. Lin, Min He, Jyotishman Pathak
2013 Genetics in Medicine  
ACKNOWLEDGMENTS We are grateful for the grant support in part from the National Human Genome Research Institute as eMERGE consortium members, specifically U01-HG06379 (Mayo Clinic) and U01-HG006389 (Marshfield  ...  NGS technologies can quickly generate the sequence of a whole genome or can be more targeted using an approach called exome sequencing.  ...  next-generation sequencing (NGS) methods.  ... 
doi:10.1038/gim.2013.121 pmid:24008998 pmcid:PMC3906918 fatcat:ayzsjqtp35fupfwazpwbnwuisa

Genomics and data science: an application within an umbrella

Fábio C. P. Navarro, Hussein Mohsen, Chengfei Yan, Shantao Li, Mengting Gu, William Meyerson, Mark Gerstein
2019 Genome Biology  
Finally, we discuss how data value, privacy, and ownership are pressing issues for data science applications, in general, and are especially relevant to genomics, due to the persistent nature of DNA.  ...  Data science allows the extraction of practical insights from large-scale data. Here, we contextualize it as an umbrella term, encompassing several disparate subdomains.  ...  Dashed lines indicate projections of future growth in data volume and infrastructure capacity for the next decade. b Cumulative number of datasets being generated for whole genome sequencing (WGS) and  ... 
doi:10.1186/s13059-019-1724-1 pmid:31142351 pmcid:PMC6540394 fatcat:do2a6yrfkndg3eua2xu26llxx4

TCGA Expedition: A Data Acquisition and Management System for TCGA Data

Uma R. Chandran, Olga P. Medvedeva, M. Michael Barmada, Philip D. Blood, Anish Chakka, Soumya Luthra, Antonio Ferreira, Kim F. Wong, Adrian V. Lee, Zhihui Zhang, Robert Budden, J. Ray Scott (+4 others)
2016 PLoS ONE  
TCGA data are currently over 1.2 Petabyte in size and include whole genome sequence (WGS), whole exome sequence, methylation, RNA expression, proteomic, and clinical datasets.  ...  The Cancer Genome Atlas Project (TCGA) is a National Cancer Institute effort to profile at least 500 cases of 20 different tumor types using genomic platforms and to make these data, both raw and processed  ...  Security for their help in ensuring that we meet the dbGAP security best practices, and Brian Stengel (CSSD) for his assistance with Science DMZ integration.  ... 
doi:10.1371/journal.pone.0165395 pmid:27788220 pmcid:PMC5082933 fatcat:7nl57f5arratdlm6bmx7tk2qu4

Managing genomic variant calling workflows with Swift/T

Azza E. Ahmed, Jacob Heldenbrand, Yan Asmann, Faisal M. Fadlelmola, Daniel S. Katz, Katherine Kendig, Matthew C. Kendzior, Tiffany Li, Yingxue Ren, Elliott Rodriguez, Matthew R. Weber, Justin M. Wozniak (+3 others)
2019 PLoS ONE  
Additionally, we formalized a set of design criteria for quality, robust, maintainable workflows that must function at-scale in a production setting, such as a large genomic sequencing facility or a major  ...  The code for our implementation of a variant calling workflow using Swift/T can be found on GitHub at https://github.com/ncsa/Swift-T-Variant-Calling, with full documentation provided at http://swift-t-variant-calling.readthedocs.io  ...  Supercomputing Applications. LSM was awarded an allocation on the Blue Waters supercomputer, which was used for some of the computational tests.  ... 
doi:10.1371/journal.pone.0211608 pmid:31287816 pmcid:PMC6615596 fatcat:rr25otjjrjbl3bnz4i7uh5mdsi

Next generation sequencing in clinical medicine: Challenges and lessons for pathology and biomedical informatics

MichaelJ Becich, Lucas Santana-Santos, RamaR Gullapalli, KetakiV Desai, JeffreyA Kant
2012 Journal of Pathology Informatics  
Routine whole exome or even whole genome sequencing of clinical patients is well within the realm of affordability for many academic institutions across the country.  ...  Today, Next Generation Sequencing (NGS) techniques represent the next phase in the evolution of DNA sequencing technology at dramatically reduced cost compared to traditional Sanger sequencing.  ...  ACKNOWLEDGEMENTS The authors would like to acknowledge the following individuals of the Pittsburgh NGS task force for their role in the planning and execution of the 25  ... 
doi:10.4103/2153-3539.103013 pmid:23248761 pmcid:PMC3519097 fatcat:qoxdhbrhtzbgvhxgpjl5jz2bym

The role of High Performance Computing in Bioinformatics

Horacio Emilio Pérez Sánchez, José M. Cecilia, Ivan Merelli
2014 International Work-Conference on Bioinformatics and Biomedical Engineering  
The consolidation of heterogeneous systems at different levels -from desktop computers to large-scale systems such as supercomputers, clusters or grids, through all kinds of low-power devices-is providing  ...  This introductory article shows the last tendencies of this active research field and our perspectives for the forthcoming years.  ...  We also thank NVIDIA for hardware donation under CUDA Teaching Center 2014  ... 
dblp:conf/iwbbio/SanchezCM14 fatcat:kfg7ta2dcjd5tegsbqhu3ygq2a

WGS Analysis and Interpretation in Clinical and Public Health Microbiology Laboratories: What Are the Requirements and How Do Existing Tools Compare?

Kelly Wyres, Thomas Conway, Saurabh Garg, Carlos Queiroz, Matthias Reumann, Kathryn Holt, Laura Rusu
2014 Pathogens  
Here we consider the requirements of microbiology laboratories for the analysis, clinical interpretation and management of bacterial whole-genome sequence (WGS) data.  ...  Recent advances in DNA sequencing technologies have the potential to transform the field of clinical and public health microbiology, and in the last few years numerous case studies have demonstrated successful  ...  analyses from first generation but not next-generation sequence data; d The Center for Genomic Epidemiology snpTree tool deals only with variant call information and thus multiple sequence alignment sensu  ... 
doi:10.3390/pathogens3020437 pmid:25437808 pmcid:PMC4243455 fatcat:qwvy7jgbvnd65nrbfholtyfolu

Integrated genome sizing (IGS) approach for the parallelization of whole genome analysis

Peter Sona, Jong Hui Hong, Sunho Lee, Byong Joon Kim, Woon-Young Hong, Jongcheol Jung, Han-Na Kim, Hyung-Lae Kim, David Christopher, Laurent Herviou, Young Hwan Im, Kwee-Yum Lee (+2 others)
2018 BMC Bioinformatics  
However, storing raw sequence reads to perform large-scale genome analysis pose hardware challenges.  ...  In this study, an Integrated Genome Sizing (IGS) approach is adopted to speed up multiple whole genome analysis in high-performance computing (HPC) environment.  ...  Availability of data and materials The low coverage sequence alignment BAM formatted mapped datasets generated and analyzed during the current study are available on the web link ftp://ftp-trace.ncbi.nih.gov  ... 
doi:10.1186/s12859-018-2499-1 fatcat:o5irzxpbtjb7netlt6yanfkza4

Accelerating K-mer Frequency Counting with GPU and Non-Volatile Memory

Nicola Cadenelli, Jorda Polo, David Carrera
2017 2017 IEEE 19th International Conference on High Performance Computing and Communications; IEEE 15th International Conference on Smart City; IEEE 3rd International Conference on Data Science and Systems (HPCC/SmartCity/DSS)  
The emergence of Next Generation Sequencing (NGS) platforms has increased the throughput of genomic sequencing and in turn the amount of data that needs to be processed, requiring highly efficient computation  ...  This paper presents a redesign of the main component of a state-of-the-art reference-free method for variant calling, SMUFIN, which has been adapted to make the most of GPUs and NVM devices.  ...  We are also grateful to SandDisk for lending the FusionIO cards and to Nvidia who donated the Tesla K40c.  ... 
doi:10.1109/hpcc-smartcity-dss.2017.57 dblp:conf/hpcc/CadenelliPC17 fatcat:u6wz5qrrjncwhp6hu4wrvn4yhe
« Previous Showing results 1 — 15 out of 414 results