181 Hits in 5.2 sec

Efficient Parallel and Out of Core Algorithms for Constructing Large Bi-directed de Bruijn Graphs [article]

Vamsi Kundeti, Sanguthevar Rajasekaran, Hieu Dinh
2010 arXiv   pre-print
Efficient algorithms for building these massive de Bruijn graphs are very essential in large sequencing projects based on short reads. In Jackson et. al.  ...  We also provide efficient algorithms for the bi-directed chain compaction problem.  ...  This work has been supported in part by the following grants: NSF 0326155, NSF 0829916 and NIH 1R01GM079689-01A1.  ... 
arXiv:1003.1940v1 fatcat:tbq37yuvxfd4jh2bx2aim4xpla

Efficient parallel and out of core algorithms for constructing large bi-directed de Bruijn graphs

Vamsi K Kundeti, Sanguthevar Rajasekaran, Hieu Dinh, Matthew Vaughn, Vishal Thapar
2010 BMC Bioinformatics  
Our algorithms for constructing Bi-directed de Bruijn graphs are efficient in parallel and out of core settings. These algorithms can be used in building large scale bi-directed de Bruijn graphs.  ...  Finally our out-of-core algorithm is extremely memory efficient and can replace the existing graph construction algorithm in VELVET.  ...  Our algorithm is optimal in the sequential, parallel, and out-of-core models. A recent work by Jackson and Aluru [9] yielded parallel algorithms to build these de Bruijn graphs efficiently.  ... 
doi:10.1186/1471-2105-11-560 pmid:21078174 pmcid:PMC2996408 fatcat:3j3u2vh4pfgilhy7l234wpsybi

Small World Asynchronous Parallel Model for Genome Assembly [chapter]

Jintao Meng, Jianrui Yuan, Jiefeng Cheng, Yanjie Wei, Shengzhong Feng
2012 Lecture Notes in Computer Science  
YAGA constructs the distributed bi-directed De Bruijn graph by maintaining edge tuples in a community of servers.  ...  Efficient and scalable frameworks or libraries for distributed graphs are essential to parallel assembly based on De Bruijn graph.  ...  Chin from HKU for their suggestions on this work.  ... 
doi:10.1007/978-3-642-35606-3_17 fatcat:aevndmkcwffqvii5euwczktsva

Parallelized short read assembly of large genomes using de Bruijn graphs

Yongchao Liu, Bertil Schmidt, Douglas L Maskell
2011 BMC Bioinformatics  
Results: We present PASHA, a parallelized short read assembler using de Bruijn graphs, which takes advantage of hybrid computing architectures consisting of both shared-memory multi-core CPUs and distributed-memory  ...  Conclusions: Developing parallel assemblers for large genomes has been garnering significant research efforts due to the explosive size growth of high-throughput short read datasets.  ...  Liu Weiguo for providing the experimental environments, and thank the anonymous reviewers whose constructive comments helped to improve the manuscript.  ... 
doi:10.1186/1471-2105-12-354 pmid:21867511 pmcid:PMC3167803 fatcat:uoonaom4cnbhzj5hmt4v6xaw3u

SWAP-Assembler: scalable and efficient genome assembly towards thousands of cores

Jintao Meng, Bingqiang Wang, Yanjie Wei, Shengzhong Feng, Pavan Balaji
2014 BMC Bioinformatics  
In the paper, a mathematical description of multi-step bi-directed graph (MSG) is provided to resolve the computational interdependence on merging edges, and a highly scalable computational framework for  ...  Results: This paper presents a highly scalable assembler named as SWAP-Assembler for processing massive sequencing data using thousands of cores, where SWAP is an acronym for Small World Asynchronous Parallel  ...  Leung, Yu Peng and Yi Wang from the University of Hongkong, and anonymous reviewers invited by Recomb-seq 2014.  ... 
doi:10.1186/1471-2105-15-s9-s2 pmid:25253533 pmcid:PMC4168705 fatcat:immjno62unhjlnewnsaibozmjq

FastEtch: A Fast Sketch-based Assembler for Genomes

Priyanka Ghosh, Ananth Kalyanaraman
2018 IEEE/ACM Transactions on Computational Biology & Bioinformatics  
One of the major computational steps in modern day short read assemblers involves the construction and use of a string data structure called the de Bruijn graph.  ...  In this paper, we present a fast algorithm, FastEtch, that uses sketching to build an approximate version of the de Bruijn graph for the purpose of generating an assembly.  ...  Algorithm 1: Approximate de Bruijn Graph Construction Algorithm -Baseline Input: Input set of reads: R, Width: w, Depth: d, Threshold: τ Output: Approximate de Bruijn graph (Ĝ) for each r ∈ R in parallel  ... 
doi:10.1109/tcbb.2017.2737999 pmid:28910776 fatcat:6yttdozncjdmzcliby3dr5xg5a

GPU-Euler: Sequence Assembly Using GPGPU

Syed Faraz Mahmood, Huzefa Rangwala
2011 2011 IEEE International Conference on High Performance Computing and Communications  
Our work was largely motivated by a growing need in the genomic community for sequence assemblers and increasing use of GPUs for general purpose computing applications.  ...  We investigated the implementation challenges, and possible solutions for a data parallel approach for sequence assembly.  ...  [17, 19] proposed a parallel implementation for bi-directed string graph assembly on large number of processors available on supercomputers like the IBM Blue Gene /L.  ... 
doi:10.1109/hpcc.2011.29 dblp:conf/hpcc/MahmoodR11 fatcat:krrrqqc7lzaf7hlokafcnt6ybq

PANDA: Processing-in-MRAM Accelerated De Bruijn Graph based DNA Assembly [article]

Shaahin Angizi, Naima Ahmed Fahmi, Wei Zhang, Deliang Fan
2020 arXiv   pre-print
In this work, we present an efficient Processing-in-MRAM Accelerated De Bruijn Graph based DNA Assembly platform named PANDA based on an optimized and hardware-friendly genome assembly algorithm.  ...  We then develop a highly parallel and step-by-step hardware-friendly DNA assembly algorithm for PANDA that only requires the developed in-memory logic operations.  ...  First, creating a hash table out of chopped short reads (k-mers) and keeping a count of each distinct k-mer; second, constructing a de Bruijn Graph with Hashmap; third, traversing through de Bruijn Graph  ... 
arXiv:2008.06177v1 fatcat:5oebprh5ezgbnhe2kjrd2l66gi

Simplitigs as an efficient and scalable representation of de Bruijn graphs

Karel Břinda, Michael Baym, Gregory Kucherov
2021 Genome Biology  
Here, we introduce simplitigs as a compact, efficient, and scalable representation, and ProphAsm, a fast algorithm for their computation.  ...  Abstractde Bruijn graphs play an essential role in bioinformatics, yet they lack a universal scalable representation.  ...  Acknowledgements The authors thank Jasmijn Baaijens, Roman Cheplyaka, Donald Halstead, Paul Medvedev, and Rayan Chikhi for their valuable comments and Kamil Salikhov and Simone Pignotti for the helpful  ... 
doi:10.1186/s13059-021-02297-z pmid:33823902 pmcid:PMC8025321 fatcat:kf3iqgji2vctjfur5tfj5e24vy

Simplitigs as an efficient and scalable representation of de Bruijn graphs [article]

Karel Břinda, Michael Baym, Gregory Kucherov
2020 bioRxiv   pre-print
Subsequently, an important question is how to efficiently represent, compress, and transmit de Bruijn graphs of the most common types of genomic data sets, such as sequencing reads, genomes, and pan-genomes  ...  Results: We introduce simplitigs, an effective representation of de Bruijn graphs for alignment-free applications.  ...  Acknowledgements The authors thank Jasmijn Baaijens and Roman Cheplyaka for careful reading and valuable comments, and Kamil Salikhov and Simone Pignotti for helpful discussions at the initial stage of  ... 
doi:10.1101/2020.01.12.903443 fatcat:6e24csokn5av7ammus3gernfvm

Identification of Significant Computational Building Blocks through Comprehensive Investigation of NGS Secondary Analysis Methods [article]

Md Vasimuddin, Sanchit Misra, Srinivas Aluru
2018 bioRxiv   pre-print
search, Debruijn graph construction and traversal, and pairwise hidden markov model algorithm - covering 80.5%-98.2%, 63.9%-99.4% and 72%-93% of the runtime, respectively, for sequence mapping, De novo  ...  Moreover, a majority of the software tools used for secondary analysis do not use the hardware efficiently.  ...  de Bruijn graph is constructed as follows.  ... 
doi:10.1101/301903 fatcat:yemhjrczozfq5boqnyz6nqx5cm

Advantages of distributed and parallel algorithms that leverage Cloud Computing platforms for large-scale genome assembly

Priti Kumari, Raja Mazumder, Vahan Simonyan, Konstantinos Krampis
2015 F1000Research  
However, for larger datasets Velvet requires large-memory compute servers in the order of 1000GB or more.  ...  On the other hand, Contrail is implemented using Hadoop, which performs the assembly in parallel across nodes of a compute cluster.  ...  A de Brujn graph can either be uni-directed or bi-directed, with a single edge connecting two nodes, or edges with two directions for the 5' or 3' genome strands.  ... 
doi:10.12688/f1000research.6016.1 fatcat:bylmprujzjeqxjpokoabolxwpi

Succinct Data Structures for Assembling Large Genomes [article]

Thomas C Conway, Andrew J Bromage
2010 arXiv   pre-print
De novo assembly not only provides a tool for understanding wide scale biological variation, but within human bio-medicine, it offers a direct way of observing both large scale structural variation and  ...  In particular we show that when stored succinctly, the de Bruijn assembly graph for homo sapiens requires only 23 gigabytes of storage.  ...  CONCLUSION We have presented a memory-efficient representation of the de Bruijn assembly graph using succinct data structures which allow us to represent the graph in close to the minimum number of bits  ... 
arXiv:1008.2555v1 fatcat:ehakzzbovnawboxg3brov63ple

Assembly algorithms for next-generation sequencing data

Jason R. Miller, Sergey Koren, Granger Sutton
2010 Genomics  
Acknowledgments The authors receive funding for assembly research from the National Institutes of Health via grant 2R01GM077117-04A1 from the National Institute of General Medical Sciences.  ...  In summary, ABYSS is scalable assembly software for Solexa short reads and paired end reads. The de Bruijn Graph in AllPaths AllPaths is a DBG assembler intended for application to large genomes.  ...  The seed & extend heuristic algorithm is used for efficiency.  ... 
doi:10.1016/j.ygeno.2010.03.001 pmid:20211242 pmcid:PMC2874646 fatcat:4xfcqejrcjcm5b5y2alb3mb3xy

Accurate determination of node and arc multiplicities in de bruijn graphs using conditional random fields

Aranka Steyaert, Pieter Audenaert, Jan Fostier
2020 BMC Bioinformatics  
Background De Bruijn graphs are key data structures for the analysis of next-generation sequencing data.  ...  Results To improve the accuracy with which node and arc multiplicities in a de Bruijn graph are inferred, we developed a conditional random field (CRF) model to efficiently combine the coverage information  ...  BCALM 2 is capable of constructing de Bruijn graphs even for large genomes with relatively low memory requirements. For all results presented in this paper we use k-mer size k = 21.  ... 
doi:10.1186/s12859-020-03740-x pmid:32928110 fatcat:tizlg63ur5eiderddbntsrjh2q
« Previous Showing results 1 — 15 out of 181 results