Filters








803 Hits in 3.8 sec

Disk-based k-mer counting on a PC

Sebastian Deorowicz, Agnieszka Debudaj-Grabysz, Szymon Grabowski
2013 BMC Bioinformatics  
Results: We propose a simple, yet efficient, parallel disk-based algorithm for counting k-mers.  ...  Conclusions: By making use of cheap disk space and exploiting CPU and I/O parallelism we propose a very competitive k-mer counting procedure, called KMC.  ...  Very recently, a disk-based k-mer counting algorithm, named DSK, was presented [9] .  ... 
doi:10.1186/1471-2105-14-160 pmid:23679007 pmcid:PMC3680041 fatcat:z5374zj3ufchdolldb6ij3kidu

KMC 2: fast and resource-frugal k-mer counting

Sebastian Deorowicz, Marek Kokot, Szymon Grabowski, Agnieszka Debudaj-Grabysz
2015 Computer applications in the biosciences : CABIOS  
For example, KMC 2 allows to count the 28-mers of a human reads collection with 44-fold coverage (106 GB of compressed size) in about 20 minutes, on a 6-core Intel i7 PC with an SSD.  ...  Our disk-based method bears some resemblance to MSPKmerCounter, yet replacing the original minimizers with signatures (a carefully selected subset of all minimizers) and using (k, x)-mers allows to significantly  ...  In the experiments, we count only k-mers with counts at least 2, since the k-mers with a single occurrence in a read collection most likely contain erroneous base(s).  ... 
doi:10.1093/bioinformatics/btv022 pmid:25609798 fatcat:nteraui7abdk5bicld3vq2cqda

Enabling Genomics Pipelines in Commodity Personal Computers With Flash Storage

Nicola Cadenelli, Sang-Woo Jun, Jordà Polo, Andrew Wright, David Carrera, Arvind
2021 Frontiers in Genetics  
We believe our technique should apply to other k-mer or n-gram-based algorithms.  ...  We construct and access large histograms of k-mers efficiently on external storage (SSDs) and apply our technique to a state-of-the-art reference-free genomics algorithm, SMUFIN, to create SMUFIN-F.  ...  K-mer Counting With Sort-Reduce In order to construct a multi-terabyte histogram of k-mers using only a small amount of memory, SMUFIN-F performs k-mer counting in secondary storage using Sort-Reduce  ... 
doi:10.3389/fgene.2021.615958 pmid:33995473 pmcid:PMC8116887 fatcat:rw62agrxxrb43ieayyskqvy5qq

DE-kupl: exhaustive capture of biological variation in RNA-seq data through k-mer decomposition

Jérôme Audoux, Nicolas Philippe, Rayan Chikhi, Mikaël Salson, Mélina Gallopin, Marc Gabriel, Jérémy Le Coz, Emilie Drouineau, Thérèse Commes, Daniel Gautheret
2017 Genome Biology  
PCs were produced with normalized log-transformed counts. For genes and transcripts, counts were generated with Kallisto based on GENCODE V25.  ...  PCs were produced with normalized log-transformed counts. For genes and transcripts, counts were generated with Kallisto based on GENCODE V25.  ... 
doi:10.1186/s13059-017-1372-2 pmid:29284518 pmcid:PMC5747171 fatcat:3mhh4ekdhvdbbdtc2q66uennum

Formation and Evolution of Compact-object Binaries in AGN Disks

Hiromichi Tagawa, Zoltán Haiman, Bence Kocsis
2020 Astrophysical Journal  
In active galactic nuclei (AGNs), compact-object binaries form, evolve, and interact with a dense star cluster and a gas disk.  ...  This "gas-capture" binary formation channel contributes up to 97% of gas-driven mergers and leads to a high merger rate in AGN disks even without preexisting binaries.  ...  The radius of the AGN disk based on mid-infrared observations is given by Equation (75) . We assume that the allowed range of Table 3 shows the value of f BH,mer for the different models.  ... 
doi:10.3847/1538-4357/ab9b8c fatcat:ijpukikipfcnfdeuyth656glem

Formation and Evolution of Compact Object Binaries in AGN Disks [article]

Hiromichi Tagawa, Zoltan Haiman, Bence Kocsis
2020 arXiv   pre-print
In active galactic nuclei (AGN), compact-object binaries form, evolve, and interact with a dense star cluster and a gas disk.  ...  This "gas capture" binary formation channel contributes up to 97 % of gas-driven mergers and leads to a high merger rate in AGN disks even without pre-existing binaries.  ...  Based on these results, we consider a reasonable range of uncertainty to be f BH,mer /t AGN ∼ 3×10 −10 yr −1 -4×10 −8 yr −1 , so that 0.018 (f BH,mer /0.5)(t AGN /30Myr) −1 2.4 in Eq. (81).  ... 
arXiv:1912.08218v2 fatcat:bi44wclrpzfvxpfljlak2o6d74

A Very Fast Algorithm for Detecting Partially Plagiarized Documents Using FM-Index

Chang SeokOck, JongKyu Seo, Sung-Hwan Kim, Hwan-Gue Cho
2013 International Journal of Computer and Communication Engineering  
The method is based on the Burrows-Wheeler Transform (BWT) and the FM-index for BWT search.  ...  We use disk-based techniques and Genome assembly used in Next Generation Sequencing (NGS) to overcome this disadvantage.  ...  If disk-based BWT cannot process entire job on the limited memory in the processing time, it generates temporary results and stores them on the disk.  ... 
doi:10.7763/ijcce.2013.v2.194 fatcat:lvt5msu6gnbz5jnwrlmfr3m4aq

Workstations and mainframe computers working together

J. K. Kravitz, D. Lieber, F. H. Robbins, J. M. Palermo
1986 IBM Systems Journal  
Al- though the virtual disk would actually be a fi kravitzetac. 117 on the host, to the user the virtual disk would be used just as any other disk on the IBM PC.  ...  Whenever an IBM PC program changed a sector on a virtual disk, CMS was requested to rewrite a 512-byte sector image in the virtual disk file.  ... 
doi:10.1147/sj.251.0116 fatcat:k7abxmmgyjaufms43ll3spgxb4

PaKman: A Scalable Algorithm for Generating Genomic Contigs on Distributed Memory Machines

Priyanka Ghosh, Sriram Krishnamoorthy, Ananth Kalyanaraman
2020 IEEE Transactions on Parallel and Distributed Systems  
PaKman presents a solution for the two most time-consuming phases in the full genome assembly pipeline, namely, k-mer counting and contig generation.  ...  In this article, we present a novel algorithm, called PaKman, to address the problem of performing large-scale genome assemblies on a distributed memory parallel computer.  ...  When tested with k=32 (and l = 8), we observed the min-lmer (calculated based on k-mer frequency) across consecutive k-mers for a given read, to change once every forty base pairs on average, thus offering  ... 
doi:10.1109/tpds.2020.3043241 fatcat:dzfmisnswjawfi3n7n4c2g65gq

Indexes of Large Genome Collections on a PC

Agnieszka Danek, Sebastian Deorowicz, Szymon Grabowski, Stephen Moore
2014 PLoS ONE  
For example, the exact matching queries are handled in average time of 39 μs and with up to 3 mismatches in 373 μs on the test PC with the index size of 13.4 GB.  ...  name a few applications.  ...  In our scheme, the two largest subarrays, kMA 0 and kMA 1 , are kept in sparse form, based on preceding positions of k-mers.  ... 
doi:10.1371/journal.pone.0109384 pmid:25289699 pmcid:PMC4188820 fatcat:24fxdonhangc7pooav4c4tavii

PaKman: Scalable Assembly of Large Genomes on Distributed Memory Machines [article]

Priyanka Ghosh, Sriram Krishnamoorthy, Ananth Kalyanaraman
2019 bioRxiv   pre-print
PaKman presents a solution for the two most time-consuming phases in the full genome assembly pipeline, namely, k-mer counting and contig generation.A key aspect of our algorithm is its graph data structure  ...  In this paper, we present a novel algorithm, called PaKman, to address the problem of performing large-scale genome assemblies on a distributed memory parallel computer.  ...  Then, we perform a simple threshold-based pruning: we remove k-mers that have a count below a certain threshold τ . Such k-mers are deemed "poor quality" from the assembly perspective.  ... 
doi:10.1101/523068 fatcat:4tfsylflybcttozouex4sraxlu

Biosequence Similarity Search on the Mercury System

Praveen Krishnamurthy, Jeremy Buhler, Roger Chamberlain, Mark Franklin, Kwame Gyang, Arpith Jacob, Joseph Lancaster
2007 Journal of VLSI Signal Processing Systems for Signal, Image and Video Technology  
Here, we present the design of BLASTN, the version of BLAST that searches DNA sequences, on the Mercury system, an architecture that supports high-volume, high-throughput data movement off a data store  ...  The most widely used similarity search tool for biosequences is BLAST, a program designed to compare query sequences to a database.  ...  RDisk [16] is one such FPGA-based approach which claims a 60 Mbases/sec throughput for stage 1 of BLAST using a single disk.  ... 
doi:10.1007/s11265-007-0087-0 pmid:18846267 pmcid:PMC2564817 fatcat:6zsefgb7ubcrtaw5pmjy7u4tje

Massively parallel FPGA-based implementation of BLASTp with the two-hit method

Lars Wienbrandt, Stefan Baumgart, Jost Bissel, Florian Schatz, Manfred Schimmler
2011 Procedia Computer Science  
In this paper, we focus on a massive parallelization using the FPGA-based hardware architecture RIVYERA [2] .  ...  The aim is to reach speedups in orders of magnitude with a flexible implementation while saving energy costs compared to PC-based database search.  ...  BLOSUM62, this k-mer can be taken as a k-word if its score exceeds T . Otherwise, all substitutions of amino acids in this k-mer would also fail the threshold T .  ... 
doi:10.1016/j.procs.2011.04.215 fatcat:qrrxv5qmfnacflc4tru24feh54

Page 177 of SMPTE Motion Imaging Journal Vol. 102, Issue 4 [page]

1993 SMPTE Motion Imaging Journal  
Mirage has now assembled a staff of audio professionals to support the MicroSound product, a professional PC-based multitrack hard disk recording and editing system fea- turing MicroEditor Software which  ...  based in London, Ont., Canada.  ... 

Centrifuge: rapid and sensitive classification of metagenomic sequences [article]

Daehwan Kim, Li Song, Florian P Breitwieser, Steven L Salzberg
2016 bioRxiv   pre-print
of 69 GB, in contrast to k-mer based indexing schemes, which require far more extensive space.  ...  The system uses an indexing scheme based on the Burrows-Wheeler transform (BWT) and the Ferragina-Manzini (FM) index, optimized specifically for the metagenomic classification problem.  ...  Classification based on the FM-index The FM-index provides several advantages over k-mer based indexing schemes that store all k-mers in the target genomes.  ... 
doi:10.1101/054965 fatcat:35gxyhk6lfdpjlsxlew3vvbbxy
« Previous Showing results 1 — 15 out of 803 results