A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
A Fast Randomized Algorithm for Finding the Maximal Common Subsequences
[article]
2020
arXiv
pre-print
In this paper, we develop a randomized algorithm, referred to as Random-MCS, for finding a random instance of Maximal Common Subsequence (MCS) of multiple strings. ...
A well-known result states that finding a Longest Common Subsequence (LCS) for L strings is NP-hard, e.g., the computational complexity is exponential in L. ...
For this experiment, we use the basic dynamic programming method to compute LCS, and run our RandomMCS algorithm 1000 times to select the longest one and compare the result with the real LCS. ...
arXiv:2009.03352v1
fatcat:w4gyd2r53nb23k6t5date6wviu
Approximating the true evolutionary distance between two genomes
2008
ACM Journal of Experimental Algorithmics
good enough to enable the simple neighbor-joining procedure to reconstruct our test trees with high accuracy. ...
In this paper we generalize our approach to compute distances between two arbitrary genomes, but focus on approximating the true evolutionary distance rather than the edit distance. ...
Acknowledgments This work is supported by the National Science Foundation under grants DEB 01-20709 (on a subcontract to U. ...
doi:10.1145/1227161.1402297
fatcat:bzmtyf7t75ha7pn62bpe5e4xae
Objective Assessment of Surgical Technical Skill and Competency in the Operating Room
2017
Annual Review of Biomedical Engineering
The algorithms and validation methodologies used for OCASE-T are highly varied; there is no uniform consensus. ...
Traditional models to train surgeons are being challenged by rapid advances in technology, an intensified patient-safety culture, and a need for value-driven health systems. ...
Narges Ahmidi for her insightful comments on earlier versions of this review and assistance with illustrations.
LITERATURE CITED ...
doi:10.1146/annurev-bioeng-071516-044435
pmid:28375649
pmcid:PMC5555216
fatcat:rlspmdequzgdjlxkaqt4vcp4la
Mining Time Series Data
[chapter]
2009
Data Mining and Knowledge Discovery Handbook
This chapter gives a high-level survey of time series Data Mining tasks, with an emphasis on time series representations. ...
While these many different techniques used to solve these problems use a multitude of different techniques, they all have one common factor; they require some high level representation of the data, rather ...
Longest Common Subsequence Similarity The longest common subsequence similarity measure, or LCSS, is a variation of edit distance used in speech recognition and text pattern matching. ...
doi:10.1007/978-0-387-09823-4_56
fatcat:52km7o7aw5dw7awgsa4qa6badm
Efficient haplotype matching and storage using the positional Burrows-Wheeler transform (PBWT)
2014
Bioinformatics
Motivation: Over the last few years, methods based on suffix arrays using the Burrows-Wheeler Transform have been widely used for DNA sequence read matching and assembly. ...
Meanwhile, algorithmic development for genotype data has concentrated on statistical methods for phasing and imputation, based on probabilistic matching to hidden Markov model representations of the reference ...
One approach to more efficient phasing and imputation may be to use computationally efficient approaches such as the positional prefix array methods to seed matches for statistical genotype algorithms, ...
doi:10.1093/bioinformatics/btu014
pmid:24413527
pmcid:PMC3998136
fatcat:vems5carsfdxpivtg5l4o7dq7e
Learning deterministic context free grammars: The Omphalos competition
2006
Machine Learning
Our approach integrates an information theoretic constituent likelihood measure together with more traditional heuristics based on substitutability and frequency. ...
We discuss a class of deterministic grammars, the Non-terminally Separated (NTS) grammars, that have a property relied on by our algorithm, and consider the possibilities of extending the algorithm to ...
We also would like to thank Remi Eyraud and Jean Christophe Janodet for pointing out the literature on NTS grammars. ...
doi:10.1007/s10994-006-9592-9
fatcat:dipqknik5needkx4cpm3owjloy
Pluribus—Exploring the Limits of Error Correction Using a Suffix Tree
2017
IEEE/ACM Transactions on Computational Biology & Bioinformatics
In this paper, we present a novel and effective method called PLURIBUS, for correcting sequencing errors using a generalized suffix trie. ...
Furthermore, PLURIBUS can be used in conjunction with other contemporary error correction methods to achieve higher levels of accuracy than either tool alone. ...
Libraries of Medicine, the Center for Science of Information (CSoI), an US National Science Foundation Science and Technology Center, under grant agreement CCF-0939370, and by American Cancer Society ...
doi:10.1109/tcbb.2016.2586060
pmid:27362987
pmcid:PMC5754272
fatcat:b25k3nb4hfbt3kruiiouysqrma
A Heuristic Approach for Finding Similarity Indexes of Multivariate Data Sets
2020
IEEE Access
INDEX TERMS Similarity index, multivariate data set, outliers, the longest common subsequence. ...
Therefore, the development of an efficient and reliable algorithm for MDSs, with minimum time and space complexity, is highly encouraged by the research community. ...
In the literature, different methods were proposed to solve the longest common subsequence problem particularly for multivariate data sets [19] . Some of these techniques are described below. ...
doi:10.1109/access.2020.2968222
fatcat:x4dqd47nvrd7vbsqs7geww3ara
Parameterized Algorithms in Bioinformatics: An Overview
2019
Algorithms
This work surveys recent developments of parameterized algorithms and complexity for important NP-hard problems in bioinformatics. ...
Bioinformatics regularly poses new challenges to algorithm engineers and theoretical computer scientists. ...
Acknowledgments: The authors want to thank Fran Rosamond for suggesting the topic as well as everyone who helped collect interesting results for the manuscript, in particular Jesper Jansson, Steven Kelk ...
doi:10.3390/a12120256
fatcat:4dhjdnpibzh43iifgan2fu6bwa
Bitpacking techniques for indexing genomes: II. Enhanced suffix arrays
2016
Algorithms for Molecular Biology
Our results on the fly, chicken, and human genomes show that bytecoding with an exception guide array is the fastest method for retrieving auxiliary information. ...
Enhanced suffix arrays (ESAs) provide fast search speed, but require large auxiliary data structures for storing longest common prefix and child interval information. ...
Acknowledgements The author thanks Simon Gog for advice on using his SDSL package.
Competing interests The author declares that he has no competing interests. ...
doi:10.1186/s13015-016-0068-6
pmid:27110277
pmcid:PMC4842304
fatcat:pavthg3vu5dfrn5lqkppnzaosy
The Deletion-Insertion model applied to the genome rearrangement problem
2019
Pure Mathematics and Applications
We use combinatorial reasoning and permutation statistics to develop a polynomial-time algorithm to approximate the minimum number of transpositions required in the transposition model and to analyze the ...
Applying one restriction to this model, we obtain the transposition model for genome rearrangement, which was shown to be NP-hard in [4]. ...
Using the method of [10] , getLdc can be run in O(n log log n) time, so algMinLdc can be run in O(n 5 log log n). ...
doi:10.1515/puma-2015-0030
fatcat:44lksv2gifbgrhrw2onyidbb54
The Average Common Substring Approach to Phylogenomic Reconstruction
2006
Journal of Computational Biology
We present an algorithm for efficiently computing these distances. In principle, the distance of two long sequences can be calculated in O( ) time. We implemented the algorithm, using suffix arrays. ...
The core of our method is a new measure of pairwise distances between sequences. This measure is based on computing the average lengths of maximum common substrings. ...
ACKNOWLEDGEMENTS We would like to thanks Eran Bacharach, Tal Pupko, and Jacob Ziv for helpful discussions. ...
doi:10.1089/cmb.2006.13.336
pmid:16597244
fatcat:l2y4ypheo5bbncforxo55ffqdi
Efficient and Effective Similar Subtrajectory Search with Deep Reinforcement Learning
[article]
2020
arXiv
pre-print
We conduct experiments on real-world trajectory datasets, which verify the effectiveness and efficiency of the proposed algorithms. ...
Among those approximate algorithms, two that are based on deep reinforcement learning stand out and outperform those non-learning based algorithms in terms of effectiveness and efficiency. ...
The authors would like to thank Eamonn Keogh for pointing out some references to the time series literature and also the anonymous reviewers for their constructive comments. ...
arXiv:2003.02542v2
fatcat:wupyxy3odremho5okuvg7ymolq
Analysis of Work-Stealing and Parallel Cache Complexity
[article]
2021
arXiv
pre-print
Our second and main contribution is some new parallel cache complexity for algorithms using the RWS scheduler. ...
The theoretical efficiency of the RWS scheduler has been analyzed for a variety of settings, but most of them are quite complicated. ...
A counterexample is the edit distance problem (or longest common subsequence). The recurrence for the divide-and-conquer algorithm is ( ) = 4 ( /2) + (1). ...
arXiv:2111.04994v1
fatcat:yjvrumbacjdcnifdf6qvwwblmq
Information Theoretic Approaches to Whole Genome Phylogenies
[chapter]
2005
Lecture Notes in Computer Science
We present an algorithm for efficiently computing these distances. In principle, the distance of two long sequences can be calculated in O( ) time. We implemented the algorithm, using suffix arrays. ...
The core of our method is a new measure of pairwise distances between sequences. This measure is based on computing the average lengths of maximum common substrings. ...
Acknowledgements We would like to thanks Eran Bacharach, Tal Pupko, and Jacob Ziv for helpful discussions. ...
doi:10.1007/11415770_22
fatcat:fhdjl343eje7de2y2bheevty44
« Previous
Showing results 1 — 15 out of 299 results