A tool for aligning very similar DNA sequences

Kun-Mao Chao, Jinghui Zhang, James Ostell, Webb Miller
1997 Bioinformatics  
Results: We have produced a computer program, named sim3, that solves the following computational problem. Two DNA sequences are given, where the shorter sequence is very similar to some contiguous region of the longer sequence. Sim3 determines such a similar region of the longer sequence, and then computes an optimal set of single-nucleotide changes (i.e. insertions, deletions or substitutions) that will convert the shorter sequence to that region. Thus, the alignment scoring scheme is
more » ... to model sequencing errors, rather than evolutionary processes. The program can align a 100 kb sequence to a 1 megabase sequence in a few seconds on a workstation, provided that there are very few differences between the shorter sequence and some region in the longer sequence. The program has been used to assemble sequence data for the Genomes Division at the National Center for Biotechnology Information. Availability: A version of sim3 for UNIX machines can be obtained by anonymous ftp from ncbi. nlm. nih. gov, in the publsimS directory.
doi:10.1093/bioinformatics/13.1.75 fatcat:3brmm33avjfcholmsfkmdyz4am