A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Genoogle: an indexed and parallelized search engine for similar DNA sequences
[article]
2015
arXiv
pre-print
The search for similar genetic sequences is one of the main bioinformatics tasks. ...
To verify the viability of using these two techniques simultaneously, a software which uses parallelization techniques with inverted indexes was developed. ...
Defining a genetic sequence as a sequence where Σ = {A, C, G, T } (DNA), Σ={A, C, G, U } (RNA) and a sub-sequence is a sequence wich is contained partiality of fully into other sequence. ...
arXiv:1507.02987v1
fatcat:sx7vv2jravbxhhwcpfooc3xtwi
A new DNA alignment method based on inverted index
[article]
2013
arXiv
pre-print
This paper presents a novel DNA sequences alignment method based on inverted index. Now most large scale information retrieval system are all use inverted index as the basic data structure. ...
But its application in DNA sequence alignment is still not found. This paper just discuss such applications. ...
This feature may be also appropriate to index the DNA sequence. ...
arXiv:1307.0194v1
fatcat:vftrg7d5uramzennbxww3gdo5i
Indexing DNA Sequences Using q-Grams
[chapter]
2005
Lecture Notes in Computer Science
Contributing to the interest, this paper presents a method for indexing the DNA sequences efficiently based on q-grams to facilitate similarity search in a DNA database and sidestep the need for linear ...
Two level index -hash table and c-treesare proposed based on the q-grams of DNA sequences. ...
Conclusion We have devised a novel two-level index structure based on q-grams of the DNA sequences which can support efficient similarity search in DNA sequence database. ...
doi:10.1007/11408079_4
fatcat:sczrqs2aingcxaka5wzavcqp7q
Characterizing self-similarity in bacteria DNA sequences
1998
Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics
In this paper some parametric methods are introduced to characterize the self-similarity of DNA sequences. ...
Long-range correlation properties in the nucleotide density distribution along these DNA sequences are explored. ...
It has been discussed in ͓3,5,9,11͔ that the DNA sequences act as self-similar processes. ...
doi:10.1103/physreve.58.3578
fatcat:43ir7hghbfhpjoefoqwsv2k5ie
An interactive DNA barcode browser
2020
Zenodo
By using n-gram indexing of DNA sequences, and alignment-free phylogeny construction, the user can interactively explore DNA barcode data in real time. ...
This paper describes an interactive web application to display DNA barcode data. It supports both query by sequence and query by geographic area. ...
The application searches the Elasticsearch index of DNA barcodes for similar sequences, if it finds any it returns them and then computes a phylogenetic tree for those sequences. ...
doi:10.5281/zenodo.4266481
fatcat:7q65dmo37jfdhp6imqrm7mshxe
Applying Shannon's information theory to bacterial and phage genomes and metagenomes
2013
Scientific Reports
A sequence with more information (higher uncertainty) has a higher probability of being significantly similar to other sequences in the database. ...
Here, Shannon's index of complete phage and bacterial genomes was examined. ...
The strong relationship between Shannon's index and jGC% 2 0.5j for word length 1 to 5 nt suggests that Shannon's index is strongly influenced by the GC composition of the DNA sequence (Figure 4b ). ...
doi:10.1038/srep01033
pmid:23301154
pmcid:PMC3539204
fatcat:weo4z5wuw5ghfaawjukybu6wqa
Stretch Profile: A pruning technique to accelerate DNA sequence search
2020
Informatics in Medicine Unlocked
DNA sequence similarity search has been used by scientists to facilitate biological research. Over the years, more sequences are added to databases, making them constantly larger. ...
This paper presents a pruning technique to accelerate DNA sequence search based on a novel Stretch Profile created from stretches of consecutive base characters: A-Stretch, C-Stretch, G-Stretch, and T-Stretch ...
Thus, the similarity comparison of DNA sequences is faster and processes less data [14] . Many other indexing techniques have been proposed under different names. ...
doi:10.1016/j.imu.2020.100323
fatcat:f76ksnsyurbnfbjkcsn6nzhhou
A perceptual hash function to store and retrieve large scale DNA sequences
[article]
2014
arXiv
pre-print
The similarity distance between two hashes is estimated with the Hamming Distance, which is used to retrieve DNA sequences. ...
The method is based on a perceptual hash function, commonly used to determine the similarity between digital images, that we adapted for DNA sequences. ...
Even if 64--bits hashes are more precise, the comparison of 32--bits hashes could be an acceptable similarity index, including for DNA sequences having a length of 10 000pb. ...
arXiv:1412.5517v1
fatcat:a5ubwtc2mfdx3nywhf37siv7nq
2D graphical representation of dna sequence based on horizon lines from a probabilistic view
2018
Bioscience Journal
Following the new approach, we perform the similarity analysis among coding sequences of the first exon of beta-globin gene from eleven species. Our results coincide with current biological analyses. ...
We also compare our method with some existing DNA sequence comparison algorithms and find that ours is more intuitive and effective. ...
Thus the smaller d reflects that the DNA sequences are more similar. ...
doi:10.14393/bj-v34n3a2018-39932
fatcat:lvikzokrgfg77aaqrd742qf6xq
How to build a DNA search engine like Google?
[article]
2011
arXiv
pre-print
This paper proposed a new method to build the large scale DNA sequences search system based on web search engine technology. ...
Since there is no local alignment process, this system is able to provide the ms level search services for billions of DNA sequences in a typical server. ...
Now most DNA search and comparing methods are similar to BLAST/FASTA algorithm, which compares one sequence with the other sequences on by one [1, 2] . ...
arXiv:1006.4114v4
fatcat:5earjms6vbhhzmp5puwnia6qve
Association between Chloroplast and Mitochondrial DNA sequences in Chinese Prunus genotypes (Prunus persica, Prunus domestica, and Prunus avium)
2015
BMC Plant Biology
In case of cpSSR, Hong Tao (Peach) and L1 Tai Yang Li (Plum) genotypes demonstrated similarity index of 0.85 and Huang Tao has the lowest similarity index of 0.50. ...
The Y2 Wu Xing (Cherry) and L2 Hong Xin Li (Plum) genotypes have a high similarity index (0.89), followed by Zi Ye Li (0.85), whereas; L1 Tai Yang Li (plum) has the lowest genetic similarity (0.35). ...
Tao (peach) showed the lowest similarity index i.e., 0.52. ...
doi:10.1186/s12870-014-0402-4
pmid:25592231
pmcid:PMC4310034
fatcat:f3f7xu7hlvfxphoq25444vgiwe
Phylogenetic Analysis Of Endophytic Fungi Isolate from Bellucia pentamera Naudin Based On ITS rDNA
2019
IJEMS (Indonesian Journal of Environmental Management and Sustainability)
From the results of identification and analysis of DNA sequencing of endophytic fungi DKJ1, DKJ3a, DKJ3c and DKJ4 with the primary pair of ITS and Beta-tubulin shows that the phylogenetic tree is different ...
The purpose of this study is to identify and analyze the sequencing of endophytic fungi isolates with ITS and Beta-tubulin markers and phylogenetic trees. ...
Highest sequence similarity index values was found in all samples of DKJ1, DKJ3c, DKJ4 and DKJ3a meanwhile the lowest order sequence similarity index values was found in all Aspergillus and Penicillium ...
doi:10.26554/ijems.2019.3.4.100-106
fatcat:p6sbj2htdzfx5p4tntit32fkyi
An efficient approach for sequence matching in large DNA databases
2006
Journal of information science
Since DNA databases contain a huge volume of sequences, fast indexes are essential for efficient processing of DNA sequence matching. ...
In molecular biology, DNA sequence matching is one of the most crucial operations. ...
DNA sequence matching is an operation that finds DNA sequences whose base arrangement is similar to that of a DNA sequence given in a query from a DNA database. ...
doi:10.1177/0165551506059229
fatcat:l2ea2b57jjd5xdztqhx3tjef6m
Studies on the Genetic Variations between Field Strains of Pectinophora gossypiella (SAUNDERS) using Polymerase Chain Reaction (PCR) Technique
2019
Journal of Plant Protection and Pathology
The highest similarity index between field strains and laboratory strain of the pink bollworm appeared in primer P8, while the lowest similarity index between field strains and laboratory strain of the ...
The RAPD patterns resulted from amplification of DNA of the field colony strains and laboratory strain of the pink bollworm P. gossypiella revealed that the lowest value of similarity index (0.0%), which ...
Similarity index: The similarity index was used to compare patterns within as well as between populations. ...
doi:10.21608/jppp.2019.40565
fatcat:cqcga6mhzbdrjhh54qcgghbc4a
Information Theory and Multivariate Techniques for Analyzing DNA Sequence Data: An Example from Tomato Genes
2010
Nepal Journal of Biotechnology
Shannon-Weaver index (Shannon Entropy, H') and mutual information (MI) index were estimated from DNA sequences of 22 genes, consisted of two gene families of tomato, namely disease resistance and fruit ...
Sequences similarity among genes was observed within a family. ...
This indicates that the nature of variation in DNA sequences reflect the similar variation at phenotypic levels. Genes within cluster were phenotypically similar. ...
doi:10.3126/njb.v1i1.3867
fatcat:wtrb2e2hkvf2dbmt2mlpup66jy
« Previous
Showing results 1 — 15 out of 353,221 results