Filters








353,221 Hits in 5.3 sec

Genoogle: an indexed and parallelized search engine for similar DNA sequences [article]

Felipe Albrecht
2015 arXiv   pre-print
The search for similar genetic sequences is one of the main bioinformatics tasks.  ...  To verify the viability of using these two techniques simultaneously, a software which uses parallelization techniques with inverted indexes was developed.  ...  Defining a genetic sequence as a sequence where Σ = {A, C, G, T } (DNA), Σ={A, C, G, U } (RNA) and a sub-sequence is a sequence wich is contained partiality of fully into other sequence.  ... 
arXiv:1507.02987v1 fatcat:sx7vv2jravbxhhwcpfooc3xtwi

A new DNA alignment method based on inverted index [article]

Wang Liang, Zhao KaiYong
2013 arXiv   pre-print
This paper presents a novel DNA sequences alignment method based on inverted index. Now most large scale information retrieval system are all use inverted index as the basic data structure.  ...  But its application in DNA sequence alignment is still not found. This paper just discuss such applications.  ...  This feature may be also appropriate to index the DNA sequence.  ... 
arXiv:1307.0194v1 fatcat:vftrg7d5uramzennbxww3gdo5i

Indexing DNA Sequences Using q-Grams [chapter]

Xia Cao, Shuai Cheng Li, Anthony K. H. Tung
2005 Lecture Notes in Computer Science  
Contributing to the interest, this paper presents a method for indexing the DNA sequences efficiently based on q-grams to facilitate similarity search in a DNA database and sidestep the need for linear  ...  Two level index -hash table and c-treesare proposed based on the q-grams of DNA sequences.  ...  Conclusion We have devised a novel two-level index structure based on q-grams of the DNA sequences which can support efficient similarity search in DNA sequence database.  ... 
doi:10.1007/11408079_4 fatcat:sczrqs2aingcxaka5wzavcqp7q

Characterizing self-similarity in bacteria DNA sequences

Xin Lu, Zhirong Sun, Huimin Chen, Yanda Li
1998 Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics  
In this paper some parametric methods are introduced to characterize the self-similarity of DNA sequences.  ...  Long-range correlation properties in the nucleotide density distribution along these DNA sequences are explored.  ...  It has been discussed in ͓3,5,9,11͔ that the DNA sequences act as self-similar processes.  ... 
doi:10.1103/physreve.58.3578 fatcat:43ir7hghbfhpjoefoqwsv2k5ie

An interactive DNA barcode browser

Roderic D. M. Page
2020 Zenodo  
By using n-gram indexing of DNA sequences, and alignment-free phylogeny construction, the user can interactively explore DNA barcode data in real time.  ...  This paper describes an interactive web application to display DNA barcode data. It supports both query by sequence and query by geographic area.  ...  The application searches the Elasticsearch index of DNA barcodes for similar sequences, if it finds any it returns them and then computes a phylogenetic tree for those sequences.  ... 
doi:10.5281/zenodo.4266481 fatcat:7q65dmo37jfdhp6imqrm7mshxe

Applying Shannon's information theory to bacterial and phage genomes and metagenomes

Sajia Akhter, Barbara A. Bailey, Peter Salamon, Ramy K. Aziz, Robert A. Edwards
2013 Scientific Reports  
A sequence with more information (higher uncertainty) has a higher probability of being significantly similar to other sequences in the database.  ...  Here, Shannon's index of complete phage and bacterial genomes was examined.  ...  The strong relationship between Shannon's index and jGC% 2 0.5j for word length 1 to 5 nt suggests that Shannon's index is strongly influenced by the GC composition of the DNA sequence (Figure 4b ).  ... 
doi:10.1038/srep01033 pmid:23301154 pmcid:PMC3539204 fatcat:weo4z5wuw5ghfaawjukybu6wqa

Stretch Profile: A pruning technique to accelerate DNA sequence search

Nalakkhana Khitmoh, Sucha Smanchat, Sissades Tongsima
2020 Informatics in Medicine Unlocked  
DNA sequence similarity search has been used by scientists to facilitate biological research. Over the years, more sequences are added to databases, making them constantly larger.  ...  This paper presents a pruning technique to accelerate DNA sequence search based on a novel Stretch Profile created from stretches of consecutive base characters: A-Stretch, C-Stretch, G-Stretch, and T-Stretch  ...  Thus, the similarity comparison of DNA sequences is faster and processes less data [14] . Many other indexing techniques have been proposed under different names.  ... 
doi:10.1016/j.imu.2020.100323 fatcat:f76ksnsyurbnfbjkcsn6nzhhou

A perceptual hash function to store and retrieve large scale DNA sequences [article]

Jocelyn De Goer De Herve, Myoung-Ah Kang, Xavier Bailly, Engelbert Mephu Nguifo
2014 arXiv   pre-print
The similarity distance between two hashes is estimated with the Hamming Distance, which is used to retrieve DNA sequences.  ...  The method is based on a perceptual hash function, commonly used to determine the similarity between digital images, that we adapted for DNA sequences.  ...  Even if 64--bits hashes are more precise, the comparison of 32--bits hashes could be an acceptable similarity index, including for DNA sequences having a length of 10 000pb.  ... 
arXiv:1412.5517v1 fatcat:a5ubwtc2mfdx3nywhf37siv7nq

2D graphical representation of dna sequence based on horizon lines from a probabilistic view

Huili Liu
2018 Bioscience Journal  
Following the new approach, we perform the similarity analysis among coding sequences of the first exon of beta-globin gene from eleven species. Our results coincide with current biological analyses.  ...  We also compare our method with some existing DNA sequence comparison algorithms and find that ours is more intuitive and effective.  ...  Thus the smaller d reflects that the DNA sequences are more similar.  ... 
doi:10.14393/bj-v34n3a2018-39932 fatcat:lvikzokrgfg77aaqrd742qf6xq

How to build a DNA search engine like Google? [article]

Wang Liang, Fang Bo
2011 arXiv   pre-print
This paper proposed a new method to build the large scale DNA sequences search system based on web search engine technology.  ...  Since there is no local alignment process, this system is able to provide the ms level search services for billions of DNA sequences in a typical server.  ...  Now most DNA search and comparing methods are similar to BLAST/FASTA algorithm, which compares one sequence with the other sequences on by one [1, 2] .  ... 
arXiv:1006.4114v4 fatcat:5earjms6vbhhzmp5puwnia6qve

Association between Chloroplast and Mitochondrial DNA sequences in Chinese Prunus genotypes (Prunus persica, Prunus domestica, and Prunus avium)

Tariq Pervaiz, Xin Sun, Yanyi Zhang, Ran Tao, Junhuan Zhang, Jinggui Fang
2015 BMC Plant Biology  
In case of cpSSR, Hong Tao (Peach) and L1 Tai Yang Li (Plum) genotypes demonstrated similarity index of 0.85 and Huang Tao has the lowest similarity index of 0.50.  ...  The Y2 Wu Xing (Cherry) and L2 Hong Xin Li (Plum) genotypes have a high similarity index (0.89), followed by Zi Ye Li (0.85), whereas; L1 Tai Yang Li (plum) has the lowest genetic similarity (0.35).  ...  Tao (peach) showed the lowest similarity index i.e., 0.52.  ... 
doi:10.1186/s12870-014-0402-4 pmid:25592231 pmcid:PMC4310034 fatcat:f3f7xu7hlvfxphoq25444vgiwe

Phylogenetic Analysis Of Endophytic Fungi Isolate from Bellucia pentamera Naudin Based On ITS rDNA

Andika Puspita Dewi, Elisa Nurnawati, Laila Hanum, Hary Widjajanti
2019 IJEMS (Indonesian Journal of Environmental Management and Sustainability)  
From the results of identification and analysis of DNA sequencing of endophytic fungi DKJ1, DKJ3a, DKJ3c and DKJ4 with the primary pair of ITS and Beta-tubulin shows that the phylogenetic tree is different  ...  The purpose of this study is to identify and analyze the sequencing of endophytic fungi isolates with ITS and Beta-tubulin markers and phylogenetic trees.  ...  Highest sequence similarity index values was found in all samples of DKJ1, DKJ3c, DKJ4 and DKJ3a meanwhile the lowest order sequence similarity index values was found in all Aspergillus and Penicillium  ... 
doi:10.26554/ijems.2019.3.4.100-106 fatcat:p6sbj2htdzfx5p4tntit32fkyi

An efficient approach for sequence matching in large DNA databases

Jung-Im Won, Sanghyun Park, Jee-Hee Yoon, Sang-Wook Kim
2006 Journal of information science  
Since DNA databases contain a huge volume of sequences, fast indexes are essential for efficient processing of DNA sequence matching.  ...  In molecular biology, DNA sequence matching is one of the most crucial operations.  ...  DNA sequence matching is an operation that finds DNA sequences whose base arrangement is similar to that of a DNA sequence given in a query from a DNA database.  ... 
doi:10.1177/0165551506059229 fatcat:l2ea2b57jjd5xdztqhx3tjef6m

Studies on the Genetic Variations between Field Strains of Pectinophora gossypiella (SAUNDERS) using Polymerase Chain Reaction (PCR) Technique

M. Mahmoud, I. Ibrahim, A. Khidr, M. Abd-El Hameed
2019 Journal of Plant Protection and Pathology  
The highest similarity index between field strains and laboratory strain of the pink bollworm appeared in primer P8, while the lowest similarity index between field strains and laboratory strain of the  ...  The RAPD patterns resulted from amplification of DNA of the field colony strains and laboratory strain of the pink bollworm P. gossypiella revealed that the lowest value of similarity index (0.0%), which  ...  Similarity index: The similarity index was used to compare patterns within as well as between populations.  ... 
doi:10.21608/jppp.2019.40565 fatcat:cqcga6mhzbdrjhh54qcgghbc4a

Information Theory and Multivariate Techniques for Analyzing DNA Sequence Data: An Example from Tomato Genes

Bal K Joshi, Dilip R Panthee
2010 Nepal Journal of Biotechnology  
Shannon-Weaver index (Shannon Entropy, H') and mutual information (MI) index were estimated from DNA sequences of 22 genes, consisted of two gene families of tomato, namely disease resistance and fruit  ...  Sequences similarity among genes was observed within a family.  ...  This indicates that the nature of variation in DNA sequences reflect the similar variation at phenotypic levels. Genes within cluster were phenotypically similar.  ... 
doi:10.3126/njb.v1i1.3867 fatcat:wtrb2e2hkvf2dbmt2mlpup66jy
« Previous Showing results 1 — 15 out of 353,221 results