Filters








36,967 Hits in 5.7 sec

Efficient Algorithms for Handling Molecular Weighted Sequences [chapter]

Costas S. Iliopoulos, Christos Makris, Yannis Panagis, Katerina Perdikuri, Evangelos Theodoridis, Athanasios Tsakalidis
IFIP International Federation for Information Processing  
In this paper we introduce the Weighted Suffix Tree, an efficient data structure for computing string regularities in weighted sequences of molecular data.  ...  We present time and space efficient algorithms for constructing the weighted suffix tree and some applications of the proposed data structure to problems taken from the Molecular Biology area such as pattern  ...  Thus we need new and efficient algorithms in order to analyze molecular weighted sequences.  ... 
doi:10.1007/1-4020-8141-3_22 dblp:conf/ifipTCS/IliopoulosMPPTT04 fatcat:m4jchffezjhnrgblqivyddw33i

Handling Weighted Sequences Employing Inverted Files and Suffix Trees
english

Klev Diamanti, Andreas Kanavos, Christos Makris, Thodoris Tokis
2014 Proceedings of the 10th International Conference on Web Information Systems and Technologies  
In this paper, we address the problem of handling weighted sequences.  ...  Besides providing a handling of weighted sequences using n-grams, we also provide a study of constructing space efficient n-gram inverted indexes.  ...  The main motivation for handling weighted sequences comes from Computational Molecular Bio-logy.  ... 
doi:10.5220/0004788502310238 dblp:conf/webist/DiamantiKMT14 fatcat:i5pqjujdpjbyfp6k2o6aoowylm

String Data Structures for Computational Molecular Biology [chapter]

Christos Makris, Evangelos Theodoridis
2010 Algorithms in Computational Molecular Biology  
The empty word is the empty sequence (of zero length) and is denoted by ε.  ...  The basic string algorithmic problems that develop in computational molecular biology are: r Exact pattern matching: given a pattern P and a text T to locate the occurrences of P into T r Approximate pattern  ...  However, the probabilistic suffix tree is inefficient for efficiently handling weighted sequences, which is why the weighted suffix tree was introduced; however, it could be possible for a suitable combination  ... 
doi:10.1002/9780470892107.ch1 fatcat:ninej76wgrbmlncj2zivrt6s6u

A Dynamic Approach to Weighted Suffix Tree Construction Algorithm

Binay Kumar Pandey, Rajdeep Niyogi, Ankush Mittal
2011 International Journal of Distributed and Parallel systems  
In present time weighted suffix tree is consider as a one of the most important existing data structure used for analyzing molecular weighted sequence.  ...  Although a static partitioning based parallel algorithm existed for the construction of weighted suffix tree, but for very long weighted DNA sequences it takes significant amount of time.  ...  Therefore, a requirement of efficient algorithm is arises, in order to analyze molecular weighted sequences.  ... 
doi:10.5121/ijdps.2011.2103 fatcat:vz4h4z3rnzd45gjjq2rsw2ykmy

Feature Based Method for Predicting Pharmacological Interaction

Ansa Baiju, Linda Sara Mathew, Neethu Subash
2021 International journal of recent technology and engineering  
Finally, the processed data is given as input to the extreme gradient boosting classifier algorithm for predicting new drug target interaction pairs.  ...  The drug chemical structure information can be extracted through FP2 molecular fingerprint which describe the molecular structure information.  ...  The update the weight and normalize the samples in Equ.3 (3) This method is which handle the unbalancing data efficiently F.  ... 
doi:10.35940/ijrte.e5205.019521 fatcat:jwipqtxeqrfirphou2n3wxyirq

BMF: Bitmapped Mass Fingerprinting for Fast Protein Identification

Weikuan Yu, K. John Wu, Wei-Shinn Ku, Cong Xu, Juan Gao
2011 2011 IEEE International Conference on Cluster Computing  
With recent large-scale automation of genome sequencing and the explosion of protein databases, it is important to exploit latest data processing technologies and design highly scalable algorithms to speed  ...  In this study, we design, implement, and evaluate a new software tool, Bitmapped Mass Fingerprinting (BMF), that can efficiently construct a bitmap index for short peptides, and quickly identify candidate  ...  Xiao Qin from Auburn University for their comments on earlier drafts of this paper, and Xinyu Que from Auburn University for his help on running some experiments for this work.  ... 
doi:10.1109/cluster.2011.11 dblp:conf/cluster/YuWKXG11 fatcat:y52fdoydcnhvvcjy7a5t5mfj3e

Parallelization of Weighted Sequence Comparision By Using EBWT

Binay Kumar Pandey, Rajdeep Niyogi, Ankush Mittal
2011 International Journal of Distributed and Parallel systems  
In this paper, we describe the design of high-performance extended burrow wheeler transform based weighted sequence comparison algorithm for many core GPU s taking advantages of the full programmability  ...  Our extended burrow wheeler transform based weighted sequence comparison algorithm with thrust library implementation on CUDA is the fastest implementation of weighted sequence comparison algorithm than  ...  However, this algorithm took considerable amount of time even to compare two small length sequences and also not suitable for molecular weighted sequences.  ... 
doi:10.5121/ijdps.2011.2102 fatcat:qoeu3vjvenehnpcj7hn46akuxa

Delineation of Techniques to Implement on the Enhanced Proposed Model Using Data Mining for Protein Sequence Classification

Ananya Basu, Suprativ Saha
2014 International Journal of Database Management Systems  
The novelty of the proposed model is its combined use of intelligent techniques to classify the protein sequence faster and efficiently.  ...  Use of FFT, fuzzy classifier, String weighted algorithm, gram encoding method, neural network model and rough set classifier in a single model and in an appropriate place can enhance the quality of the  ...  For each protein sequence in the database molecular weight is calculated. Fast Fourier transform is then calculated for every sequence as good discriminating feature.  ... 
doi:10.5121/ijdms.2014.6105 fatcat:wo3wz57hcnh7hkfxpqnprxkuse

Delineation of Techniques to Implement on the Enhanced Proposed Model Using Data Mining for Protein Sequence Classification

Ananya Basu, Suprativ Saha
2014 International Journal of Database Management Systems  
The novelty of the proposed model is its combined use of intelligent techniques to classify the protein sequence faster and efficiently.  ...  Use of FFT, fuzzy classifier, String weighted algorithm, gram encoding method, neural network model and rough set classifier in a single model and in an appropriate place can enhance the quality of the  ...  For each protein sequence in the database molecular weight is calculated. Fast Fourier transform is then calculated for every sequence as good discriminating feature.  ... 
doi:10.5121/ijdms.2013.6105 fatcat:x2jkmr7iavfnfcnsb6e4qjj4yy

PHYLOViZ 2.0: providing scalable data integration and visualization for multiple phylogenetic inference methods

Marta Nascimento, Adriano Sousa, Mário Ramirez, Alexandre P. Francisco, João A. Carriço, Cátia Vaz
2016 Bioinformatics  
PHYLOViZ 2.0 incorporates new data analysis algorithms and new visualization modules, as well as the capability of saving projects for subsequent work or for dissemination of results.  ...  High Throughput Sequencing provides a cost effective means of generating high resolution data for hundreds or even thousands of strains, and is rapidly superseding methodologies based on a few genomic  ...  Introduction DNA sequencing facilitated obtaining comparable and reproducible microbial typing data, effectively replacing other molecular and phenotypic techniques.  ... 
doi:10.1093/bioinformatics/btw582 pmid:27605102 fatcat:prjdnlwg3rgxfkw4g5fcnkxzpm

Parallel implementation of a quartet-based algorithm for phylogenetic analysis

B.B. Zhou, D. Chu, M. Tarawneh, P. Wang, C. Wang, A.Y. Zomaya, R.P. Brent
2006 Proceedings 20th IEEE International Parallel & Distributed Processing Symposium  
This algorithm constructs evolutionary trees for a given set of DNA or protein sequences based on the topological information of every possible quartet trees.  ...  In our experiments, computation time, memory usage and communication costs of the algorithm were measured for different number of DNA sequences and across different number of CPUs.  ...  It constructs a tree for a given number of molecular sequences based on the topological properties of each subset of four molecular sequences.  ... 
doi:10.1109/ipdps.2006.1639534 dblp:conf/ipps/ZhouCTWWZB06 fatcat:ybwmxjol2nbsvjv7hizuii33yy

Descartes' fly: the geometry of genomic annotation

Junhyong Kim
2001 Functional & Integrative Genomics  
The rule of the game is how to generate the most efficient set of "handles" into the complexity of biologically relevant molecular classes.  ...  Many different algorithms and methods have been proposed for identifying biologically relevant features in genomic sequences.  ... 
doi:10.1007/s101420000025 pmid:11793243 fatcat:te62xn4wjbcsfmcegfvpuqjapy

Parallelization of Weighted Sequence Comparison by using EBWT [article]

Shashank Srikant
2011 arXiv   pre-print
The Extended Burrows Wheeler transform (EBWT) helps to find the distance between two sequences. Implementation of an existing algorithm takes considerable amount of time for small size sequences.  ...  In this paper, we give a parallel implementation of this algorithm using NVIDIA Compute Unified Device Architecture (CUDA). We have obtained, on an average, a 2X improvement in the performance.  ...  However, this algorithm takes a considerable amount of time even to compare two small length sequences making comparison of molecular weighted sequences far less efficient.  ... 
arXiv:1011.0597v3 fatcat:x6yslbkxuvajbeon4raxvnewpu

Preface

Sorin Istrail, Pavel Pevzner, Ron Shamir
2007 Discrete Applied Mathematics  
Dynamic programming is used to obtain a sub-quadratic time algorithm for the problem. Monotonic correction factors are key to creating space-efficient  ...  Given positive and negative examples of sequences, the goal is to find a motif (position weight matrix) that discriminates between the two sets of examples.  ...  Today the development of biology and medicine depend to a tremendous extent on using computers for data handling and for sophisticated analysis.  ... 
doi:10.1016/j.dam.2006.09.001 fatcat:qjuijfdypvemhhvje7hagnxsru

Predicting subcellular localization of multisite proteins using differently weighted multi-label k-nearest neighbors sets

Zhongting Jiang, Dong Wang, Peng Wu, Yuehui Chen, Huijie Shang, Luyao Wang, Huichun Xie
2019 Technology and Health Care  
To improve the efficiency of predicting multiplex protein subcellular localization, we used the multi-label k-nearest neighbors algorithm and assigned different weights to various attributes.  ...  For a protein to execute its function, ensuring its correct subcellular localization is essential.  ...  Although there are many algorithms available for predicting PSL, most of these focus only on a single subcellular location of a protein sequence, and cannot handle proteins located at multiple sites.  ... 
doi:10.3233/thc-199018 pmid:31045538 pmcid:PMC6598103 fatcat:sgvguhkfrjan3bczgr7el55qqa
« Previous Showing results 1 — 15 out of 36,967 results