170,885 Hits in 5.8 sec

Function Merging by Sequence Alignment

Rodrigo C. O. Rocha, Pavlos Petoumenos, Zheng Wang, Murray Cole, Hugh Leather
2019 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)  
We introduce a novel technique that can merge arbitrary functions through sequence alignment, a bioinformatics algorithm for identifying regions of similarity between sequences.  ...  However, production compilers only apply this optimization to identical functions, while research compilers improve on that by merging the few functions with identical control-flow graphs and signatures  ...  This work was supported by the Royal Academy of Engineering under the Research Fellowship scheme.  ... 
doi:10.1109/cgo.2019.8661174 dblp:conf/cgo/RochaP0CL19 fatcat:rpprdk7r4zesngerwcjisselty

Effective function merging in the SSA form

Rodrigo C. O. Rocha, Pavlos Petoumenos, Zheng Wang, Murray Cole, Hugh Leather
2020 Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation  
Function merging is an important optimization for reducing code size. This technique eliminates redundant code across functions by merging them into a single function.  ...  While a supericially minor workaround, this has a three-fold negative effect: by artiicially lengthening the instruction sequences to be aligned, it hinders the identiication of mergeable instruction;  ...  [14] advanced this simple merging strategy by exploiting the CFG isomorphism of two functions.  ... 
doi:10.1145/3385412.3386030 dblp:conf/pldi/RochaP0CL20 fatcat:am6isjg4lfbf5mrdou2xa3vxzi

Structural similarity to bridge sequence space: Finding new families on the bridges

Parantu K. Shah, Patrick Aloy, Peer Bork, Robert B. Russell
2005 Protein Science  
We explore this possibility here by preparing merged profiles for pairs of structurally similar, but not necessarily sequence-similar, domains within the SMART and Pfam database by way of the Structural  ...  New structures are often similar to those solved previously, and such similarities can give insights into function by linking poorly understood families to those that are better characterized.  ...  We then merge the separate alignments via a structure-based sequence alignment and use the new alignment to perform profile-based sequence database searches.  ... 
doi:10.1110/ps.041187405 pmid:15840833 pmcid:PMC2253280 fatcat:7u5k57i5kve77mr4zsuibwdfeq

A Sequence-Pair-Classification-Based Method For Detecting And Correcting Under-Clustered Gene Families [article]

Akshay Yadav, David Fernández-Baca, Steven B Cannon
2020 bioRxiv   pre-print
alignment score cutoffs.  ...  Further, using a simple merging strategy, we were able to merge 2,216 small families into 933 under-clustered families using the predicted missing sequences.  ...  alignment score cutoffs obtained in training. We also provide the containerized version of the tool which can be downloaded from  ... 
doi:10.1101/2020.02.22.942557 fatcat:g3opez4r3bhqnipqzo6pps6jqa

Generalized suffix trees for biological sequence data: applications and implementation

Bieganski, Riedl, Cartis, Retzel
1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences HICSS-94  
Instead of laboriously searching sequences stored as arrays, we search by walking down the tree. We present a new GSTbased sequence alignment algorithm, called GESTALT.  ...  GESTALT finds all exact matches in parallel, and uses best-first search to extend them to produce alignments.  ...  FIGURE 5 . 5 Most of them function by enumerating alignments of prefixes of the two sequences of increasing length.  ... 
doi:10.1109/hicss.1994.323593 fatcat:ocsqocmthjbndoawy7fpdhk6ua

Multiple alignment by sequence annealing

A. S. Schwartz, L. Pachter
2007 Bioinformatics  
This leads to a sequence annealing algorithm, which is an incremental method for building multiple sequence alignments one match at a time.  ...  Results: The sequence annealing algorithm performs well on benchmark test sets of protein sequences.  ...  The function from the set of sequence elements to the alignment poset that specifies the multiple alignment is not shown, but is fully specified by the diagram on the right.  ... 
doi:10.1093/bioinformatics/btl311 pmid:17237099 fatcat:kniyrq3kfnev5leopvndyjf74m

AlignStat: a web-tool and R package for statistical comparison of alternative multiple sequence alignments

Thomas Shafee, Ira Cooke
2016 BMC Bioinformatics  
Alternative sequence alignment algorithms yield different results. It is therefore useful to quantify the similarities and differences between alternative alignments of the same sequences.  ...  Results: Here we present a simple method for aligning two alternative multiple sequence alignments to one another and assessing their similarity.  ...  function. d Merges, splits and shifts in results matrix visualised by the plot_dissimilarity_proportions function  ... 
doi:10.1186/s12859-016-1300-6 pmid:27784265 pmcid:PMC5081975 fatcat:7t3qkigvxzahlmaxck33wsqqda

A MapReduce Framework for DNA Sequencing Data Processing

Samy Ghoneimy, Samir Abou El-Seoud
2016 International Journal of Recent Contributions from Engineering, Science & IT  
, merging, indexing, ‎and generating alignments.  ...  In this paper MapReduce/Hadoop along with Burrows-Wheeler Aligner (BWA), Sequence Alignment/Map (SAM) ‎tools, are fully utilized to provide various utilities for manipulating alignments, including sorting  ...  by the map( ) function of the third and final step of DNA sequencing, variant detection.  ... 
doi:10.3991/ijes.v4i4.6537 fatcat:ax3tuk5mc5dqdiuqw2bnkesvjy

A unique chromatin complex occupies young -satellite arrays of human centromeres

J. G. Henikoff, J. Thakur, S. Kasinathan, S. Henikoff
2015 Science Advances  
We use high-resolution chromatin immunoprecipitation (ChIP) of centromere components followed by clustering of sequence data as an unbiased approach to identify functional centromere sequences.  ...  We find that specific dimeric a-satellite units shared by multiple individuals dominate functional human centromeres.  ...  For each base pair (i) in the reference sequence, the number of merged pairs aligned over it (n i ) was counted and normalized by dividing by the total number of merged pairs (N) and multiplying by the  ... 
doi:10.1126/sciadv.1400234 pmid:25927077 pmcid:PMC4410388 fatcat:n5sbl7l5cjenphm7ohuabwk4o4

An Integrated Physical, Genetic and Cytogenetic Map of Brachypodium distachyon, a Model System for Grass Research

Melanie Febrer, Jose Luis Goicoechea, Jonathan Wright, Neil McKenzie, Xiang Song, Jinke Lin, Kristi Collura, Marina Wissotski, Yeisoo Yu, Jetty S. S. Ammiraju, Elzbieta Wolny, Dominika Idziak (+6 others)
2010 PLoS ONE  
Mapped BACs were used in Fluorescence In Situ Hybridisation (FISH) experiments to align the integrated physical map and sequence assemblies to chromosomes with high resolution.  ...  sequences (BES).  ...  The remaining contigs were end-merged by ''End to End'' function and then singletons were added to the end of contigs by ''Singles to End'' function (cutoff of 1e-21).  ... 
doi:10.1371/journal.pone.0013461 pmid:20976139 pmcid:PMC2956642 fatcat:7pkx24v3fjd2zdtfxpymi37ykm

Fuse: multiple network alignment via data fusion

Vladimir Gligorijević, Noël Malod-Dognin, Nataša Pržulj
2015 Bioinformatics  
First, it computes our novel protein functional similarity scores by fusing information from wiring patterns of all aligned PPI networks and sequence similarities between their proteins.  ...  Our comprehensive new protein similarity scores are computed by Non-negative Matrix Tri-Factorization (NMTF) method that predicts associations between proteins whose homology (from sequences) and functioning  ...  We merge V i with V j into V i j by identifying the mapped nodes u k ↔ v k and by creating a corresponding merged node u k v k ∈ V i j .  ... 
doi:10.1093/bioinformatics/btv731 pmid:26668003 fatcat:5labegnmqbdzjnbtp6jpb746xu

FUSE: Multiple Network Alignment via Data Fusion [article]

Vladimir Gligorijević and Noël Malod-Dognin and Nataša Pržulj
2014 arXiv   pre-print
can be found by using protein sequence similarities alone.  ...  First, it computes novel similarity scores of proteins across the PPI networks by fusing from all aligned networks both the protein wiring patterns and their sequence similarities.  ...  conservation across all PPI networks being aligned, by using a triplet approach similar to the multiple sequence aligner, T-Coffee [30] .  ... 
arXiv:1410.7585v2 fatcat:jfdt3a5ehffyznagy37rksm2si An integrated environment for the prediction, annotation, and analysis of ncRNA

M.-J. Cros, A. de Monte, J. Mariette, P. Bardou, B. Grenier-Boley, D. Gautheret, H. Touzet, C. Gaspin
2011 RNA: A publication of the RNA Society  
Alignments are merged in a greedy fashion, starting from the first alignment on the query sequence, from 59 to 39.  ...  A functionality is already implemented to help users reduce redundancy by merging overlapping regions. This functionality will be improved in the future.  ... 
doi:10.1261/rna.2844911 pmid:21947200 pmcid:PMC3198588 fatcat:44e23cc6srdaxgphmfzw5xzwo4

Parallel and Scalable Precise Clustering for Homologous Protein Discovery [article]

Stuart Byma, Akash Dhasade, Adrian Altenhoff, Christophe Dessimoz, James R. Larus
2019 arXiv   pre-print
Clustering is a technique to reduce the number of comparison needed to find similar pairs in a set of n elements such as protein sequences.  ...  Precise clustering ensures that each pair of similar elements appears together in at least one cluster, so that similarities can be identified by all-to-all comparison in each cluster rather than on the  ...  The primary compute bottleneck is the process of aligning representative sequences using Smith-Waterman, which processes data that fits in the L1 cache and is able to saturate functional units with a single  ... 
arXiv:1908.10574v1 fatcat:y7p357disvfflfrjqxbf55h46a

Internal organization of large protein families: Relationship between the sequence, structure, and function-based clustering

Xiao-Hui Cai, Lukasz Jaroszewski, John Wooley, Adam Godzik
2011 Proteins: Structure, Function, and Bioinformatics  
In contrast, the still commonly used sequence-based methods with fixed thresholds result in vast overestimates of structural and functional diversity in protein families.  ...  functions.  ...  ' between two sequence clusters and the number of distinct structural (or functional) clusters 'merged' into the same sequence clusters are comparable.  ... 
doi:10.1002/prot.23049 pmid:21671455 pmcid:PMC3132221 fatcat:lzhctud7rja7xf6quo3dbkrcqu
« Previous Showing results 1 — 15 out of 170,885 results