Filters








1,316 Hits in 3.2 sec

Cache-Friendly implementations of transitive closure

Michael Penner, Viktor K. Prasanna
2007 ACM Journal of Experimental Algorithmics  
In this paper we show cache-friendly implementations of the Floyd-Warshall algorithm for the All-Pairs Shortest-Path problem.  ...  We also develop a general representation, the Unidirectional Space Time Representation, which can be used to generate cache-friendly implementations for a large class of algorithms.  ...  Experimentally, the best tile size for the USTR optimization of transitive closure on our Pentium III was found to be β = 140. A Cache-Friendly Algorithm for Transitive Closure.  ... 
doi:10.1145/1187436.1210586 fatcat:a362wrijnfglloi5q4glj5gl7u

Cache-friendly implementations of transitive closure

M. Penner, V.K. Prasanna
Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques  
In this paper we show cache-friendly implementations of the Floyd-Warshall algorithm for the All-Pairs Shortest-Path problem.  ...  We also develop a general representation, the Unidirectional Space Time Representation, which can be used to generate cache-friendly implementations for a large class of algorithms.  ...  Experimentally, the best tile size for the USTR optimization of transitive closure on our Pentium III was found to be β = 140. A Cache-Friendly Algorithm for Transitive Closure.  ... 
doi:10.1109/pact.2001.953299 dblp:conf/IEEEpact/PennerP01 fatcat:73kbritkxfgmxgkpfj4io5k5fi

Optimizing graph algorithms for improved cache performance

J.-S. Park, M. Penner, V.K. Prasanna
2004 IEEE Transactions on Parallel and Distributed Systems  
We develop new implementations by means of these two techniques for the fundamental graph problem of Transitive Closure, namely the Floyd-Warshall Algorithm, and prove their optimality with respect to  ...  For these algorithms, we demonstrate up to a 2x improvement by using a cache friendly graph representation.  ...  In Section 2.2 we discuss some of the challenges that are faced in making the transitive closure problem cache-friendly.  ... 
doi:10.1109/tpds.2004.44 fatcat:knegwttiirgzbon755jkezo3ni

Optimizing graph algorithms for improved cache performance

Joon-Sang Park, M. Penner, V. K. Prasanna
2002 Proceedings 16th International Parallel and Distributed Processing Symposium  
We develop new implementations by means of these two techniques for the fundamental graph problem of Transitive Closure, namely the Floyd-Warshall Algorithm, and prove their optimality with respect to  ...  For these algorithms, we demonstrate up to a 2x improvement by using a cache friendly graph representation.  ...  In Section 2.2 we discuss some of the challenges that are faced in making the transitive closure problem cache-friendly.  ... 
doi:10.1109/ipdps.2002.1015509 dblp:conf/ipps/ParkPP02 fatcat:kyyyfuqjfjeolepbasdy3grpie

Inferray

Julien Subercaze, Christophe Gravier, Jules Chevalier, Frederique Laforest
2016 Proceedings of the VLDB Endowment  
Our measurements on synthetic and real-world datasets show improvements over competitors on RDFS-Plus, and up to several orders of magnitude for transitivity closure.  ...  This paper presents Inferray, an implementation of RDFS, ρDF, and RDFS-Plus inference with improved performance over existing solutions.  ...  The authors would also like to thank Satish Nadathur for his help on sorting algorithms, Jacopo Urbani for his help with WebPIE and QueryPIE and the authors of RDFox for their help in configuring their  ... 
doi:10.14778/2904121.2904123 fatcat:6ncfo5nx2zbydihf4dbxlzgc6q

One-bit counts between unique and sticky

David J. Roth, David S. Wise
1999 SIGPLAN notices  
The first, suited to pure register transactions, is a cache of referents to two shared references.  ...  The analog of Deutsch's and Bobrow's multiple-reference table, this cache is sufficient to manage small counts across successive assignment statements.  ...  The friendly metaphor of ecological recycling implies a more local reuse of freed space, enhancing locality.  ... 
doi:10.1145/301589.286866 fatcat:62q7qtsgcvcxdagiurrueeeccm

One-bit counts between unique and sticky

David J. Roth, David S. Wise
1998 Proceedings of the first international symposium on Memory management - ISMM '98  
The first, suited to pure register transactions, is a cache of referents to two shared references.  ...  The analog of Deutsch's and Bobrow's multiple-reference table, this cache is sufficient to manage small counts across successive assignment statements.  ...  The friendly metaphor of ecological recycling implies a more local reuse of freed space, enhancing locality.  ... 
doi:10.1145/286860.286866 dblp:conf/iwmm/RothW98 fatcat:64kswjisbvf2pdlfyv3dlh6eca

Experimenting with ELK Reasoner on Android

Yevgeny Kazakov, Pavel Klinov
2013 International Workshop on OWL Reasoner Evaluation  
The paper emphasizes the engineering aspects of ELK's design and implementation which make this performance possible.  ...  The results show that economic and well-engineered ontology reasoners can demonstrate acceptable performance when classifying ontologies with thousands of axioms and take advantage of multi-core CPUs of  ...  ELK provides a custom array-based, cache-friendly hashtable implementation with linear prob- ing which is fine-tuned for small sets and supports very fast lookup and iteration (as, consequently, intersection  ... 
dblp:conf/ore/KazakovK13 fatcat:ayizdz23brh6jcntivkyqmz7ge

A few bits are enough - ASIC friendly Regular Expression matching for high speed network security systems

Alex X. Liu, Eric Norige, Sailesh Kumar
2013 2013 21st IEEE International Conference on Network Protocols (ICNP)  
Compared with XFA, HASIC advances the state of the art because it can be fully automated and it is ASIC friendly. HASIC only uses three simple bit operations and they are easy to implement in ASIC.  ...  : (1) XFA construction is hard to automate as it requires manual annotation by human experts, and (2) XFA is hard to implement in ASIC as the program executed upon reaching a state requires much of the  ...  or more closures than the number of RegExes.  ... 
doi:10.1109/icnp.2013.6733572 dblp:conf/icnp/LiuNK13 fatcat:lyfrwvhwerf4rexoyqngxs4tse

Coping with Inconsistent Models of Requirements

Juha Tiihonen, Mikko Raatikainen, Lalli Myllyaho, Clara Marie Lüders, Tomi Männistö
2019 International Configuration Workshop  
The research methodology follows the principles of Design Science: we built a prototype implementation for the approach and tested it with relevant use cases.  ...  However, holistic support for managing the consistency of a set of requirements such as a release is largely missing.  ...  We thank Elina Kettunen, Miia Rämö and Tomi Laurinen for their contributions to implementation.  ... 
dblp:conf/confws/TiihonenRMLM19 fatcat:p5prik3z7jhrnaftazy74hjkz4

Semeru: A Memory-Disaggregated Managed Runtime

Chenxi Wang, Haoran Ma, Shi Liu, Yuanqi Li, Zhenyuan Ruan, Khanh Nguyen, Michael D. Bond, Ravi Netravali, Miryung Kim, Guoqing Harry Xu
2020 USENIX Symposium on Operating Systems Design and Implementation  
RDMA over InfiniBand (a) Universal Java Heap (b) State machine of a virtual page Init Cached-Dirty Evicted Allocate Swap out Cached-Clean Free (unmap)  ...  An evaluation of Semeru on a set of widely-deployed systems shows very promising results.  ...  Thus the transitive closure may include dead objects (due to pointer changes the memory server is not aware of), but objects not in the closure are guaranteed to be dead (except for newly allocated objects  ... 
dblp:conf/osdi/WangMLLRNBNKX20 fatcat:mleexavtujcwjjovvu2ffesdtu

Solving path problems on the GPU

Aydın Buluç, John R. Gilbert, Ceren Budak
2010 Parallel Computing  
The blocked recursive elimination strategy we use is applicable to a class of algorithms (such as all-pairs shortest-paths, transitive closure, and LU decomposition without pivoting) having similar data  ...  We implemented a recursively partioned all-pairs shortest-paths algorithm that harnesses the power of GPUs better than existing implementations.  ...  Sivan Toledo helped us improve the presentation of the paper with various comments. Thanks to Fenglin Liao and Arda Atali for their help during the initial implementation on Cuda.  ... 
doi:10.1016/j.parco.2009.12.002 fatcat:gpdffk6s4fa4tifrawtmq5x22a

Plumber: Diagnosing and Removing Performance Bottlenecks in Machine Learning Data Pipelines [article]

Michael Kuchnik and Ana Klimovic and Jiri Simsa and Virginia Smith and George Amvrosiadis
2022 arXiv   pre-print
By automating caching, Plumber obtains end-to-end speedups of over 50% compared to state-of-the-art tuners.  ...  However, it is challenging to implement efficient input pipelines, as it requires reasoning about parallelism, asynchrony, and variability in fine-grained profiling information.  ...  The transitive closure, f + − → s, measures if any child functions of f touches a random seed. If f + − → s is true, then we cannot cache f or any operations following it.  ... 
arXiv:2111.04131v2 fatcat:p3ebaozf4ffzjgakckfgmzqp3e

BioGateway: a semantic systems biology tool for the life sciences

Erick Antezana, Ward Blondé, Mikel Egaña, Alistair Rutherford, Robert Stevens, Bernard De Baets, Vladimir Mironov, Martin Kuiper
2009 BMC Bioinformatics  
We call for the creation of a forum that strives to implement a truly semantic life science foundation for Semantic Systems Biology.  ...  Results: We implemented a semantically integrated resource named BioGateway, comprising the entire set of the OBO foundry candidate ontologies, the GO annotation files, the SWISS-PROT protein set, the  ...  Transitive closures To increase the utility of the RDF representation, transitive closures were added programmatically with the use of the Ontolome module from ONTO-PERL [79] .  ... 
doi:10.1186/1471-2105-10-s10-s11 pmid:19796395 pmcid:PMC2755819 fatcat:r4mmbhi3vrenji3dh5kekdz2ge

Rules for mobile performance optimization

Tammy Everts
2013 Communications of the ACM  
An overview of techniques to speed page loading Tammy Everts, Radware Performance has always been crucial to the success of Web sites.  ...  Even delays of less than one second significantly affect revenues.  ...  The Closure Compiler from Google does an incredible job of understanding and minifying JavaScript.  ... 
doi:10.1145/2492007.2492024 fatcat:vjaqyr62kvabfa77kzgamfx3xm
« Previous Showing results 1 — 15 out of 1,316 results