6,253 Hits in 4.6 sec


Vladimir Dimić, Miquel Moretó, Marc Casas, Jan Ciesko, Mateo Valero
2020 Proceedings of the 34th ACM International Conference on Supercomputing  
RICH updates the reduction variable directly in the cache hierarchy with the help of added in-cache functional units.  ...  RICH does not modify the ISA, which allows the use of algorithms with reductions from pre-compiled external libraries.  ...  We thank to Vicenç Beltran and Sergi Mateo for sharing their valuable experience with reductions in the context of programming models, to Lluc Alvarez, Luc Jaulmes and Francesc Martinez for numerous technical  ... 
doi:10.1145/3392717.3392736 fatcat:57ph5q6wobbehmbigehddigppa

Dark Memory and Accelerator-Rich System Optimization in the Dark Silicon Era

Ardavan Pedram, Stephen Richardson, Mark Horowitz, Sameh Galal, Shahar Kvatinsky
2017 IEEE design & test  
the DRAM and memory hierarchy are mostly idle.  ...  While this might seem like the magic bullet we need, for most CPU applications more energy is dissipated in the memory system than in the processor: these large gains in efficiency are only possible if  ...  While the register files and first level caches are duplicated with the compute units, the lower levels in the memory hierarchy (last level cache, and sometimes even the L2) are shared and so their area  ... 
doi:10.1109/mdat.2016.2573586 fatcat:hpt3dqfk55fwtk7iuwpz6gsspy

Enhancing Programmability, Portability, and Performance with Rich Cross-Layer Abstractions [article]

Nandita Vijaykumar
2019 arXiv   pre-print
In doing so, they enable a rich space of hardware-software cooperative mechanisms to optimize for performance.  ...  This thesis makes the case for rich low-overhead cross-layer abstractions as a highly effective means to address the above challenges.  ...  in the swap space in the memory hierarchy.  ... 
arXiv:1911.05660v1 fatcat:w5f3g4isqbcphm2jjfzjtvrjnq

Accelerator-Rich Architectures

Jason Cong, Mohammad Ali Ghodrat, Michael Gill, Beayna Grigorian, Karthik Gururaj, Glenn Reinman
2014 Proceedings of the The 51st Annual Design Automation Conference on Design Automation Conference - DAC '14  
With respect to these areas, we review the progress of our research in the Center for Domain-Specific Computing (supported by the NSF Expeditionsin-Computing Award), and discuss ongoing work and additional  ...  In particular, we believe future architectures will make extensive use of accelerators to significantly reduce energy consumption.  ...  impressive speedup and energy reduction.  ... 
doi:10.1145/2593069.2596667 dblp:conf/dac/CongGGGGR14 fatcat:orli5tsdhbh3jke4bd3xwyl6ta

PARADE: A cycle-accurate full-system simulation Platform for Accelerator-Rich Architectural Design and Exploration

Jason Cong, Zhenman Fang, Michael Gill, Glenn Reinman
2015 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)  
The emerging accelerator-rich architecture is still in its early stage, and many design issues, such as the efficient accelerator resource management and communication between accelerators and CPU cores  ...  Finally, a few case studies are conducted to confirm that PARADE can enable various system-level design space explorations in the accelerator-rich architecture.  ...  First, the cache hierarchy is simulated by the Ruby [3] component in gem5 that supports various cache coherence protocols.  ... 
doi:10.1109/iccad.2015.7372595 dblp:conf/iccad/CongFGR15 fatcat:zdlwygtwnfcaxmkgsckn42ccui

Probabilistic Modelling of Morphologically Rich Languages [article]

Jan A. Botha
2015 arXiv   pre-print
This thesis investigates how the sub-structure of words can be accounted for in probabilistic models of language.  ...  This assumption does not fit morphologically complex language well, where words can have rich internal structure and sub-word elements are shared across distinct word forms.  ...  The hierarchies induced by these n-grams in both models can be visualised as trees, as shown in Figure 3 .3.  ... 
arXiv:1508.04271v1 fatcat:6qhsfdbvt5emfiaumtwh2pzs7m

Deploying Video-on-Demand Services on Cable Networks

Matthew S. Allen, Ben Y. Zhao, Rich Wolski
2007 27th International Conference on Distributed Computing Systems (ICDCS '07)  
Hardware requirements become more substantial as the service providers increase the catalog size or number of subscribers.  ...  VoD allows subscribers to view any item in a large media catalog nearly-instantaneously.  ...  at two simple caching strategies that are implemented by the index servers.  ... 
doi:10.1109/icdcs.2007.98 dblp:conf/icdcs/AllenZW07 fatcat:fekpcwbtpbhlxjcpareadavk5m

High Performance Quantum Modular Multipliers [article]

Rich Rines, Isaac Chuang
2018 arXiv   pre-print
We additionally conduct an empirical resource analysis of our designs in order to determine the total gate count and circuit depth of each fully constructed circuit, with inputs as large as 2048 bits.  ...  Our comparative analysis considers both circuit implementations which allow for arbitrary (controlled) rotation gates, as well as those restricted to a typical fault-tolerant gate set.  ...  Acknowledgments The authors would particularly like to acknowledge Kevin Obenland at MIT Lincoln Laboratory, whose invaluable discussions, insight, and expertise in both the design of high-performance  ... 
arXiv:1801.01081v1 fatcat:euefn5f7yjf4dhw7mgv5qypccu

Parallel bounded analysis in code with rich invariants by refinement of field bounds

Nicolás Rosner, Juan Galeotti, Santiago Bermúdez, Guido Marucci Blas, Santiago Perez De Rosso, Lucas Pizzagalli, Luciano Zemín, Marcelo F. Frias
2013 Proceedings of the 2013 International Symposium on Software Testing and Analysis - ISSTA 2013  
TACO is a tool based on SAT-solving for efficient bugfinding in Java code with rich class invariants.  ...  The bounds computed by TACO generally include a substantial amount of nondeterminism; its reduction allows us to split the original analysis into disjoint subproblems.  ...  In Sections 3.2-3.4 we present optimizations that allow for significant reductions in the number of generated subproblems. In Section 3.5 we discuss some of the most relevant implementation details.  ... 
doi:10.1145/2483760.2483770 dblp:conf/issta/RosnerGBBRPZF13 fatcat:u5trj4i3jbbzljpclfvy6ldr7m

Data-rich astronomy: mining synoptic sky surveys [article]

Stefano Cavuoti
2013 arXiv   pre-print
In the last decade a new generation of telescopes and sensors has allowed the production of a very large amount of data and astronomy has become, a data-rich science; this transition is often labeled as  ...  traditional approaches to data storage, data reduction and data analysis.  ...  Acknowledgements "Good company in a journey makes the way to seem the shorter." Izaak Walton.  ... 
arXiv:1304.6615v1 fatcat:6qmfvpl3czcr3hlupbc3r3hhfa

Lattice QCD and the Computational Frontier [article]

Peter Boyle, Dennis Bollweg, Richard Brower, Norman Christ, Carleton DeTar, Robert Edwards, Steven Gottlieb, Taku Izubuchi, Balint Joo, Fabian Joswig, Chulwoo Jung, Christopher Kelly (+6 others)
2022 arXiv   pre-print
Overcoming them is key to supporting the community effort required to deliver the needed theoretical support for experiments in the coming decade.  ...  They include algorithmic and software-engineering challenges, challenges in computer technology and design, and challenges in maintaining the necessary human resources.  ...  For example, modern Intel chips typically use a 64 B cache line in all layers of the cache hierarchy, while the IBM BlueGene/Q system made use of a 64 B L1 line size and a 128 B L2 line size.  ... 
arXiv:2204.00039v1 fatcat:7aksjpcx65cnblq3aqzlpsx4ae

Selectivity estimation for hybrid queries over text-rich data graphs

Andreas Wagner, Veli Bicer, Thanh D. Tran
2013 Proceedings of the 16th International Conference on Extending Database Technology - EDBT '13  
In our experiments on real-world data, we show that capturing dependencies between structured and textual data in this way greatly improves the accuracy of selectivity estimates without compromising the  ...  Many databases today are text-rich, comprising not only structured, but also textual data.  ...  In fact, some authors pointed out that the the number of nominal values can be limited via clustering or, if possible, using feature hierarchies [10] .  ... 
doi:10.1145/2452376.2452421 dblp:conf/edbt/WagnerBT13 fatcat:e2g7j4e2wvfa7nqagc7wdem6wy

Efficient architectural design space exploration via predictive modeling

Engin Ipek, Sally A. McKee, Karan Singh, Rich Caruana, Bronis R. de Supinski, Martin Schulz
2008 ACM Transactions on Architecture and Code Optimization (TACO)  
In an experimental study using the approach, training on 1% of a 250-K-point CMP design space allows our models to predict performance with only 4-5% error.  ...  We simulate sampled points, using results to teach our models the function describing relationships among design parameters.  ...  Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF or LLNL. The U.S.  ... 
doi:10.1145/1328195.1328196 fatcat:rmvgzcniyzhfvpcaruzgsevvha

The Layered Architecture of a System for Reasoning about Programs

Charles Rich
1985 International Joint Conference on Artificial Intelligence  
We also argue that a hybrid system in general is characterized by the use of multiple representations in the sense of multiple data abstractions, which does not necessarily imply distinct implementation  ...  The operation of Cake is illustrated by a complete trace of the solution of an example reasoning problem.  ...  X and Y in the Plan Calculus are implemented in the Truth Maintenance layer by the constraint clauses resulting from reduction of the following formulae to disjunctive normal form: The "data theory" underlying  ... 
dblp:conf/ijcai/Rich85 fatcat:o7qi4ntv3nfetihpkffecfm5am

Compilation and Simulation Tool Chain for Memory Aware Energy Optimizations [chapter]

Manish Verma, Lars Wehmeyer, Robert Pyka, Peter Marwedel, Luca Benini
2006 Lecture Notes in Computer Science  
In this paper, we present such a framework for performing memory hierarchy aware energy optimization. Both the compiler and the simulator are configured from a single memory hierarchy description.  ...  Significant savings of upto 50% in the total energy dissipation are reported.  ...  All components of the simulated memory hierarchy are implemented as abstract components.  ... 
doi:10.1007/11796435_29 fatcat:szae4j2y5ndstdjq2fo6ra6rv4
« Previous Showing results 1 — 15 out of 6,253 results