4,570 Hits in 9.1 sec

Analysis and performance results of computing betweenness centrality on IBM Cyclops64

Guangming Tan, Vugranam C. Sreedhar, Guang R. Gao
2009 Journal of Supercomputing  
By identifying several key architectural features, we propose and evaluate efficient strategies for achieving scalability on a massive multi-threading many-core architecture.  ...  We demonstrate several optimization strategies including multi-grain parallelism, just-in-time locality with explicit memory hierarchy and nonpreemptive thread execution, and fine-grain data synchronization  ...  IBM Cyclops64 represents a new class of many-core architecture featuring with shared address space for on-chip memory between cores and explicit addressing without cache.  ... 
doi:10.1007/s11227-009-0339-9 fatcat:rtslfwssvzasleg55erlq6mpoa

Multi-log processor - towards scalable event-driven multiprocessors

V. Viswanath
2004 Euromicro Symposium on Digital System Design, 2004. DSD 2004.  
Given billion transistors on a single chip, the Multi-¥ ¦ I architecture would have ¥ ¦ s on chip, while the Multi-¥ ¦ II architecture would allow for ! ¥ ¦ s on chip.  ...  Both multiprocessors are implemented by a large collection of ALUs with controllers and on chip speculative L0 caches (together called ¥ ¦ s) connected together by a network of parallel-prefix tree circuits  ...  Each processor core has a full fledged ALU, an event handler, and a speculative local cache memory (L0).  ... 
doi:10.1109/dsd.2004.1333288 dblp:conf/dsd/Viswanath04 fatcat:x65exr73u5dprn3akdyct46agu

Runtime-Aware Architectures: A First Approach

2014 Supercomputing Frontiers and Innovations  
With the irruption of multi-cores and parallel applications, this simple interface started to leak.  ...  Current multi-cores are designed as simple symmetric multiprocessors (SMP) on a chip. However, we believe that this is not enough to overcome all the problems that multi-cores already have to face.  ...  This work has been partially supported by the Spanish Ministry of Science and Innovation under grant TIN2012-34557, the HiPEAC Network of Excellence, and by the European Research Council under the European  ... 
doi:10.14529/jsfi140102 fatcat:4bh33566cfbz7iylsf2ufppsfa

Characterizing Betweenness Centrality Algorithm on Multi-core Architectures

Dengbiao Tu, Guangming Tan
2009 2009 IEEE International Symposium on Parallel and Distributed Processing with Applications  
This paper presents an in-depth analysis of characterization for an irregular application -computing betweenness centrality (BC) -on multi-core architectures.  ...  Finally, several implications on mulit-core architecture and programming are proposed.  ...  On-chip memory hierarchy, limited on-chip memory per core, and other features in such architectures make the problem even more difficult.  ... 
doi:10.1109/ispa.2009.18 dblp:conf/ispa/TuT09 fatcat:bkmlcevsrnf7xoaipv3o4jsgzq

High-Performance Embedded Architecture and Compilation Roadmap [chapter]

Koen De Bosschere, Wayne Luk, Xavier Martorell, Nacho Navarro, Mike O'Boyle, Dionisios Pnevmatikatos, Alex Ramirez, Pascal Sainrat, André Seznec, Per Stenström, Olivier Temam
2007 Lecture Notes in Computer Science  
One of the key deliverables of the EU HiPEAC FP6 Network of Excellence is a roadmap on high-performance embedded architecture and compilation -the HiPEAC Roadmap for short.  ...  The HiPEAC roadmap is organized around 10 central themes: (i) single core architecture, (ii) multi-core architecture, (iii) interconnection networks, (iv) programming models and tools, (v) compilation,  ...  Challenge 2.2: On-Chip Interconnects and Memory Subsystem The critical infrastructure to host a large core count (say 100-1000 cores in ten years from now) consists of the on-chip memory subsystem and  ... 
doi:10.1007/978-3-540-71528-3_2 fatcat:ywmebvj7wrfb3ojghsjs4w3fy4

Memory Architecture and Management in an NoC Platform [chapter]

Axel Jantsch, Xiaowen Chen, Abdul Naeem, Yuang Zhang, Sando Penolazzi, Zhonghai Lu
2011 Scalable Multi-core Architectures  
on-chip and off-chip memory technology.  ...  The DME's main functions are virtual address translation, private and shared memory management, cache coherence protocol, support for memory consistency models, synchronization and protection mechanisms  ...  We propose a programmable hardware block, called a Data Management Engine, that supports this architectural adaptation in multiple ways.  ... 
doi:10.1007/978-1-4419-6778-7_1 fatcat:ja4wt52fnzb43okhyabajri33a

Enhancing Cache Coherent Architectures with access patterns for embedded manycore systems

Jussara Marandola, Stephane Louise, Loic Cudennec, Jean-Thomas Acquaviva, David A. Bader
2012 2012 International Symposium on System on Chip (SoC)  
The high performance hardwarecomponent in our context is aimed at CMP (Chip Multi-Processing) and MPSoC (Multiprocessor System-on-Chip).  ...  In this paper, we present a Cache Coherent Architecture that optimizes memory accesses to patterns using both a hardware component and specialized instructions.  ...  The base architecture is a multi-core system, each core fitted with its memory hierarchy (L1, L2), Directory and Pattern Table, and all cores have access to a Network on Chip (NoC) that permits each  ... 
doi:10.1109/issoc.2012.6376369 dblp:conf/issoc/MarandolaLCAB12 fatcat:3ovjidxtgfendpydlbes4uzqgu

Using a configurable processor generator for computer architecture prototyping

Alex Solomatnikov, Amin Firoozshahian, Ofer Shacham, Zain Asgar, Megan Wachs, Wajahat Qadeer, Stephen Richardson, Mark Horowitz
2009 Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture - Micro-42  
to successfully tape out an 8-core CMP chip with only a small group of designers.  ...  Building hardware prototypes for computer architecture research is challenging.  ...  The authors also would like to thank Han Chen, Kyle Kelley, Francois Labonte, Jacob Chang and Don Stark for their help and support.  ... 
doi:10.1145/1669112.1669159 dblp:conf/micro/SolomatnikovFSAWQRH09 fatcat:64ix2vqwsjaxhl5abuqvrpqlr4

Smart Memories

Ken Mai, Tim Paaske, Nuwan Jayasena, Ron Ho, William J. Dally, Mark Horowitz
2000 Proceedings of the 27th annual international symposium on Computer architecture - ISCA '00  
A Smart Memories chip is made up of many processing tiles, each containing local memory, local interconnect, and a processor core.  ...  Simulations of the mappings show that the Smart Memories architecture can successfully map these architectures with only modest performance degradation.  ...  The Imagine bandwidth hierarchy consists of off-chip DRAM, an on-chip stream register file (SRF), and local register files (LRFs) in the datapath.  ... 
doi:10.1145/339647.339673 fatcat:5h25fekfvbhxdj3tsox74djxl4

Sunway supercomputer architecture towards exascale computing: analysis and practice

Jiangang Gao, Fang Zheng, Fengbin Qi, Yajun Ding, Hongliang Li, Hongsheng Lu, Wangquan He, Hongmei Wei, Lifeng Jin, Xin Liu, Daoyong Gong, Fei Wang (+5 others)
2021 Science China Information Sciences  
In recent years, the improvements of system performance and energy efficiency for supercomputers have faced increasing challenges, which create more intensive demands on the architecture design for realizing  ...  system, system software, parallel algorithm and application support, promising great advances for exascale supercomputing.  ...  An architecture with global asynchronization and local synchronization is designed for SW many-core processor.  ... 
doi:10.1007/s11432-020-3104-7 fatcat:ocmhnpa2dng2lhqhldgbcdfw2a

Approximate 32-bit floating-point unit design with 53% power-area product reduction

Vincent Camus, Jeremy Schlachter, Christian Enz, Michael Gautschi, Frank K. Gurkaynak
2016 ESSCIRC Conference 2016: 42nd European Solid-State Circuits Conference  
different approximate FPUs and one reference IEEE-754 compliant FPU have been integrated in a 65 nm CMOS process within a low-power multi-core processor.  ...  By combining two state-of-the-art techniques of imprecise hardware, namely Gate-Level Pruning and Inexact Speculative Adder, and by introducing a novel Inexact Speculative Multiplier architecture, three  ...  ACKNOWLEDGMENT The authors would like to thank the Integrated Systems Laboratory at ETHZ for supporting the fabrication costs and for providing support and equipments for the design and measurement of  ... 
doi:10.1109/esscirc.2016.7598342 dblp:conf/esscirc/CamusSEGG16 fatcat:clvvx2d2zzgp7nim53xp3hiday

Architectural Support for Fault Tolerance in a Teradevice Dataflow System

Sebastian Weis, Arne Garbade, Bernhard Fechner, Avi Mendelson, Roberto Giorgi, Theo Ungerer
2014 International journal of parallel programming  
Therefore, future many-core systems will require faulttolerance techniques, which are capable to scale with the number of cores and the increasing failure probability on a chip in conjunction with a reasonable  ...  will further raise [43] , making faults in present multi-core and future many-core systems unavoidable.  ...  Popovic for their initial studies on the DTA-C architecture and P. Faraboschi of HP for his precious suggestions and support on the COTSon simulator.  ... 
doi:10.1007/s10766-014-0312-y fatcat:kygdzmqyvrbonia2cu7n4glnsu

An overview about Networks-on-Chip with multicast suppor [article]

Marcelo Daniel Berejuck
2016 arXiv   pre-print
Modern System-on-Chip (SoC) platforms typically consist of multiple processors and a communication interconnect between them.  ...  This paper presents an overview of research on NoC with support for multicast communication and delineates the major issues addressed so far by the scientific community in this investigation area.  ...  For MPSoC based on NoC, a core is typically a processor with some amount of local memory.  ... 
arXiv:1610.00751v1 fatcat:rysfjplkcndtho2d4wa7664hwq

Evaluating CMPs and Their Memory Architecture [chapter]

Chris Jesshope, Mike Lankamp, Li Zhang
2009 Lecture Notes in Computer Science  
Many-core processor architectures require scalable solutions that reflect the locality and power constraints of future generations of technology.  ...  This paper presents a CMP architecture that supports automatic mapping and dynamic scheduling of threads leaving the binary code devoid of any explicit communication.  ...  Acknowledgements We acknowledge support for this work from NWO in the project Microgrids and from the EU in the project Apple-CORE.  ... 
doi:10.1007/978-3-642-00454-4_24 fatcat:gnkfu73cpnca3mary6xbru6ie4

A survey of new research directions in microprocessors

J. Šilc, T. Ungerer, B. Robic
2000 Microprocessors and microsystems  
Technological advances will replace the gate delay by on-chip wire delay as the main obstacle to increase the chip complexity and cycle rate.  ...  Multiscalar and trace processors define several processing cores that speculatively execute different parts of a sequential program in parallel.  ...  Acknowledgements We thank the referees of this paper for many helpful comments.  ... 
doi:10.1016/s0141-9331(00)00072-7 fatcat:55y6n4wzijaeppl3l5qp6x2koa
« Previous Showing results 1 — 15 out of 4,570 results