A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2012; you can also visit the original URL.
The file type is application/pdf
.
Filters
Cache equalizer
2011
Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers - HiPEAC '11
Temporal pressure at the on-chip last-level cache is continuously collected at a group (comprised of cache sets) granularity, and periodically recorded at the memory controller to guide the placement process ...
Simulation results using a full-system simulator demonstrate that CE achieves an average L2 miss rate reduction of 13.6% over a shared NUCA scheme and by as much as 46.7% for the benchmark programs we ...
., Tilera's Tile64 and Intel's Teraflops Research Chip) that co-locate distributed cores with distributed cache banks in tiles communicating via a network on-chip (NoC) [12] . ...
doi:10.1145/1944862.1944889
dblp:conf/hipeac/HammoudCM11
fatcat:gzndgemzqzabtn4jdec2dmq5hi
FELI: HW/SW Support for On-Chip Distributed Shared Memory in Multicores
[chapter]
2011
Lecture Notes in Computer Science
It relies on a set of TLB counters, and dynamical migration of pages from off-chip memory to on-chip memory. ...
FELI can automatically allocate on-chip memory to an average of 90% of the applications working set. ...
Special thanks to the members of the Heterogeneous Architecture group at BSC and the anonymous reviewers for their comments and suggestions. ...
doi:10.1007/978-3-642-23400-2_27
fatcat:tr4recb4m5g67etayk7bj7gnpa
Open-Scale: A Scalable, Open-Source NOC-based MPSoC for Design Space Exploration
2011
2011 International Conference on Reconfigurable Computing and FPGAs
The main objective of this platform is to provide a complete framework for research development on NoC-based distributed memory MPSoCs. ...
As a consequence, one of the most promising embedded architecture consists in the replication of Processing Elements (PEs) connected through a Network-on-Chip (NoC). ...
INTRODUCTION The increasing complexity of application and higher performance demand make Multiprocessors System-on-Chip (MPSoCs) one valuable alternative for dealing with nowadays embedded requirements ...
doi:10.1109/reconfig.2011.66
dblp:conf/reconfig/BusseuilBAOBSBRT11
fatcat:qcj7mvj43rhwniavjnlbcxrfye
Managing QoS flows at task level in NoC-based MPSoCs
2009
2009 17th IFIP International Conference on Very Large Scale Integration (VLSI-SoC)
This work bridges the hardware/software gap, exploring the integration of low-level NoC services into an application programming interface (API). ...
An important issue in MPSoC design is QoS, since applications running in such systems may have tight timing constraints, as video processing or fast communication protocols. ...
INTRODUCTION Multiprocessor systems-on-chips (MPSoCs) provide a huge design space exploration for applications with high computational demands. ...
doi:10.1109/vlsisoc.2009.6041343
fatcat:axvfwvk2infydaw2hgheoniydm
A Dynamic Pressure-Aware Associative Placement Strategy for Large Scale Chip Multiprocessors
2010
IEEE computer architecture letters
Temporal pressure at the on-chip last-level cache, is continuously collected at a group (comprised of cache sets) granularity, and periodically recorded at the memory controller to guide the placement ...
Simulation results using a full-system simulator demonstrate that CE outperforms shared NUCA caches by an average of 15.5% and by as much as 28.5% for the benchmark programs we examined. ...
., Tilera's Tile64 and Intel's Teraflops Research Chip) that co-locate distributed cores with distributed cache banks in tiles communicating via a network on-chip (NoC) [11] . ...
doi:10.1109/l-ca.2010.7
fatcat:5obf374lfnbnzhyuy2qr2r4r2i
Locality-oblivious cache organization leveraging single-cycle multi-hop NoCs
2014
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems - ASPLOS '14
However, this complicates the problem of data tracking and search/invalidation; tracking the state of a line at all on-chip caches at a directory or performing full-chip broadcasts are both non-scalable ...
In this paper, we make the case for Locality-Oblivious Cache Organization (LOCO), a CMP cache organization that leverages the on-chip network to create virtual single-cycle paths between distant caches ...
network (STARnet), under the Center for Future Architectures (C-FAR) research center. ...
doi:10.1145/2541940.2541976
dblp:conf/asplos/KwonKP14
fatcat:fefauexc45hghcbkglxtokx4rq
C-AMTE: A location mechanism for flexible cache management in chip multiprocessors
2011
Journal of Parallel and Distributed Computing
This paper describes Constrained Associative-Mapping-of-Tracking-Entries (C-AMTE), a scalable mechanism to facilitate flexible and efficient distributed cache management in large-scale chip multiprocessors ...
C-AMTE enables fast locating of cache blocks in CMP cache schemes that employ one-to-one or one-to-many associative mappings. ...
., Tilera's Tile64 and Intel's Teraflops Research Chip) that co-locate distributed cores with distributed cache banks in tiles communicating via a network on-chip (NoC) [13] . ...
doi:10.1016/j.jpdc.2010.11.009
fatcat:qhcpugsfwrhkhnwxy5zu43vwi4
Feedback-Driven Restructuring of Multi-threaded Applications for NUCA Cache Performance in CMPs
2010
2010 22nd International Symposium on Computer Architecture and High Performance Computing
We show techniques for altering the distribution of applications into the cache space as to achieve improved average memory access time. ...
We consider a number of Splash-2 and Parsec benchmarks on an 8 processor system and we show that a relatively simple remapping algorithm is able to improve the average Static-NUCA (SNUCA) cache access ...
ACKNOWLEDGMENT The authors would like to thank the colleague Manuel Comparetti for the insightful discussions on NUCA caches, and for the tests performed on the simulator 1 . ...
doi:10.1109/sbac-pad.2010.20
dblp:conf/sbac-pad/BartoliniFSP10
fatcat:vuxo5p555zacxmqx2kru3zbsye
Physical-aware system-level design for tiled hierarchical chip multiprocessors
2013
Proceedings of the 2013 ACM international symposium on International symposium on physical design - ISPD '13
In this work, the importance of physical-aware system-level exploration is investigated, and a strategy for deriving chip floorplans is described. ...
The combination of architectural exploration and physical planning is studied with an example and the impact of the physical aspects on the selection of architectural parameters is evaluated. ...
The given formulation is an example of the architectural exploration problem with the objective of efficiently distributing the chip resources among the components of a multi-core system, e.g. cores, memories ...
doi:10.1145/2451916.2451920
dblp:conf/ispd/CortadellaPNP13
fatcat:ufzyr2nvlzg7zd4og7af4oe46u
Exploiting multicast messages in cache-coherence protocols for NoC-based MPSoCs
2011
6th International Workshop on Reconfigurable Communication-Centric Systems-on-Chip (ReCoSoC)
The shift in the communication infrastructure, from buses to networks-on-chip (NoCs), adds new design challenges. ...
The main functionality NoCs may provide for the protocols is the way messages are sent through the network. Most NoCs support multicast as a set of unicast messages. ...
ACKNOWLEDGMENTS The Authors acknowledge the support of CNPq, projects 301599/2009-2 and 133526/2010-0, and FAPERGS project 10/0814-9. ...
doi:10.1109/recosoc.2011.5981492
dblp:conf/recosoc/ChavesCM11
fatcat:zrxbt3upi5djri3apzas5p4m2y
Manycore network interfaces for in-memory rack-scale computing
2015
Proceedings of the 42nd Annual International Symposium on Computer Architecture - ISCA '15
Our best manycore NI architecture achieves latencies within 3% of an idealized hardware NUMA and efficiently uses the full bisection bandwidth of the NOC, without changing the on-chip coherence protocol ...
Our results indicate that a careful splitting of NI functionality per chip tile and at the chip's edge along a NOC dimension enables a rack-scale architecture to optimize for both latency and bandwidth ...
Mirzadeh and the rest of the PARSA group for their feedback and support. ...
doi:10.1145/2749469.2750415
dblp:conf/isca/DaglisNBFG15
fatcat:sh6qqz6rkvdc3eb3emwovgpk7y
Manycore network interfaces for in-memory rack-scale computing
2015
SIGARCH Computer Architecture News
Our best manycore NI architecture achieves latencies within 3% of an idealized hardware NUMA and efficiently uses the full bisection bandwidth of the NOC, without changing the on-chip coherence protocol ...
Our results indicate that a careful splitting of NI functionality per chip tile and at the chip's edge along a NOC dimension enables a rack-scale architecture to optimize for both latency and bandwidth ...
Mirzadeh and the rest of the PARSA group for their feedback and support. ...
doi:10.1145/2872887.2750415
fatcat:hflmsptnsjhfdn6qoicsvqwude
SpiNNaker: A multi-core System-on-Chip for massively-parallel neural net simulation
2012
Proceedings of the IEEE 2012 Custom Integrated Circuits Conference
The basic block of the machine is the SpiNNaker multicore System-on-Chip, a Globally Asynchronous Locally Synchronous (GALS) system with 18 ARM968 processor nodes residing in synchronous islands, surrounded ...
The modelling of large systems of spiking neurons is computationally very demanding in terms of processing power and communication. ...
The die photo in Fig. 4(b) is courtesy of Unisem Europe Ltd. ...
doi:10.1109/cicc.2012.6330636
dblp:conf/cicc/PainkrasPGTDPCPF12
fatcat:cm5i4u3wa5ghffa52nxeqrynwa
Runtime Detection of a Bandwidth Denial Attack from a Rogue Network-on-Chip
2015
Proceedings of the 9th International Symposium on Networks-on-Chip - NOCS '15
This work explores a covert threat model for multi-processor system on chips designed using 3rd party NoCs. ...
NoC is an interconnect network for the glueless integration of on-chip components in the modern complex communication centric designs. ...
For example, an application with a heavy memory footprint has a strong dependence on the on-chip memory controllers, as well as, on SoC nodes with cache slices housing pertinent data. ...
doi:10.1145/2786572.2786580
dblp:conf/nocs/SACR15
fatcat:xnd254j74zdwbnfq62osgmz4cm
Dynamic thread and data mapping for NoC based CMPs
2009
Proceedings of the 46th Annual Design Automation Conference on ZZZ - DAC '09
Thread mapping and data mapping are two important problems in the context of NoC (network-on-chip) based CMPs (chip multiprocessors). ...
In this work, we present dynamic (runtime) thread and data mappings for NoC based CMPs. ...
For this purpose, we first present an application-specific, dynamic thread assignment strategy for NoC based CMP systems. ...
doi:10.1145/1629911.1630129
dblp:conf/dac/KandemirOM09
fatcat:xti3qbimcrbvngdakbdar5nco4
« Previous
Showing results 1 — 15 out of 249 results