A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
IsoNet: Hardware-Based Job Queue Management for Many-Core Architectures
2013
IEEE Transactions on Very Large Scale Integration (vlsi) Systems
Thus, IsoNet is a network tasked with maintaining equal load between the processing cores of the CMP. ...
IsoNet is a lightweight job queue manager responsible for administering the list of jobs to be executed, and maintaining load balance among all CMP cores. ...
Section II serves as a concise preamble on the basics of parallel programming on modern CMPs and provides motivation on the need for hardware-based dynamic load balancing. ...
doi:10.1109/tvlsi.2012.2202699
fatcat:kcd2je7dave6dmyx6rjmze3g6q
Fast and Memory-Efficient Traffic Classification with Deep Packet Inspection in CMP Architecture
2010
2010 IEEE Fifth International Conference on Networking, Architecture, and Storage
Furthermore, we design a fast and memory-efficient system of a two-layer architecture for traffic classification with the help of regular expressions in multi-core architecture, which is different from ...
The classic way to identify flows, e.g., examining the port numbers in the packet headers, becomes ineffective. ...
We design a new load balancing algorithm based on static preallocation and adaptive adjustments, which could be used in network flow-based parallel processing system in CMP architecture. ...
doi:10.1109/nas.2010.43
dblp:conf/nas/LiuSG10
fatcat:4w3i67ejyfdmppejmi4ap5nn24
The Cell Broadband Engine: Exploiting Multiple Levels of Parallelism in a Chip Multiprocessor
2007
International journal of parallel programming
in many designs fails to exploit all the levels of available parallelism in many workloads for CMP systems. ...
By taking advantage of opportunities at all levels of the system, this CMP revolutionizes parallel architectures to deliver previously unattained levels of single chip performance. ...
In the process, a range of new innovative solutions has been proposed and implemented, from CMPs based on homogeneous single ISA systems (Piranha, Cyclops, Xbox360), to heterogenous multi-ISA systems ( ...
doi:10.1007/s10766-007-0035-4
fatcat:zcndew73k5bevgjnuc3h3lbtya
Scalability of Multimedia Applications on Next-Generation Processors
2006
2006 IEEE International Conference on Multimedia and Expo
In particular, the study discusses the decomposition method, load balancing, synchronization primitives, interaction with the operating system and hardware issues such as cache hierarchy and memory bandwidth ...
In the near future, the majority of personal computers are expected to have several processing units. This is referred to as Core Multiprocessing (CMP). ...
Yefet for their contributions to this research. ...
doi:10.1109/icme.2006.262503
dblp:conf/icmcs/AmitCVP06
fatcat:fk3dzpldjbfoxl6myovc4zzdkm
Evaluation Scheme for NoC-based CMP with Integrated Processor Management System
2010
International Journal of Electronics and Telecommunications
Analyzed results reveal that CMP with a PA controlled by IFF allocation algorithm for mesh systems and torus-based NoC driven by DORLB routing with express-virtual-channel flow control achieved the best ...
Evaluation Scheme for NoC-based CMP with Integrated Processor Management System With the opportunities and benefits offered by Chip Multiprocessors (CMPs), there are many challenges that need to be addressed ...
INTRODUCTION T ILED CHIP MULTIPROCESSOR (CMP) architectures with many Processing Elements (PEs) integrated on one die attract a lot of attention and they are dominant trend in parallel processing systems ...
doi:10.2478/v10177-010-0021-4
fatcat:cqeqzba2vrgi7lovnnxw475svq
Event-driven configuration of a neural network CMP system over an homogeneous interconnect fabric
2011
Parallel Computing
SpiNNaker is a parallel Chip Multiprocessor (CMP) system for neural network (NN) simulation. ...
Configuring a million-core parallel system at boot time is a difficult process when the system has neither specialised hardware support for the configuration process nor a preconfigured default state that ...
A system for neural network simulation will be, correspondingly, architecturally different from parallel systems designed mostly for generalpurpose computing. ...
doi:10.1016/j.parco.2011.02.003
fatcat:xznsapdhsne4xemiwgns2aop34
Energy characteristic of a processor allocator and a network-on-chip
2011
International Journal of Applied Mathematics and Computer Science
Simulation results show that a PA with an IFF allocation algorithm for mesh systems and a torus-based NoC with express-virtual-channel flow control are very energy efficient. ...
Besides efficient on-chip processing elements, a well-designed Processor Allocator (PA) and a Network-on-Chip (NoC) are also important factors in the energy budget of novel CMPs. ...
Acknowledgment This work has been supported by the European Union in the framework of the European Social Fund through the Warsaw University of Technology Development Programme, and by the Ministry of ...
doi:10.2478/v10006-011-0029-7
fatcat:cby3q2ucczczzpdknrm5lwk2ly
Location of Processor Allocator and Job Scheduler and Its Impact on CMP Performance
2012
International Journal of Electronics and Telecommunications
We present energy models for the researched CMP components, mathematical model of the system, and experimentation system. ...
Processors that are being developed and used as nodes in HPC systems are Chip Multiprocessors (CMPs) with a number of cores. ...
We investigate NoC architectures with: 1) 2D-Mesh and 2D-Torus topologies, 2) Virtual-Channel (VC) and Express Virtual-Channel (EVC) flow controls, 3) Dimensional Order Routing with Load Balance extension ...
doi:10.2478/v10177-012-0001-y
fatcat:pzfc77eyhzeg3g4jko4l643cka
Characterizing processor architectures for programmable network interfaces
2000
Proceedings of the 14th international conference on Supercomputing - ICS '00
To date, no performance data exist to aid in the decision of what processor architecture to use in next generation network processor. Our goal is to remedy this situation. ...
The network interface environment is simulated in detail, and our results indicate that SMT is the architecture best suited to this environment. ...
., flow management) and emerging applications (HTTP load balancing) to round out our network processor workloads. ...
doi:10.1145/335231.335237
dblp:conf/ics/CrowleyFBB00
fatcat:zf23lcuw6fflfpbkppcbkzxnqa
Characterizing processor architectures for programmable network interfaces
2014
25th Anniversary International Conference on Supercomputing Anniversary Volume -
To date, no performance data exist to aid in the decision of what processor architecture to use in next generation network processor. Our goal is to remedy this situation. ...
The network interface environment is simulated in detail, and our results indicate that SMT is the architecture best suited to this environment. ...
(HTTP load balancing) to round out our network processor workloads. ...
doi:10.1145/2591635.2667178
fatcat:f5h26uvk5rf65oi3sgg5e4mjna
Author Index
2008
2008 IEEE International Symposium on Parallel and Distributed Processing
DiCo-CMP: Efficient Cache Coherency in Tiled CMP Architectures García, Miguel Angel Perimeter Quadrature-based Metric for Estimating FPGA Fragmentation in 2D HW Multitasking Garey, Larry A Linear Solver ...
for Component Based Hardware/Software Interaction in Embedded Real-Time Systems Foughali, Laidi A Parallel Insular Model for Location Areas Planning in Mobile Networks Fu, Song Random Choices for Churn ...
doi:10.1109/ipdps.2008.4536576
fatcat:7unikf5ywjhjtdd6xtrmcom3gq
High-Performance Energy-Efficient Multicore Embedded Computing
2012
IEEE Transactions on Parallel and Distributed Systems
Embedded systems differ from traditional high-performance supercomputers in that power is a first-order constraint for embedded systems; whereas, performance is the major benchmark for supercomputers. ...
Finally, we present design challenges and future research directions for HPEEC system development. ...
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSERC and the NSF. ...
doi:10.1109/tpds.2011.214
fatcat:vagqmojdsjevvc2u2ewqrcjjpq
Implementation and Evaluation of Fock Matrix Calculation Program on the Cell Processor
2007
AIP Conference Proceedings
Recently, the Chip Multi-processors (CMPs), which has many processor cores onto a chip, are proposed for further performance improvement. ...
Various processor architectures have been proposed until today, and the performance has improved remarkably. ...
ACKNOWLEDGMENTS This work has been supported in part by the Grant-in-Aid for Young Scientists (A) No.17680005 of the Ministry of Education, Science, Sports and Culture (MEXT). ...
doi:10.1063/1.2836167
fatcat:umbyy6vpifhefkspzadsngmmhm
Event-Driven Configuration of a Neural Network CMP System over a Homogeneous Interconnect Fabric
2009
2009 Eighth International Symposium on Parallel and Distributed Computing
The architecture of SpiNNaker, a parallel chip multiprocessor (CMP) system for neural network simulation, is in this class. ...
Configuring a million-core parallel system at boot time is a difficult process when the system has neither specialised hardware support for the configuration process nor a preconfigured default state that ...
The architecture of SpiNNaker, a parallel chip multiprocessor (CMP) system for neural network simulation, is in this class. ...
doi:10.1109/ispdc.2009.25
dblp:conf/ispdc/KhanNRJPLWMF09
fatcat:bx2ce5yagfbntmzx47s5dq3fdu
Embracing heterogeneity with dynamic core boosting
2014
Proceedings of the 11th ACM Conference on Computing Frontiers - CF '14
Even for embarrassingly parallel programs in the form of SPMD (single program multiple data), the threads are not perfectly balanced due to control flow divergence, non-deterministic memory latencies, ...
We propose Dynamic Core Boosting (DCB), a software-hardware cooperative system that mitigates the workload imbalance problem in performance asymmetric CMPs. ...
Increasing core-to-core process variation also creates performance asymmetry in CMPs [29] . ...
doi:10.1145/2597917.2597932
dblp:conf/cf/ChoM14
fatcat:tx22yxw3mnbu7dpe5ccwealwua
« Previous
Showing results 1 — 15 out of 2,240 results