Filters








2,155 Hits in 3.3 sec

MPARM: Exploring the Multi-Processor SoC Design Space with SystemC

Luca Benini, Davide Bertozzi, Alessandro Bogliolo, Francesco Menichelli, Mauro Olivieri
2005 Journal of VLSI Signal Processing Systems for Signal, Image and Video Technology  
We developed a complete simulation platform for a MP-SoC called MP-ARM, based on SystemC as modelling and simulation environment, and including models for processors, the AMBA bus compliant communication  ...  As an example thereof, we use our tool to evaluate the impact on system performance of architectural parameters and of bus arbitration policies, showing that the effectiveness of a particular system configuration  ...  Main memory banks reside on the shared bus as slave devices. They consist of multiple instantiations of a basic SystemC memory module.  ... 
doi:10.1007/s11265-005-6648-1 fatcat:atufrbrdxjcb7pmiad7fuqwbky

A new minicomputer/multiprocessor for the ARPA network

F. E. Heart, S. M. Ornstein, W. R. Crowther, W. B. Barker
1973 Proceedings of the June 4-8, 1973, national computer conference and exposition on - AFIPS '73  
From the collection of the Computer History Museum (www.computerhistory.org) * The terms "/iO bus" and "memory bus" as used here and henceforth are not thc sar.;e as conL'cntiona!  ...  This step permits one to consider configurations embodying multiple processors and multiple memories as well as I/O on a single bus.  ...  Given a Bus Coupler connecting each processor bus to each shared-memory bus, all processors can access all shared memory.  ... 
doi:10.1145/1499586.1499721 dblp:conf/afips/HeartOCB73 fatcat:idr3sygml5c6dd7tixjbmdfwfy

Hierarchical cache/bus architecture for shared memory multiprocessors

A. W. Wilson
1987 Proceedings of the 14th annual international symposium on Computer architecture - ISCA '87  
Extended versions of shared bus multicache coherency protocols are used to maintain coherency among all caches in the system.  ...  A new, large scale multiprocessor architecture is presented in this paper. The architecture consists of hierarchies of shared buses and caches.  ...  " System Performance vs.  ... 
doi:10.1145/30350.30378 dblp:conf/isca/Wilson87 fatcat:yyaqljc45fe7ncgymfmmjikf2e

SMP-SoC is the answer if you ask the right questions

Philip Machanick
2006 Proceedings of the 2006 annual research conference of the South African institute of computer scientists and information technologists on IT research in developing couuntries - SAICSIT '06  
It makes a case for focusing design efforts on symmetric multiprocessor (SMP) SoC designs, which have the best chance of making an impact in a wide range of markets, rather than designs for very specific  ...  This paper explores design principles for SoC multiprocessors and relates them to more general system requirements.  ...  SMP-SoC vs.  ... 
doi:10.1145/1216262.1216264 fatcat:xwnrcp5ncjfj7bacnljoyokmji

Performance evaluation of the slotted ring multiprocessor

L.A. Barroso, M. Dubois
1995 IEEE transactions on computers  
We use memory traces from actual execution of parallel programs to drive detailed event-driven simulations of a variety of ring and bus multiprocessors.  ...  As microprocessor speeds continue to improve at a very fast rate the bandwidth requirements for system level interconnections in multiprocessors may eventually rule out the use of shared buses even for  ...  Figure 12 compares the performance of a 32-bit wide ring, clocked at 500 MHz, to a 64-bit wide bus, clocked at 50 MHz and 100 MHz.  ... 
doi:10.1109/12.392846 fatcat:6edhssskgfdjbabrfeqdclcqvu

Software environment for a multiprocessor DSP

Asawaree Kalavade, Joe Othmer, Bryan Ackland, K. J. Singh
1999 Proceedings of the 36th ACM/IEEE conference on Design automation conference - DAC '99  
In this paper, we describe the software environment for Daytona, a single-chip, bus-based, shared-memory, multiprocessor DSP. The software environment is designed around a layered architecture.  ...  The run-time kernel includes a low-overhead, preemptive, dynamic scheduler with multiprocessor support that guarantees real-time performance to admitted tasks.  ...  The architecture incorporates a 128-bit wide on-chip split transaction bus (ST bus).  ... 
doi:10.1145/309847.310078 dblp:conf/dac/KalavadeOAS99 fatcat:5o32coryxzeuroxi3x54zct7la

Effect of hot-spots on the performance of crossbar multiprocessor systems

M Atiquzzaman, M.M Banat
1993 Parallel Computing  
Hot-spots arising in multiprocessor systems due to the use of shared variables, synchronization primitives, etc. give rise to nonuniform memory reference pattern.  ...  Banat, Effect of hot-spots on the performance of crossbar multiprocessor systems, Parallel Computing 19 (1993) 455-461.  ...  Because of modularity and fault tolerance of multiple-bus systems, such systems using B busses have beeen widely investigated [2] .  ... 
doi:10.1016/0167-8191(93)90057-r fatcat:vb4tw6wbxbastmut3ygo4kpc6y

Speeding-up multiprocessors running DBMS workloads through coherence protocols

Pierfrancesco Foglia, Roberto Giorgi, Cosimo Antonio Prete
2004 International Journal of High Performance Computing and Networking  
In this work, it is shown how a DBMS workload, running on a shared-bus shared-memory multiprocessor, can be accelerated by adding simple support to the MESI coherence protocol.  ...  Results show that, for a DSS workload, the use of a WU protocol with a selective invalidation strategy for private data improves performance because of the access pattern to shared data and the lower bus  ...  The simpler design for a multiprocessor system is a shared-bus shared-memory architecture (Tanenbaum, 2001) . In shared-bus systems, processors access the shared memory through a shared bus.  ... 
doi:10.1504/ijhpcn.2004.007562 fatcat:qiiab5c5p5hrjgmouy7qmvj5tq

The performance of cache-coherent ring-based multiprocessors

Luis André Barroso, Michel Dubois
1993 Proceedings of the 20th annual international symposium on Computer architecture - ISCA '93  
We believe that the interconnection problem is not solved even for small scale shared memory multiprocessors, since the speed of shared buses is unlikely to keep up with the bandwidth requirements of new  ...  One of the main challenges presented by such developments is the effective use of powerful microprocessors in shared memory multiprocessor configurations.  ...  . directories; 500 MHz 32-bit rings Figure 6 . 6 32-bit wide slotted ring vs. 64-bit wide split transaction bus time (ns) miss latency (ns) compares the distribution of remote misses P11 P10 P9 P8  ... 
doi:10.1145/165123.165162 dblp:conf/isca/BarrosoD93 fatcat:rxg54hrm3fcrtilv2ottirsxxm

The performance of cache-coherent ring-based multiprocessors

Luis André Barroso, Michel Dubois
1993 SIGARCH Computer Architecture News  
We believe that the interconnection problem is not solved even for small scale shared memory multiprocessors, since the speed of shared buses is unlikely to keep up with the bandwidth requirements of new  ...  One of the main challenges presented by such developments is the effective use of powerful microprocessors in shared memory multiprocessor configurations.  ...  . directories; 500 MHz 32-bit rings Figure 6 . 6 32-bit wide slotted ring vs. 64-bit wide split transaction bus time (ns) miss latency (ns) compares the distribution of remote misses P11 P10 P9 P8  ... 
doi:10.1145/173682.165162 fatcat:64s6ip7jivedzorzdrjvsvm55u

Characterization of TCC on chip-multiprocessors

A. McDonald, JaeWoong Chung, H. Chafi, Chi Cao Minh, B.D. Carlstrom, L. Hammond, C. Kozyrakis, K. Olukotun
2005 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05)  
Transactional Coherence and Consistency (TCC) is a novel coherence scheme for shared memory multiprocessors that uses programmer-defined transactions as the fundamental unit of parallel work, synchronization  ...  In this paper, we study the implementation of TCC on chip-multiprocessors (CMPs).  ...  We always use an invalidation-based MESI protocol for SCC because it generates less inter-processor traffic and is more widely used than update-based protocols in bus-based multiprocessor systems [39]  ... 
doi:10.1109/pact.2005.11 dblp:conf/IEEEpact/McDonaldCCMCHKO05 fatcat:roc4b7suzvctrebk4chzdwjpti

Performances of multiprocessor multidisk architectures for continuous media storage

Benoit A. Gennart, Vincent Messerli, Roger D. Hersch, Ishwar K. Sethi, Ramesh C. Jain
1996 Storage and Retrieval for Still Image and Video Databases IV  
This contribution analyzes through bottleneck performance evaluation and simulation the behavior of two m ulti-processor multi-disk architectures : a point-to-point architecture and a shared-bus architecture  ...  The results suggest that the shared bus is a potential bottleneck despite its very high hardware throughput (400Mbytes/s) and that an architecture with addressable local memories located closely to their  ...  We consider a fast-wide SCSI-2 bus, in synchronous mode.  ... 
doi:10.1117/12.234806 dblp:conf/spieSR/GennartMH96 fatcat:4k2pjsxt45dmdclhumhheco6sa

Environment for multiprocessor simulator development

Masaki Wakabayashi, Hideharu Amano
2003 Electronics and communications in Japan. Part 3, Fundamental electronic science  
Performance estimation is essential for designing and investigating of new architectures including multiprocessors.  ...  ISIS, an architecture independent simulation kit for multiprocessors, is developed so as to reduce such designers load.  ...  Shared vs.  ... 
doi:10.1002/ecjc.10122 fatcat:pkhr2iaxebcapiecxa6daabs4a

I2SEMS: Interconnects-Independent Security Enhanced Shared Memory Multiprocessor Systems

Manhee Lee, Minseon Ahn, Eun Jung Kim
2007 Parallel Architecture and Compilation Techniques (PACT), Proceedings of the International Conference on  
We tested our design with SPLASH-2 benchmarks on up to 16-processor shared memory multiprocessor systems.  ...  In this paper, we present a fast and efficient method for providing secure memory and cache-to-cache communications in shared memory multiprocessor systems that are becoming enormously popular in designing  ...  Next, we would like to expand our research to much larger multiprocessor systems with distributed shared memory (DSM) and to new multiprocessor architectures such as Chip Multiprocessor (CMP) systems.  ... 
doi:10.1109/pact.2007.4336203 fatcat:62pacrxg2feeheqlzn54enu2di

Accelerating Lattice Boltzmann Fluid Flow Simulations Using Graphics Processors

P. Bailey, J. Myre, S.D.C. Walsh, D.J. Lilja, M.O. Saar
2009 2009 International Conference on Parallel Processing  
With memory bandwidth of up to 141 GB/s and a theoretical maximum floating point performance of over 600 GFLOPS [8], CUDA-ready GPUs from NVIDIA provide an attractive platform for a wide range of scientific  ...  This paper improves upon prior single-precision GPU LBM results for the D3Q19 model [7] by increasing GPU multiprocessor occupancy, resulting in an increase in maximum performance by 20%, and by introducing  ...  Similarly, running multiple blocks per multiprocessor can hide block synchronization latency.  ... 
doi:10.1109/icpp.2009.38 dblp:conf/icpp/BaileyMWLS09 fatcat:enlvpj3yn5abfox2h5otaqahny
« Previous Showing results 1 — 15 out of 2,155 results