Filters








243 Hits in 4.6 sec

McSimA+: A manycore simulator with application-level+ simulation and detailed microarchitecture modeling

Jung Ho Ahn, Sheng Li, Seongil O, Norman P. Jouppi
2013 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)  
With their significant performance and energy advantages, emerging manycore processors have also brought new challenges to the architecture research community.  ...  Manycore processors are highly integrated complex system-on-chips with complicated core and uncore subsystems. The core subsystems can consist of a large number of traditional and asymmetric cores.  ...  Instructions with operands ready bid on these dispatch resources, and McSimA+ arbitrates and selects instructions based on their time stamps to execute on the proper units.  ... 
doi:10.1109/ispass.2013.6557148 dblp:conf/ispass/AhnLSJ13 fatcat:ywqys7o75ndkfjdqal5j4vvu4e

2018 Index IEEE Transactions on Computers Vol. 67

2019 IEEE transactions on computers  
., and Rodriguez-Henriquez, F., A Faster Software Implementation of the Supersingular Isogeny Diffie-Hellman Key Exchange Protocol; 1622-1636 Feng, D., see Fu, M., TC Sept. 2018 1259-1272 Analysis  ...  Han, L., þ, TC Aug. 2018 1105-1120 Mapping and Scheduling Mixed-Criticality Systems with On-Demand Redundancy.  ...  Seol, H., þ, TC Oct. 2018 1403-1415 Certification Mapping and Scheduling Mixed-Criticality Systems with On-Demand Redundancy.  ... 
doi:10.1109/tc.2018.2882120 fatcat:j2j7yw42hnghjoik2ghvqab6ti

ElCore: Dynamic elastic resource management and discovery for future large-scale manycore enabled distributed systems

Javad Zarrin, Rui L. Aguiar, João Paulo Barraca
2016 Microprocessors and microsystems  
In such large-scale computing environments, resource management is one of the most challenging, and complex issues for efficient resource sharing and utilization, particularly as we move toward Future  ...  This work proposes a novel resource management scheme for future peta-scale many-core-enabled computing systems, based on hybrid adaptive resource discovery, called ElCore.  ...  Manycore Systems Current approaches for resource management in manycore systems can be categorized to offline, mixed and on-line approaches.  ... 
doi:10.1016/j.micpro.2016.06.007 fatcat:tgqz6x3vzjddbgyab2xskei57e

Seismic wave propagation simulations on low-power and performance-centric manycores

Márcio Castro, Emilio Francesquini, Fabrice Dupros, Hideo Aochi, Philippe O.A. Navaux, Jean-François Méhaut
2016 Parallel Computing  
Each one of 57 these clusters has 2 MB, of work memory shared among the cores.  ...  On Xeon Phi each core has 512 kB of L2 shared 55 by up to 4 threads. On the other hand, on MPPA-256 the processing cores have 56 8 kB of L1 cache each and cores are grouped into clusters of 16.  ...  The limited cache 292 9 size on Xeon Phi prevents the storage of three velocity and six stress components 293 in 512 kB of L2 cache memory shared by up to four threads.  ... 
doi:10.1016/j.parco.2016.01.011 fatcat:p4ibo5q7xzgvdfwb7ei4o4hxdi

S4oC: A Self-optimizing, Self-adapting Secure System-on-Chip Design Framework to Tackle Unknown Threats – A Network Theoretic, Learning Approach [article]

Shahin Nazarian, Paul Bogdan
2020 arXiv   pre-print
S4oC is a manycore system, modeled as a four-layer graph, representing the model of computation (MoCp), model of connection (MoCn), model of memory (MoM) and model of storage (MoS), with a large number  ...  Security driven community detection, and neural networks are utilized for application task clustering, and distributed reinforcement learning (RL) for task mapping.  ...  S 4 oC manages the resources and reconfigures the elements in real-time, using security driven RL based on distributed intelligent schedulers.  ... 
arXiv:2004.02109v1 fatcat:xaia56igjzghlgkgbz3kdwhj6e

The Italian research on HPC key technologies across EuroHPC

Marco Aldinucci, Giovanni Agosta, Antonio Andreini, Claudio A. Ardagna, Andrea Bartolini, Alessandro Cilardo, Biagio Cosenza, Marco Danelutto, Roberto Esposito, William Fornaciari, Roberto Giorgi, Davide Lengani (+5 others)
2021 Proceedings of the 18th ACM International Conference on Computing Frontiers  
High-Performance Computing (HPC) is one of the strategic priorities for research and innovation worldwide due to its relevance for industrial and scientific applications. We envision HPC as composed  ...  of computation and I/O, and the scheduling of storage resources along with all levels of the storage hierarchy.  ...  Reasoning on how to enforce privacy and security for critical data on shared infrastructures (as supercomputers) is crucial. Hence, HPC cloud services. HPC and cloud are convergent technologies.  ... 
doi:10.1145/3457388.3458508 fatcat:nbnzfa2frvbpflj6tcbsk4bwcq

McPAT

Sheng Li, Jung Ho Ahn, Richard D. Strong, Jay B. Brockman, Dean M. Tullsen, Norman P. Jouppi
2009 Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture - Micro-42  
At the microarchitectural level, McPAT includes models for the fundamental components of a chip multiprocessor, including in-order and out-of-order processor cores, networks-on-chip, shared caches, integrated  ...  die cost is not taken into account clustering 8 cores together gives the best energy-delay product, whereas when cost is taken into account configuring clusters with 4 cores gives the best EDA 2 P and  ...  ACKNOWLEDGMENTS The authors would like to thank Victor Zyuban and Shyamkumar Thoziyoor at IBM for answering our questions on circuit implementation and the anonymous reviewers for their constructive comments  ... 
doi:10.1145/1669112.1669172 dblp:conf/micro/LiASBTJ09 fatcat:grtv5brsxzgwxdiqjcdhkfkqwa

A Survey on Memory Subsystems for Deep Neural Network Accelerators

Arghavan Asad, Rupinder Kaur, Farah Mohammadi
2022 Future Internet  
trade-offs associated with memory organizations; and 4—become familiar with proposed new memory systems for modern DNN accelerators to solve the memory wall and other mentioned current issues.  ...  From self-driving cars to detecting cancer, the applications of modern artificial intelligence (AI) rely primarily on deep neural networks (DNNs).  ...  The communication and storage requirements needed by DNNs creates resistance on the path towards high power efficiency and performance.  ... 
doi:10.3390/fi14050146 fatcat:4mrod5zmibgxvp6ppevgnpwlqq

A Dedicated Micro-Kernel to Combine Real-Time and Stream Applications on Embedded Manycores

Paul Dubrulle, Emmanuel Ohayon
2013 Procedia Computer Science  
This opportunity however depends on the ability to provide safety guarantees, especially when it comes to embedded life-or mission-critical applications.  ...  This micro-kernel is able to run simultaneously and safely tasks written with very different programming paradigm, and very different execution requirements: hard real-time applications and stream applications  ...  We choose to target multi-core systems with shared on-chip memory. This memory can either be a specialized local storage shared among several cores, or be a shared L2 or L3 on-chip cache.  ... 
doi:10.1016/j.procs.2013.05.331 fatcat:wxfb6u5f65czvgsoswefqr4toq

MITTS

Yanqi Zhou, David Wentzlaff
2016 SIGARCH Computer Architecture News  
Having the ability to precisely provision, schedule, and isolate memory bandwidth and latency on a per-core basis is particularly important when different memory guarantees are needed on a per-customer  ...  MITTS shapes memory traffic based on memory request inter-arrival time, enabling fine-grain bandwidth allocation.  ...  INTRODUCTION Off-chip memory bandwidth is a critical resource in multicore and manycore processors.  ... 
doi:10.1145/3007787.3001193 fatcat:tkfzm3w7qbgtzf7kmfbgb57qoq

MITTS: Memory Inter-arrival Time Traffic Shaping

Yanqi Zhou, David Wentzlaff
2016 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA)  
Having the ability to precisely provision, schedule, and isolate memory bandwidth and latency on a per-core basis is particularly important when different memory guarantees are needed on a per-customer  ...  MITTS shapes memory traffic based on memory request inter-arrival time, enabling fine-grain bandwidth allocation.  ...  INTRODUCTION Off-chip memory bandwidth is a critical resource in multicore and manycore processors.  ... 
doi:10.1109/isca.2016.53 dblp:conf/isca/ZhouW16 fatcat:2zturrinjbg4hkufutauvqpgf4

Multicore enablement for Cyber Physical Systems

Andreas Herkersdorf
2012 2012 International Conference on Embedded Computer Systems (SAMOS)  
As of today, the ability to efficient utilize the available resources depends to a large extent on the aptitude of experienced programmers and the inherent ability of being able to parallelize the computing  ...  The focus of the seminar was on the exchange of experiences and discussion of the challenges of reusable and transferable multicore technologies.  ...  Examples of such mixed-criticality integration are found in the avionics and automotive industry with their desire to integrate safety-critical, mission critical and noncritical subsystems on the same  ... 
doi:10.1109/samos.2012.6404198 dblp:conf/samos/Herkersdorf12 fatcat:73whij7ozbfgpimxz4md3f4jii

A Case for Coordinated Resource Management in Heterogeneous Multicore Platforms [chapter]

Priyanka Tembey, Ada Gavrilovska, Karsten Schwan
2011 Lecture Notes in Computer Science  
This paper first presents examples that demonstrate the need for coordination among multiple resource managers on heterogeneous multicore platforms.  ...  This independence, however, can cause performance degradation for an application that spans diverse cores and resource managers, unless managers coordinate with each other to better service application  ...  Profiles are based on two client workloads available with the standard RUBiS benchmark: browsing (read) mix and bid/browse/sell (read-write) mix.  ... 
doi:10.1007/978-3-642-24322-6_27 fatcat:pxdolvtb2rahvfpqbuysoetuby

A case for FAME

Zhangxi Tan, Andrew Waterman, Henry Cook, Sarah Bird, Krste Asanović, David Patterson
2010 SIGARCH Computer Architecture News  
system scheduling policy.  ...  To clear up misconceptions about FPGA-based simulation methodologies, we propose a FAME taxonomy to distinguish the costperformance of variations on these ideas.  ...  Thanks to the ROS implementers for their assistance on the RAMP Gold port, and to Kevin Klues in particular for providing page coloring support to facilitate our case study.  ... 
doi:10.1145/1816038.1815999 fatcat:v3bebnwebzdmzdutz4d345rzdq

A case for FAME

Zhangxi Tan, Andrew Waterman, Henry Cook, Sarah Bird, Krste Asanović, David Patterson
2010 Proceedings of the 37th annual international symposium on Computer architecture - ISCA '10  
system scheduling policy.  ...  To clear up misconceptions about FPGA-based simulation methodologies, we propose a FAME taxonomy to distinguish the costperformance of variations on these ideas.  ...  Thanks to the ROS implementers for their assistance on the RAMP Gold port, and to Kevin Klues in particular for providing page coloring support to facilitate our case study.  ... 
doi:10.1145/1815961.1815999 dblp:conf/isca/TanWCBAP10 fatcat:4u2ves3qn5ckdhgocyzfplx4z4
« Previous Showing results 1 — 15 out of 243 results