A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
Challenges in Building a Flat-Bandwidth Memory Hierarchy for a Large-Scale Computer with Proximity Communication
13th Symposium on High Performance Interconnects (HOTI'05)
to relieve the memory bottleneck in a large-scale computer that we call "Hero." ...
Memory systems for conventional large-scale computers provide only limited bytes/s of data bandwidth when compared to their flop/s of instruction execution rate. ...
In addition we recognize the effective support and guidance from DARPA as part of its HPCS Phase II program. ...
doi:10.1109/conect.2005.12
dblp:conf/hoti/DrostFGHKCCTZCLS05
fatcat:jaytoz2qnzalzgz4qnl4eytohe
Scalable Multi-purpose Network Representation for Large Scale Distributed System Simulation
2012
2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
However, the simulation of large-scale computing systems raises several scalability issues, in terms of speed and memory. ...
Conducting experiments in large-scale distributed systems is usually time-consuming and labor-intensive. ...
(P2P) systems, raises new challenges in Computer Science. ...
doi:10.1109/ccgrid.2012.31
dblp:conf/ccgrid/BobelinLMNQST12
fatcat:3yqwvfgzkzejrbgxkh4purwkcu
Marrying Many-core Accelerators and InfiniBand for a New Commodity Processor
[article]
2013
arXiv
pre-print
The coming exa-scale era changes the landscape: exa-scale computers will contain components in quantities large enough to justify their custom development and production. ...
We propose a new heterogeneous processor, equipped with a network controller and designed specifically for HPC. ...
Here we have the following assumptions about this system: • Complicated hierarchy in memory architecture • Shortage of available memory per core Therefore a multi-layered MPI communicator management is ...
arXiv:1307.0100v1
fatcat:yxvtz66fcvahvb3wtytdm6nlhe
Monitoring Large-Scale Cloud Systems with Layered Gossip Protocols
[article]
2013
arXiv
pre-print
In this paper we propose the development of a cloud monitoring suite to provide scalable and robust lookup, data collection and analysis services for large-scale cloud systems. ...
The need for robust monitoring tools has become more evident with the advent of cloud computing. ...
With the immense benefits of this paradigm comes a number of challenges, amongst these is the challenge of monitoring. Monitoring large-scale distributed systems is challenging. ...
arXiv:1305.7403v1
fatcat:k5izdtv3rfhvbl2tousan3yriq
Self managing monitoring for highly elastic large scale cloud deployments
2014
Proceedings of the sixth international workshop on Data intensive distributed computing - DIDC '14
Infrastructure as a Service computing exhibits a number of properties, which are not found in conventional server deployments. ...
This tool breaks with many of the conventions of previous monitoring systems and leverages a multi-tier P2P architecture in order to achieve in situ monitoring without the need for dedicated monitoring ...
In large scale cloud deployments individual VMs operate under a range of computation and communication constraints. ...
doi:10.1145/2608020.2608022
dblp:conf/hpdc/WardB14
fatcat:76cemc5sczgrdf6aeoiau5cc7i
Hardware Locality-Aware Partitioning and Dynamic Load-Balancing of Unstructured Meshes for Large-Scale Scientific Applications
2020
Proceedings of the Platform for Advanced Scientific Computing Conference
The tool was successfully integrated into our in-house code and we present results from a large-eddy simulation of a combustion problem. ...
It provides a range of partitioning methods by interfacing with existing shared and distributed memory parallel partitioning libraries. ...
This work was granted access to the HPC resources of IDRIS under an allocation by GENCI for the Grand Challenges Jean Zay (2019). ...
doi:10.1145/3394277.3401851
dblp:conf/pasc/MohanamuralyS20
fatcat:thhnjsb7ufbgbmvmf6abiyzlri
Beyond Processor-centric Operating Systems
2015
USENIX Workshop on Hot Topics in Operating Systems
At rack scale, we can expect a large pool of non-volatile memory (NVM) that will be accessed by heterogeneous and decentralized compute resources [3, 17]. ...
In this paper, we describe the characteristics and consequences of memory-centric architectures and propose a memory-centric OS design that moves traditional OS functionality outside of the compute node ...
We thank the anonymous reviewers and our colleagues John Sontag, Brad Morrey, Indrajit Roy, Terence Kelly, Joe Tucek, Keith Packard and Guilherme Magalhaes for their helpful comments. ...
dblp:conf/hotos/FaraboschiKMM15
fatcat:wc4ehrbiffazzmdke2m6bmmma4
Enabling Fog Computing using Self-Organizing Compute Nodes
2019
Zenodo
Publication in Conference proceedings/Workshop ...
For communication within the fog network, intelligent messaging based on bandwidth minimization is implemented along with the querying mechanism, as discussed in Section III-F. ...
In such hierarchical models, typically, computation requests originate from the bottom and travel upwards the hierarchy until they reach a compute node with enough resources to execute them [5] . ...
doi:10.5281/zenodo.5850217
fatcat:ypz6flpqmraxrkdaqry3pc5bxm
The Mondrian Data Engine
2017
SIGARCH Computer Architecture News
The increasing demand for extracting value out of ever-growing data poses an ongoing challenge to system designers, a task only made trickier by the end of Dennard scaling. ...
In the context of NMP, such random accesses result in wasteful DRAM row buffer activations that account for a significant fraction of the total memory access energy. ...
ACKNOWLEDGEMENTS The authors thank the anonymous reviewers and Arash Pourhabibi for their precious comments and feedback. ...
doi:10.1145/3140659.3080233
fatcat:zjfvuprhmnahhmqfnj6ems7fli
The Mondrian Data Engine
2017
Proceedings of the 44th Annual International Symposium on Computer Architecture
The increasing demand for extracting value out of ever-growing data poses an ongoing challenge to system designers, a task only made trickier by the end of Dennard scaling. ...
In the context of NMP, such random accesses result in wasteful DRAM row buffer activations that account for a significant fraction of the total memory access energy. ...
ACKNOWLEDGEMENTS The authors thank the anonymous reviewers and Arash Pourhabibi for their precious comments and feedback. ...
doi:10.1145/3079856.3080233
fatcat:lctwfx5ilbdazdhjvjtbvgo3fq
Software challenges in extreme scale systems
2009
Journal of Physics, Conference Series
Carlson is a member of the research staff at the IDA Center for Computing Sciences where, since 1990, his focus has been on applications and system tools for large-scale parallel and distributed computers ...
More recent work is investigating how PIM-like ideas may port into quantum cellular array (QCA) and other nanotechnology logic, where in-stead of "Processing-In-Memory" we have opportunities for "Processing-In-Wire ...
the memory hierarchy and minimizes overall runtime with effective trade-off of increased computation for reduced memory consumption. ...
doi:10.1088/1742-6596/180/1/012045
fatcat:iukutry2dvbitfdh6ng7kgz564
Memory leads the way to better computing
2015
Nature Nanotechnology
in advancing computing by a thousandfold by 2015. ...
computing. ...
For the single job speedup case, bandwidths probably scale linearly with the 1,000X in rate, but total memory capacity will stay roughly flat at what it will be in 2010. ...
doi:10.1038/nnano.2015.29
pmid:25740127
fatcat:d6iiuuwcozbxlgn4kxxzdzwd4m
Eurolab-4-HPC Long-Term Vision on High-Performance Computing
[article]
2018
arXiv
pre-print
Radical changes in computing are foreseen for the next decade. ...
The objective of the Eurolab-4-HPC vision is to provide a long-term roadmap from 2023 to 2030 for High-Performance Computing (HPC). ...
Simultaneously, a high communication bandwidth between layers connected with TSVs can be expected leading to particularly high processor-to-memory bandwidth. ...
arXiv:1807.04521v1
fatcat:5neetrgubjhnvcajcktpkohrzq
Scale-out NUMA
2014
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems - ASPLOS '14
We introduce Scale-Out NUMA (soNUMA) -an architecture, programming model, and communication protocol for low-latency, distributed in-memory processing. soNUMA layers an RDMA-inspired programming model ...
The large number of servers needed to accommodate this massive memory footprint requires frequent server-to-server communication in applications such as key-value stores and graph-based applications that ...
Acknowledgments The authors thank Dushyanth Narayanan and Aleksandar Dragojevic from Microsoft Research for many inspiring conversations on rack-scale computing requirements and applications early in the ...
doi:10.1145/2541940.2541965
dblp:conf/asplos/NovakovicDBFG14
fatcat:buiufe62bvfohpvti5gsq3cyoa
Trends in Data Locality Abstractions for HPC Systems
2017
IEEE Transactions on Parallel and Distributed Systems
However, with the increasing complexity of the memory hierarchy and higher parallelism in emerging HPC systems, locality management has acquired a new urgency. ...
This paper examines these trends and identifies commonalities that can combine various locality concepts to develop a comprehensive approach to expressing and managing data locality on future large-scale ...
Bandwidth tapering has been a challenge since the dawn of cache hierarchies, and the remedies (loop blocking, strip-mining, tiling, domain decomposition, and communication optimizations/topology mapping ...
doi:10.1109/tpds.2017.2703149
fatcat:vjalwrujhrex7cibod3qerf3z4
« Previous
Showing results 1 — 15 out of 1,645 results