Filters








2,518 Hits in 4.3 sec

NUMA policies and their relation to memory architecture

William J. Bolosky, Michael L. Scott, Robert P. Fitzgerald, Robert J. Fowler, Alan L. Cox
1991 SIGPLAN notices  
We have used this information to explore the relationship between kernel-based NUMA management policies and multiprocessor memory architecture.  ...  Our results indicate that a good NUMA policy must be chosen to match its machine, and confirm that such policies can be both simple and effective.  ...  We use multiprocessor memory reference traces to drive simulations of NUMA policies. We implement page placement policies and simulate architectures in our trace analysis program.  ... 
doi:10.1145/106973.106994 fatcat:gycerao7sjdc3o2jkn6pdudg2q

NUMA policies and their relation to memory architecture

William J. Bolosky, Michael L. Scott, Robert P. Fitzgerald, Robert J. Fowler, Alan L. Cox
1991 SIGARCH Computer Architecture News  
We have used this information to explore the relationship between kernel-based NUMA management policies and multiprocessor memory architecture.  ...  Our results indicate that a good NUMA policy must be chosen to match its machine, and confirm that such policies can be both simple and effective.  ...  We use multiprocessor memory reference traces to drive simulations of NUMA policies. We implement page placement policies and simulate architectures in our trace analysis program.  ... 
doi:10.1145/106975.106994 fatcat:h64q7qubqncobofb7zqq75f6eu

NUMA policies and their relation to memory architecture

William J. Bolosky, Michael L. Scott, Robert P. Fitzgerald, Robert J. Fowler, Alan L. Cox
1991 Proceedings of the fourth international conference on Architectural support for programming languages and operating systems - ASPLOS-IV  
We have used this information to explore the relationship between kernel-based NUMA management policies and multiprocessor memory architecture.  ...  Our results indicate that a good NUMA policy must be chosen to match its machine, and confirm that such policies can be both simple and effective.  ...  We use multiprocessor memory reference traces to drive simulations of NUMA policies. We implement page placement policies and simulate architectures in our trace analysis program.  ... 
doi:10.1145/106972.106994 dblp:conf/asplos/BoloskySFFC91 fatcat:cf3yu7o5cbczvnannrxmuryyya

NUMA policies and their relation to memory architecture

William J. Bolosky, Michael L. Scott, Robert P. Fitzgerald, Robert J. Fowler, Alan L. Cox
1991 ACM SIGOPS Operating Systems Review  
We have used this information to explore the relationship between kernel-based NUMA management policies and multiprocessor memory architecture.  ...  Our results indicate that a good NUMA policy must be chosen to match its machine, and confirm that such policies can be both simple and effective.  ...  We use multiprocessor memory reference traces to drive simulations of NUMA policies. We implement page placement policies and simulate architectures in our trace analysis program.  ... 
doi:10.1145/106974.106994 fatcat:54vlt7bddzd6nkcoit7qlgd6wm

Dynamic Task and Data Placement over NUMA Architectures: An OpenMP Runtime Perspective [chapter]

François Broquedis, Nathalie Furmento, Brice Goglin, Raymond Namyst, Pierre-André Wacrenier
2009 Lecture Notes in Computer Science  
Our runtime, which is based on a multi-level thread scheduler combined with a NUMA-aware memory manager, converts this information into "scheduling hints" to solve thread/memory affinity issues.  ...  Directive-based programming languages such as OpenMP provide programmers with an easy way to structure the parallelism of their application and to transmit this information to the runtime system.  ...  able to draw threads and bubbles to their "preferred" NUMA node.  ... 
doi:10.1007/978-3-642-02303-3_7 fatcat:e4cc3wengncrnfrvcnbtmvbmtq

Enabling high-performance memory migration for multithreaded applications on LINUX

Brice Goglin, Nathalie Furmento
2009 2009 IEEE International Symposium on Parallel & Distributed Processing  
As the number of cores per machine increases, memory architectures are being redesigned to avoid bus contention and sustain higher throughput needs.  ...  The emergence of Non-Uniform Memory Access (NUMA) constraints has caused affinities between threads and buffers to become an important decision criteria for schedulers.  ...  and placing memory buffers depending on their affinities.  ... 
doi:10.1109/ipdps.2009.5161101 dblp:conf/ipps/GoglinF09 fatcat:tifxpqrgvbbhngt2czn3jr7ake

Toward Efficient In-memory Data Analytics on NUMA Systems [article]

Puya Memarzia, Suprio Ray, Virendra C Bhavsar
2020 arXiv   pre-print
Modern computers increasingly rely on Non-Uniform Memory Access (NUMA) architectures in order to achieve scalability.  ...  A key drawback of NUMA architectures is that many existing software solutions are not aware of the underlying NUMA topology and thus do not take full advantage of the hardware.  ...  ACKNOWLEDGEMENTS We would like to thank Kenneth Kent and Aaron Graham from IBM CASA and Serguei Vassiliev and Kaizaad Bilimorya from Compute Canada, for providing access to Machine B and Machine C respectively  ... 
arXiv:1908.01860v3 fatcat:3ri4vadygzce5ao5dslmakn7zm

ForestGOMP: An Efficient OpenMP Environment for NUMA Architectures

François Broquedis, Nathalie Furmento, Brice Goglin, Pierre-André Wacrenier, Raymond Namyst
2010 International journal of parallel programming  
Directive-based programming languages such as OpenMP, can greatly help to perform such a distribution by providing programmers with an easy way to structure the parallelism of their application and to  ...  Our runtime, which is based on a multi-level thread scheduler combined with a NUMAaware memory manager, converts this information into scheduling hints related to threadmemory affinity issues.  ...  able to draw threads and bubbles to their "preferred" NUMA node.  ... 
doi:10.1007/s10766-010-0136-3 fatcat:g2vajlq53ba4xeq2xjkfdriesy

Towards Efficient OpenMP Strategies for Non-Uniform Architectures [article]

Oussama Tahan
2014 arXiv   pre-print
Many of current computing systems rely on Non-Uniform Memory Access (NUMA) based processors architectures.  ...  Our technique applies a smart threads allocation method and an advanced tasks scheduling strategy for reducing remote memory accesses and consequently their extra time consumption.  ...  .), a first-touch policy for memory allocation is used when running on NUMA architectures.  ... 
arXiv:1411.7131v1 fatcat:wtbqvorzvrfwxprde23jbho3dm

NUMA-ICTM: A parallel version of ICTM exploiting memory placement strategies for NUMA machines

Marcio Castro, Luiz Gustavo Fernandes, Christiane Pousa, Jean-Francois Mehaut, Marilton Sanchotene de Aguiar
2009 2009 IEEE International Symposium on Parallel & Distributed Processing  
Recent advances in multiprocessor architectures lead to the emergence of NUMA (Non-Uniform Memory Access) machines. In this work, we present NUMA-ICTM: a parallel solution of ICTM for NUMA machines.  ...  The categorization of large regions is a computational intensive problem, what justifies the proposal and development of parallel solutions in order to improve its applicability.  ...  The NUMA API is an interface that defines a set of system calls to apply memory policies and processes/threads scheduling.  ... 
doi:10.1109/ipdps.2009.5161155 dblp:conf/ipps/CastroFPMA09 fatcat:2hbnjiascnaxlcdgj7crhuuvdm

A performance comparison of data and memory allocation strategies for sequence aligners on NUMA architectures

Josefina Lenis, Miquel Angel Senar
2017 Cluster Computing  
Therefore, these tools are very sensitive to performance problems related to the memory system.  ...  We have performed experiments with several popular sequence alignment tools on two widely available NUMA systems to assess the performance of different memory allocation policies and data partitioning  ...  Related work Challenges in memory access on NUMA systems have been addressed by some approaches that tried to optimize locality at the OS level.  ... 
doi:10.1007/s10586-017-1015-0 fatcat:kfulo77t7je5jezrb5by4hb4vu

NUMA-Awareness as a Plug-In for an Eventify-Based Fast Multipole Method [chapter]

Laura Morgenstern, David Haensel, Andreas Beckmann, Ivo Kabadshow
2020 Lecture Notes in Computer Science  
In this article, we aim to improve the performance and sustainability of FMSolvr, a parallel Fast Multipole Method for Molecular Dynamics, by adapting it to Non-Uniform Memory Access architectures in a  ...  By means of the NUMA module we introduce diverse NUMA-aware data distribution, thread pinning and work stealing policies for FMSolvr.  ...  A reusable NUMA module for Eventify that models hierarchical memory architectures in software and enables rapid development of algorithmdependent NUMA policies. 3.  ... 
doi:10.1007/978-3-030-50436-6_31 fatcat:wfl66kght5dejp4jtqh4rew354

An interface to implement NUMA policies in the Xen hypervisor

Gauthier Voron, Gaël Thomas, Vivien Quéma, Pierre Sens
2017 Proceedings of the Twelfth European Conference on Computer Systems - EuroSys '17  
Most of the overhead on the latter machines is caused by the Non-Uniform Memory Access (NUMA) architecture they are using.  ...  In order to reduce this overhead, this paper shows how NUMA placement heuristics can be implemented inside Xen.  ...  This paper shows how we can implement the classical NUMA policies of operating systems inside an hypervisor, while hiding the NUMA topology to the virtual machines.  ... 
doi:10.1145/3064176.3064196 dblp:conf/eurosys/Voron0QS17 fatcat:uvccjgufsbbdtha45qeecy6ml4

Introducing kernel-level page reuse for high performance computing

Sébastien Valat, Marc Pérache, William Jalby
2013 Proceedings of the ACM SIGPLAN Workshop on Memory Systems Performance and Correctness - MSPC '13  
Due to computer architecture evolution, more and more HPC applications have to include thread-based parallelism and take care of memory consumption.  ...  On modern architectures, we measured that up to 40% of the page fault time is spent in page zeroing.  ...  A French FSN (Fond pour la Socit Numrique) cooperative project that associates academics and industrials partners in order to design then provide building blocks for a new generation of HPC datacenters  ... 
doi:10.1145/2492408.2492414 dblp:conf/pldi/ValatPJ13 fatcat:eg5su4jmk5axtksenidcawur7q

Using Data Dependencies to Improve Task-Based Scheduling Strategies on NUMA Architectures [chapter]

Philippe Virouleau, François Broquedis, Thierry Gautier, Fabrice Rastello
2016 Lecture Notes in Computer Science  
We also evaluate their performances on linear algebra applications executed on a 192-core NUMA machine, reporting noticeable performance improvement when considering both the architecture topology and  ...  Data placement and task scheduling strategies have a significant impact on performances when considering NUMA architectures.  ...  Acknowledgments This work is integrated and supported by the ELCI project, a French FSN ("Fond pour la Société Numérique") project that associates academic and industrial partners to design and provide  ... 
doi:10.1007/978-3-319-43659-3_39 fatcat:i6ou3fm2efcz5np6rafvfnuvkm
« Previous Showing results 1 — 15 out of 2,518 results