Composing high-performance memory allocators
Current general-purpose memory allocators do not provide sufficient speed or flexibility for modern high-performance applications. Highly-tuned general purpose allocators have per-operation costs around one hundred cycles, while the cost of an operation in a custom memory allocator can be just a handful of cycles. To achieve high performance, programmers often write custom memory allocators from scratch -a difficult and error-prone process. In this paper, we present a flexible and efficient
... e and efficient infrastructure for building memory allocators that is based on C++ templates and inheritance. This novel approach allows programmers to build custom and general-purpose allocators as "heap layers" that can be composed without incurring any additional runtime overhead or additional programming cost. We show that this infrastructure simplifies allocator construction and results in allocators that either match or improve the performance of heavily-tuned allocators written in C, including the Kingsley allocator and the GNU obstack library. We further show this infrastructure can be used to rapidly build a general-purpose allocator that has performance comparable to the Lea allocator, one of the best uniprocessor allocators available. We thus demonstrate a clean, easy-to-use allocator interface that seamlessly combines the power and efficiency of any number of general and custom allocators within a single application. ½ We estimated this cost by measuring the time spent in allocation using 197.parser's custom allocator and computing a conservative estimate of allocation time with the system allocator (which cannot directly be substituted because of the semantics of the custom allocator). This and the other programs in this paper were compiled with Visual C++ 6.0 and run under Windows 2000 on a 366 MHz Pentium II system.