Advanced prefetching and caching of models with PrefetchML

Gwendal Daniel, Gerson Sunyé, Jordi Cabot
2018 Journal of Software and Systems Modeling  
Caching and prefetching techniques have been used for decades in database engines and file systems to improve the performance of I/O intensive application. A prefetching algorithm typically benefits from the system's latencies by loading into main memory elements that will be needed in the future, speeding-up data access. While these solutions can bring a significant improvement in terms of execution time, prefetching rules are often defined at the data-level, making them hard to understand,
more » ... ntain, and optimize. In addition, low-level prefetching and caching components are difficult to align with scalable model persistence frameworks because they are unaware of potential optimizations relying on the analysis of metamodel-level information, and are less present in NoSQL databases, a common solution to store large models. To overcome this situation we propose PrefetchML, a framework that executes prefetching and caching strategies over models. Our solution embeds a DSL to configure precisely the prefetching rules to follow, and a monitoring component providing insights on how the prefetching execution is working to help designers optimize his performance plans. Our experiments show that PrefetchML is a suitable solution to improve query execution time on top of scalable model persistence frameworks. Tool support is fully available online as an open-source Eclipse plugin. Prefetching and caching are two well-known approaches to improve performance of applications that rely intensively on I/O accesses. Prefetching consists in bringing objects into memory before they are actually requested by the application to reduce performance issues due to the latency of I/O accesses. Fetched objects are then stored in memory to speed-up their (possible) access later on. In contrast, caching aims at speeding up the access by keeping in memory objects that have been already loaded. Prefetching and caching have been part of database management systems and file systems for a long time and have proved their efficiency in several use cases [25, 28] . P. Cao et al. [6] showed that integrating prefetching and caching strategies dramatically improves the performance of I/O-intensive applications. In short, prefetching mechanisms work by adding load instructions (according to prefetching rules derived by static [18] or execution trace analysis [8]) into an existing program. Global policies, (e.g., LRU -least recently used, MRU -most recently used, etc.) control the cache contents. Given that model-driven engineering (MDE) is progressively adopted in the industry [17, 23] , we believe that the support of prefetching and caching techniques at the modeling level is required to raise the scalability of MDE tools dealing with large models where storing, editing, transforming, and querying operations are major issues [21, 32] . These large models typically appear in various engineering fields, such as civil engineering [1], automotive industry [4], product lines [26] , and in software maintenance and evolution tasks such as reverse engineering [5] . Existing approaches have proposed scalable model persistence frameworks on top of SQL and NoSQL databases [13, 15, 19, 24] . These frameworks use lazyloading techniques to load into main memory the parts of the model that need to be accessed. This helps dealing with large models that would otherwise not fit in memory but adds an execution time overhead due to the latency of I/O accesses to load model excerpts from the database, specially when executed in a distributed environment. Existing frameworks typically rely on the database prefetching and caching capabilities (when they exist) to speed-up query computation in a generic way (i. e. regardless the context of the performed model manipulation). This facilitates their use in a variety of scenarios but prevent them from providing modelspecific optimizations that would require understanding the type of the model (i.e. its metamodel) to come up with accurate loading strategies. In this sense, this paper proposes a new prefetching and caching framework for models. We present PrefetchML, a domain specific language and execution engine, to specify prefetching and caching policies and execute them at run-time in order to optimize model access operations. This DSL allows designers to customize the prefetching rules to the specific needs of model manipulation scenarios, even providing several execution plans for different use cases. The DSL itself is generic and could be part of any modeling stack but our framework is built on top of the Eclipse Modeling Framework (EMF) infrastructure and therefore it is compatible with existing scalable model persistence approaches, regardless whether those
doi:10.1007/s10270-018-0671-8 fatcat:a27sbfhgijhapga3uymk3aluqq