D12.4: Performance Optimized Lustre

Ernest Artiaga, Alberto Miranda
2012 Zenodo  
This report documents the research and development carried on within Task 12.4. The main goal of the task is to identify and address some open issues in file systems for multi-petascale and exascale facilities, aiming to the development of solutions that can be applied to the Lustre file system. The addressed issues can be classified into two main areas: metadata management and data management. Metadata handling involves dealing with huge numbers of files and their hierarchical organization
more » ... rding the user's view (including directory management and file attributes). Data handling deals with the storage of file contents and management data; this includes, in particular, techniques for automatic (self-tuned) placement of data on a system with many heterogeneous devices, aiming at maximizing bandwidth and minimizing response time. The work carried on in the area of metadata management included the observation, measurement and study of a large scale system currently in production, in order to identify the key metadata-related issues; the development of a prototype aimed to improve the metadata behaviour in such system and also to provide a framework to easily deploy novel metadata management techniques on top of other systems; the measurement and study of specially deployed Lustre and GPFS prototypes to validate the presence of the metadata issues observed in current in-production systems; and finally the porting of the framework prototype to test novel metadata management techniques on the Lustre prototype facility. In this line we have observed that in both Lustre and GPFS there are some scalability issues that reduce the performance of metadata operation when many files are used by the applications or when the number of accessing clients grows. The most important observation is that the number of files needed for the problem to appear is only a few hundreds and the number of clients a few dozens. This clearly shows that the problem needs to be addressed. After our mechanism has been added to the [...]
doi:10.5281/zenodo.6572353 fatcat:nkqucavqavapzabifinzn6ucxq