PODIO: recent developments in the Plain Old Data EDM toolkit

Frank-Dieter Gaede, Graeme A Stewart, Benedikt Hegner
2019 Zenodo  
PODIO is a C++ toolkit for the creation of event data models (EDMs) with a fast and efficient I/O layer, developed in the AIDA2020 project. It employs plain-old-data (POD) data structures wherever possible, while avoiding deep object-hierarchies and virtual inheritance. A lightweight layer of handle classes provides the necessary high-level interface for the physicist, such as support for inter-object relation-ships, convenient iteration through objects or automatic memory-management. PODIO
more » ... nagement. PODIO creates all EDM code from simple instructive YAML files, describing the actual EDM entities. Since its original development PODIO has been very actively used for Future Circular Collider studies. In its original version, the underlying I/O was entirely based on the automatic streaming code generated with ROOT dictionaries. Recently two additional I/O implementations have been added. One is based on HDF5 and the other uses SIO, a simple binary I/O library provided by LCIO (the Linear Collider I/O EDM). HDF5 is heavily used in many other science fields as well as by the machine learning community. Providing the option to persistify the EDM in this way allows HEP data to be used with tools based around that ecosystem. The SIO implementation exploits the array-of-struct data layout with the goal of optimising the I/O performance. We briefly introduce the main features of PODIO and then report on recent developments with a focus on performance comparisons between the three available I/O implementations. We conclude with presenting recent activities on porting the well-established LCIO EDM to PODIO, thereby discussing the possibility of defining a common HEP-EDM that is shared by all future collider studies.
doi:10.5281/zenodo.3599436 fatcat:kaxdk4ugivgejpaz5vqxx7kgx4