Building analytical platform with Big Data solutions for log files of PanDA infrastructure

A A Alekseev, F G Barreiro Megino, A A Klimentov, T A Korchuganova, T Maendo, S V Padolski
2018 Journal of Physics, Conference Series  
The paper describes the implementation of a high-performance system for the processing and analysis of log files for the PanDA infrastructure of the ATLAS experiment at the Large Hadron Collider (LHC), responsible for the workload management of order of 2M daily jobs across the Worldwide LHC Computing Grid. The solution is based on the ELK technology stack, which includes several components: Filebeat, Logstash, ElasticSearch (ES), and Kibana. Filebeat is used to collect data from logs. Logstash
more » ... processes data and export to Elasticsearch. ES are responsible for сentralized data storage. Accumulated data in ES can be viewed using a special software Kibana. These components were integrated with the PanDA infrastructure and replaced previous log processing systems for increased scalability and usability. The authors will describe all the components and their configuration tuning for the current tasks, the scale of the actual system and give several real-life examples of how this centralized log processing and storage service is used to showcase the advantages for daily operations.
doi:10.1088/1742-6596/1015/3/032003 fatcat:cjqsynui6nbe7h3iuxusgtviva