Filters








667 Hits in 1.5 sec

Extension of the INFN Tier-1 on a HPC system

Tommaso Boccali, Luca Dell'Agnello, Daniele Bonacorsi, Concezio Bozzi, Anna Lupato, Alessio Gianelle, Alessandro De Salvo, Stefano Dal Pra, Stefano Zani, Daniele Spiga, Diego Ciangottini, Andrea Valassi
2019 Zenodo  
The INFN Tier-1 located at CNAF in Bologna (Italy) is a major center of the WLCG e-Infrastructure, supporting the 4 major LHC collaborations and more than 30 other INFN-related experiments. After multiple tests towards elastic expansion of CNAF compute power via Cloud resources (provided by Azure, Aruba and in the framework of the HNSciCloud project), but also building on the experience gained with the production quality extension of the Tier-1 farm on remote owned sites, the CNAF team, in
more » ... boration with experts from the ATLAS, CMS, and LHCb experiments, has been working to put in production a solution of an integrated HTC+HPC system with the PRACE CINECA center, located nearby Bologna. Such extension will be implemented on the Marconi A2 partition, equipped with Intel Knights Landing (KNL) processors. A number of technical challenges were faced and solved in order to successfully run on low RAM nodes, as well as to overcome the closed environment (network, access, software distribution, ... ) that HPC systems deploy with respect to standard GRID sites. We show preliminary results from a large scale integration effort, using resources secured via the successful PRACE grant N. 2018194658, for 30 million KNL core hours.
doi:10.5281/zenodo.3598876 fatcat:gcrxmhyrlng75pow2kzz4ozynu

Migration of CMSWEB cluster at CERN to Kubernetes: a comprehensive study

Muhammad Imran, Valentin Kuznetsov, Katarzyna Maria Dziedziniewicz-Wojcik, Andreas Pfeiffer, Panos Paparrigopoulos, Spyridon Trigazis, Tommaso Tedeschi, Diego Ciangottini
2021 Cluster Computing  
Diego Ciangottini (m) got his Ph.D. in Physics (2015) in Perugia.  ...  @cern.ch Panos Paparrigopoulos panos.paparrigopoulos@cern.ch Spyridon Trigazis spyridon.trigazis@cern.ch Tommaso Tedeschi tommaso.tedeschi@pg.infn.it Diego Ciangottini diego.ciangottini@pg.infn.it National  ... 
doi:10.1007/s10586-021-03325-0 fatcat:jkeceooicndxpf5nscjr5thn6q

Improving efficiency of analysis jobs in CMS

Todor Trendafilov Ivanov, Stefano Belforte, Matthias Wolf, Marco Mascheroni, Antonio Pérez-Calero Yzquierdo, James Letts, José M. Hernández, Leonardo Cristella, Diego Ciangottini, Justas Balcas, Anna Elizabeth Woodard, Kenyi Hurtado Anampa (+7 others)
2019 EPJ Web of Conferences  
Hundreds of physicists analyze data collected by the Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider using the CMS Remote Analysis Builder and the CMS global pool to exploit the resources of the Worldwide LHC Computing Grid. Efficient use of such an extensive and expensive resource is crucial. At the same time, the CMS collaboration is committed to minimizing time to insight for every scientist, by pushing for fewer possible access restrictions to the full data sample and
more » ... ports the free choice of applications to run on the computing resources. Supporting such variety of workflows while preserving efficient resource usage poses special challenges. In this paper we report on three complementary approaches adopted in CMS to improve the scheduling efficiency of user analysis jobs: automatic job splitting, automated run time estimates and automated site selection for jobs.
doi:10.1051/epjconf/201921403006 fatcat:xlhjhjdjjveedgbezd7fjffq7i

Extension of the INFN Tier-1 on a HPC system [article]

Tommaso Boccali, Stefano Dal Pra, Daniele Spiga, Diego Ciangottini, Stefano Zani, Concezio Bozzi, Alessandro De Salvo, Andrea Valassi, Francesco Noferini, Luca dell Agnello, Federico Stagni, Alessandra Doria (+1 others)
2020 arXiv   pre-print
The INFN Tier-1 located at CNAF in Bologna (Italy) is a center of the WLCG e-Infrastructure, supporting the 4 major LHC collaborations and more than 30 other INFN-related experiments. After multiple tests towards elastic expansion of CNAF compute power via Cloud resources (provided by Azure, Aruba and in the framework of the HNSciCloud project), and building on the experience gained with the production quality extension of the Tier-1 farm on remote owned sites, the CNAF team, in collaboration
more » ... th experts from the ALICE, ATLAS, CMS, and LHCb experiments, has been working to put in production a solution of an integrated HTC+HPC system with the PRACE CINECA center, located nearby Bologna. Such extension will be implemented on the Marconi A2 partition, equipped with Intel Knights Landing (KNL) processors. A number of technical challenges were faced and solved in order to successfully run on low RAM nodes, as well as to overcome the closed environment (network, access, software distribution, ... ) that HPC systems deploy with respect to standard GRID sites. We show preliminary results from a large scale integration effort, using resources secured via the successful PRACE grant N. 2018194658, for 30 million KNL core hours.
arXiv:2006.14603v1 fatcat:rupnyh47zffwzcpd2sbp745nyi

Progress in Double Parton Scattering Studies [article]

Sunil Bansal, Paolo Bartalini, Boris Blok, Diego Ciangottini, Markus Diehl, Fiorella M. Fionda, Jonathan R. Gaunt, Paolo Gunnellini, Tristan Du Pree, Tomas Kasemets, Daniel Ostermeier, Sergio Scopetta (+4 others)
2014 arXiv   pre-print
An overview of theoretical and experimental progress in double parton scattering (DPS) is presented. The theoretical topics cover factorization in DPS, models for double parton distributions and DPS in charm production and nuclear collisions. On the experimental side, CMS results for dijet and double J/\psi\ production, in light of DPS, as well as first results for the 4-jet channel are presented. ALICE reports on a study of open charm and J/\psi\ multiplicity dependence.
arXiv:1410.6664v1 fatcat:66yrvnzbhbanpjjhs5f5rtetgm

The DODAS Experience on the EGI Federated Cloud

Daniele Spiga, Enol Fernandez, Vincenzo Spinoso, Diego Ciangottini, Mirco Tracolli, Giacinto Donvito, Marica Antonacci, Davide Salomoni, Andrea Ceccanti, Doina Cristina Duma, Luciano Gaido, C. Doglioni (+5 others)
2020 EPJ Web of Conferences  
The EGI Cloud Compute service offers a multi-cloud IaaS federation that brings together research clouds as a scalable computing platform for research accessible with OpenID Connect Federated Identity. The federation is not limited to single sign-on, it also introduces features to facilitate the portability of applications across providers: i) a common VM image catalogue VM image replication to ensure these images will be available at providers whenever needed; ii) a GraphQL information
more » ... API to understand the capacities and capabilities available at each provider; and iii) integration with orchestration tools (such as Infrastructure Manager) to abstract the federation and facilitate using heterogeneous providers. EGI also monitors the correct function of every provider and collects usage information across all the infrastructure. DODAS (Dynamic On Demand Analysis Service) is an open-source Platform-as-a-Service tool, which allows to deploy software applications over heterogeneous and hybrid clouds. DODAS is one of the so-called Thematic Services of the EOSC-hub project and it instantiates on-demand container-based clusters offering a high level of abstraction to users, allowing to exploit distributed cloud infrastructures with a very limited knowledge of the underlying technologies.This work presents a comprehensive overview of DODAS integration with EGI Cloud Federation, reporting the experience of the integration with CMS Experiment submission infrastructure system.
doi:10.1051/epjconf/202024507033 fatcat:wasfikiu65bkhmexfxo5oubpei

Smart Caching at CMS: applying AI to XCache edge services

Daniele Spiga, Diego Ciangottini, Mirco Tracolli, Tommaso Tedeschi, Daniele Cesini, Tommaso Boccali, Valentina Poggioni, Marco Baioletti, Valentin Y. Kuznetsov, C. Doglioni, D. Kim, G.A. Stewart (+3 others)
2020 EPJ Web of Conferences  
The projected Storage and Compute needs for the HL-LHC will be a factor up to 10 above what can be achieved by the evolution of current technology within a flat budget. The WLCG community is studying possible technical solutions to evolve the current computing in order to cope with the requirements; one of the main focus is resource optimization, with the ultimate aim of improving performance and efficiency, as well as simplifying and reducing operation costs. As of today the storage
more » ... on based on a Data Lake model is considered a good candidate for addressing HL-LHC data access challenges. The Data Lake model under evaluation can be seen as a logical system that hosts a distributed working set of analysis data. Compute power can be "close" to the lake, but also remote and thus completely external. In this context we expect data caching to play a central role as a technical solution to reduce the impact of latency and reduce network load. A geographically distributed caching layer will be functional to many satellite computing centers that might appear and disappear dynamically. In this talk we propose a system of caches, distributed at national level, describing both deployment and results of the studies made to measure the impact on the CPU efficiency. In this contribution, we also present the early results on novel caching strategy beyond the standard XRootD approach whose results will be a baseline for an AI-based smart caching system.
doi:10.1051/epjconf/202024504024 fatcat:vb32hfev2fgdlone4sxbmicwqm

First experiences with a portable analysis infrastructure for LHC at INFN

Diego Ciangottini, Tommaso Boccali, Andrea Ceccanti, Daniele Spiga, Davide Salomoni, Tommaso Tedeschi, Mirco Tracolli, C. Biscarat, S. Campana, B. Hegner, S. Roiser, C.I. Rovelli (+1 others)
2021 EPJ Web of Conferences  
The challenges proposed by the HL-LHC era are not limited to the sheer amount of data to be processed: the capability of optimizing the analyser's experience will also bring important benefits for the LHC communities, in terms of total resource needs, user satisfaction and in the reduction of end time to publication. At the Italian National Institute for Nuclear Physics (INFN) a portable software stack for analysis has been proposed, based on cloud-native tools and capable of providing users
more » ... h a fully integrated analysis environment for the CMS experiment. The main characterizing traits of the solution consist in the user-driven design and the portability to any cloud resource provider. All this is made possible via an evolution towards a "python-based" framework, that enables the usage of a set of open-source technologies largely adopted in both cloud-native and data-science environments. In addition, a "single sign on"-like experience is available thanks to the standards-based integration of INDIGO-IAM with all the tools. The integration of compute resources is done through the customization of a JupyterHUB solution, able to spawn identity-aware user instances ready to access data with no further setup actions. The integration with GPU resources is also available, designed to sustain more and more widespread ML based workflow. Seamless connections between the user UI and batch/big data processing framework (Spark, HTCondor) are possible. Eventually, the experiment data access latency is reduced thanks to the integrated deployment of a scalable set of caches, as developed in the context of ESCAPE project, and as such compatible with the future scenarios where a data-lake will be available for the research community. The outcome of the evaluation of such a solution in action is presented, showing how a real CMS analysis workflow can make use of the infrastructure to achieve its results.
doi:10.1051/epjconf/202125102045 fatcat:ubtyoepjsveqthwq3eze5tysdm

Extension of the INFN Tier-1 on a HPC system

Tommaso Boccali, Stefano Dal Pra, Daniele Spiga, Diego Ciangottini, Stefano Zani, Concezio Bozzi, Alessandro De Salvo, Andrea Valassi, Francesco Noferini, Luca dell'Agnello, Federico Stagni, Alessandra Doria (+7 others)
2020 EPJ Web of Conferences  
The INFN Tier-1 located at CNAF in Bologna (Italy) is a center of the WLCG e-Infrastructure, supporting the 4 major LHC collaborations and more than 30 other INFN-related experiments. After multiple tests towards elastic expansion of CNAF compute power via Cloud resources (provided by Azure, Aruba and in the framework of the HNSciCloud project), and building on the experience gained with the production quality extension of the Tier-1 farm on remote owned sites, the CNAF team, in collaboration
more » ... th experts from the ALICE, ATLAS, CMS, and LHCb experiments, has been working to put in production a solution of an integrated HTC+HPC system with the PRACE CINECA center, located nearby Bologna. Such extension will be implemented on the Marconi A2 partition, equipped with Intel Knights Landing (KNL) processors. A number of technical challenges were faced and solved in order to successfully run on low RAM nodes, as well as to overcome the closed environment (network, access, software distribution, ... ) that HPC systems deploy with respect to standard GRID sites. We show preliminary results from a large scale integration effort, using resources secured via the successful PRACE grant N. 2018194658, for 30 million KNL core hours.
doi:10.1051/epjconf/202024509009 fatcat:7c4zvg6jhbgitljy62xj6ela4m

Exploiting private and commercial clouds to generate on-demand CMS computing facilities with DODAS

Daniele Spiga, Marica Antonacci, Tommaso Boccali, Andrea Ceccanti, Diego Ciangottini, Riccardo Di Maria, Giacinto Donvito, Cristina Duma, Luciano Gaido, Álvaro López García, Aida Palacio Hoz, Davide Salomoni (+6 others)
2019 EPJ Web of Conferences  
Minimising time and cost is key to exploit private or commercial clouds. This can be achieved by increasing setup and operational efficiencies. The success and sustainability are thus obtained reducing the learning curve, as well as the operational cost of managing community-specific services running on distributed environments. The greater beneficiaries of this approach are communities willing to exploit opportunistic cloud resources. DODAS builds on several EOSC-hub services developed by the
more » ... NDIGO-DataCloud project and allows to instantiate on-demand container-based clusters. These execute software applications to benefit of potentially "any cloud provider", generating sites on demand with almost zero effort. DODAS provides ready-to-use solutions to implement a "Batch System as a Service" as well as a BigData platform for a "Machine Learning as a Service", offering a high level of customization to integrate specific scenarios. A description of the DODAS architecture will be given, including the CMS integration strategy adopted to connect it with the experiment's HTCondor Global Pool. Performance and scalability results of DODAS-generated tiers processing real CMS analysis jobs will be presented. The Instituto de Física de Cantabria and Imperial College London use cases will be sketched. Finally a high level strategy overview for optimizing data ingestion in DODAS will be described.
doi:10.1051/epjconf/201921407027 fatcat:lfuyzuau75g4xj7cn53vzhesiq

Rucio - Scientific Data Management [article]

Martin Barisits, Thomas Beermann, Frank Berghaus, Brian Bockelman, Joaquin Bogado, David Cameron, Dimitrios Christidis, Diego Ciangottini, Gancho Dimitrov, Markus Elsing, Vincent Garonne, Alessandro di Girolamo, Luc Goossens, Wen Guan (+15 others)
2019 arXiv   pre-print
Rucio is an open source software framework that provides scientific collaborations with the functionality to organize, manage, and access their volumes of data. The data can be distributed across heterogeneous data centers at widely distributed locations. Rucio has been originally developed to meet the requirements of the high-energy physics experiment ATLAS, and is continuously extended to support the LHC experiments and other diverse scientific communities. In this article we detail the
more » ... ental concepts of Rucio, describe the architecture along with implementation details, and give operational experience from production usage.
arXiv:1902.09857v1 fatcat:6rrkmqtlqrfq5koydykwv7vlnq

Rucio: Scientific Data Management

Martin Barisits, Thomas Beermann, Frank Berghaus, Brian Bockelman, Joaquin Bogado, David Cameron, Dimitrios Christidis, Diego Ciangottini, Gancho Dimitrov, Markus Elsing, Vincent Garonne, Alessandro di Girolamo (+18 others)
2019 Computing and Software for Big Science  
Rucio is an open-source software framework that provides scientific collaborations with the functionality to organize, manage, and access their data at scale. The data can be distributed across heterogeneous data centers at widely distributed locations. Rucio was originally developed to meet the requirements of the high-energy physics experiment ATLAS, and now is continuously extended to support the LHC experiments and other diverse scientific communities. In this article, we detail the
more » ... tal concepts of Rucio, describe the architecture along with implementation details, and report operational experience from production usage. 1 3 11 Page 2 of 19 for data organization, management, and access for scientific experiments which incorporates the existing tools and makes it easy to interact with them. One of the guiding principles of Rucio is dataflow autonomy and automation, and its design is geared towards that goal. Rucio is built on the experiences of its predecessor system DQ2 [5] and modern technologies, and expands on functionality, scalability, robustness, and efficiency which are required for data-intensive sciences. Within ATLAS, Rucio is responsible for detector data, simulation data, as well as user data, and provides a unified interface across heterogeneous storage and network infrastructures. Rucio also offers advanced features such as data recovery or adaptive replication, and is frequently extended to support LHC experiments and other diverse scientific communities. In this article, we describe the Rucio data management system. We start by detailing the requirements of the ATLAS experiment and the motivation for Rucio. In Sect. "Concepts", we describe the core concepts and in Sect. "Architecture", the architecture of the system, including the implementation highlights. Section "Functionality Details" explains how the concepts and architecture together are used to provide data management functionality. We continue in Sect. "Operational Experience" with a view on the operational experience with a focus on deployment, configuration, and system performance, and in Sect. "Advanced Features" with details on advanced workflows that can be enabled through Rucio. We close the article in Sect. "Summary" with a summary, an overview of the Rucio community, and outlook on future work and challenges to prepare Rucio for the next generation of data-intensive experiments. ATLAS Distributed Computing ATLAS is one of the four major experiments at the Large Hadron Collider at CERN. It is a general-purpose particle physics experiment run by an international collaboration and it is designed to exploit the full discovery potential and the huge range of physics opportunities that the LHC provides. The experiment tracks and identifies particles to investigate a wide range of physics topics, from the study of the Higgs boson [6] to the search for supersymmetry [7], extra dimensions [8], or potential particles that make up dark matter [9] . The physics program of ATLAS is, thus, very diverse and requires a flexible data management system. ATLAS Distributed Computing (ADC) [10] covers all aspects of the computing systems for the experiment, across more than 130 computing centers worldwide. Within ADC, Rucio has been developed as the principal Distributed Data Management system, integrating with essentially every other component of the distributed computing infrastructure, most importantly the workflow management system PanDA [11] and the task definition and control system ProdSys [12]. ADC
doi:10.1007/s41781-019-0026-3 fatcat:3erfeeamhvg5ndqiiiibaunphm

QCD and Jets

Diego Ciangottini, Lucio Anderlini, Matteo Bauce
2016 Proceedings of VII Workshop Italiano sulla fisica pp a LHC — PoS(PP@LHC2016)   unpublished
In the light of the successful restart of the data-taking of the LHC experiments at the unprecedented energy in the center of mass √ s = 13 TeV, we review the prospects for the second run of the LHC for measurements related to Quantum Chromodynamics (QCD) and jets in pp collisions. Recent results from the ATLAS, CMS, and LHCb collaborations lead the discussion on the open questions on soft production that the LHC experiments are called to address during the next few years of activities. The
more » ... ussion is mainly focused on measurements related to the underlying event, to the production mechanism of jets, and to the associative production of jets and heavy flavours. VII Workshop italiano sulla fisica pp a LHC
doi:10.22323/1.278.0017 fatcat:boys277opze4tipn3milllmfuu

Reinforcement Learning for Smart Caching at the CMS experiment

Tommaso Tedeschi, Mirco Tracolli, Diego Ciangottini, Daniele Spiga, Loriano Storchi, Marco Baioletti, Valentina Poggioni
2021 Proceedings of International Symposium on Grids & Clouds 2021 — PoS(ISGC2021)   unpublished
doi:10.22323/1.378.0009 fatcat:pjdvlpsv6vb5po5e4fnoyxu2wa

DODAS: How to effectively exploit heterogeneous clouds for scientific computations

Daniele Spiga, Marica Antonacci, Tommaso Boccali, Diego Ciangottini, Alessandro COSTANTINI, Giacinto DONVITO, Cristina Duma, Matteo Duranti, Valerio Formato, Luciano Gaido, Davide Salomoni, Mirco Tracolli (+1 others)
2018 Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)   unpublished
Dynamic On Demand Analysis Service (DODAS) is a Platform as a Service tool built combining several solutions and products developed by the INDIGO-DataCloud H2020 project. DODAS allows to instantiate on-demand container-based clusters. Both HTCondor batch system and platform for the Big Data analysis based on Spark, Hadoop etc, can be deployed on any cloud-based infrastructures with almost zero effort. DODAS acts as cloud enabler designed for scientists seeking to easily exploit distributed and
more » ... eterogeneous clouds to process data. Aiming to reduce the learning curve as well as the operational cost of managing community specific services running on distributed cloud, DODAS completely automates the process of provisioning, creating, managing and accessing a pool of heterogeneous computing and storage resources. DODAS was selected as one of the Thematic Services that will provide multidisciplinary solutions in the EOSC-hub project, an integration and management system of the European Open Science Cloud starting in January 2018. The main goals of this contribution are to provide a comprehensive overview of the overall technical implementation of DODAS, as well as to illustrate two distinct real examples of usage: the integration within the CMS Workload Management System and the extension of the AMS computing model.
doi:10.22323/1.327.0024 fatcat:jco7oe4lzfd67kjdbytfmfqqsi
« Previous Showing results 1 — 15 out of 667 results