Filters








24 Hits in 2.7 sec

COFFEA - Columnar Object Framework For Effective Analysis

CMS Collaboration, Nick Smith
2019 Zenodo  
The COFFEA Framework provides a new approach to HEP analysis, via columnar operations, that improves time-to-insight, scalability, portability, and reproducibility of analysis.  ...  We will present published results from analysis of CMS data using the COFFEA framework along with a discussion of metrics and the user experience of arriving at those results with columnar analysis.  ...  analysis is effective for HEP analyst use cases -10 CMS analysis groups have implemented or are implementing their analysis in coffea Code samples I • Selects good candidates (per-entry selection)  ... 
doi:10.5281/zenodo.3598788 fatcat:5zgr5ogr35e4pggfbugqgocjc4

Coffea – Columnar Object Framework For Effective Analysis [article]

Nicholas Smith, Lindsey Gray, Matteo Cremonesi, Bo Jayatilaka, Oliver Gutsche, Allison Hall, Kevin Pedro, Maria Acosta, Andrew Melo, Stefano Belforte, Jim Pivarski
2020 arXiv   pre-print
The coffea framework provides a new approach to High-Energy Physics analysis, via columnar operations, that improves time-to-insight, scalability, portability, and reproducibility of analysis.  ...  We will discuss our experience in implementing analysis of CMS data using the coffea framework along with a discussion of the user experience and future directions.  ...  Conclusions Columnar analysis is an effective paradigm for HEP data analysis within the CMS collaboration, and we have implemented several maturing analyses in a columnar fashion.  ... 
arXiv:2008.12712v1 fatcat:o66txf5f55fvteua3may7pyz3a

Coffea Columnar Object Framework For Effective Analysis

Nicholas Smith, Lindsey Gray, Matteo Cremonesi, Bo Jayatilaka, Oliver Gutsche, Allison Hall, Kevin Pedro, Maria Acosta, Andrew Melo, Stefano Belforte, Jim Pivarski, C. Doglioni (+5 others)
2020 EPJ Web of Conferences  
The coffea framework provides a new approach to High-Energy Physics analysis, via columnar operations, that improves time-to-insight, scalability, portability, and reproducibility of analysis.  ...  We will discuss our experience in implementing analysis of CMS data using the coffea framework along with a discussion of the user experience and future directions.  ...  Conclusions Columnar analysis is an effective paradigm for HEP data analysis within the CMS collaboration, and we have implemented several maturing analyses in a columnar fashion.  ... 
doi:10.1051/epjconf/202024506012 fatcat:s6ag574gyvc5zfykkdbwp27xra

Coffea - Column Object Framework for Effective Analysis

Lindsey Gray
2019 Zenodo  
Presentation of the Coffea project and package for Column Object Framework for Effective Analysis.  ...  • Column Object Framework For Effective Analysis (Coffea) Coffea -Physicist friendly tools for column based analysis -Implements typical recipes needed to operate on NanoAOD-like ntuples • Currently  ...  to, but I can see why it is useful" • "Coffea is the easiest-to-use analysis framework I have worked with so far.  ... 
doi:10.5281/zenodo.3959304 fatcat:5aa6yq4sbzbjla45jpmpdappzy

Coffea-casa: an analysis facility prototype [article]

Matous Adamec
2021 arXiv   pre-print
The "Coffea-casa" prototype analysis facility is an effort to provide users with alternate mechanisms to access computing resources and enable new programming paradigms.  ...  Instead of writing event loops, the column-based Coffea library is used.  ...  Both analysis are implemented in the Coffea framework in Jupyter notebook format and executed from within Coffea-casa.  ... 
arXiv:2103.01871v2 fatcat:yswnbtvecjgxdkf5nj26hed57m

Coffea-casa: an analysis facility prototype

Matous Adamec, Garhan Attebury, Kenneth Bloom, Brian Bockelman, Carl Lundstedt, Oksana Shadura, John Thiltges, C. Biscarat, S. Campana, B. Hegner, S. Roiser, C.I. Rovelli (+1 others)
2021 EPJ Web of Conferences  
The "Coffea-casa" prototype analysis facility is an effort to provide users with alternate mechanisms to access computing resources and enable new programming paradigms.  ...  Instead of writing event loops, the columnbased Coffea library is used.  ...  Both analysis are implemented in the Coffea framework in Jupyter notebook format and executed from within Coffea-casa.  ... 
doi:10.1051/epjconf/202125102061 fatcat:gv25j5um5jeovacx3b5shqxqbm

Real-time HEP analysis with funcX, a high-performance platform for function as a service

Yadu Babuji, Ben Blaiszik, Kyle Chard, Ryan Chard, Ian Foster, Zhuozhao Li, Tyler Skluzacek, Ana Trisovic, Anna Elizabeth Woodard, Daniel S. Katz
2019 Zenodo  
Low-latency, query-based analysis strategies are being developed to enable real-time analysis of primary datasets by replacing conventional nested loops over objects with native operations on hierarchically  ...  nested, columnar data.  ...  datasets, treenames, 'boostedHbbProcessor.coffea', funcx_executor, stageout_path, executor_args=executor_args, chunksize=chunksize ) Columnar Object Framework For Effective Analysis-check  ... 
doi:10.5281/zenodo.3599652 fatcat:ftehptauuncdpn2gkqf5rlblf4

Columnar data analysis with ATLAS analysis formats

Nikolai Hartmann, Johannes Elmsheuser, Günter Duckeck, on behalf of ATLAS Software and Computing, C. Biscarat, S. Campana, B. Hegner, S. Roiser, C.I. Rovelli, G.A. Stewart
2021 EPJ Web of Conferences  
This allows for application of the "columnar analysis" paradigm where operations are applied on a per-array instead of a per-event basis.  ...  The smallest of these, named DAOD_PHYSLITE, has calibrations already applied to allow fast downstream analysis and avoid the need for further analysis-specific intermediate formats.  ...  Currently, the Coffea framework [4] , together with the Awkward Array package [5] , provide the most complete set of tools for columnar data analysis in HEP.  ... 
doi:10.1051/epjconf/202125103001 fatcat:jjjqjqgxcrdx3blbjx2dlizn5i

Using Analysis Declarative Languages for the HL-LHC

Gordon Watts, Mason Proffitt, Emma Torro
2019 Zenodo  
The increase in luminosity by a factor of 100 for the HL-LHC with respect to Run 1 poses a big challenge from the data analysis point of view.  ...  the current efforts in the field, including efforts in traditional programming languages like C++, Python, and Go, and efforts that have invented their own languages like Root Data Frame, CutLang, ADL, coffea  ...  (yaml) -Monday, Nov 4 th • COFFEA -Columnar Object Framework For Effective Analysis -This session • A Functional Declarative Analysis Language in Python -Tuesday Poster • HEP Data Query Challenges -Thursday  ... 
doi:10.5281/zenodo.3599533 fatcat:ea5f3ontnbehblzjx7yglmy7sa

HL-LHC Computing Review Stage 2, Common Software Projects: Data Science Tools for Analysis [article]

Jim Pivarski, Eduardo Rodrigues, Kevin Pedro, Oksana Shadura, Benjamin Krikler, Graeme A. Stewart
2022 arXiv   pre-print
It describes the adoption of Python and data science tools in HEP, discusses the likelihood of future scenarios, and recommendations for action by the HEP community.  ...  For example, Coffea needed histogram and Lorentz vector objects on a shorter timescale than IRIS-HEP developers could produce a future-proof foundation, so both teams agreed for Coffea to develop "quick  ...  A mature framework for the usage of databases for analysis has the potential to provide many benefits.  ... 
arXiv:2202.02194v1 fatcat:kal6u3ldhbd3zc7qhvjqql5hiy

Real-time HEP analysis with funcX, a high-performance platform for function as a service

Anna Elizabeth Woodard, Ana Trisovic, Zhuozhao Li, Yadu Babuji, Ryan Chard, Tyler Skluzacek, Ben Blaiszik, Daniel S. Katz, Ian Foster, Kyle Chard, C. Doglioni, D. Kim (+4 others)
2020 EPJ Web of Conferences  
operating on columnar data to aggregate histograms of analysis products of interest in real-time.  ...  study, we use funcX—a high-performance function as a service platform that enables intuitive, flexible, efficient, and scalable remote function execution on existing infrastructure—to parallelize an analysis  ...  site access and help with site testing; and to Lindsey Gray and the Coffea Team.  ... 
doi:10.1051/epjconf/202024507046 fatcat:oo6uqds4jnc5le6n4gsb27mski

Laurelin: Java-native ROOT I/O for Apache Spark

Andrew Melo, Oksana Shadura, for the CMS Collaboration, C. Biscarat, S. Campana, B. Hegner, S. Roiser, C.I. Rovelli, G.A. Stewart
2021 EPJ Web of Conferences  
One difficulty in adopting conventional big data frameworks to HEP workflows is the lack of support for the ROOT file format in these frameworks.  ...  Apache Spark[1] is one of the predominant frameworks in the big data space, providing a fully-functional query processing engine, vendor support for hardware accelerators, and performant integrations with  ...  Gray for his valued work testing and debugging Laurelin, as well Oksana Shadura who provided valuable testing insight.  ... 
doi:10.1051/epjconf/202125102072 fatcat:vtrcbflihjbetaf2tewebsmsci

Analysis Description Languages for the LHC [article]

Sezen Sekmen, Philippe Gras, Lindsey Gray, Benjamin Krikler, Jim Pivarski, Harrison B. Prosper, Andrea Rizzi, Gokhan Unel, Gordon Watts
2020 arXiv   pre-print
An analysis description language is a domain specific language capable of describing the contents of an LHC analysis in a standard and unambiguous way, independent of any computing framework.  ...  Adopting analysis description languages would bring numerous benefits for the LHC experimental and phenomenological communities ranging from analysis preservation beyond the lifetimes of experiments or  ...  and analysis framework.  ... 
arXiv:2011.01950v1 fatcat:k2dicy2qgvapze5rrq6cth6cka

CutLang v2: Advances in a Runtime-Interpreted Analysis Description Language for HEP Data

G. Unel, S. Sekmen, A. M. Toon, B. Gokturk, B. Orgen, A. Paul, N. Ravel, J. Setpal
2021 Frontiers in Big Data  
We will present the latest developments in CutLang, the runtime interpreter of a recently-developed analysis description language (ADL) for collider data analysis.  ...  ADL is a domain-specific, declarative language that describes the contents of an analysis in a standard and unambiguous way, independent of any computing framework.  ...  Prosper for useful discussions on the language content and help with validation of analysis results. We also thank the SModelS team for a collaboration that is helping to gradually improve CutLang.  ... 
doi:10.3389/fdata.2021.659986 pmid:34169274 pmcid:PMC8218547 fatcat:3u4hiuwyh5hanlghfo3bg3p2jq

First experiences with a portable analysis infrastructure for LHC at INFN

Diego Ciangottini, Tommaso Boccali, Andrea Ceccanti, Daniele Spiga, Davide Salomoni, Tommaso Tedeschi, Mirco Tracolli, C. Biscarat, S. Campana, B. Hegner, S. Roiser, C.I. Rovelli (+1 others)
2021 EPJ Web of Conferences  
analysis environment for the CMS experiment.  ...  At the Italian National Institute for Nuclear Physics (INFN) a portable software stack for analysis has been proposed, based on cloud-native tools and capable of providing users with a fully integrated  ...  kube-batch [25] functionalities) as well as the exploitation of software alternatives as the distributed ROOT framework RDataFrame [26] developed at CERN and columnar analysis tools as coffea-casa  ... 
doi:10.1051/epjconf/202125102045 fatcat:ubtyoepjsveqthwq3eze5tysdm
« Previous Showing results 1 — 15 out of 24 results