17,337 Hits in 5.5 sec

Capturing Interactive Data Transformation Operations Using Provenance Workflows [chapter]

Tope Omitola, André Freitas, Edward Curry, Séan O'Riain, Nicholas Gibbins, Nigel Shadbolt
2015 Lecture Notes in Computer Science  
This paper describes a principled way to capture data lineage of interactive data transformation processes.  ...  Interactive data transformation (IDT) tools are becoming easily available to lower these barriers to data transformation efforts.  ...  said provenance data, and "what" type of method to use for the capture.  ... 
doi:10.1007/978-3-662-46641-4_3 fatcat:f5zzbssxzbbjrhwugelc7rxqju

Managing Rapidly-Evolving Scientific Workflows [chapter]

Juliana Freire, Cláudio T. Silva, Steven P. Callahan, Emanuele Santos, Carlos E. Scheidegger, Huy T. Vo
2006 Lecture Notes in Computer Science  
A key feature that sets Vis-Trails apart from previous visualization and scientific workflow systems is a novel action-based mechanism that uniformly captures provenance for data products and workflows  ...  We give an overview of VisTrails, a system that provides an infrastructure for systematically capturing detailed provenance and streamlining the data exploration process.  ...  George Chen (MGH/Harvard University) for providing us the lung datasets, and Erik Anderson for creating the lung visualizations.  ... 
doi:10.1007/11890850_2 fatcat:tje2p3h6zraori63effygps42e

Detailed Provenance Capture of Data Processing

Ben De Meester, Anastasia Dimou, Ruben Verborgh, Erik Mannens
2017 International Semantic Web Conference  
Using declarative mapping documents to describe the computational experiment allows automatic capturing of termlevel provenance for both schema and data transformations, and for both the used software  ...  This paper proposes an automatic capturing mechanism for interchangeable and implementation independent metadata and provenance that includes data processing.  ...  Automatic Capture of Provenance A provenance capture mechanism falls into three main classes: workflow-, process-, and operating system-based (os) [7] .  ... 
dblp:conf/semweb/MeesterDVM17 fatcat:zyagjwkyljeqrkykchny7mhz6u

Automatic capture and reconstruction of computational provenance

James Frew, Dominic Metzger, Peter Slaughter
2008 Concurrency and Computation  
Instead of specifying provenance explicitly with a workflow model, ES3 extracts provenance information automatically from arbitrary applications by monitoring their interactions with their execution environment  ...  By 'local,' we mean the infrastructure that a scientist uses to manage the creation and dissemination of her own data products, particularly those that are constantly incorporating corrections or improvements  ...  to assume the burdens of operational data publication.  ... 
doi:10.1002/cpe.1247 fatcat:437peuqipzeefksbf2f6ykntgu

Representing Interoperable Provenance Descriptions for ETL Workflows [chapter]

André Freitas, Benedikt Kämpgen, João Gabriel Oliveira, Seán O'Riain, Edward Curry
2015 Lecture Notes in Computer Science  
quality assessment, data semantics and facilitating the reproducibility of data transformation processes.  ...  In addition to ETL, provenance, the representation of source artifacts, processes and agents behind data, becomes another cornerstone element for Web data management, playing a fundamental role in data  ...  In practice it is not always possible to capture all data transformation operations into a fine-grained provenance representation.  ... 
doi:10.1007/978-3-662-46641-4_4 fatcat:o7d5txmqa5ai7iiohmxwezr7qm

Characterizing Provenance in Visualization and Data Analysis: An Organizational Framework of Provenance Types and Purposes

Eric D. Ragan, Alex Endert, Jibonananda Sanyal, Jian Chen
2016 IEEE Transactions on Visualization and Computer Graphics  
We also discuss the relationships between these factors and the methods used to capture provenance information.  ...  The term, provenance, has been used in a variety of ways to describe different types of records and histories related to visualization.  ...  User interaction data is one common form of data collected about analytic provenance, but others have included annotations, screenshots, and data transformations in related discussions.  ... 
doi:10.1109/tvcg.2015.2467551 pmid:26340779 fatcat:bujis3h24na63ldqizjqwv434a

Tackling the Provenance Challenge one layer at a time

Carlos Scheidegger, David Koop, Emanuele Santos, Huy Vo, Steven Callahan, Juliana Freire, Cláudio Silva
2008 Concurrency and Computation  
It uniformly and automatically captures provenance information for data products and for the evolution of the workflows used to generate these products.  ...  VisTrails uses a new change-based provenance mechanism which was designed to handle rapidly-evolving workflows.  ...  Although data provenance is necessary to allow for reproducibility, it fails to capture useful information about the relationship among the different workflows used in an exploratory task.  ... 
doi:10.1002/cpe.1237 fatcat:hnwrtpsv2nfrfltzzv4ax4s6um

Connecting Scientific Data to Scientific Experiments with Provenance

Simon Miles, Ewa Deelman, Paul Groth, Karan Vahi, Gaurang Mehta, Luc Moreau
2007 Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007)  
As scientific workflows and the data they operate on, grow in size and complexity, the task of defining how those workflows should execute (which resources to use, where the resources must be in readiness  ...  In this paper, we describe preliminary work on adapting Pegasus to capture the process of workflow refinement in the PASOA provenance system.  ...  The provenance model previously used by Pegasus as part of the Virtual Data System (VDS) also captured provenance-related information from both the workflow enactment engine and executing applications  ... 
doi:10.1109/e-science.2007.22 dblp:conf/eScience/MilesDGVMM07 fatcat:dn7i4mxwoffzlpa65ks4dykis4

or2yw: Modeling and Visualizing OpenRefineHistories as YesWorkflow Diagrams [article]

Nikolaus Nova Parulian, Lan Li, Bertram Ludaescher
2021 arXiv   pre-print
The resulting YW models can be understood as a form of prospective provenance, i.e., knowledge artifacts that can be queried and visualized (i) to help authors document their own data cleaning workflows  ...  With or2yw the user can automatically generate YW models from OpenRefine operation histories, thus providing a 'workflow view' on a previously executed sequence of data cleaning operations.  ...  Thus, the or2yw tool (Fig. 2) helps uncover provenance information by visualizing the operations workflow with its dataflow dependencies and effects on the data schema. 2 OpenRefine Transformations  ... 
arXiv:2112.08259v1 fatcat:feckiehp7neh7byw4sv2wxsche

In Situ Data Provenance Capture in Spreadsheets

Hazeline U. Asuncion
2011 2011 IEEE Seventh International Conference on eScience  
While provenance can be captured using techniques such as scientific workflows, typically these techniques do not trace internal data manipulations that occur within off-the-shelf analysis tools.  ...  The capture of data provenance is a fundamentally important task in eScience.  ...  We thank Dan Jaffe and Jonathan Hee for useful feedback. This research was supported in part by the UWB Undergraduate Research Fund.  ... 
doi:10.1109/escience.2011.41 dblp:conf/eScience/Asuncion11 fatcat:l7567cm3xvehvhypxrm4ksq4gu

Capturing Provenance in the Wild [chapter]

M. David Allen, Adriane Chapman, Barbara Blaustein, Len Seligman
2010 Lecture Notes in Computer Science  
Our approach is implemented using the PLUS provenance system and the open source MULE Enterprise Service Bus. Our evaluations show that this approach is scalable and has minimal overhead.  ...  However, when users compose services from heterogeneous systems and organizations to form a new application, it is impossible to track the provenance in the new system using currently available work.  ...  Among our U.S. government customers though, it is common for data to flow across organizational boundaries and for each autonomous stakeholder to use and transform data with their own applications.  ... 
doi:10.1007/978-3-642-17819-1_12 fatcat:keknwm2etvbsdn6wsjkmaif3ya

Towards Provenance-Enabling ParaView [chapter]

Steven P. Callahan, Juliana Freire, Carlos E. Scheidegger, Cláudio T. Silva, Huy T. Vo
2008 Lecture Notes in Computer Science  
By consolidating provenance information for a variety of applications, we can provide a uniform environment for querying, sharing, and re-using provenance in large-scale, collaborative settings.  ...  Currently, there are no general provenance management systems or tools available for existing applications.  ...  Provenance is captured during user interactions with the main application using a custom solutions for the application.  ... 
doi:10.1007/978-3-540-89965-5_13 fatcat:f5oc43qzbrao5fcog2npey2qge

A primer on provenance

Lucian Carata, Sherif Akoush, Nikilesh Balakrishnan, Thomas Bytheway, Ripduman Sohan, Margo Selter, Andy Hopper
2014 Communications of the ACM  
As the quantity of data that contributes to a particular result increases, keeping track of how different sources and transformations are related to each other becomes more difficult.  ...  This is not so straightforward when dealing with digital data, however: the result of a computation might have been derived from numerous sources and by applying complex successive transformations, possibly  ...  Provenance can also be used to obtain a better understanding of the actual process through which different pieces of input data are transformed into outputs.  ... 
doi:10.1145/2596628 fatcat:votcaprhhfe25laiohnm7tc7de

Provenance Data in the Machine Learning Lifecycle in Computational Science and Engineering [article]

Renan Souza, Leonardo Azevedo, Vítor Lourenço, Elton Soares, Raphael Thiago, Rafael Brandão, Daniel Civitarese, Emilio Vital Brazil, Marcio Moreno, Patrick Valduriez, Marta Mattoso, Renato Cerqueira, Marco A. S. Netto
2019 arXiv   pre-print
The main limitation of provenance tracking solutions is that they cannot cope with provenance capture and integration of domain and ML data processed in the multiple workflows in the lifecycle while keeping  ...  the provenance capture overhead low.  ...  Workflow provenance capture systems usually address scripts as workflows with chained functions, method, or library calls that execute data transformations, while capturing input arguments and output values  ... 
arXiv:1910.04223v2 fatcat:kqalpetlarecrlqbcsiatrcpoi

Provenance Data in the Machine Learning Lifecycle in Computational Science and Engineering

Renan Souza, Patrick Valduriez, Marta Mattoso, Renato Cerqueira, Marco Netto, Leonardo Azevedo, Vitor Lourenco, Elton Soares, Raphael Thiago, Rafael Brandao, Daniel Civitarese, Emilio Brazil (+1 others)
2019 2019 IEEE/ACM Workflows in Support of Large-Scale Science (WORKS)  
The main limitation of provenance tracking solutions is that they cannot cope with provenance capture and integration of domain and ML data processed in the multiple workflows in the lifecycle, while keeping  ...  the provenance capture overhead low.  ...  Workflow provenance capture systems usually address scripts as workflows with chained functions, method, or library calls that execute data transformations, while capturing input arguments and output values  ... 
doi:10.1109/works49585.2019.00006 dblp:conf/sc/SouzaVMCNALSTBC19 fatcat:w7wnh4arnnfb5ixnjap4iryb7a
« Previous Showing results 1 — 15 out of 17,337 results