715 Hits in 4.5 sec

Deciding the physical implementation of ETL workflows

Vasiliki Tziovara, Panos Vassiliadis, Alkis Simitsis
2007 Proceedings of the ACM tenth international workshop on Data warehousing and OLAP - DOLAP '07  
In this paper, we deal with the problem of determining the best possible physical implementation of an ETL workflow, given its logical-level description and an appropriate cost model as inputs.  ...  We further extend this technique by intentionally introducing sorter activities in the workflow in order to search for alternative physical implementations with lower cost.  ...  The objective of this work is to identify the best possible physical implementation for a given logical ETL workflow.  ... 
doi:10.1145/1317331.1317341 dblp:conf/dolap/TziovaraVS07 fatcat:ljnyn233g5cdhptah5rqmmbhae

From conceptual design to performance optimization of ETL workflows: current state of research and open problems

Syed Muhammad Fawad Ali, Robert Wrembel
2017 The VLDB journal  
We explain the existing techniques for: (1) constructing a conceptual and a logical model of an ETL workflow, (2) its corresponding physical implementation, and (3) its optimization, illustrated by examples  ...  In this paper, we discuss the state of the art and current trends in designing and optimizing ETL workflows.  ...  Section 5 introduces techniques for the physical implementation of an ETL workflow. Section 6 focuses on techniques for optimizing an ETL workflow.  ... 
doi:10.1007/s00778-017-0477-2 fatcat:s5f7mzuzgfhzfkvl26yxixw2vy

Frequent patterns in ETL workflows: An empirical approach

Vasileios Theodorou, Alberto Abelló, Maik Thiele, Wolfgang Lehner
2017 Data & Knowledge Engineering  
We showcase our approach through a use case that is applied on implemented ETL processes from the TPC-DI specification and we present mined ETL patterns.  ...  We logically model the ETL workflows using labeled graphs and employ graph algorithms to identify candidate patterns and to recognize them on different workflows.  ...  This research has been funded by the European Commission through the Erasmus Mundus Joint Doctorate "Information Technologies for Business Intelligence -Doctoral College" (IT4BI-DC).  ... 
doi:10.1016/j.datak.2017.08.004 fatcat:nwlx3pjbz5g67fjpicktv2nnfm

Extraction, Transformation, and Loading [chapter]

Alkis Simitsis, Panos Vassiliadis
2017 Encyclopedia of Database Systems  
In later years, during the early days of data integration, the driving force behind data integration were wrapper-mediator schemes; the construction of the wrappers is a primitive form of ETL scripting  ...  SYNONYMS ETL; ETL process; ETL tool; Back Stage of a Data Warehouse; Data warehouse refreshment DEFINITION Extraction, Transformation, and Loading (ETL) processes are responsible for the operations taking  ...  the designer who must decide the order and physical implementation for the individual activities.  ... 
doi:10.1007/978-1-4899-7993-3_158-3 fatcat:etto3enuuneind3s3ldeg4s5qy

Scheduling strategies for efficient ETL execution

Anastasios Karagiannis, Panos Vassiliadis, Alkis Simitsis
2013 Information Systems  
Extract-transform-load (ETL) workflows model the population of enterprise data warehouses with information gathered from a large variety of heterogeneous data sources.  ...  In this paper, we deal with the problem of scheduling the execution of ETL activities (a.k.a. transformations, tasks, operations), with the goal of minimizing ETL execution time and allocated memory.  ...  The workflow is an abstract design at the logical level, which has to be implemented physically, i.e., to be mapped to a combination of executable programs/scripts that perform the ETL workflow.  ... 
doi:10.1016/ fatcat:nj7muti2u5gwnhmc6zxv54rkty

Managing Continuous Data Integration Flows

Josef Schiefer, Jun-Jang Jeng, Robert M. Bruckner
2003 International Conference on Advanced Information Systems Engineering  
A particularly difficult aspect of measuring the performance of workflows is the dissemination of event data and its transformation into business metrics.  ...  The proposed architecture takes full advantage of existing J2EE (Java 2 Platform, Enterprise Edition) technology and uses an ETL container for the event data processing.  ...  The authors propose for the implementation of the ETL processes the usage of workflow management systems (WFMSs) and active rules which are executed under certain operational semantics.  ... 
dblp:conf/caise/SchieferJB03 fatcat:tuct6ygeejf3dhorcvysrsgtjy

Value-driven Approach for Designing Extended Data Warehouses

Nabila Berkani, Ladjel Bellatreche, Selma Khouri, Carlos Ordonez
2019 International Workshop on Data Warehousing and OLAP  
In this paper, first, we conceptualize the variety of internal and external sources and study its impact on the ETL phase to ease the value capturing.  ...  Decline was signaled by the appearance of Big Data. It is therefore essential to find other challenges that will contribute to the revival of DW while taking advantage of the V's of Big Data.  ...  (iv) Physical level: proposed scenario obliges designers to manage variety of LOD according the physical implementations of DW: at unification formalism level [2, 6, 10] or at querying level ([11, 13  ... 
dblp:conf/dolap/BerkaniBK019 fatcat:owebcnaaavhghponwdobbkhley

Implementation of business intelligence tools using open source approach

Carlos Gameiro
2011 Proceedings of the 2011 Workshop on Open Source and Design of Communication - OSDOC '11  
This repository is the source of all business intelligence and implementing it requires the right software tools, essential for the data warehouse.  ...  The two ETL solutions used were: • Pentaho Kettle Data Integration Community Editions (Open Source Software) • SQL Server 2005 Integrations Services (SSIS) Enterprise Edition (Proprietary Software) The  ...  It uses one or more file packages to implement the workflow process, but if too many workflows are present in the same package the design of the workflow will become heavier.  ... 
doi:10.1145/2016716.2016723 fatcat:efglv3acrrerrazrt62fe64c5y

Benchmarking ETL Workflows [chapter]

Alkis Simitsis, Panos Vassiliadis, Umeshwar Dayal, Anastasios Karagiannis, Vasiliki Tziovara
2009 Lecture Notes in Computer Science  
Each ETL tool uses its own technique for the design and implementation of an ETL workflow, making the task of assessing ETL tools extremely difficult.  ...  We also identify the main points of interest in designing, implementing, and maintaining ETL workflows.  ...  Then, each activity of the workflow is physically implemented using various algorithmic methods, each with different cost in terms of time requirements or system resources (e.g., CPU, memory, disk space  ... 
doi:10.1007/978-3-642-10424-4_15 fatcat:mzyiwt6iwbgghkpsbgxhp7yo2y

A Framework for User-Centered Declarative ETL

Vasileios Theodorou, Alberto Abelló, Maik Thiele, Wolfgang Lehner
2014 Proceedings of the 17th International Workshop on Data Warehousing and OLAP - DOLAP '14  
Based on existing work, we raise the level of abstraction for the conceptual representation of ETL operations and we show how process quality characteristics can generate specific patterns on the process  ...  Current approaches for the modeling and optimization of ETL processes provide platform-independent optimization solutions for the (semi-)automated transition among different abstraction levels, focusing  ...  In the same direction, the work in [14] provides a classification of ETL activities, through investigating the particular characteristics of ETL workflows and introducing a formal representation of workflows  ... 
doi:10.1145/2666158.2666178 dblp:conf/dolap/TheodorouATL14 fatcat:e3zo65p5gzcybbgzetonwldjci

Business Processes Meet Operational Business Intelligence

Umeshwar Dayal, Kevin Wilkinson, Alkis Simitsis, Malú Castellanos
2009 IEEE Data Engineering Bulletin  
through logical design to physical implementation.  ...  We describe the challenges in ETL design and implementation, and the approach we are taking to meet these challenges.  ...  The third challenge derives from our ultimate goal of producing an optimized physical ETL implementation.  ... 
dblp:journals/debu/DayalWSC09 fatcat:jdfytjvpobfd5mmhc3kj4qahsi

Optimizing ETL workflows for fault-tolerance

Alkis Simitsis, Kevin Wilkinson, Umeshwar Dayal, Malu Castellanos
2010 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010)  
In addition, other metrics can affect the choice of a strategy; e.g., higher freshness reduces the time window for recovery. The design space is too large for informal, ad-hoc approaches.  ...  Typically, design work on ETL has focused on performance as the sole metric to make sure that the ETL process finishes within an allocated time window.  ...  Abstract part of an ETL workflow Fig. 3.  ... 
doi:10.1109/icde.2010.5447816 dblp:conf/icde/SimitsisWDC10 fatcat:vewa5zew75a2vaggpjygbxhgty

Applying the UML and the Unified Process to the Design of Data Warehouses

Sergio Lujan-Mora, Juan Trujillo
2006 Journal of Computer Information Systems  
the modeling of the data sources, ETL processes or the modeling of the DW itself) by using the same notation .  ...  This is mainly due to the different aspects taking part in a DW architecture such as data sources, processes responsible for Extracting, Transforming and Loading (ETL) data into the DW, the modeling of  ...  Government, and by the DADS (PBC-05-012-2) project from the Regional Science and Technology Ministry of Castilla-La Mancha (Spain).  ... 
doi:10.1080/08874417.2006.11645923 fatcat:ymdpofi2tjfhlpdt4mpktlsdua

A Survey of Extract–Transform–Load Technology

Panos Vassiliadis
2009 International Journal of Data Warehousing and Mining  
The intention of this survey is to present the research work in the field of ETL technology in a structured way.  ...  To this end, we organize the coverage of the field as follows: (a) first, we cover the conceptual and logical modeling of ETL processes, along with some design methods, (b) we visit each stage of the E-T-L  ...  Each logical-level template is physically implemented by a variety of implementations (much like a relational join is implemented by nested loops, merge-sort, or hash join physical-level operators).  ... 
doi:10.4018/jdwm.2009070101 fatcat:okcajnbvabhe5fx72svcdkwrzu

Towards a Low Cost ETL System

Vasco Santos, Rui Silva, Orlando Belo
2014 International Journal of Database Management Systems  
This article proposes a different approach to deal with the distribution of ETL processes in a grid environment, taking into account not only the processing performance of its nodes but also the existing  ...  bandwidth to estimate the grid availability in a near future and therefore optimize workflow distribution.  ...  Thus, it's possible to reduce significantly the costs of a traditional ETL system implementation.  ... 
doi:10.5121/ijdms.2014.6205 fatcat:typxa6xwdnf5lbfbwo2b7r2ufy
« Previous Showing results 1 — 15 out of 715 results