Integrating databases and workflow systems

Srinath Shankar, Ameet Kini, David J. DeWitt, Jeffrey Naughton
2005 SIGMOD record  
There has been an information explosion in fields of science such as high energy physics, astronomy, environmental sciences and biology. There is a critical need for automated systems to manage scientific applications and data. Database technology is well-suited to handle several aspects of workflow management. Contemporary workflow systems are built from multiple, separately developed components and do not exploit the full power of DBMSs in handling data of large magnitudes. We advocate a
more » ... tic view of a WFMS that includes not only workflow modeling but planning, scheduling, data management and cluster management. Thus, it is worthwhile to explore the ways in which databases can be augmented to manage workflows in addition to data. We present a language for modeling workflows that is tightly integrated with SQL. Each scientific program in a workflow is associated with an active table or view. The definition of data products is in relational format, and invocation of programs and querying is done in SQL. The tight coupling between workflow management and datamanipulation is an advantage for data-intensive scientific programs.
doi:10.1145/1084805.1084808 fatcat:glg7ykanyzfppiuwlsp23teobu