32,313 Hits in 9.3 sec

A First Study on Clustering Collections of Workflow Graphs [chapter]

Emanuele Santos, Lauro Lins, James P. Ahrens, Juliana Freire, Cláudio T. Silva
2008 Lecture Notes in Computer Science  
We propose two different representations for these graphs and present an experimental evaluation, using a collection of 1,700 workflow graphs, where we study the trade-offs of these representations and  ...  In this paper, we explore the use of clustering techniques to organize large collections of workflow and provenance graphs.  ...  Santos is partially supported by a CAPES/Fulbright fellowship.  ... 
doi:10.1007/978-3-540-89965-5_18 fatcat:qfmb5mwwg5ff5jwrqpwyekxa3a


Nicholas Kong, Tovi Grossman, Björn Hartmann, Maneesh Agrawala, George Fitzmaurice
2012 Proceedings of the 2012 ACM annual conference on Human Factors in Computing Systems - CHI '12  
Finally, we conducted an evaluation of Delta on a small corpus of 30 workflows and found that the intermediate list view provided the best information density.  ...  We conducted an initial study to identify the set of attributes users attend to when comparing workflows, finding that they consider result quality, their knowledge of commands, and the efficiency of the  ...  ACKNOWLEDGMENTS We thank our study participants. This work was partially supported by NSF grant CCF-0643552.  ... 
doi:10.1145/2207676.2208549 dblp:conf/chi/KongGHAF12 fatcat:7ejf3zrflrdcvhnq62wuptxjdm

Temporal representation for scientific data provenance

Peng Chen, Beth Plale, Mehmet S. Aktas
2012 2012 IEEE 8th International Conference on E-Science  
We propose a representation of the provenance data based on logical time that reduces the feature space.  ...  We evaluate the temporal representation using an existing 10 GB database of provenance captured from a range of scientific workflows.  ...  However, an OPM graph resulting from a typical experimental provenance collection procedure, which is the target of this study, does not contain such cycles.  ... 
doi:10.1109/escience.2012.6404477 dblp:conf/eScience/ChenPA12 fatcat:2kkanmri2fe67njqvelvhuhhzm

A first study on strategies for generating workflow snippets

Tommy Ellkvist, Lena Strömbäck, Lauro Didier Lins, Juliana Freire
2009 Proceedings of the First International Workshop on Keyword Search on Structured Data - KEYS '09  
In this paper, we take a first look at the requirements for workflow snippets and study alternative techniques for deriving concise, yet informative snippets.  ...  Recently, a number of public workflow repositories have become available, for example, myExperiment for scientific workflows, and Yahoo! Pipes.  ...  CONCLUSIONS This papers presents a first study on constructing workflow snippets using information from the workflow graph.  ... 
doi:10.1145/1557670.1557678 dblp:conf/sigmod/EllkvistSLF09 fatcat:dcv5mabqp5etzbzczena5kgt2a

Local clustering in provenance graphs

Peter Macko, Daniel Margo, Margo Seltzer
2013 Proceedings of the 22nd ACM international conference on Conference on information & knowledge management - CIKM '13  
We identify three key properties of provenance graphs and exploit them to justify two new centrality metrics we developed for use in performing local clustering on provenance graphs.  ...  Local clustering in these graphs, in which we start with a seed vertex and grow a cluster around it, is of paramount importance because it supports critical provenance applications such as identifying  ...  Clustering is one of the most well-studied problems in data mining, but there is little work directed at clustering in provenance graphs, and many existing techniques do not work well.  ... 
doi:10.1145/2505515.2505624 dblp:conf/cikm/MackoMS13 fatcat:aqpqt6ylvjbkffh6wznqmizuam

Workflow Scheduling to Minimize Data Movement Using Multi-constraint Graph Partitioning

Masahiro Tanaka, Osamu Tatebe
2012 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)  
Thus, we propose a new method of task assignment based on Multi-Constraint Graph Partitioning.  ...  Among scheduling algorithms of scientific workflows, the graph partitioning is a technique to minimize data transfer between nodes or clusters.  ...  ACKNOWLEDGMENT This work is supported by JST CREST research area, "Development of System Software Technologies for Post-Peta Scale High Performance Computing," and the MEXT Promotion of Research for Next  ... 
doi:10.1109/ccgrid.2012.134 dblp:conf/ccgrid/TanakaT12 fatcat:h7tfv5gwmrf45m3pfaie7a4kle

Optimizing Grid-Based Workflow Execution

Gurmeet Singh, Carl Kesselman, Ewa Deelman
2005 Journal of Grid Computing  
So far, the focus has been on developing easy to use interfaces for composing these workflows and finding an optimal mapping of tasks in the workflow to the Grid resources in order to minimize the completion  ...  Large-scale applications can be expressed as a set of tasks with data dependencies between them, also known as application workflows.  ...  Figure 10 shows the workflow execution graph when using 1 cluster per level and using 2 clusters per level. With one cluster per level, the workflow now completes in 46 minutes.  ... 
doi:10.1007/s10723-005-9011-7 fatcat:zxac2wohojb65ahb25xmvssh2a

Learning Bundled Care Opportunities from Electronic Medical Records [article]

You Chen, Abel N. Kho, David Liebovitz, Catherine Ivory, Sarah Osmundson, Jiang Bian, Bradley A. Malin
2017 arXiv   pre-print
It is recognized that a bundled care approach to healthcare-one that manages a collection of health conditions together-may enable greater efficacy and cost savings.  ...  Study Design: Retrospective inference of clusters of health conditions from an electronic medical record (EMR) system.  ...  First, this study focused on the development of a methodology to infer general collections of phenotypic patterns that share similar workflow patterns according to EMR system utilization.  ... 
arXiv:1706.00487v1 fatcat:ptm26wb6r5hdhksuz5okqqzrf4

Learning bundled care opportunities from electronic medical records

You Chen, Abel N. Kho, David Liebovitz, Catherine Ivory, Sarah Osmundson, Jiang Bian, Bradley A. Malin
2018 Journal of Biomedical Informatics  
Methods-We designed a framework to infer health condition collections (HCCs) based on the similarity of their clinical workflows, according to electronic medical record (EMR) utilization.  ...  It has been recognized that a bundled approach to healthcare -one that manages a collection of health conditions together -may enable greater efficacy and cost savings.  ...  Funding This research was supported, in part, by the National Institutes of Health under grants R00LM011933 and R01LM010685.  ... 
doi:10.1016/j.jbi.2017.11.014 pmid:29174994 pmcid:PMC5771885 fatcat:q5zulc5u2fgdrk6slf3un3qmvq

Social Network Analysis as Knowledge Discovery Process: A Case Study on Digital Bibliography

Michele Coscia, Fosca Giannotti, Ruggero Pensa
2009 2009 International Conference on Advances in Social Network Analysis and Mining  
Today Digital Bibliographies are a powerful instrument that collects a great amount of data about scientific publications.  ...  Digital Bibliographies have been used as basis of many studies focused on the knowledge extraction in databases. Here we present a new methodology for mining knowledge in this field.  ...  We first list a number of classical social network analysis descriptors and then we give a short explanation about the data mining techniques of graph mining and co-clustering. A.  ... 
doi:10.1109/asonam.2009.65 dblp:conf/asunam/CosciaGP09 fatcat:heu5ijgqxnfytnoom4ntj6jzne

The Database of Cross-Linguistic Colexifications, reproducible analysis of cross-linguistic polysemies

Christoph Rzymski, Tiago Tresoldi, Simon J. Greenhill, Mei-Shin Wu, Nathanael E. Schweikhard, Maria Koptjevskaja-Tamm, Volker Gast, Timotheus A. Bodt, Abbie Hantgan, Gereon A. Kaiping, Sophie Chang, Yunfan Lai (+15 others)
2020 Scientific Data  
Such advances, however, are bringing high requirements in terms of rigorousness for preparing and curating datasets. Here we present CLICS, a Database of Cross-Linguistic Colexifications (CLICS).  ...  This is done by addressing shortcomings of an earlier version of the database, CLICS2, and by supplying an updated version with CLICS3, which massively increases the size and scope of the project.  ...  We are also very grateful for the help and data provided by many researchers, among them: Cathryn Yang fordata on Lolo languages 47  ... 
doi:10.1038/s41597-019-0341-x pmid:31932593 pmcid:PMC6957499 fatcat:6tqxrmxj4zh3lmiaipahykov3a

Similarity-based workflow clustering

V. Silva
2011 Journal of Computational Interdisciplinary Sciences  
However, SWfMS expect a modeled workflow to be represented on its workflow language to be executed. The scientist does not have an assistance or guidance to obtain a modeled workflow.  ...  Experiment lines, which are a novel approach to deal with these limitations, allow for the abstract representation and systematic composition of experiments.  ...  A first study was conducted to evaluate the viability of the architecture in Silva et al. (2010) .  ... 
doi:10.6062/jcis.2011.02.01.0029 fatcat:bqupvlmdofcthanf3vp5pc5z4u

A survey of simulation provenance systems: modeling, capturing, querying, visualization, and advanced utilization

Young-Kyoon Suh, Ki Yong Lee
2018 Human-Centric Computing and Information Sciences  
In particular, we present a taxonomy of scientific platforms regarding provenance support and holistically tabulate the major functionalities and supporting levels of the studied systems.  ...  Accordingly, there have been a lot of attentions paid to actively utilize provenance information regarding such computer simulations, particularly conducted on highperformance computing and storage resources  ...  Recently, there has been a clustered study of automatically collecting provenance without altering such a program while charging little overhead on the platform [9, 10] .  ... 
doi:10.1186/s13673-018-0150-9 fatcat:zmdmunmguvfelnlwcvnhw5wmpi

Model-driven multisite workflow scheduling

Ketan Maheshwari, Eun-Sung Jung, Jiayuan Meng, Venkatram Vishwanath, Rajkumar Kettimuthu
2013 2013 IEEE International Conference on Cluster Computing (CLUSTER)  
A user may not be able to complete a complex workflow at a single site. It is thus beneficial to run different tasks of a workflow on different sites.  ...  Workflows continue to play an important role in expressing and deploying scientific applications. In recent years, a wide variety of computational sites have emerged with shared access to users.  ...  Department of Energy, Office of Science, ASCR, under Contract DE-AC02-06CH11357.  ... 
doi:10.1109/cluster.2013.6702647 dblp:conf/cluster/MaheshwariJMVK13 fatcat:vtoxjyfalja4vp4kpl5jbwsq4a

Designing machine learning workflows with an application to topological data analysis

Eric Cawi, Patricio S La Rosa, Arye Nehorai
2019 PLoS ONE  
Inspired by statistical learning, MLMs are morphisms whose parameters are minimized via a risk function. We explore operations such as composition of MLMs and when sets of MLMs form a vector space.  ...  These operations are used to build a machine learning workflow from data preprocessing to final task completion.  ...  Lenise Cummings-Vaughn, with Barnes Jewish Hospital for sharing data and collaboration on the Hospital Readmissions Prediction.  ... 
doi:10.1371/journal.pone.0225577 pmid:31790458 pmcid:PMC6886815 fatcat:jjlcn3zdffg7pbkytumttpet5q
« Previous Showing results 1 — 15 out of 32,313 results