Filters








106 Hits in 1.7 sec

GalaxyCloudRunner: enhancing scalable computing for Galaxy [article]

Nuwan Goonasekera, Alexandru Mahmoud, John Chilton, Enis Afgan
2020 bioRxiv   pre-print
federation (Afgan, Jalili, et al., 2018) , offering free resources for scientific analyses.  ...  Ensuring these steps are properly configured imposes significant complexity on the system administrator when trying to leverage cloud resources (Afgan et al. , 2015) . .  ... 
doi:10.1101/2020.05.28.121772 fatcat:7kbotybhl5eaxo7p3ub5vcx6zi

Resource planning on the Cloud: exploring the scalability spectrum [article]

Enis Afgan, Mohammad Heydarian
2017 Figshare  
Cloud computing resources have become the informatics backbone for scalable, accessible, customizable, and secure computing with bioinformatics continuing to benefit from this computational model. What started as a handful of applications that were ported to the Cloud has geared up to creation of Virtual Laboratories and Cloud Pilot projects funded by national funding agencies. Today, a typical end-user scenario for the cloud is to acquire a set of virtual machines from a cloud provider with
more » ... -installed software and perform the needed data analysis. In the process, the user needs to make cost-effective decisions about what resources to acquire and how many. These decisions have a direct impact on the outcome of the analysis because with insufficient resources it may be impossible to complete the analysis or it may take extra time. Excessive resources waste project funds or merit allocation credits and can cause resource contention on academic clouds. To shine some light on this topic, we performed a number of experiments with the Galaxy CloudMan project to explore the tradeoffs among resource types and sizes across the Amazon Web Services infrastructure. Using published next generation sequencing data we identified resource requirements, limits on resource classes, and observed actual resource utilization for RNA-seq and chIP-seq pipelines. These results can be used help users gauge what resources to use when using cloud machinery. They can also be used by academic cloud infrastructure projects to determine what type of underlying infrastructure is needed by users. In this talk, we will detail our findings.
doi:10.6084/m9.figshare.5563036.v1 fatcat:jthmp56u5rbk7dw5ouj5diha64

Cloud Bursting Galaxy: Federated Identity and Access Management [article]

Vahid Jalili, Enis Afgan, James Taylor, Jeremy Goecks
2018 bioRxiv   pre-print
AbstractMotivationLarge biomedical datasets, such as those from genomics and imaging, are increasingly being stored on commercial and institutional cloud computing platforms. This is because cloud-scale computing resources, from robust backup to high-speed data transfer to scalable compute and storage, are needed to make these large datasets usable. However, one challenge for large-scale biomedical data on the cloud is providing secure access, especially when datasets are distributed across
more » ... forms. While there are open Web protocols for secure authentication and authorization, these protocols are not in wide use in bioinformatics and are difficult to use for even technologically sophisticated users.ResultsWe have developed a generic and extensible approach for securely accessing biomedical datasets distributed across cloud computing platforms. Our approach combines OpenID Connect and OAuth2, best-practice Web protocols for authentication and authorization, together with Galaxy (https://galaxyproject.org), a web-based computational workbench used by thousands of scientists across the world. With our enhanced version of Galaxy, users can access and analyze data distributed across multiple cloud computing providers without any special knowledge of access/authorization protocols. Our approach does not require users to share permanent credentials (e.g., username, password, API key), instead relying on automatically-generated temporary tokens that refresh as needed. Our approach is generalizable to most identity providers and cloud computing platforms. To the best of our knowledge, Galaxy is the only computational workbench where users can access biomedical datasets across multiple cloud computing platforms using best-practice Web security approaches and thereby minimize risks of unauthorized data access and credential use.Availability and ImplementationFreely available for academic and commercial use under the open-source Academic Free License (https://opensource.org/licenses/AFL-3.0) from the following Github repositories: https://github.com/galaxyproject/galaxy and https://github.com/galaxyproject/cloudauthzContactjalili@ohsu.edu, goecksj@ohsu.edu
doi:10.1101/506238 fatcat:mxq6ie4flfambferdv223757ty

CloudLaunch as a Gateway for Discovering and Launching Cloud Applications

Enis Afgan, NUWAN GOONASEKERA, Marcus Christie
2017 Figshare  
Global presence of cloud computing infrastructure and increasing demand by researchers to be able to easily gain access to research software on a cloud of choice is putting pressure on application deployers to make their applications easily accessible on multiple clouds. However, performing complex IaaS orchestration tasks on multiple clouds is often difficult, as it requires that each cloud be tested and supported individually. In this demo, we present CloudLaunch as a portal for discovering
more » ... d launching cloud-enabled applications on a variety of cloud providers, while being able to write complex IaaS orchestration tasks in a cloud agnostic manner. The portal makes it possible for deployers to readily integrate their application into the portal while end-users can easily browse and launch available applications. CloudLaunch provides necessary abstractions to make either task uniform and accessible across all supported cloud providers.
doi:10.6084/m9.figshare.5471800 fatcat:6slyg5brfbeevfc2s7blr2zlka

CloudLaunch: Discover and Deploy Cloud Applications [article]

Enis Afgan, Andrew Lonie, James Taylor, Nuwan Goonasekera
2018 arXiv   pre-print
Cloud computing is a common platform for delivering software to end users. However, the process of making complex-to-deploy applications available across different cloud providers requires isolated and uncoordinated application-specific solutions, often locking-in developers to a particular cloud provider. Here, we present the CloudLaunch application as a uniform platform for discovering and deploying applications for different cloud providers. CloudLaunch allows arbitrary applications to be
more » ... ed to a catalog with each application having its own customizable user interface and control over the launch process, while preserving cloud-agnosticism so that authors can easily make their applications available on multiple clouds with minimal effort. It then provides a uniform interface for launching available applications by end users across different cloud providers. Architecture details are presented along with examples of different deployable applications that highlight architectural features.
arXiv:1805.04005v2 fatcat:ny2oquchaffexlmt4zuedb4sqm

Grid Resource Broker Using Application Benchmarking [chapter]

Enis Afgan, Vijay Velusamy, Purushotham V. Bangalore
2005 Lecture Notes in Computer Science  
While the Grid is becoming a common word in the context of distributed computing, users are still experiencing long phases of adaptability and increased complexity when using the system. Although users have access to multiple resources, selecting the optimal resource for their application and appropriately launching the job is a tedious process that not only proves difficult for the naïve user, but also leads to ineffective usage of the resources. A generalpurpose resource broker that performs
more » ... pplication specific resource selection on behalf of the user through a web interface is required. This paper describes the design and prototyping of such a resource broker that not only selects a matching resource based on user specified criteria but also uses the application performance characteristics on the resources enabling the user to execute applications transparently and efficiently thereby providing true virtualization.
doi:10.1007/11508380_70 fatcat:cyv3h22r6vcq3hathju2j47voi

Galaxy CloudMan: delivering cloud compute clusters

Enis Afgan, Dannon Baker, Nate Coraor, Brad Chapman, Anton Nekrutenko, James Taylor
2010 BMC Bioinformatics  
Acknowledgements Galaxy is developed by the Galaxy Team: Enis Afgan, Guruprasad Ananda, Dannon Baker, Dan Blankenberg, Ramkrishna Chakrabarty, Nate Coraor, Jeremy Goecks, Greg Von Kuster, Ross Lazarus,  ...  © 2010 Afgan et al; licensee BioMed Central Ltd.  ...  A script used to automatically install all the tools available to a default instance of CloudMan cluster is available at https://bitbucket.org/ afgane/mi-deployment/; using this script and customizing  ... 
doi:10.1186/1471-2105-11-s12-s4 pmid:21210983 pmcid:PMC3040530 fatcat:edyk2xhac5bv7imvw32arm5fpa

Application Information Services for distributed computing environments

Enis Afgan, Purushotham Bangalore, Karolj Skala
2011 Future generations computer systems  
Afgan E., Gray J., Bangalore P., "Using Domain-Specific Modeling to Generate User Interfaces for Wizards", Int.  ...  Conference (MDDAUI) 2007 , Nashville, TN, September 30-October 5, 2007 Afgan E., Bangalore P., "Computation Cost in Grid Computing Environments", Int.  ... 
doi:10.1016/j.future.2010.08.004 fatcat:bpdh6xgdozdlpig4bx37s7ept4

Bio-Docklets: Virtualization Containers for Single-Step Execution of NGS Pipelines [article]

Baekdoo Kim, Thahmina A Ali, Carlos Lijeron, Enis Afgan, Konstantinos Krampis
2017 bioRxiv   pre-print
Processing of Next-Generation Sequencing (NGS) data requires significant technical skills, involving installation, configuration, and execution of bioinformatics data pipelines, in addition to specialized post-analysis visualization and data mining software. In order to address some of these challenges, developers have leveraged virtualization containers, towards seamless deployment of preconfigured bioinformatics software and pipelines on any computational platform. Findings: We present an
more » ... oach for abstracting the complex data operations of multi-step, bioinformatics pipelines for NGS data analysis. As examples, we have deployed two pipelines for RNAseq and CHIPseq, pre-configured within Docker virtualization containers we call Bio-Docklets. Each Bio-Docklet exposes a single data input and output endpoint and from a user perspective, running the pipelines is as simple as running a single bioinformatics tool. This is achieved through a 'meta-script' that automatically starts the Bio-Docklets, and controls the pipeline execution through the BioBlend software library and the Galaxy Application Programming Interface (API). The pipelne output is postprocessed using the Visual Omics Explorer (VOE) framework, providing interactive data visualizations that users can access through a web browser. Conclusions: The goal of our approach is to enable easy access to NGS data analysis pipelines for nonbioinformatics experts, on any computing environment whether a laboratory workstation, university computer cluster, or a cloud service provider. Besides end-users, the Bio-Docklets also enables developers to programmatically deploy and run a large number of pipeline instances for concurrent analysis of multiple datasets.
doi:10.1101/116962 fatcat:yqd4wm72wzcsdnkbap4rexz4au

CloudMan as a platform for tool, data, and analysis distribution

Enis Afgan, Brad Chapman, James Taylor
2012 BMC Bioinformatics  
Cloud computing provides an infrastructure that facilitates large scale computational analysis in a scalable, democratized fashion, However, in this context it is difficult to ensure sharing of an analysis environment and associated data in a scalable and precisely reproducible way. Results: CloudMan (usecloudman.org) enables individual researchers to easily deploy, customize, and share their entire cloud analysis environment, including data, tools, and configurations. Conclusions: With the
more » ... led customization and sharing of instances, CloudMan can be used as a platform for collaboration. The presented solution improves accessibility of cloud resources, tools, and data to the level of an individual researcher and contributes toward reproducibility and transparency of research solutions.
doi:10.1186/1471-2105-13-315 pmid:23181507 pmcid:PMC3556322 fatcat:lma3lbbwyfg4nbcvmr7pye6uke

The elastic analysis with galaxy on the cloud

Enis Afgan, Dannon Baker, Anton Nekrutenko, James Taylor
2010 Genome Biology  
© 2010 Afgan et al; licensee BioMed Central Ltd.  ... 
doi:10.1186/gb-2010-11-s1-p2 fatcat:3br3pwj5wjg4npjtqab46p4tqm

Experiences with Integrating Custos SecurityServices [article]

Isuru Ranawaka, Samitha Liyanage, Dannon Baker, Alexandru Mahmoud, Juleen Graham, Terry Fleury, Dimuthu Wannipurage, Yu Ma, Enis Afgan, Jim Basney, Suresh Marru, Marlon Pierce
2021 arXiv   pre-print
Science gateways are user-facing cyberinfrastruc-ture that provide researchers and educators with Web-basedaccess to scientific software, computing, and data resources.Managing user identities, accounts, and permissions are essentialtasks for science gateways, and gateways likewise must man-age secure connections between their middleware and remoteresources. The Custos project is an effort to build open sourcesoftware that can be operated as a multi-tenanted service thatprovides reliable
more » ... ntations of common science gatewaycybersecurity needs, including federated authentication, iden-tity management, group and authorization management, andresource credential management. Custos aims further to provideintegrated solutions through these capabilities, delivering end-to-end support for several science gateway usage scenarios. Thispaper examines four deployment scenarios using Custos andassociated extensions beyond previously described work. Thefirst capability illustrated by these scenarios is the need forCustos to provide hierarchical tenant management that allowsmultiple gateway deployments to be federated together andalso to support consolidated, hosted science gateway platformservices. The second capability illustrated by these scenarios is theneed to support service accounts that can support non-browserapplications and agent applications that can act on behalf ofusers on edge resources. We illustrate how the latter can be builtusing Web security standards combined with Custos permissionmanagement mechanisms.
arXiv:2107.04172v1 fatcat:b3kddpoy4rgo3b7efta4iasszm

Bio-Docklets: virtualization containers for single-step execution of NGS pipelines

Baekdoo Kim, Thahmina Ali, Carlos Lijeron, Enis Afgan, Konstantinos Krampis
2017 GigaScience  
Processing of next-generation sequencing (NGS) data requires significant technical skills, involving installation, configuration, and execution of bioinformatics data pipelines, in addition to specialized postanalysis visualization and data mining software. In order to address some of these challenges, developers have leveraged virtualization containers toward seamless deployment of preconfigured bioinformatics software and pipelines on any computational platform. We present an approach for
more » ... racting the complex data operations of multistep, bioinformatics pipelines for NGS data analysis. As examples, we have deployed 2 pipelines for RNA sequencing and chromatin immunoprecipitation sequencing, preconfigured within Docker virtualization containers we call Bio-Docklets. Each Bio-Docklet exposes a single data input and output endpoint and from a user perspective, running the pipelines as simply as running a single bioinformatics tool. This is achieved using a "meta-script" that automatically starts the Bio-Docklets and controls the pipeline execution through the BioBlend software library and the Galaxy Application Programming Interface. The pipeline output is postprocessed by integration with the Visual Omics Explorer framework, providing interactive data visualizations that users can access through a web browser. Our goal is to enable easy access to NGS data analysis pipelines for nonbioinformatics experts on any computing environment, whether a laboratory workstation, university computer cluster, or a cloud service provider. Beyond end users, the Bio-Docklets also enables developers to programmatically deploy and run a large number of pipeline instances for concurrent analysis of multiple datasets.
doi:10.1093/gigascience/gix048 pmid:28854616 pmcid:PMC5569920 fatcat:xhfuc2rnnvahrnd3nw5xq7jdde

Role of the resource broker in the Grid

Enis Afgan
2004 Proceedings of the 42nd annual Southeast regional conference on - ACM-SE 42  
Today, as Grid Computing is becoming a reality, there is a need for managing and monitoring the available resources worldwide, as well as the need for conveying these resources to the everyday user. This paper describes a resource broker with its main function being to match the available resources to the user's requests. The use of the resource broker provides a uniform interface to access any of the available and appropriate resources using user's credentials. This paper discusses the process
more » ... of creating the resource broker as well as provides insight into how it connects and relates to the underlying software. The resource broker runs on top of the Globus Toolkit. Therefore, it provides security and current information about the available resources and serves as a link to the diverse systems available in the Grid.
doi:10.1145/986537.986608 dblp:conf/ACMse/Afgan04 fatcat:o6dihljhxrdmdn7l3c5m2mikvy

Harnessing cloud computing with Galaxy Cloud

Enis Afgan, Dannon Baker, Nate Coraor, Hiroki Goto, Ian M Paul, Kateryna D Makova, Anton Nekrutenko, James Taylor
2011 Nature Biotechnology  
Efforts of the Galaxy Team (Enis Afgan, Dannon Baker, Dan Blankenberg, Nate Coraor, Jeremy Goecks, Greg Von Kuster, Ross Lazarus, Kanwei Li, Kelly Vincent) were instrumental for making this work happen  ... 
doi:10.1038/nbt.2028 pmid:22068528 pmcid:PMC3868438 fatcat:bi254ehyubhzxgdat6sbuztt2e
« Previous Showing results 1 — 15 out of 106 results