Cloud Bursting Galaxy: Federated Identity and Access Management

Mapping Intimacies ◽

10.1101/506238 ◽

2018 ◽

Author(s):

Vahid Jalili ◽

Enis Afgan ◽

James Taylor ◽

Jeremy Goecks

Keyword(s):

Cloud Computing ◽

High Speed ◽

Large Scale ◽

Best Practice ◽

Data Access ◽

Web Security ◽

Biomedical Data ◽

Link Type ◽

Authentication And Authorization ◽

Computing Platforms

AbstractMotivationLarge biomedical datasets, such as those from genomics and imaging, are increasingly being stored on commercial and institutional cloud computing platforms. This is because cloud-scale computing resources, from robust backup to high-speed data transfer to scalable compute and storage, are needed to make these large datasets usable. However, one challenge for large-scale biomedical data on the cloud is providing secure access, especially when datasets are distributed across platforms. While there are open Web protocols for secure authentication and authorization, these protocols are not in wide use in bioinformatics and are difficult to use for even technologically sophisticated users.ResultsWe have developed a generic and extensible approach for securely accessing biomedical datasets distributed across cloud computing platforms. Our approach combines OpenID Connect and OAuth2, best-practice Web protocols for authentication and authorization, together with Galaxy (https://galaxyproject.org), a web-based computational workbench used by thousands of scientists across the world. With our enhanced version of Galaxy, users can access and analyze data distributed across multiple cloud computing providers without any special knowledge of access/authorization protocols. Our approach does not require users to share permanent credentials (e.g., username, password, API key), instead relying on automatically-generated temporary tokens that refresh as needed. Our approach is generalizable to most identity providers and cloud computing platforms. To the best of our knowledge, Galaxy is the only computational workbench where users can access biomedical datasets across multiple cloud computing platforms using best-practice Web security approaches and thereby minimize risks of unauthorized data access and credential use.Availability and ImplementationFreely available for academic and commercial use under the open-source Academic Free License (https://opensource.org/licenses/AFL-3.0) from the following Github repositories: https://github.com/galaxyproject/galaxy and https://github.com/galaxyproject/[email protected], [email protected]

Download Full-text

Cloud bursting galaxy: federated identity and access management

Bioinformatics ◽

10.1093/bioinformatics/btz472 ◽

2019 ◽

Vol 36 (1) ◽

pp. 1-9 ◽

Cited By ~ 1

Author(s):

Vahid Jalili ◽

Enis Afgan ◽

James Taylor ◽

Jeremy Goecks

Keyword(s):

Cloud Computing ◽

High Speed ◽

Large Scale ◽

Best Practice ◽

Data Transfer ◽

Data Access ◽

Web Security ◽

Biomedical Data ◽

Authentication And Authorization ◽

Computing Platforms

Abstract Motivation Large biomedical datasets, such as those from genomics and imaging, are increasingly being stored on commercial and institutional cloud computing platforms. This is because cloud-scale computing resources, from robust backup to high-speed data transfer to scalable compute and storage, are needed to make these large datasets usable. However, one challenge for large-scale biomedical data on the cloud is providing secure access, especially when datasets are distributed across platforms. While there are open Web protocols for secure authentication and authorization, these protocols are not in wide use in bioinformatics and are difficult to use for even technologically sophisticated users. Results We have developed a generic and extensible approach for securely accessing biomedical datasets distributed across cloud computing platforms. Our approach combines OpenID Connect and OAuth2, best-practice Web protocols for authentication and authorization, together with Galaxy (https://galaxyproject.org), a web-based computational workbench used by thousands of scientists across the world. With our enhanced version of Galaxy, users can access and analyze data distributed across multiple cloud computing providers without any special knowledge of access/authorization protocols. Our approach does not require users to share permanent credentials (e.g. username, password, API key), instead relying on automatically generated temporary tokens that refresh as needed. Our approach is generalizable to most identity providers and cloud computing platforms. To the best of our knowledge, Galaxy is the only computational workbench where users can access biomedical datasets across multiple cloud computing platforms using best-practice Web security approaches and thereby minimize risks of unauthorized data access and credential use. Availability and implementation Freely available for academic and commercial use under the open-source Academic Free License (https://opensource.org/licenses/AFL-3.0) from the following Github repositories: https://github.com/galaxyproject/galaxy and https://github.com/galaxyproject/cloudauthz.

Download Full-text

A Parallel Unmixing-Based Content Retrieval System for Distributed Hyperspectral Imagery Repository on Cloud Computing Platforms

Remote Sensing ◽

10.3390/rs13020176 ◽

2021 ◽

Vol 13 (2) ◽

pp. 176

Author(s):

Peng Zheng ◽

Zebin Wu ◽

Jin Sun ◽

Yi Zhang ◽

Yaoqin Zhu ◽

...

Keyword(s):

Cloud Computing ◽

Large Scale ◽

Retrieval System ◽

Hyperspectral Image ◽

Parallel Implementation ◽

Remotely Sensed Data ◽

Web Interfaces ◽

Content Retrieval ◽

Service Mode ◽

Computing Platforms

As the volume of remotely sensed data grows significantly, content-based image retrieval (CBIR) becomes increasingly important, especially for cloud computing platforms that facilitate processing and storing big data in a parallel and distributed way. This paper proposes a novel parallel CBIR system for hyperspectral image (HSI) repository on cloud computing platforms under the guide of unmixed spectral information, i.e., endmembers and their associated fractional abundances, to retrieve hyperspectral scenes. However, existing unmixing methods would suffer extremely high computational burden when extracting meta-data from large-scale HSI data. To address this limitation, we implement a distributed and parallel unmixing method that operates on cloud computing platforms in parallel for accelerating the unmixing processing flow. In addition, we implement a global standard distributed HSI repository equipped with a large spectral library in a software-as-a-service mode, providing users with HSI storage, management, and retrieval services through web interfaces. Furthermore, the parallel implementation of unmixing processing is incorporated into the CBIR system to establish the parallel unmixing-based content retrieval system. The performance of our proposed parallel CBIR system was verified in terms of both unmixing efficiency and accuracy.

Download Full-text

Cloud Security

Handbook of Research on Securing Cloud-Based Databases with Biometric Applications - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-4666-6559-0.ch011 ◽

2015 ◽

pp. 236-250

Author(s):

Natasha Csicsmann ◽

Victoria McIntyre ◽

Patrick Shea ◽

Syed S. Rizvi

Keyword(s):

Cloud Computing ◽

Performance Analysis ◽

Large Scale ◽

Research Issues ◽

Biometric Technology ◽

Cloud Computing Security ◽

Security Technologies ◽

Encryption Schemes ◽

Computing Platforms ◽

Cloud Auditing

Strong authentication and encryption schemes help cloud stakeholders in performing the robust and accurate cloud auditing of a potential service provider. All security-related issues and challenges, therefore, need to be addressed before a ubiquitous adoption of cloud computing. In this chapter, the authors provide an overview of existing biometrics-based security technologies and discuss some of the open research issues that need to be addressed for making biometric technology an effective tool for cloud computing security. Finally, this chapter provides a performance analysis on the use of large-scale biometrics-based authentication systems for different cloud computing platforms.

Download Full-text

A High-Speed Railway Data Placement Strategy Based on Cloud Computing

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.135-136.43 ◽

2011 ◽

Vol 135-136 ◽

pp. 43-49

Author(s):

Han Ning Wang ◽

Wei Xiang Xu ◽

Chao Long Jia

Keyword(s):

Cloud Computing ◽

High Speed ◽

Data Access ◽

Interval Mapping ◽

Data Placement ◽

Programming Algorithm ◽

High Speed Railway ◽

Mapping Algorithm ◽

Data Intensive ◽

Study Results

The application of high-speed railway data, which is an important component of China's transportation science data sharing, has embodied the typical characteristics of data-intensive computing. A reasonable and effective data placement strategy is needed to deploy and execute data-intensive applications in the cloud computing environment. Study results of current data placement approaches have been analyzed and compared in this paper. Combining the semi-definite programming algorithm with the dynamic interval mapping algorithm, a hierarchical structure data placement strategy is proposed. The semi-definite programming algorithm is suitable for the placement of files with various replications, ensuring that different replications of a file are placed on different storage devices. And the dynamic interval mapping algorithm could guarantee better self-adaptability of the data storage system. It has been proved both by theoretical analysis and experiment demonstration that a hierarchical data placement strategy could guarantee the self-adaptability, data reliability and high-speed data access for large-scale networks.

Download Full-text

Large-Scale Data Management Techniques in Cloud Computing Platforms

Data-Intensive Computing ◽

10.1017/cbo9780511844409.005 ◽

2012 ◽

pp. 85-123

Author(s):

Sherif Sakr ◽

Anna Liu

Keyword(s):

Cloud Computing ◽

Data Management ◽

Large Scale ◽

Large Scale Data ◽

Management Techniques ◽

Computing Platforms ◽

Scale Data

Download Full-text

CROssBAR: Comprehensive Resource of Biomedical Relations with Deep Learning Applications and Knowledge Graph Representations

10.1101/2020.09.14.296889 ◽

2020 ◽

Author(s):

Tunca Doğan ◽

Heval Atas ◽

Vishal Joshi ◽

Ahmet Atakan ◽

Ahmet Sureyya Rifaioglu ◽

...

Keyword(s):

Deep Learning ◽

Protein Interactions ◽

Large Scale ◽

Host Protein ◽

Knowledge Graph ◽

Biomedical Data ◽

Data Resource ◽

Graph Representations ◽

Link Type ◽

Systemic Analysis

AbstractSystemic analysis of available large-scale biological and biomedical data is critical for developing novel and effective treatment approaches against both complex and infectious diseases. Owing to the fact that different sections of the biomedical data is produced by different organizations/institutions using various types of technologies, the data are scattered across individual computational resources, without any explicit relations/connections to each other, which greatly hinders the comprehensive multi-omics-based analysis of data. We aimed to address this issue by constructing a new biological and biomedical data resource, CROssBAR, a comprehensive system that integrates large-scale biomedical data from various resources and store them in a new NoSQL database, enrich these data with deep-learning-based prediction of relations between numerous biomedical entities, rigorously analyse the enriched data to obtain biologically meaningful modules and display them to users via easy-to-interpret, interactive and heterogenous knowledge graph (KG) representations within an open access, user-friendly and online web-service at https://crossbar.kansil.org. As a use-case study, we constructed CROssBAR COVID-19 KGs (available at: https://crossbar.kansil.org/covid_main.php) that incorporate relevant virus and host genes/proteins, interactions, pathways, phenotypes and other diseases, as well as known and completely new predicted drugs/compounds. Our COVID-19 graphs can be utilized for a systems-level evaluation of relevant virus-host protein interactions, mechanisms, phenotypic implications and potential interventions.

Download Full-text

Contextual Contracts for Component-Based Resource Abstraction in a Cloud of HPC Services

10.5753/wscad.2019.8670 ◽

2019 ◽

Cited By ~ 1

Author(s):

Wagner Al Alam ◽

Francisco Carvalho Junior

Keyword(s):

Cloud Computing ◽

Parallel Computing ◽

Large Scale ◽

Matrix Multiplication ◽

Small Scale ◽

Computing Systems ◽

Computing Platform ◽

Computing Platforms

The efforts to make cloud computing suitable for the requirements of HPC applications have motivated us to design HPC Shelf, a cloud computing platform of services for building and deploying parallel computing systems for large-scale parallel processing. We introduce Alite, the system of contextual contracts of HPC Shelf, aimed at selecting component implementations according to requirements of applications, features of targeting parallel computing platforms (e.g. clusters), QoS (Quality-of-Service) properties and cost restrictions. It is evaluated through a small-scale case study employing a componentbased framework for matrix-multiplication based on the BLAS library.

Download Full-text

An Abstract Model for Integrated Intrusion Detection and Severity Analysis for Clouds

Cloud Computing Advancements in Design, Implementation, and Technologies ◽

10.4018/978-1-4666-1879-4.ch001 ◽

2013 ◽

pp. 1-17 ◽

Cited By ~ 1

Author(s):

Junaid Arshad ◽

Paul Townend ◽

Jie Xu

Keyword(s):

Cloud Computing ◽

Intrusion Detection ◽

Large Scale ◽

Best Practice ◽

Abstract Model ◽

Analysis Model ◽

Best Practice Guidelines ◽

Computing Paradigm ◽

Architectural Evaluation ◽

Cloud Infrastructures

Cloud computing is an emerging computing paradigm which introduces novel opportunities to establish large scale, flexible computing infrastructures. However, security underpins extensive adoption of Cloud computing. This paper presents efforts to address one of the significant issues with respect to security of Clouds i.e. intrusion detection and severity analysis. An abstract model for integrated intrusion detection and severity analysis for Clouds is proposed to facilitate minimal intrusion response time while preserving the overall security of the Cloud infrastructures. In order to assess the effectiveness of the proposed model, detailed architectural evaluation using Architectural Trade-off Analysis Model (ATAM) is used. A set of recommendations which can be used as a set of best practice guidelines while implementing the proposed architecture is discussed.

Download Full-text

Information Evaporation: The Migration Of Information To Cloud Computing Platforms

International Journal of Management & Information Systems (IJMIS) ◽

10.19030/ijmis.v16i4.7305 ◽

2012 ◽

Vol 16 (4) ◽

pp. 291

Author(s):

David Reavis

Keyword(s):

Cloud Computing ◽

Data Storage ◽

Data Access ◽

Time Lapse ◽

Early Years ◽

Computing Environment ◽

Central System ◽

Cloud Computing Environment ◽

Transmission Distance ◽

Computing Platforms

The physical location for data used in every organization ebbs and flows as technology improves. In the early years of computing, data were stored on the central system because that was the only choice. As communication technology advanced, a decentralized model became popular and data were stored nearer to the place it would be used. Another leap in telecommunications prompted a move back to centralized data storage, mostly because access speeds allowed the data to be used remotely with minimal time lapse due to transmission distance. The most recent transition for housing data is to move data from various databases, some centralized and some localized, into the cloud. The benefits of moving information to a cloud computing environment have made it attractive to organizations recently. Converting data from one platform to another is done regularly by IT professionals. In each of the transitions described above, data had to be converted in some way and transitions to updated computing platforms are not uncommon. In this paper, the term information evaporation will be used to distinguish the move of information to the cloud from other conversion activities, such as system upgrades or platform transitions. Converting data from a traditional database environment to an Internet-based cloud computing environment requires a different approach to security, attention to avoiding creating information silos, and development of data tags, such as eXtensible Markup Language (XML), to facilitate cross platform data access.

Download Full-text

Data Management Challenges in Cloud Environments

Computer Engineering and Applications Journal ◽

10.18495/comengapp.v3i3.105 ◽

2014 ◽

Vol 3 (3) ◽

pp. 158-171

Author(s):

Mohamad Masood Javidi ◽

Najme Mansouri ◽

Asghar Asadi Karam

Keyword(s):

Cloud Computing ◽

Data Management ◽

Large Scale ◽

Web Applications ◽

Large Data ◽

Database Systems ◽

Advantages And Disadvantages ◽

Computing Paradigm ◽

Data Marts ◽

Computing Platforms

Recently the cloud computing paradigm has been receiving special excitement and attention in the new researches. Cloud computing has the potential to change a large part of the IT activity, making software even more interesting as a service and shaping the way IT hardware is proposed and purchased. Developers with novel ideas for new Internet services no longer require the large capital outlays in hardware to present their service or the human expense to do it. These cloud applications apply large data centers and powerful servers that host Web applications and Web services. This report presents an overview of what cloud computing means, its history along with the advantages and disadvantages. In this paper we describe the problems and opportunities of deploying data management issues on these emerging cloud computing platforms. We study that large scale data analysis jobs, decision support systems, and application specific data marts are more likely to take benefit of cloud computing platforms than operational, transactional database systems.

Download Full-text