Data Management Challenges in Cloud Environments

Mohamad Masood Javidi; Najme Mansouri; Asghar Asadi Karam

doi:10.18495/comengapp.v3i3.105

Data Management Challenges in Cloud Environments

Computer Engineering and Applications Journal ◽

10.18495/comengapp.v3i3.105 ◽

2014 ◽

Vol 3 (3) ◽

pp. 158-171

Author(s):

Mohamad Masood Javidi ◽

Najme Mansouri ◽

Asghar Asadi Karam

Keyword(s):

Cloud Computing ◽

Data Management ◽

Large Scale ◽

Web Applications ◽

Large Data ◽

Database Systems ◽

Advantages And Disadvantages ◽

Computing Paradigm ◽

Data Marts ◽

Computing Platforms

Recently the cloud computing paradigm has been receiving special excitement and attention in the new researches. Cloud computing has the potential to change a large part of the IT activity, making software even more interesting as a service and shaping the way IT hardware is proposed and purchased. Developers with novel ideas for new Internet services no longer require the large capital outlays in hardware to present their service or the human expense to do it. These cloud applications apply large data centers and powerful servers that host Web applications and Web services. This report presents an overview of what cloud computing means, its history along with the advantages and disadvantages. In this paper we describe the problems and opportunities of deploying data management issues on these emerging cloud computing platforms. We study that large scale data analysis jobs, decision support systems, and application specific data marts are more likely to take benefit of cloud computing platforms than operational, transactional database systems.

Download Full-text

Large-Scale Data Management Techniques in Cloud Computing Platforms

Data-Intensive Computing ◽

10.1017/cbo9780511844409.005 ◽

2012 ◽

pp. 85-123

Author(s):

Sherif Sakr ◽

Anna Liu

Keyword(s):

Cloud Computing ◽

Data Management ◽

Large Scale ◽

Large Scale Data ◽

Management Techniques ◽

Computing Platforms ◽

Scale Data

Download Full-text

A Parallel Unmixing-Based Content Retrieval System for Distributed Hyperspectral Imagery Repository on Cloud Computing Platforms

Remote Sensing ◽

10.3390/rs13020176 ◽

2021 ◽

Vol 13 (2) ◽

pp. 176

Author(s):

Peng Zheng ◽

Zebin Wu ◽

Jin Sun ◽

Yi Zhang ◽

Yaoqin Zhu ◽

...

Keyword(s):

Cloud Computing ◽

Large Scale ◽

Retrieval System ◽

Hyperspectral Image ◽

Parallel Implementation ◽

Remotely Sensed Data ◽

Web Interfaces ◽

Content Retrieval ◽

Service Mode ◽

Computing Platforms

As the volume of remotely sensed data grows significantly, content-based image retrieval (CBIR) becomes increasingly important, especially for cloud computing platforms that facilitate processing and storing big data in a parallel and distributed way. This paper proposes a novel parallel CBIR system for hyperspectral image (HSI) repository on cloud computing platforms under the guide of unmixed spectral information, i.e., endmembers and their associated fractional abundances, to retrieve hyperspectral scenes. However, existing unmixing methods would suffer extremely high computational burden when extracting meta-data from large-scale HSI data. To address this limitation, we implement a distributed and parallel unmixing method that operates on cloud computing platforms in parallel for accelerating the unmixing processing flow. In addition, we implement a global standard distributed HSI repository equipped with a large spectral library in a software-as-a-service mode, providing users with HSI storage, management, and retrieval services through web interfaces. Furthermore, the parallel implementation of unmixing processing is incorporated into the CBIR system to establish the parallel unmixing-based content retrieval system. The performance of our proposed parallel CBIR system was verified in terms of both unmixing efficiency and accuracy.

Download Full-text

Cloud Security

Handbook of Research on Securing Cloud-Based Databases with Biometric Applications - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-4666-6559-0.ch011 ◽

2015 ◽

pp. 236-250

Author(s):

Natasha Csicsmann ◽

Victoria McIntyre ◽

Patrick Shea ◽

Syed S. Rizvi

Keyword(s):

Cloud Computing ◽

Performance Analysis ◽

Large Scale ◽

Research Issues ◽

Biometric Technology ◽

Cloud Computing Security ◽

Security Technologies ◽

Encryption Schemes ◽

Computing Platforms ◽

Cloud Auditing

Strong authentication and encryption schemes help cloud stakeholders in performing the robust and accurate cloud auditing of a potential service provider. All security-related issues and challenges, therefore, need to be addressed before a ubiquitous adoption of cloud computing. In this chapter, the authors provide an overview of existing biometrics-based security technologies and discuss some of the open research issues that need to be addressed for making biometric technology an effective tool for cloud computing security. Finally, this chapter provides a performance analysis on the use of large-scale biometrics-based authentication systems for different cloud computing platforms.

Download Full-text

Cloud Computing for Global Software Development

Transportation Systems and Engineering ◽

10.4018/978-1-4666-8473-7.ch045 ◽

2015 ◽

pp. 897-908 ◽

Cited By ~ 3

Author(s):

Thamer Al-Rousan

Keyword(s):

Cloud Computing ◽

Software Development ◽

Process Model ◽

Risk Model ◽

Cost Effective ◽

Information And Communications Technology ◽

Global Software Development ◽

Computing Paradigm ◽

It Systems ◽

Computing Platforms

The cloud computing paradigm offers an innovative and promising vision concerning Information and Communications Technology. Actually, it provides the possibility of improving IT systems management and is changing the way in which hardware and software are designed and purchased. This paper introduces challenges in Global Software Development (GSD) and application of cloud computing platforms as a solution to some problems. Even though cloud computing provides compelling benefits and cost-effective options for GSD, new risks and difficulties must be taken into account. Thus, the paper presents a study about the risk issues involved in cloud computing. It highlights the different types of risks and how their existence can affect GSD. It also proposes a new risk management process model. The risk model employs new processes for risk analysis and assessment. Its aim is to analyse cloud risks quantitatively and, consequently, prioritise them according to their impact on different GSD objectives.

Download Full-text

A Review: Map Reduce Framework for Cloud Computing

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i4.6.20224 ◽

2018 ◽

Vol 7 (4.6) ◽

pp. 13

Author(s):

Mekala Sandhya ◽

Ashish Ladda ◽

Dr. Uma N Dulhare ◽

. . ◽

. .

Keyword(s):

Data Mining ◽

Cloud Computing ◽

Distributed Computing ◽

Data Storage ◽

High Performance ◽

Large Scale ◽

Distributed Storage ◽

Large Data ◽

Mass Data ◽

Internet Information

In this generation of Internet, information and data are growing continuously. Even though various Internet services and applications. The amount of information is increasing rapidly. Hundred billions even trillions of web indexes exist. Such large data brings people a mass of information and more difficulty discovering useful knowledge in these huge amounts of data at the same time. Cloud computing can provide infrastructure for large data. Cloud computing has two significant characteristics of distributed computing i.e. scalability, high availability. The scalability can seamlessly extend to large-scale clusters. Availability says that cloud computing can bear node errors. Node failures will not affect the program to run correctly. Cloud computing with data mining does significant data processing through high-performance machine. Mass data storage and distributed computing provide a new method for mass data mining and become an effective solution to the distributed storage and efficient computing in data mining.

Download Full-text

Automated Clustering of Virtual Machines based on Correlation of Resource Usage

Journal of Communications Software and Systems ◽

10.24138/jcomss.v8i4.164 ◽

2012 ◽

Vol 8 (4) ◽

pp. 102 ◽

Cited By ~ 7

Author(s):

Claudia Canali ◽

Riccardo Lancellotti

Keyword(s):

Cloud Computing ◽

High Performance ◽

Large Scale ◽

Virtual Machines ◽

Resource Usage ◽

Cloud Data ◽

Multiple Resources ◽

Computing Paradigm ◽

Innovative Methodology ◽

Cloud Data Centers

The recent growth in demand for modern applicationscombined with the shift to the Cloud computing paradigm have led to the establishment of large-scale cloud data centers. The increasing size of these infrastructures represents a major challenge in terms of monitoring and management of the system resources. Available solutions typically consider every Virtual Machine (VM) as a black box each with independent characteristics, and face scalability issues by reducing the number of monitored resource samples, considering in most cases only average CPU usage sampled at a coarse time granularity. We claim that scalability issues can be addressed by leveraging thesimilarity between VMs in terms of resource usage patterns.In this paper we propose an automated methodology to cluster VMs depending on the usage of multiple resources, both systemand network-related, assuming no knowledge of the services executed on them. This is an innovative methodology that exploits the correlation between the resource usage to cluster together similar VMs. We evaluate the methodology through a case study with data coming from an enterprise datacenter, and we show that high performance may be achieved in automatic VMs clustering. Furthermore, we estimate the reduction in the amount of data collected, thus showing that our proposal may simplify the monitoring requirements and help administrators totake decisions on the resource management of cloud computing datacenters.

Download Full-text

Efficiency of Semantic Web Implementation on Cloud Computing: A Review

Qubahan Academic Journal ◽

10.48161/qaj.v1n3a72 ◽

2021 ◽

Vol 1 (3) ◽

pp. 1-9

Author(s):

Kazheen Ismael Taher ◽

Rezgar Hasan Saeed ◽

Rowaida Kh. Ibrahim ◽

Zryan Najat Rashid ◽

Lailan M. Haji ◽

...

Keyword(s):

Cloud Computing ◽

Semantic Web ◽

Large Scale ◽

Web Applications ◽

New Technology ◽

Cloud Services ◽

Facilities Management ◽

Heterogeneous Services ◽

Critical Components ◽

Definition Of

Semantic web and cloud technology systems have been critical components in creating and deploying applications in various fields. Although they are self-contained, they can be combined in various ways to create solutions, which has recently been discussed in depth. We have shown a dramatic increase in new cloud providers, applications, facilities, management systems, data, and so on in recent years, reaching a level of complexity that indicates the need for new technology to address such tremendous, shared, and heterogeneous services and resources. As a result, issues with portability, interoperability, security, selection, negotiation, discovery, and definition of cloud services and resources may arise. Semantic Technologies, which has enormous potential for cloud computing, is a vital way of re-examining these issues. This paper explores and examines the role of Semantic-Web Technology in the Cloud from a variety of sources. In addition, a "cloud-driven" mode of interaction illustrates how we can construct the semantic web and provide automated semantical annotations to web applications on a large scale by leveraging Cloud computing properties and advantages.

Download Full-text

Contextual Contracts for Component-Based Resource Abstraction in a Cloud of HPC Services

10.5753/wscad.2019.8670 ◽

2019 ◽

Cited By ~ 1

Author(s):

Wagner Al Alam ◽

Francisco Carvalho Junior

Keyword(s):

Cloud Computing ◽

Parallel Computing ◽

Large Scale ◽

Matrix Multiplication ◽

Small Scale ◽

Computing Systems ◽

Computing Platform ◽

Computing Platforms

The efforts to make cloud computing suitable for the requirements of HPC applications have motivated us to design HPC Shelf, a cloud computing platform of services for building and deploying parallel computing systems for large-scale parallel processing. We introduce Alite, the system of contextual contracts of HPC Shelf, aimed at selecting component implementations according to requirements of applications, features of targeting parallel computing platforms (e.g. clusters), QoS (Quality-of-Service) properties and cost restrictions. It is evaluated through a small-scale case study employing a componentbased framework for matrix-multiplication based on the BLAS library.

Download Full-text

Cloud bursting galaxy: federated identity and access management

Bioinformatics ◽

10.1093/bioinformatics/btz472 ◽

2019 ◽

Vol 36 (1) ◽

pp. 1-9 ◽

Cited By ~ 1

Author(s):

Vahid Jalili ◽

Enis Afgan ◽

James Taylor ◽

Jeremy Goecks

Keyword(s):

Cloud Computing ◽

High Speed ◽

Large Scale ◽

Best Practice ◽

Data Transfer ◽

Data Access ◽

Web Security ◽

Biomedical Data ◽

Authentication And Authorization ◽

Computing Platforms

Abstract Motivation Large biomedical datasets, such as those from genomics and imaging, are increasingly being stored on commercial and institutional cloud computing platforms. This is because cloud-scale computing resources, from robust backup to high-speed data transfer to scalable compute and storage, are needed to make these large datasets usable. However, one challenge for large-scale biomedical data on the cloud is providing secure access, especially when datasets are distributed across platforms. While there are open Web protocols for secure authentication and authorization, these protocols are not in wide use in bioinformatics and are difficult to use for even technologically sophisticated users. Results We have developed a generic and extensible approach for securely accessing biomedical datasets distributed across cloud computing platforms. Our approach combines OpenID Connect and OAuth2, best-practice Web protocols for authentication and authorization, together with Galaxy (https://galaxyproject.org), a web-based computational workbench used by thousands of scientists across the world. With our enhanced version of Galaxy, users can access and analyze data distributed across multiple cloud computing providers without any special knowledge of access/authorization protocols. Our approach does not require users to share permanent credentials (e.g. username, password, API key), instead relying on automatically generated temporary tokens that refresh as needed. Our approach is generalizable to most identity providers and cloud computing platforms. To the best of our knowledge, Galaxy is the only computational workbench where users can access biomedical datasets across multiple cloud computing platforms using best-practice Web security approaches and thereby minimize risks of unauthorized data access and credential use. Availability and implementation Freely available for academic and commercial use under the open-source Academic Free License (https://opensource.org/licenses/AFL-3.0) from the following Github repositories: https://github.com/galaxyproject/galaxy and https://github.com/galaxyproject/cloudauthz.

Download Full-text

Tuning Heterogeneous Computing Platforms for Large-Scale Hydrology Data Management

IEEE Transactions on Parallel and Distributed Systems ◽

10.1109/tpds.2015.2499741 ◽

2016 ◽

Vol 27 (9) ◽

pp. 2753-2765 ◽

Cited By ~ 2

Author(s):

Lorne Leonard ◽

Kamesh Madduri ◽

Christopher J. Duffy

Keyword(s):

Data Management ◽

Large Scale ◽

Heterogeneous Computing ◽

Computing Platforms

Download Full-text