Phoenix: Making Data-Intensive Grid Applications Fault-Tolerant

Fault tolerant parallel data-intensive algorithms

2012 19th International Conference on High Performance Computing ◽

10.1109/hipc.2012.6507503 ◽

2012 ◽

Cited By ~ 1

Author(s):

Mucahid Kutlu ◽

Gagan Agrawal ◽

Oguz Kurt

Keyword(s):

Fault Tolerant ◽

Data Intensive ◽

Parallel Data

Download Full-text

A data intensive distributed computing architecture for “Grid” applications

Future Generation Computer Systems ◽

10.1016/s0167-739x(99)00142-9 ◽

2000 ◽

Vol 16 (5) ◽

pp. 473-481 ◽

Cited By ~ 22

Author(s):

Brian Tierney ◽

William Johnston ◽

Jason Lee ◽

Mary Thompson

Keyword(s):

Distributed Computing ◽

Computing Architecture ◽

Data Intensive ◽

Grid Applications

Download Full-text

Towards a Service-based Collaborative Framework for Data-intensive Grid Applications

11th International Conference on Parallel and Distributed Systems (ICPADS'05) ◽

10.1109/icpads.2005.280 ◽

2006 ◽

Cited By ~ 1

Author(s):

Hsi-Min Chen ◽

Chao-Chin Chang ◽

Jan-Jan Wu ◽

Chien-Min Wang ◽

Chun-Chen Hsu

Keyword(s):

Data Intensive ◽

Grid Applications

Download Full-text

Fault-tolerant scheduling using primary-backup approach for optical grid applications

10.1117/12.852964 ◽

2009 ◽

Author(s):

Min Zhu ◽

Shilin Xiao ◽

Wei Guo ◽

Anne Wei ◽

Yaohui Jin ◽

...

Keyword(s):

Fault Tolerant ◽

Grid Applications ◽

Optical Grid

Download Full-text

Fault-Tolerant and Data-Intensive Resource Scheduling and Management for Scientific Applications in Cloud Computing

Sensors ◽

10.3390/s21217238 ◽

2021 ◽

Vol 21 (21) ◽

pp. 7238

Author(s):

Zulfiqar Ahmad ◽

Ali Imran Jehangiri ◽

Mohammed Alaa Ala’anzy ◽

Mohamed Othman ◽

Arif Iqbal Umar

Keyword(s):

Cloud Computing ◽

Fault Tolerant ◽

Research Work ◽

Resource Scheduling ◽

Scientific Workflow ◽

Scientific Workflows ◽

Scientific Applications ◽

Data Intensive ◽

Computing Paradigm ◽

Cost Constraints

Cloud computing is a fully fledged, matured and flexible computing paradigm that provides services to scientific and business applications in a subscription-based environment. Scientific applications such as Montage and CyberShake are organized scientific workflows with data and compute-intensive tasks and also have some special characteristics. These characteristics include the tasks of scientific workflows that are executed in terms of integration, disintegration, pipeline, and parallelism, and thus require special attention to task management and data-oriented resource scheduling and management. The tasks executed during pipeline are considered as bottleneck executions, the failure of which result in the wholly futile execution, which requires a fault-tolerant-aware execution. The tasks executed during parallelism require similar instances of cloud resources, and thus, cluster-based execution may upgrade the system performance in terms of make-span and execution cost. Therefore, this research work presents a cluster-based, fault-tolerant and data-intensive (CFD) scheduling for scientific applications in cloud environments. The CFD strategy addresses the data intensiveness of tasks of scientific workflows with cluster-based, fault-tolerant mechanisms. The Montage scientific workflow is considered as a simulation and the results of the CFD strategy were compared with three well-known heuristic scheduling policies: (a) MCT, (b) Max-min, and (c) Min-min. The simulation results showed that the CFD strategy reduced the make-span by 14.28%, 20.37%, and 11.77%, respectively, as compared with the existing three policies. Similarly, the CFD reduces the execution cost by 1.27%, 5.3%, and 2.21%, respectively, as compared with the existing three policies. In case of the CFD strategy, the SLA is not violated with regard to time and cost constraints, whereas it is violated by the existing policies numerous times.

Download Full-text

Analytical Design of the Dis Architecture: The Hybrid Model

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.d1454.039520 ◽

2020 ◽

Vol 9 (5) ◽

pp. 1032-1036

Keyword(s):

Architectural Design ◽

Fault Tolerant ◽

Current Model ◽

Data Sets ◽

Data Intensive Computing ◽

Data Intensive ◽

Sustainable Technologies ◽

Huge Data ◽

Design Changes ◽

Physical Infrastructure

In the last decades, and due to emergence of Internet appliance, there is a strategical increase in the usage of data which had a high impact on the storage and mining technologies. It is also observed that the scientific/research field’s produces the zig-zag structure of data viz., structured, semi-structured, and unstructured data. Comparably, processing of such data is relatively increased due to rugged requirements. There are sustainable technologies to address the challenges and to expedite scalable services via effective physical infrastructure (in terms of mining), smart networking solutions, and useful software approaches. Indeed, the Cloud computing aims at data-intensive computing, by facilitating scalable processing of huge data. But still, the problem remains unaddressed with reference to huge data and conversely the data is growing exponentially faster. At this juncture, the recommendable algorithm is, the well-known model i.e., MapReduce, to compress the huge and voluminous data. Conceptualization of any problem with the current model is, less fault-tolerant and reliability, which may be surmounted by Hadoop architecture. On Contrary case, Hadoop is fault tolerant, and has the high throughput which is recommendable for applications having huge volume of data sets, file system requiring the streaming access. The paper examines and unravels, what efficient architectural/design changes are necessary to bring the benefits of the Everest model, HBase algorithm, and the existing MR algorithms.

Download Full-text