Exploit failure prediction for adaptive fault-tolerance in cluster computing

A Comparative Analysis of Performance of Shared Memory Cluster Computing Interconnection Systems

Journal of Computer Networks and Communications ◽

10.1155/2014/128438 ◽

2014 ◽

Vol 2014 ◽

pp. 1-9

Author(s):

Minakshi Tripathy ◽

C. R. Tripathy

Keyword(s):

Comparative Analysis ◽

Fault Tolerance ◽

Load Balancing ◽

Shared Memory ◽

Cluster Computing ◽

Distributed Shared Memory ◽

System Size ◽

Cluster Architecture ◽

Analysis Of Performance ◽

Made In

In recent past, many types of shared memory cluster computing interconnection systems have been proposed. Each of these systems has its own advantages and limitations. With the increase in system size of the cluster interconnection systems, the comparative analysis of their various performance measures becomes quite inevitable. The cluster architecture, load balancing, and fault tolerance are some of the important aspects, which need to be addressed. The comparison needs to be made in order to choose the best one for a particular application. In this paper, a detailed comparative study on four important and different classes of shared memory cluster architectures has been made. The systems taken up for the purpose of the study are shared memory clusters, hierarchical shared memory clusters, distributed shared memory clusters, and the virtual distributed shared memory clusters. These clusters are analyzed and compared on the basis of the architecture, load balancing, and fault tolerance aspects. The results of comparison are reported.

Download Full-text

Fault Tolerance in Cluster Computing System

2011 International Conference on P2P, Parallel, Grid, Cloud and Internet Computing ◽

10.1109/3pgcic.2011.77 ◽

2011 ◽

Cited By ~ 1

Author(s):

Ashwini Patil ◽

Ankit Shah ◽

Sheetal Gaikwad ◽

Akassh A. Mishra ◽

Simranjit Singh Kohli ◽

...

Keyword(s):

Fault Tolerance ◽

Cluster Computing ◽

Computing System

Download Full-text

Effective fault tolerance for agent-based cluster computing

Journal of Systems and Software ◽

10.1016/s0164-1212(99)00057-6 ◽

1999 ◽

Vol 48 (3) ◽

pp. 189-196 ◽

Cited By ~ 2

Author(s):

Kam Hong Shum

Keyword(s):

Fault Tolerance ◽

Cluster Computing ◽

Agent Based

Download Full-text

Failure Prediction Models for Proactive Fault Tolerance within Storage Systems

2008 IEEE International Symposium on Modeling, Analysis and Simulation of Computers and Telecommunication Systems ◽

10.1109/mascot.2008.4770560 ◽

2008 ◽

Cited By ~ 16

Author(s):

Ben Eckart ◽

Xin Chen ◽

Xubin He ◽

Stephen L. Scott

Keyword(s):

Fault Tolerance ◽

Prediction Models ◽

Storage Systems ◽

Failure Prediction ◽

Proactive Fault Tolerance

Download Full-text

ER-TCP: an efficient TCP fault-tolerance scheme for cluster computing

The Journal of Supercomputing ◽

10.1007/s11227-007-0123-7 ◽

2007 ◽

Vol 43 (2) ◽

pp. 127-145 ◽

Cited By ~ 2

Author(s):

Zhiyuan Shao ◽

Hai Jin ◽

Bin Cheng ◽

Wenbin Jiang

Keyword(s):

Fault Tolerance ◽

Cluster Computing

Download Full-text

Fault Tolerance for Cluster Computing Based on Functional Tasks

Euro-Par 2001 Parallel Processing - Lecture Notes in Computer Science ◽

10.1007/3-540-44681-8_102 ◽

2001 ◽

pp. 712-717

Author(s):

Wolfgang Schreiner ◽

Gabor Kusper ◽

Karoly Bosa

Keyword(s):

Fault Tolerance ◽

Cluster Computing

Download Full-text

Adaptive Fault Tolerance for Scalable Cluster Computing in Space

The International Journal of High Performance Computing Applications ◽

10.1177/1094342009106190 ◽

2009 ◽

Vol 23 (3) ◽

pp. 227-241 ◽

Cited By ~ 4

Author(s):

Mark L. James ◽

Andrew A. Shapiro ◽

Paul L. Springer ◽

Hans P. Zima

Keyword(s):

Fault Tolerance ◽

Cluster Computing ◽

Traditional Approach ◽

Deep Space ◽

California Institute Of Technology ◽

Core Technology ◽

Commercial Off The Shelf ◽

Institute Of Technology ◽

Modular Redundancy ◽

Time Critical

Future missions of deep-space exploration face the challenge of building more capable autonomous spacecraft and planetary rovers. Given the communication latencies and bandwidth limitations for such missions, the need for increased autonomy becomes mandatory, along with the requirement for enhanced on-board computational capabilities while in deep-space or time-critical situations. This will result in dramatic changes in the way missions are conducted and supported by on-board computing systems. Specifically, the traditional approach of relying exclusively on radiation-hardened hardware and modular redundancy will not be able to deliver the required computational power. As a consequence, such systems are expected to include high-capability low-power components based on emerging commercial-off-the-shelf (COTS) multi-core technology. In this paper we describe the design of a generic framework for introspection that supports runtime monitoring and analysis of program execution as well as a feedback-oriented recovery from faults. Our focus is on providing flexible software fault tolerance matched to the requirements and properties of applications by exploiting knowledge that is either contained in an application knowledge base, provided by users, or automatically derived from specifications. A prototype implementation is currently in progress at the Jet Propulsion Laboratory, California Institute of Technology, targeting a cluster of cell broadband engines.

Download Full-text

Acoustic Emission Spectrum and Failure Prediction in Loaded Coal Specimens

Физико-технические проблемы разработки полезных ископаемых ◽

10.15372/ftprpi20170503 ◽

2017 ◽

Keyword(s):

Acoustic Emission ◽

Emission Spectrum ◽

Failure Prediction

Download Full-text

Fabrication of a Slope Failure Prediction Sensor using Miniaturized EC Sensor

IEEJ Transactions on Sensors and Micromachines ◽

10.1541/ieejsmas.133.278 ◽

2013 ◽

Vol 133 (9) ◽

pp. 278-283 ◽

Cited By ~ 4

Author(s):

Masato Futagawa ◽

Mitsuru Komatsu ◽

Hikofumi Suzuki ◽

Yuji Takeshita ◽

Yasushi Fuwa ◽

...

Keyword(s):

Slope Failure ◽

Failure Prediction

Download Full-text

A Review on Load Balancing Model Using Best Partition Technique

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v7i8.69 ◽

2017 ◽

Vol 7 (8) ◽

pp. 284

Author(s):

M. Chaitanya ◽

K. Durga Charan

Keyword(s):

Cloud Computing ◽

Fault Tolerance ◽

Load Balancing ◽

Load Balance ◽

Large Impact ◽

Cloud Environment ◽

Public Cloud ◽

The Public ◽

Partition Technique ◽

Textual Content

Load balancing makes cloud computing greater knowledgeable and could increase client pleasure. At reward cloud computing is among the all most systems which offer garage of expertise in very lowers charge and available all the time over the net. However, it has extra vital hassle like security, load administration and fault tolerance. Load balancing inside the cloud computing surroundings has a large impact at the presentation. The set of regulations relates the sport idea to the load balancing manner to amplify the abilties in the public cloud environment. This textual content pronounces an extended load balance mannequin for the majority cloud concentrated on the cloud segregating proposal with a swap mechanism to select specific strategies for great occasions.

Download Full-text