Exploit failure prediction for adaptive fault-tolerance in cluster computing

Author(s):  
Yawei Li ◽  
Zhiling Lan
2014 ◽  
Vol 2014 ◽  
pp. 1-9
Author(s):  
Minakshi Tripathy ◽  
C. R. Tripathy

In recent past, many types of shared memory cluster computing interconnection systems have been proposed. Each of these systems has its own advantages and limitations. With the increase in system size of the cluster interconnection systems, the comparative analysis of their various performance measures becomes quite inevitable. The cluster architecture, load balancing, and fault tolerance are some of the important aspects, which need to be addressed. The comparison needs to be made in order to choose the best one for a particular application. In this paper, a detailed comparative study on four important and different classes of shared memory cluster architectures has been made. The systems taken up for the purpose of the study are shared memory clusters, hierarchical shared memory clusters, distributed shared memory clusters, and the virtual distributed shared memory clusters. These clusters are analyzed and compared on the basis of the architecture, load balancing, and fault tolerance aspects. The results of comparison are reported.


Author(s):  
Ashwini Patil ◽  
Ankit Shah ◽  
Sheetal Gaikwad ◽  
Akassh A. Mishra ◽  
Simranjit Singh Kohli ◽  
...  

2007 ◽  
Vol 43 (2) ◽  
pp. 127-145 ◽  
Author(s):  
Zhiyuan Shao ◽  
Hai Jin ◽  
Bin Cheng ◽  
Wenbin Jiang

Author(s):  
Mark L. James ◽  
Andrew A. Shapiro ◽  
Paul L. Springer ◽  
Hans P. Zima

Future missions of deep-space exploration face the challenge of building more capable autonomous spacecraft and planetary rovers. Given the communication latencies and bandwidth limitations for such missions, the need for increased autonomy becomes mandatory, along with the requirement for enhanced on-board computational capabilities while in deep-space or time-critical situations. This will result in dramatic changes in the way missions are conducted and supported by on-board computing systems. Specifically, the traditional approach of relying exclusively on radiation-hardened hardware and modular redundancy will not be able to deliver the required computational power. As a consequence, such systems are expected to include high-capability low-power components based on emerging commercial-off-the-shelf (COTS) multi-core technology. In this paper we describe the design of a generic framework for introspection that supports runtime monitoring and analysis of program execution as well as a feedback-oriented recovery from faults. Our focus is on providing flexible software fault tolerance matched to the requirements and properties of applications by exploiting knowledge that is either contained in an application knowledge base, provided by users, or automatically derived from specifications. A prototype implementation is currently in progress at the Jet Propulsion Laboratory, California Institute of Technology, targeting a cluster of cell broadband engines.


2013 ◽  
Vol 133 (9) ◽  
pp. 278-283 ◽  
Author(s):  
Masato Futagawa ◽  
Mitsuru Komatsu ◽  
Hikofumi Suzuki ◽  
Yuji Takeshita ◽  
Yasushi Fuwa ◽  
...  

Author(s):  
M. Chaitanya ◽  
K. Durga Charan

Load balancing makes cloud computing greater knowledgeable and could increase client pleasure. At reward cloud computing is among the all most systems which offer garage of expertise in very lowers charge and available all the time over the net. However, it has extra vital hassle like security, load administration and fault tolerance. Load balancing inside the cloud computing surroundings has a large impact at the presentation. The set of regulations relates the sport idea to the load balancing manner to amplify the abilties in the public cloud environment. This textual content pronounces an extended load balance mannequin for the majority cloud concentrated on the cloud segregating proposal with a swap mechanism to select specific strategies for great occasions.


Sign in / Sign up

Export Citation Format

Share Document