Mobile Agent Based Fault-Tolerance Support for the Reliable Mobile Computing Systems

Author(s):  
Taesoon Park
Author(s):  
R. Raje ◽  
J. Gandhamaneni ◽  
A. Olson ◽  
B. Bryant

For reasons of economy and scalability, many of the current distributed computing systems (DCSs) are realized as an integration of prefabricated and deployed components offering specific services. A critical task that the assembler of such a system needs to address is to locate and select appropriate components scattered over a network. This requires solving many research challenges. These include: (a) deployment of components and their specifications, (b) efficient searching for and gathering of appropriate specifications, (c) representation of queries, and (d) semantics of matching between queries and specifications. UniFrame (Raje, Auguston, Bryant, Olson, & Burt, 2001) is a framework that allows the seamless discovery and integration of such distributed software components. It addresses three key research issues: (1) architecture-based interoperability, (2) distributed discovery of resources, and (3) quality validation. This article presents a mobile-agent-based discovery service, which is one of the alternatives developed under research issue (2).


CGS-accumulation (Consistent Global State Accumulation) is one of the commonly used method to provide fault tolerance in distributed systems so that the system can operate even if one or more components have failed. However, mobile computing systems are constrained by low bandwidth, mobility, lack of stable storage, frequent disconnections and limited battery life. Hence CGS- accumulation etiquettes which have lesser reinstatement- points are favored in mobile environment. In this paper, we propose a minimum-method coordinated CGS-accumulation etiquette for deterministic distributed applications on mobile computing systems. We eliminate useless reinstatement-points as well as blocking of methods during reinstatement-points at the cost of logging anti- messages of very few messages during CGS-accumulation. We also try to minimize the loss of CGS-accumulation effort when any method miscarries to capture its reinstatement-point in an instigation. In this way, we take care of excessive disappointments during CGS-accumulation. We make logging of anti-messages of very few messages only during CGS-accumulation. We also strive to minimize loss of CGS-accumulation effort.


2011 ◽  
Vol 21 (04) ◽  
pp. 379-396 ◽  
Author(s):  
BLESSON VARGHESE ◽  
GERARD MCKEE ◽  
VASSIL ALEXANDROV

The work reported in this paper is motivated towards validating an alternative approach for fault tolerance over traditional methods like checkpointing that constrain efficacious fault tolerance. Can agent intelligence be used to achieve fault tolerant parallel computing systems? If so, "What agent capabilities are required for fault tolerance?", "What parallel computational tasks can benefit from such agent capabilities?" and "How can agent capabilities be implemented for fault tolerance?" need to be addressed. Cognitive capabilities essential for achieving fault tolerance through agents are considered. Parallel reduction algorithms are identified as a class of algorithms that can benefit from cognitive agent capabilities. The Message Passing Interface is utilized for implementing an intelligent agent based approach. Preliminary results obtained from the experiments validate the feasibility of an agent based approach for achieving fault tolerance in parallel computing systems.


Sign in / Sign up

Export Citation Format

Share Document