A Similar Resource Auto-Discovery Based Adaptive Fault-tolerance Method for Embedded Distributed System

MobileRE: A replicas prioritized hybrid fault tolerance strategy for mobile distributed system

Journal of Systems Architecture ◽

10.1016/j.sysarc.2021.102217 ◽

2021 ◽

pp. 102217

Author(s):

Yu Wu ◽

Duo Liu ◽

Xianzhang Chen ◽

Jinting Ren ◽

Renping Liu ◽

...

Keyword(s):

Fault Tolerance ◽

Distributed System ◽

Tolerance Strategy

Download Full-text

Enhancing Fault Tolerance of Cloud Nodes using Replication Techniques

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.e5607.018520 ◽

2020 ◽

Vol 8 (5) ◽

pp. 2040-2044

Keyword(s):

Fault Tolerance ◽

High Availability ◽

The Other ◽

Cloud Infrastructure ◽

Virtual Node ◽

Adaptive Scheme ◽

Fine Grained ◽

Cloud Technologies ◽

Tolerance Method ◽

Selection Of

The cloud technologies are gaining boom in the field of information technology. But on the same side cloud computing sometimes results in failures. These failures demand more reliable frameworks with high availability of computers acting as nodes. The request made by the user is replicated and sent to various VMs. If one of the VMs fail, the other can respond to increase the reliability. A lot of research has been done and being carried out to suggest various schemes for fault tolerance thus increasing the reliability. Earlier schemes focus on only one way of dealing with faults but the scheme proposed by the the author in this paper presents an adaptive scheme that deals with the issues related to fault tolerance in various cloud infrastructure. The projected scheme uses adaptive behavior during the selection of replication and fine-grained checkpointing methods for attaining a reliable cloud infrastructure that can handle different client requirements. In addition to it the algorithm also determines the best suited fault tolerance method for every designated virtual node. Zheng, Zhou,. Lyu and I. King (2012).

Download Full-text

A Highly-Efficient Fault Tolerance Method for a Scalable Stream Processing System

Proceedings of the 2017 2nd International Conference on Control, Automation and Artificial Intelligence (CAAI 2017) ◽

10.2991/caai-17.2017.50 ◽

2017 ◽

Author(s):

Guanghui Chang ◽

Peizhen Li ◽

Guangxia Xu

Keyword(s):

Fault Tolerance ◽

Stream Processing ◽

Processing System ◽

Highly Efficient ◽

Tolerance Method

Download Full-text

CORBA Replication Support for Fault-Tolerance in a Partitionable Distributed System

17th International Workshop on Database and Expert Systems Applications (DEXA'06) ◽

10.1109/dexa.2006.44 ◽

2006 ◽

Cited By ~ 4

Author(s):

S. Beyer ◽

F.D. Munoz-Escoi ◽

P. Galdamez

Keyword(s):

Fault Tolerance ◽

Distributed System

Download Full-text

A Replication-Based Mechanism for Fault Tolerance in MapReduce Framework

Mathematical Problems in Engineering ◽

10.1155/2015/408921 ◽

2015 ◽

Vol 2015 ◽

pp. 1-7 ◽

Cited By ~ 3

Author(s):

Yang Liu ◽

Wei Wei

Keyword(s):

Fault Tolerance ◽

Large Scale ◽

Failure Time ◽

Programming Model ◽

Large Data ◽

Distributed Data ◽

Mapreduce Framework ◽

Single Node ◽

Node Failure ◽

Tolerance Method

MapReduce is a programming model and an associated implementation for processing and generating large data sets with a parallel, distributed algorithm on a cluster. In cloud environment, node and task failure are no longer accidental but a common feature of large-scale systems. Current rescheduling-based fault tolerance method in MapReduce framework failed to fully consider the location of distributed data and the computation and storage overhead of rescheduling failure tasks. Thus, a single node failure will increase the completion time dramatically. In this paper, a replication-based mechanism is proposed, which takes both task and node failure into consideration. Experimental results show that, compared with default mechanism in Hadoop, our mechanism can significantly improve the performance at failure time, with more than 30% decreasing in execution time.

Download Full-text

A reliable distributed system using dual level fault tolerance

Proceedings IEEE Southeastcon '92 ◽

10.1109/secon.1992.202268 ◽

2003 ◽

Cited By ~ 1

Author(s):

J.W. Hanna ◽

J.D. Johannes

Keyword(s):

Fault Tolerance ◽

Distributed System

Download Full-text

CONSISTENCY OF DISTRIBUTED SYSTEM WITH ACTIVE INITIATOR PROCESS WITHOUT USELESS CHECKPOINTS

International Journal of Computing ◽

10.47839/ijc.5.1.387 ◽

2014 ◽

pp. 92-99

Author(s):

N. P. Gopalan ◽

K. Nagarajan

Keyword(s):

Fault Tolerance ◽

Distributed System ◽

Message Passing ◽

Domino Effect ◽

Exchange Of Information ◽

Global Consistency ◽

Software Fault Tolerance ◽

Software Fault ◽

The One ◽

Active Initiator

Checkpointing mechanism is the one of the best attractive approach for providing software fault tolerance in distributed message passing systems. This paper aims to implement a distributed checkpointing technique, which eliminates the drawbacks of the centralized approach like “domino effect”, “useless checkpoint” (checkpoints that do not contribute to global consistency), and “hidden and zigzag” dependencies. The proposed checkpointing protocol has a checkpoint initiator, but, coordination among the local checkpoints is done in a distributed fashion. This guaranty that no message would be lost in case of failure occurs, has been maintained in this work by exchange of information among the processes. However, there is no central checkpoint initiator, but each of the processes takes turn to act as an initiator. Processes take local checkpoints only after being notified by the initiator. The processes synchronize their activities of the current checkpointing interval before finally committing their checkpoints. Thus, the checkpointing pattern described in this paper takes only those checkpoints that will contribute to the consistent global snapshot thereby eliminating the number of useless checkpoints.

Download Full-text

Novelty circular neighboring technique using reactive fault tolerance method

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v9i6.pp5211-5217 ◽

2019 ◽

Vol 9 (6) ◽

pp. 5211

Author(s):

Ahmad Shukri Mohd Noor ◽

Nur Farhah Mat Zian ◽

Noor Hafhizah Abd Rahim ◽

Rabiei Mamat ◽

Wan Nur Amira Wan Azman

Keyword(s):

Fault Tolerance ◽

Data Replication ◽

Data Availability ◽

Complex Environment ◽

Tolerance Mechanism ◽

System Availability ◽

Replication Technique ◽

The Right ◽

Tolerance Method ◽

Reactive Method

The availability of the data in a distributed system can be increase by implementing fault tolerance mechanism in the system. Reactive method in fault tolerance mechanism deals with restarting the failed services, placing redundant copies of data in multiple nodes across network, in other words data replication and migrating the data for recovery. Even if the idea of data replication is solid, the challenge is to choose the right replication technique that able to provide better data availability as well as consistency that involves read and write operations on the redundant copies. Circular Neighboring Replication (CNR) technique exploits neighboring policy in replicating the data items in the system performs well with regards to lower copies needed to maintain the system availability at the highest. In a performance analysis with existing techniques, results show that CNR improves system availability by average 37% by offering only two replicas needed to maintain data availability and consistency. The study demonstrates the possibility of the proposed technique and the potential of deploying in larger and complex environment.

Download Full-text