Detailed Information

Cited 0 time in webofscience Cited 2 time in scopus
Metadata Downloads

FRASystem: fault tolerant system using agents in distributed computing systems

Authors
Lee, HwaMinPark, DooSoonYu, HeonChangLee, Giyeol
Issue Date
Mar-2011
Publisher
Baltzer Science Publishers B.V.
Keywords
Fault tolerance; Multi-agent system; Distributed computing system; Rollback-recovery; Garbage-collection
Citation
Cluster Computing, v.14, no.1, pp 15 - 25
Pages
11
Journal Title
Cluster Computing
Volume
14
Number
1
Start Page
15
End Page
25
URI
https://scholarworks.bwise.kr/sch/handle/2021.sw.sch/16700
DOI
10.1007/s10586-009-0095-x
ISSN
1386-7857
1573-7543
Abstract
In this paper, we present a fault tolerant and recovery system called FRASystem (Fault Tolerant & Recovery Agent System) using multi-agent in distributed computing systems. Previous rollback-recovery protocols were dependent on an inherent communication and an underlying operating system, which caused a decline of computing performance. We propose a rollback-recovery protocol that works independently on an operating system and leads to an increasing portability and extensibility. We define four types of agents: (1) a recovery agent performs a rollback-recovery protocol after a failure, (2) an information agent constructs domain knowledge as a rule of fault tolerance and information during a failure-free operation, (3) a facilitator agent controls the communication between agents, (4) a garbage collection agent performs garbage collection of the useless fault tolerance information. Since agent failures may lead to inconsistent states of a system and a domino effect, we propose an agent recovery algorithm. A garbage collection protocol addresses the performance degradation caused by the increment of saved fault tolerance information in a stable storage. We implemented a prototype of FRASystem using Java and CORBA and experimented the proposed rollback-recovery protocol. The simulations results indicate that the performance of our protocol is better than previous rollback-recovery protocols which use independent checkpointing and pessimistic message logging without using agents. Our contributions are as follows: (1) this is the first rollback-recovery protocol using agents, (2) FRASystem is not dependent on an operating system, and (3) FRASystem provides a portability and extensibility.
Files in This Item
There are no files associated with this item.
Appears in
Collections
College of Engineering > Department of Computer Software Engineering > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetrics

Total Views & Downloads

BROWSE