Evaluation of Communication Induced Checkpointing Approaches for Reconfiguration-Based Fault-Tolerance in Embedded Systems

Belal H. Sababha ., Osamah A. Rawashdeh

Abstract


Reconfiguration-Based Fault-Tolerance is an approach to developing dependable safety-critical embedded applications, where redundant active or standby resources are used to cope with faults through a system reconfiguration at run-time. Compared to traditional hardware and software redundancy, it is a promising technique that may achieve dependability with a significant reduction in cost, size, weight, and power requirements. Reconfiguration necessitates using proper checkpointing protocols to support state reservation to ensure correct task restarts after a system reconfiguration. Communication Induced Checkpointing (CIC) protocols are well developed and understood for large parallel and information systems, but not much has been done for resource limited embedded systems. This paper implements four common CIC protocols in a resource constrained distributed embedded system with a Controller Area Network (CAN) backbone. An example feedback control system implementation is used for a case study. The four implemented protocols are described and performances are contrasted. The paper compares the protocols in terms of network bandwidth consumptions, CPU usages, checkpointing times, and checkpoint sizes in additional to the traditional measures of forced to local checkpoint rations and total number of checkpoints.

Keywords


Distributed Embedded Systems; Fault-Tolerance; Reconfiguration; Communication Induced Checkpointing; CAN.

Full Text:

PDF

Refbacks

  • There are currently no refbacks.