A New Concurrent Checkpoint Mechanism for Embeded Multi-Core Systems

Authors

  • Jianwei Liao College of Computer and Information Science, Southwest University of China

Keywords:

Concurrent checkpoint, reduced downtime, incremental checkpoint, embedded multi-core systems

Abstract

his paper presents a new transparent, incremental, concurrent checkpoint mechanism for embedded multi-core systems. It allows the checkpointed process (also called checkpointee) to continue running without stopping while checkpoints are set to a large extent. Through tracing TLB misses to block the accesses to target memory pages first time while dumping memory pages (the most time-consuming step when setting a checkpoint). At that time, a kernel thread, called checkpointer, copies the memory access target pages to the designated memory buffer for constructing a consistent state of the checkpointee, and then resumes the memory accesses. From the experimental results, in contrast to a traditional concurrent checkpoint system, the proposed mechanism reduces the downtime of the checkpointed process by more than 10.1 %. Moreover, the incremental checkpointing functionality has been implemented in this new concurrent checkpoint mechanism as well. Compared with full checkpointing, incremental checkpointing can reduce the checkpoint time more than 95.5 % and 89.2 % while the benchmark is the matrix multiplication at the checkpoint intervals of 10 seconds and 20 seconds, respectively.

Downloads

Download data is not yet available.

Downloads

Published

2012-08-10

How to Cite

Liao, J. (2012). A New Concurrent Checkpoint Mechanism for Embeded Multi-Core Systems. COMPUTING AND INFORMATICS, 31(3), 693–709. Retrieved from https://www.cai.sk/ojs/index.php/cai/article/view/1015