Fault Tolerance Interface
|
Post-checkpointing functions for the FTI library. More...
#include "interface.h"
Functions | |
int | FTI_Local (FTIT_configuration *FTI_Conf, FTIT_execution *FTI_Exec, FTIT_topology *FTI_Topo, FTIT_checkpoint *FTI_Ckpt) |
It returns FTI_SCES. More... | |
int | FTI_SendCkpt (FTIT_configuration *FTI_Conf, FTIT_execution *FTI_Exec, FTIT_checkpoint *FTI_Ckpt, int destination, int postFlag) |
It sends Ckpt file. More... | |
int | FTI_RecvPtner (FTIT_configuration *FTI_Conf, FTIT_execution *FTI_Exec, FTIT_checkpoint *FTI_Ckpt, int source, int postFlag) |
It receives Ptner file. More... | |
int | FTI_Ptner (FTIT_configuration *FTI_Conf, FTIT_execution *FTI_Exec, FTIT_topology *FTI_Topo, FTIT_checkpoint *FTI_Ckpt) |
It copies ckpt. files in to the partner node. More... | |
int | FTI_RSenc (FTIT_configuration *FTI_Conf, FTIT_execution *FTI_Exec, FTIT_topology *FTI_Topo, FTIT_checkpoint *FTI_Ckpt) |
It performs RS encoding with the ckpt. files in to the group. More... | |
int | FTI_Flush (FTIT_configuration *FTI_Conf, FTIT_execution *FTI_Exec, FTIT_topology *FTI_Topo, FTIT_checkpoint *FTI_Ckpt, int level) |
It flushes the local ckpt. files in to the PFS. More... | |
int | FTI_ArchiveL4Ckpt (FTIT_configuration *FTI_Conf, FTIT_execution *FTI_Exec, FTIT_checkpoint *FTI_Ckpt, FTIT_topology *FTI_Topo) |
It moves the level 4 ckpt. to the archive folder. More... | |
int | FTI_FlushPosix (FTIT_configuration *FTI_Conf, FTIT_execution *FTI_Exec, FTIT_topology *FTI_Topo, FTIT_checkpoint *FTI_Ckpt, int level) |
It flushes the local ckpt. files in to the PFS using POSIX. More... | |
int | FTI_FlushMPI (FTIT_configuration *FTI_Conf, FTIT_execution *FTI_Exec, FTIT_topology *FTI_Topo, FTIT_checkpoint *FTI_Ckpt, int level) |
It flushes the local ckpt. files in to the PFS using MPI-I/O. More... | |
int | FTI_FlushSionlib (FTIT_configuration *FTI_Conf, FTIT_execution *FTI_Exec, FTIT_topology *FTI_Topo, FTIT_checkpoint *FTI_Ckpt, int level) |
It flushes the local ckpt. files in to the PFS using SIONlib. More... | |
Post-checkpointing functions for the FTI library.
Copyright (c) 2017 Leonardo A. Bautista-Gomez All rights reserved
FTI - A multi-level checkpointing library for C/C++/Fortran applications
Revision 1.0 : Fault Tolerance Interface (FTI)
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
int FTI_ArchiveL4Ckpt | ( | FTIT_configuration * | FTI_Conf, |
FTIT_execution * | FTI_Exec, | ||
FTIT_checkpoint * | FTI_Ckpt, | ||
FTIT_topology * | FTI_Topo | ||
) |
It moves the level 4 ckpt. to the archive folder.
FTI_Conf | Configuration metadata. |
FTI_Exec | Execution metadata. |
FTI_Topo | Topology metadata. |
FTI_Ckpt | Checkpoint metadata. |
This function is called if keepL4Ckpt is enabled in the configuration file. It moves the old level 4 ckpt file to the archive folder before the l4 folder in the global directory is deleted.
int FTI_Flush | ( | FTIT_configuration * | FTI_Conf, |
FTIT_execution * | FTI_Exec, | ||
FTIT_topology * | FTI_Topo, | ||
FTIT_checkpoint * | FTI_Ckpt, | ||
int | level | ||
) |
It flushes the local ckpt. files in to the PFS.
FTI_Conf | Configuration metadata. |
FTI_Exec | Execution metadata. |
FTI_Topo | Topology metadata. |
FTI_Ckpt | Checkpoint metadata. |
level | The level from which ckpt. files are flushed. |
This function flushes the local checkpoint files in to the PFS.
FTI_Flush is either executed by application processes during FTI_Finalize or by the heads during FTI_PostCkpt.
int FTI_FlushMPI | ( | FTIT_configuration * | FTI_Conf, |
FTIT_execution * | FTI_Exec, | ||
FTIT_topology * | FTI_Topo, | ||
FTIT_checkpoint * | FTI_Ckpt, | ||
int | level | ||
) |
It flushes the local ckpt. files in to the PFS using MPI-I/O.
FTI_Conf | Configuration metadata. |
FTI_Exec | Execution metadata. |
FTI_Topo | Topology metadata. |
FTI_Ckpt | Checkpoint metadata. |
level | The level from which ckpt. files are flushed. |
This function flushes the local checkpoint files in to the PFS.
int FTI_FlushPosix | ( | FTIT_configuration * | FTI_Conf, |
FTIT_execution * | FTI_Exec, | ||
FTIT_topology * | FTI_Topo, | ||
FTIT_checkpoint * | FTI_Ckpt, | ||
int | level | ||
) |
It flushes the local ckpt. files in to the PFS using POSIX.
FTI_Conf | Configuration metadata. |
FTI_Exec | Execution metadata. |
FTI_Topo | Topology metadata. |
FTI_Ckpt | Checkpoint metadata. |
level | The level from which ckpt. files are flushed. |
This function flushes the local checkpoint files in to the PFS.
int FTI_FlushSionlib | ( | FTIT_configuration * | FTI_Conf, |
FTIT_execution * | FTI_Exec, | ||
FTIT_topology * | FTI_Topo, | ||
FTIT_checkpoint * | FTI_Ckpt, | ||
int | level | ||
) |
It flushes the local ckpt. files in to the PFS using SIONlib.
FTI_Conf | Configuration metadata. |
FTI_Exec | Execution metadata. |
FTI_Topo | Topology metadata. |
FTI_Ckpt | Checkpoint metadata. |
level | The level from which ckpt. files are flushed. |
This function flushes the local checkpoint files in to the PFS.
int FTI_Local | ( | FTIT_configuration * | FTI_Conf, |
FTIT_execution * | FTI_Exec, | ||
FTIT_topology * | FTI_Topo, | ||
FTIT_checkpoint * | FTI_Ckpt | ||
) |
It returns FTI_SCES.
FTI_Conf | Configuration metadata. |
FTI_Exec | Execution metadata. |
FTI_Topo | Topology metadata. |
FTI_Ckpt | Checkpoint metadata. |
This function just returns FTI_SCES to have homogeneous code.
int FTI_Ptner | ( | FTIT_configuration * | FTI_Conf, |
FTIT_execution * | FTI_Exec, | ||
FTIT_topology * | FTI_Topo, | ||
FTIT_checkpoint * | FTI_Ckpt | ||
) |
It copies ckpt. files in to the partner node.
FTI_Conf | Configuration metadata. |
FTI_Exec | Execution metadata. |
FTI_Topo | Topology metadata. |
FTI_Ckpt | Checkpoint metadata. |
This function copies the checkpoint files into the partner node. It follows a ring, where the ring size is the group size given in the FTI configuration file.
int FTI_RecvPtner | ( | FTIT_configuration * | FTI_Conf, |
FTIT_execution * | FTI_Exec, | ||
FTIT_checkpoint * | FTI_Ckpt, | ||
int | source, | ||
int | postFlag | ||
) |
It receives Ptner file.
FTI_Conf | Configuration metadata. |
FTI_Exec | Execution metadata. |
FTI_Ckpt | Checkpoint metadata. |
source | souce group rank |
postFlag | 0 if postckpt done by approc, > 0 if by head |
This function receives ckpt file from partner process and saves it as Ptner file. Partner should call FTI_SendCkpt to send file.
int FTI_RSenc | ( | FTIT_configuration * | FTI_Conf, |
FTIT_execution * | FTI_Exec, | ||
FTIT_topology * | FTI_Topo, | ||
FTIT_checkpoint * | FTI_Ckpt | ||
) |
It performs RS encoding with the ckpt. files in to the group.
FTI_Conf | Configuration metadata. |
FTI_Exec | Execution metadata. |
FTI_Topo | Topology metadata. |
FTI_Ckpt | Checkpoint metadata. |
This function performs the Reed-Solomon encoding for a given group. The checkpoint files are padded to the maximum size of the largest checkpoint file in the group +- the extra space to be a multiple of block size.
int FTI_SendCkpt | ( | FTIT_configuration * | FTI_Conf, |
FTIT_execution * | FTI_Exec, | ||
FTIT_checkpoint * | FTI_Ckpt, | ||
int | destination, | ||
int | postFlag | ||
) |
It sends Ckpt file.
FTI_Conf | Configuration metadata. |
FTI_Exec | Execution metadata. |
FTI_Ckpt | Checkpoint metadata. |
destination | destination group rank |
postFlag | 0 if postckpt done by approc, > 0 if by head |
This function sends ckpt file to partner process. Partner should call FTI_RecvPtner to receive this file.