Fault Tolerance Interface
ftiff.c File Reference

Functions for the FTI File Format (FTI-FF). More...

#include "interface.h"
Include dependency graph for ftiff.c:

Macros

#define _GNU_SOURCE
 

Functions

int FTIFF_ReadDbFTIFF (FTIT_configuration *FTI_Conf, FTIT_execution *FTI_Exec, FTIT_checkpoint *FTI_Ckpt)
 Reads datablock structure for FTI File Format from ckpt file. More...
 
int FTIFF_GetFileChecksum (FTIFF_metaInfo *FTIFF_Meta, FTIT_checkpoint *FTI_Ckpt, int fd, unsigned char *hash)
 Determines checksum of checkpoint data. More...
 
int FTIFF_UpdateDatastructFTIFF (FTIT_execution *FTI_Exec, FTIT_dataset *FTI_Data, FTIT_configuration *FTI_Conf)
 updates datablock structure for FTI File Format. More...
 
int FTIFF_WriteFTIFF (FTIT_configuration *FTI_Conf, FTIT_execution *FTI_Exec, FTIT_topology *FTI_Topo, FTIT_checkpoint *FTI_Ckpt, FTIT_dataset *FTI_Data)
 Writes ckpt to local/PFS using FTIFF. More...
 
int FTIFF_CreateMetadata (FTIT_execution *FTI_Exec, FTIT_topology *FTI_Topo, FTIT_dataset *FTI_Data, FTIT_configuration *FTI_Conf)
 Assign meta data to runtime and file meta data types. More...
 
int FTIFF_Recover (FTIT_execution *FTI_Exec, FTIT_dataset *FTI_Data, FTIT_checkpoint *FTI_Ckpt)
 Recovers protected data to the variable pointers for FTI-FF. More...
 
int FTIFF_RecoverVar (int id, FTIT_execution *FTI_Exec, FTIT_dataset *FTI_Data, FTIT_checkpoint *FTI_Ckpt)
 Recovers protected data to the variable pointer with id. More...
 
int FTIFF_CheckL1RecoverInit (FTIT_execution *FTI_Exec, FTIT_topology *FTI_Topo, FTIT_checkpoint *FTI_Ckpt)
 Init of FTI-FF L1 recovery. More...
 
int FTIFF_CheckL2RecoverInit (FTIT_execution *FTI_Exec, FTIT_topology *FTI_Topo, FTIT_checkpoint *FTI_Ckpt, int *exists)
 Init of FTI-FF L2 recovery. More...
 
int FTIFF_CheckL3RecoverInit (FTIT_execution *FTI_Exec, FTIT_topology *FTI_Topo, FTIT_checkpoint *FTI_Ckpt, int *erased)
 Init of FTI-FF L3 recovery. More...
 
int FTIFF_CheckL4RecoverInit (FTIT_execution *FTI_Exec, FTIT_topology *FTI_Topo, FTIT_checkpoint *FTI_Ckpt)
 Init of FTI-FF L4 recovery. More...
 
void FTIFF_GetHashMetaInfo (unsigned char *hash, FTIFF_metaInfo *FTIFFMeta)
 Computes hash of the FTI-FF file meta data structure. More...
 
void FTIFF_GetHashdb (unsigned char *hash, FTIFF_db *db)
 Computes hash of the FTI-FF file data block meta data structure. More...
 
void FTIFF_GetHashdbvar (unsigned char *hash, FTIFF_dbvar *dbvar)
 Computes hash of the FTI-FF data chunk meta data structure. More...
 
void FTIFF_InitMpiTypes ()
 Initializes the derived MPI data types used for FTI-FF. More...
 
int FTIFF_DeserializeFileMeta (FTIFF_metaInfo *meta, char *buffer_ser)
 deserializes FTI-FF file meta data More...
 
int FTIFF_DeserializeDbMeta (FTIFF_db *db, char *buffer_ser)
 deserializes FTI-FF file data block meta data More...
 
int FTIFF_DeserializeDbVarMeta (FTIFF_dbvar *dbvar, char *buffer_ser)
 deserializes FTI-FF data chunk meta data More...
 
int FTIFF_SerializeFileMeta (FTIFF_metaInfo *meta, char *buffer_ser)
 serializes FTI-FF file meta data More...
 
int FTIFF_SerializeDbMeta (FTIFF_db *db, char *buffer_ser)
 serializes FTI-FF file data block meta data More...
 
int FTIFF_SerializeDbVarMeta (FTIFF_dbvar *dbvar, char *buffer_ser)
 serializes FTI-FF data chunk meta data More...
 
void FTIFF_FreeDbFTIFF (FTIFF_db *last)
 Frees allocated memory for the FTI-FF meta data struct list. More...
 
void FTIFF_PrintDataStructure (int rank, FTIT_execution *FTI_Exec, FTIT_dataset *FTI_Data)
 

Variables

MPI_Datatype FTIFF_MpiTypes [FTIFF_NUM_MPI_TYPES]
 

Detailed Description

Functions for the FTI File Format (FTI-FF).

Copyright (c) 2017 Leonardo A. Bautista-Gomez All rights reserved

FTI - A multi-level checkpointing library for C/C++/Fortran applications

Revision 1.0 : Fault Tolerance Interface (FTI)

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
  2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
  3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Author
Kai Keller (kelle.nosp@m.kai@.nosp@m.gmx.d.nosp@m.e)
Date
October, 2017

Macro Definition Documentation

#define _GNU_SOURCE

Function Documentation

int FTIFF_CheckL1RecoverInit ( FTIT_execution FTI_Exec,
FTIT_topology FTI_Topo,
FTIT_checkpoint FTI_Ckpt 
)

Init of FTI-FF L1 recovery.

Parameters
FTI_ExecExecution metadata.
FTI_TopoTopology metadata.
FTI_CkptCheckpoint metadata.
Returns
integer FTI_SCES if successful.

This function initializes the L1 checkpoint recovery. It checks for erasures and loads the required meta data.

Here is the call graph for this function:

int FTIFF_CheckL2RecoverInit ( FTIT_execution FTI_Exec,
FTIT_topology FTI_Topo,
FTIT_checkpoint FTI_Ckpt,
int *  exists 
)

Init of FTI-FF L2 recovery.

Parameters
FTI_ExecExecution metadata.
FTI_TopoTopology metadata.
FTI_CkptCheckpoint metadata.
existsArray with info of erased files
Returns
integer FTI_SCES if successful.

This function initializes the L2 checkpoint recovery. It checks for erasures and loads the required meta data.

Here is the call graph for this function:

int FTIFF_CheckL3RecoverInit ( FTIT_execution FTI_Exec,
FTIT_topology FTI_Topo,
FTIT_checkpoint FTI_Ckpt,
int *  erased 
)

Init of FTI-FF L3 recovery.

Parameters
FTI_ExecExecution metadata.
FTI_TopoTopology metadata.
FTI_CkptCheckpoint metadata.
erasedArray with info of erased files
Returns
integer FTI_SCES if successful.

This function initializes the L3 checkpoint recovery. It checks for erasures and loads the required meta data.

Here is the call graph for this function:

int FTIFF_CheckL4RecoverInit ( FTIT_execution FTI_Exec,
FTIT_topology FTI_Topo,
FTIT_checkpoint FTI_Ckpt 
)

Init of FTI-FF L4 recovery.

Parameters
FTI_ExecExecution metadata.
FTI_TopoTopology metadata.
FTI_CkptCheckpoint metadata.
checksumCkpt file checksum
Returns
integer FTI_SCES if successful.

This function initializes the L4 checkpoint recovery. It checks for erasures and loads the required meta data.

Here is the call graph for this function:

int FTIFF_CreateMetadata ( FTIT_execution FTI_Exec,
FTIT_topology FTI_Topo,
FTIT_dataset FTI_Data,
FTIT_configuration FTI_Conf 
)

Assign meta data to runtime and file meta data types.

Parameters
FTI_ConfConfiguration metadata.
FTI_ExecExecution metadata.
FTI_TopoTopology metadata.
FTI_DataDataset metadata.
Returns
integer FTI_SCES if successful.

This function gathers information about the checkpoint files in the group and stores it in the respective meta data types runtime and ckpt file.

Here is the call graph for this function:

int FTIFF_DeserializeDbMeta ( FTIFF_db db,
char *  buffer_ser 
)

deserializes FTI-FF file data block meta data

Parameters
dbFTI-FF file data block meta data.
buffer_serserialized file data block meta data.

Here is the call graph for this function:

int FTIFF_DeserializeDbVarMeta ( FTIFF_dbvar dbvar,
char *  buffer_ser 
)

deserializes FTI-FF data chunk meta data

Parameters
dbvarFTI-FF data chunk meta data.
buffer_serserialized data chunk meta data.

Here is the call graph for this function:

int FTIFF_DeserializeFileMeta ( FTIFF_metaInfo meta,
char *  buffer_ser 
)

deserializes FTI-FF file meta data

Parameters
metaFTI-FF file meta data.
buffer_serserialized file meta data.

Here is the call graph for this function:

void FTIFF_FreeDbFTIFF ( FTIFF_db last)

Frees allocated memory for the FTI-FF meta data struct list.

Parameters
lastLast element in FTI-FF metadata list.
int FTIFF_GetFileChecksum ( FTIFF_metaInfo FTIFF_Meta,
FTIT_checkpoint FTI_Ckpt,
int  fd,
unsigned char *  hash 
)

Determines checksum of checkpoint data.

Parameters
FTIFF_MetaFTI-FF file meta data.
FTI_CkptCheckpoint metadata.
fdfile descriptor.
hashpointer to MD5 digest container.
Returns
integer FTI_SCES if successful.

This function computes the FTI-FF file checksum and places the MD5 digest into the 'hash' buffer. The buffer has to be allocated for at least MD5_DIGEST_LENGTH bytes.

Here is the call graph for this function:

void FTIFF_GetHashdb ( unsigned char *  hash,
FTIFF_db db 
)

Computes hash of the FTI-FF file data block meta data structure.

Parameters
hashhash to compute.
FTIFFMetafile data block meta data.
void FTIFF_GetHashdbvar ( unsigned char *  hash,
FTIFF_dbvar dbvar 
)

Computes hash of the FTI-FF data chunk meta data structure.

Parameters
hashhash to compute.
dbvardata chunk meta data.
void FTIFF_GetHashMetaInfo ( unsigned char *  hash,
FTIFF_metaInfo FTIFFMeta 
)

Computes hash of the FTI-FF file meta data structure.

Parameters
hashhash to compute.
FTIFFMetaCkpt file meta data.
void FTIFF_InitMpiTypes ( )

Initializes the derived MPI data types used for FTI-FF.

+----------------------------------------------------------------------—+ | FUNCTION DECLARATIONS | +----------------------------------------------------------------------—+

void FTIFF_PrintDataStructure ( int  rank,
FTIT_execution FTI_Exec,
FTIT_dataset FTI_Data 
)
int FTIFF_ReadDbFTIFF ( FTIT_configuration FTI_Conf,
FTIT_execution FTI_Exec,
FTIT_checkpoint FTI_Ckpt 
)

Reads datablock structure for FTI File Format from ckpt file.

+-------------------------------------------------------------------——+ | FUNCTION DEFINITIONS | +-------------------------------------------------------------------——+

Parameters
FTI_ExecExecution metadata.
FTI_CkptCheckpoint metadata.
Returns
integer FTI_SCES if successful.

Builds meta data list from checkpoint file for the FTI File Format

Here is the call graph for this function:

int FTIFF_Recover ( FTIT_execution FTI_Exec,
FTIT_dataset FTI_Data,
FTIT_checkpoint FTI_Ckpt 
)

Recovers protected data to the variable pointers for FTI-FF.

Parameters
FTI_ExecExecution metadata.
FTI_CkptCheckpoint metadata.
FTI_DataDataset metadata.
Returns
integer FTI_SCES if successful.

This function restores the data of the protected variables to the state of the last checkpoint. The function is called by the API function 'FTI_Recover'.

Here is the call graph for this function:

int FTIFF_RecoverVar ( int  id,
FTIT_execution FTI_Exec,
FTIT_dataset FTI_Data,
FTIT_checkpoint FTI_Ckpt 
)

Recovers protected data to the variable pointer with id.

Parameters
idId of protected variable.
FTI_ExecExecution metadata.
FTI_DataDataset metadata.
FTI_CkptCheckpoint metadata.
Returns
integer FTI_SCES if successful.

This function restores the data to the protected variable with given id as it was checkpointed during the last checkpoint. The function is called by the API function 'FTI_RecoverVar'.

Here is the call graph for this function:

int FTIFF_SerializeDbMeta ( FTIFF_db db,
char *  buffer_ser 
)

serializes FTI-FF file data block meta data

Parameters
dbFTI-FF file data block meta data.
buffer_serserialized file data block meta data.

Here is the call graph for this function:

int FTIFF_SerializeDbVarMeta ( FTIFF_dbvar dbvar,
char *  buffer_ser 
)

serializes FTI-FF data chunk meta data

Parameters
dbvarFTI-FF data chunk meta data.
buffer_serserialized data chunk meta data.

Here is the call graph for this function:

int FTIFF_SerializeFileMeta ( FTIFF_metaInfo meta,
char *  buffer_ser 
)

serializes FTI-FF file meta data

Parameters
metaFTI-FF file meta data.
buffer_serserialized file meta data.

Here is the call graph for this function:

int FTIFF_UpdateDatastructFTIFF ( FTIT_execution FTI_Exec,
FTIT_dataset FTI_Data,
FTIT_configuration FTI_Conf 
)

updates datablock structure for FTI File Format.

Parameters
FTI_ExecExecution metadata.
FTI_DataDataset metadata.
FTI_ConfConfiguration metadata.
Returns
integer FTI_SCES if successful.

Updates information about the checkpoint file. Updates file pointers in the dbvar structures and updates the db structure.

Here is the call graph for this function:

int FTIFF_WriteFTIFF ( FTIT_configuration FTI_Conf,
FTIT_execution FTI_Exec,
FTIT_topology FTI_Topo,
FTIT_checkpoint FTI_Ckpt,
FTIT_dataset FTI_Data 
)

Writes ckpt to local/PFS using FTIFF.

Parameters
FTI_ConfConfiguration metadata.
FTI_ExecExecution metadata.
FTI_TopoTopology metadata.
FTI_CkptCheckpoint metadata.
FTI_DataDataset metadata.
Returns
integer FTI_SCES if successful.

FTI-FF structure:

+-----------—+ +---------------------—+ | | | | | FB | | VB | | | | | +-----------—+ +---------------------—+

The FB (file block) holds meta data related to the file whereas the VB (variable block) holds meta and actual data of the variables protected by FTI.

|<---------------------------------— VB ---------------------------------—>|

|<---------— VCB_1------------—>| |<---------— VCB_n------------—>|

+--------------------------------—+ +--------------------------------—+ | +----—++----—+ +----—+ | | +----—++----—+ +----—+ | | | || | | | | | | || | | | | | | VMB_1 || VC_11 | -— | VC_1k | | -— | | VMB_n || VC_n1 | -— | VC_nl | | | | || | | | | | | || | | | | | +----—++----—+ +----—+ | | +----—++----—+ +----—+ | +--------------------------------—+ +--------------------------------—+

VMB_i (FTIFF_db + FTIFF_dbvar structures) keeps the data block metadata and VC_ij are the data chunks.

Here is the call graph for this function:

Variable Documentation

MPI_Datatype FTIFF_MpiTypes[FTIFF_NUM_MPI_TYPES]

+----------------------------------------------------------------------—+ | STATIC TYPE DECLARATIONS | +----------------------------------------------------------------------—+