Model for Anticipating Frank Failures in Computing Grids
Ramadane Adamou Yougouda, Vivient Corneille Kamla, Laurent Bitjoka
Computing grids are infrastructures that provide almost infinite computing capacities, they are now
used in all fields, from the study of pandemics to the monitoring of rocket trajectories and the study of
meteorological and climatic phenomena. They have a distributed and heterogeneous architecture that
gives them unlimited computing performance. They are made up of several computing nodes that are
subject to failures like frank failures. A frank failure in a computing grid is an abnormal and unexpected
interruption of a node. Many frank failure tolerance protocols have been proposed in the literature
but none of these protocols integrate the anticipation of frank failures. The objective of this article is
to propose a model based on the PDEVS formalism of frank failure tolerance in a computing grid that
allows the anticipation of frank failures. The proposed model relies on the temperature variation of
electronic components and on the state of the hard disk through the values provided by SMART data
to predict a probable frank failure of a node. The results of the simulations on the different scenarios
that we have carried out show that our results provide better performances than those proposed in the
literature when the number of nodes to tolerate is greater than 200.