Fault tolerance scheme for some mathematical models /
مخطط للتسامح مع الخطأ لبعض النماذج الرياضية
Omnia Ismail Mohammad Ismail ; Supervised Laila F. Abdelal , Nasser H. Sweilam , Hatem M. Moharram
- Cairo : Omnia Ismail Mohammad Ismail , 2015
- 86 P. : charts ; 25cm
Thesis (M.Sc.) - Cairo University - Faculty of Science - Department of Mathematics
This thesis has two purposes, the rst purpose is to study the numerical solution of fractional order dierential equations using computer cluster machines then measure the eciency of the solution algorithm when applied on computer cluster machines using a parallel programming model. The second purpose is to detect and handle faults that may occur during the implementation of the solution algorithm. In this thesis, a parallel Crank-Nicholson nite dierence method (P - CN - FDM) is presented for solving time - fractional parabolic equation using distributed memory systems. The resultant large sparse system of equations is solved using a parallel preconditioned conjugate gradient algorithm (PPCG) that is implemented using a two level parallel programming model. A series of tests has been carried out on a Linux PC cluster using dierent problem sizes and dierent number of processes and nodes. The proposed algorithm in this the- sis has a great performance enhancement with respect to the total execution time and memory utilization in comparison with a previously proposed techniques. An online algorithm based fault tolerance technique (online ABFT) for detecting soft errors in Krylov osed technique is explained using the preconditioned conjugate gradient method (PCG). Experimental results showed a good enhancement in the execution time when compared with disk-based checkpointing technique
Algorithm based fault tolerance Diskless checkpointing Fault tolerance