An efficient replication technique for improving availability in hadoop distributed file system / Eyman Saleh Ali Abdanabi Abid ; Supervised Fatma A. Omara , Mohamed H. Khafagy

By:

Eyman Saleh Ali Abdanabi Abid

Contributor(s):

Material type: Text

TextLanguage: English Publication details: Cairo : Eyman Saleh Ali Abdanabi Abid , 2016Description: 87 Leaves : charts , facsimiles ; 30cmOther title:

(Hadoop) إيجاد اسلوب فعال للنسخ المتماثل لتحسين الاتاحة في نظام الملفات الموزعة [Added title page title]

Subject(s):

Online resources:

Click here to access online

Available additional physical forms:

Issued also as CD

Dissertation note: Thesis (M.Sc.) - Cairo University - Faculty of Computers and Information - Department of Computer Science Summary: The Hadoop Distributed File System (HDFS) is a core component of Apache Hadoop. In the recent years, the HDFS becomes the most popular file system for Big Data Computing due to its availability and fault-tolerance. HDFS is designed to store, analysis, transfer massive data sets reliably, stream it at high bandwidth to the user applications, provides high throughput access to application data and it is suitable for applications that have large data sets. HDFS is a variant of the Google File System (GFS). It handles fault tolerance by using data replication, where each data block is replicated and stored on multiple DataNodes. Therefore, the HDFS supports reliability and availability. The existing implementation of the HDFS in Hadoop performs replication in a pipelined manner that takes much time for replication. This kind of pipelined replication scheme affects the performance of file write operation because of the time overhead. The work in this thesis concerns about improving the HDFS replication

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings
Item type	Current library	Home library	Call number	Copy number	Status	Barcode
Thesis	قاعة الرسائل الجامعية - الدور الاول	المكتبة المركزبة الجديدة - جامعة القاهرة	Cai01.20.03.M.Sc.2016.Ey.E (Browse shelf(Opens below))		Not for loan	01010110070050000
CD - Rom	مخـــزن الرســائل الجـــامعية - البدروم	المكتبة المركزبة الجديدة - جامعة القاهرة	Cai01.20.03.M.Sc.2016.Ey.E (Browse shelf(Opens below))	70050.CD	Not for loan	01020110070050000

Thesis (M.Sc.) - Cairo University - Faculty of Computers and Information - Department of Computer Science

The Hadoop Distributed File System (HDFS) is a core component of Apache Hadoop. In the recent years, the HDFS becomes the most popular file system for Big Data Computing due to its availability and fault-tolerance. HDFS is designed to store, analysis, transfer massive data sets reliably, stream it at high bandwidth to the user applications, provides high throughput access to application data and it is suitable for applications that have large data sets. HDFS is a variant of the Google File System (GFS). It handles fault tolerance by using data replication, where each data block is replicated and stored on multiple DataNodes. Therefore, the HDFS supports reliability and availability. The existing implementation of the HDFS in Hadoop performs replication in a pipelined manner that takes much time for replication. This kind of pipelined replication scheme affects the performance of file write operation because of the time overhead. The work in this thesis concerns about improving the HDFS replication

Issued also as CD

There are no comments on this title.

to post a comment.

Click on an image to view it in the image viewer