Local cover image
Local cover image
Image from OpenLibrary

Initial data reorderering in mapreduce technique for specific data categories / Ahmed Abdelrahim Ali Eldouh ; Supervised Hatem Elkadi , Mohamed Helmy Khafagy

By: Contributor(s): Material type: TextLanguage: English Publication details: Cairo : Ahmed Abdelrahim Ali Eldouh , 2018Description: 87 Leaves : charts , facsimiles ; 30cmOther title:
  • إعادة ترتيب البيانات الاولية فى تقنية تصغير الخريطة لفئات بيانات محددة [Added title page title]
Subject(s): Online resources: Available additional physical forms:
  • Issued also as CD
Dissertation note: Thesis (M.Sc.) - Cairo University - Faculty of Computers and Information - Department of Information System Summary: The rapid increase in big data sets presents an urgent need for handling the difficulty in storing and processing of these datasets. MapReduce is a recent programming model which was initiated by Google{u2019}s Team to handle big data sets and storing. Hadoop is an open source software with an implementation of MapReduce presented by Apache. MapReduce requires a shuffling phase to exchange global the intermediate data generated by the mapping phase, but the shuffling phase in MapReduce increases the overhead on performance. In this thesis, we explore the literature on the shuffling subject and discuss previous techniques adopted to enhance the performance of MapReduce. In addition to our focus on an approach to improve the performance of MapReduce through reducing the overhead caused by shuffling phase. Improving the locality of data will lead to eliminating the network overhead in the shuffling phase for the MapReduce. We achieve this by pre-partitioning data based on query-based similarity through the TF {u2013} IDF and Cosine similarity algorithms and grouping the related queries with each other using K-means clustering algorithm. In this regard, we support HDFS with the related data and control where data are stored to collocate the related data files in the same nodes
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
Holdings
Item type Current library Home library Call number Copy number Status Barcode
Thesis قاعة الرسائل الجامعية - الدور الاول المكتبة المركزبة الجديدة - جامعة القاهرة Cai01.20.04.M.Sc.2018.Ah.I (Browse shelf(Opens below)) Not for loan 01010110078233000
CD - Rom مخـــزن الرســائل الجـــامعية - البدروم المكتبة المركزبة الجديدة - جامعة القاهرة Cai01.20.04.M.Sc.2018.Ah.I (Browse shelf(Opens below)) 78233.CD Not for loan 01020110078233000

Thesis (M.Sc.) - Cairo University - Faculty of Computers and Information - Department of Information System

The rapid increase in big data sets presents an urgent need for handling the difficulty in storing and processing of these datasets. MapReduce is a recent programming model which was initiated by Google{u2019}s Team to handle big data sets and storing. Hadoop is an open source software with an implementation of MapReduce presented by Apache. MapReduce requires a shuffling phase to exchange global the intermediate data generated by the mapping phase, but the shuffling phase in MapReduce increases the overhead on performance. In this thesis, we explore the literature on the shuffling subject and discuss previous techniques adopted to enhance the performance of MapReduce. In addition to our focus on an approach to improve the performance of MapReduce through reducing the overhead caused by shuffling phase. Improving the locality of data will lead to eliminating the network overhead in the shuffling phase for the MapReduce. We achieve this by pre-partitioning data based on query-based similarity through the TF {u2013} IDF and Cosine similarity algorithms and grouping the related queries with each other using K-means clustering algorithm. In this regard, we support HDFS with the related data and control where data are stored to collocate the related data files in the same nodes

Issued also as CD

There are no comments on this title.

to post a comment.

Click on an image to view it in the image viewer

Local cover image
Share
Cairo University Libraries Portal Implemented & Customized by: Eng. M. Mohamady Contacts: new-lib@cl.cu.edu.eg | cnul@cl.cu.edu.eg
CUCL logo CNUL logo
© All rights reserved — Cairo University Libraries
CUCL logo
Implemented & Customized by: Eng. M. Mohamady Contact: new-lib@cl.cu.edu.eg © All rights reserved — New Central Library
CNUL logo
Implemented & Customized by: Eng. M. Mohamady Contact: cnul@cl.cu.edu.eg © All rights reserved — Cairo National University Library