header
Image from OpenLibrary

Intermediate data management for map-reduce system / Marwah Nihad Abdullah Mohammed ; Supervised Fatma A. Omara , Mohamed H. Khafagy

By: Contributor(s): Material type: TextTextLanguage: English Publication details: Cairo : Marwah Nihad Abdullah Mohammed , 2015Description: 110 P. : charts ; 30cmOther title:
  • (MapReduce) إدارة البيانات الوسيطة لنظام خريطة الحد [Added title page title]
Subject(s): Available additional physical forms:
  • Issued also as CD
Dissertation note: Thesis (M.Sc.) - Cairo University - Faculty of Computers and Information - Department of Computer Science Summary: Analyzing Big Data has emerged as a significant activity for many organizations. This Big Data analysis is simplified by the MapReduce framework and execution environment, such as Hadoop and parallel systems, such as Hive. On the other, most of the MapReduce users have a complex query analysis that has expressed as individual MapReduce jobs. By using high-level query languages such as Pig, Hive, and Jaql, the user complex query expresses into Workflow s of MapReduce jobs. The work in this thesis concerns about how to reuse the previous results in the hive output file in the same or different sessions to improve the Hive performance. This has been done by introducing two algorithms. First called HOME (HiveQL Optimization in Multi-Session Environment). To evaluate our first developed HOME algorithm, it has implemented using 19 Different SQL Statement to reduce I/O in MapReduce Job. By developing HOME algorithm, a new HiveQL execution architecture based on materialized previous results has proposed
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
Holdings
Item type Current library Home library Call number Copy number Status Date due Barcode
Thesis Thesis قاعة الرسائل الجامعية - الدور الاول المكتبة المركزبة الجديدة - جامعة القاهرة Cai01.20.03.M.Sc.2015.Ma.I (Browse shelf(Opens below)) Not for loan 01010110067766000
CD - Rom CD - Rom مخـــزن الرســائل الجـــامعية - البدروم المكتبة المركزبة الجديدة - جامعة القاهرة Cai01.20.03.M.Sc.2015.Ma.I (Browse shelf(Opens below)) 67766.CD Not for loan 01020110067766000

Thesis (M.Sc.) - Cairo University - Faculty of Computers and Information - Department of Computer Science

Analyzing Big Data has emerged as a significant activity for many organizations. This Big Data analysis is simplified by the MapReduce framework and execution environment, such as Hadoop and parallel systems, such as Hive. On the other, most of the MapReduce users have a complex query analysis that has expressed as individual MapReduce jobs. By using high-level query languages such as Pig, Hive, and Jaql, the user complex query expresses into Workflow s of MapReduce jobs. The work in this thesis concerns about how to reuse the previous results in the hive output file in the same or different sessions to improve the Hive performance. This has been done by introducing two algorithms. First called HOME (HiveQL Optimization in Multi-Session Environment). To evaluate our first developed HOME algorithm, it has implemented using 19 Different SQL Statement to reduce I/O in MapReduce Job. By developing HOME algorithm, a new HiveQL execution architecture based on materialized previous results has proposed

Issued also as CD

There are no comments on this title.

to post a comment.