Intermediate data management for map-reduce system / Marwah Nihad Abdullah Mohammed ; Supervised Fatma A. Omara , Mohamed H. Khafagy

By:

Marwah Nihad Abdullah Mohammed

Contributor(s):

Material type: Text

TextLanguage: English Publication details: Cairo : Marwah Nihad Abdullah Mohammed , 2015Description: 110 P. : charts ; 30cmOther title:

(MapReduce) إدارة البيانات الوسيطة لنظام خريطة الحد [Added title page title]

Subject(s):

Available additional physical forms:

Issued also as CD

Dissertation note: Thesis (M.Sc.) - Cairo University - Faculty of Computers and Information - Department of Computer Science Summary: Analyzing Big Data has emerged as a significant activity for many organizations. This Big Data analysis is simplified by the MapReduce framework and execution environment, such as Hadoop and parallel systems, such as Hive. On the other, most of the MapReduce users have a complex query analysis that has expressed as individual MapReduce jobs. By using high-level query languages such as Pig, Hive, and Jaql, the user complex query expresses into Workflow s of MapReduce jobs. The work in this thesis concerns about how to reuse the previous results in the hive output file in the same or different sessions to improve the Hive performance. This has been done by introducing two algorithms. First called HOME (HiveQL Optimization in Multi-Session Environment). To evaluate our first developed HOME algorithm, it has implemented using 19 Different SQL Statement to reduce I/O in MapReduce Job. By developing HOME algorithm, a new HiveQL execution architecture based on materialized previous results has proposed

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings
Item type	Current library	Home library	Call number	Copy number	Status	Date due	Barcode
Thesis	قاعة الرسائل الجامعية - الدور الاول	المكتبة المركزبة الجديدة - جامعة القاهرة	Cai01.20.03.M.Sc.2015.Ma.I (Browse shelf(Opens below))		Not for loan		01010110067766000
CD - Rom	مخـــزن الرســائل الجـــامعية - البدروم	المكتبة المركزبة الجديدة - جامعة القاهرة	Cai01.20.03.M.Sc.2015.Ma.I (Browse shelf(Opens below))	67766.CD	Not for loan		01020110067766000

Thesis (M.Sc.) - Cairo University - Faculty of Computers and Information - Department of Computer Science

Analyzing Big Data has emerged as a significant activity for many organizations. This Big Data analysis is simplified by the MapReduce framework and execution environment, such as Hadoop and parallel systems, such as Hive. On the other, most of the MapReduce users have a complex query analysis that has expressed as individual MapReduce jobs. By using high-level query languages such as Pig, Hive, and Jaql, the user complex query expresses into Workflow s of MapReduce jobs. The work in this thesis concerns about how to reuse the previous results in the hive output file in the same or different sessions to improve the Hive performance. This has been done by introducing two algorithms. First called HOME (HiveQL Optimization in Multi-Session Environment). To evaluate our first developed HOME algorithm, it has implemented using 19 Different SQL Statement to reduce I/O in MapReduce Job. By developing HOME algorithm, a new HiveQL execution architecture based on materialized previous results has proposed

Issued also as CD

There are no comments on this title.

to post a comment.