Mohammed Ramadan,

Employing Machine Learning in Query Optimization / توظيف تعلم الاله في امثلية الاستعلام / by Mohammed Ramadan ; Supervision Prof. Dr. Ihab Ezzat, Prof. Dr. Hoda M. O. Mokhtar, Dr. Ayman Elkilany. - 71 leaves : illustrations ; 30 cm. + CD.

Thesis (M.Sc)-Cairo University, 2024.

Bibliography: pages 71-66.

With the current availability of massive datasets and scalability requirements,
different systems are required to provide their users with the best performance
possible in terms of speed. On the physical level, performance can be translated into
queries’ execution time in database management systems(DBMS). Queries have to
execute efficiently (i.e. in minimum time) to meet users’ needs, which puts an
excessive burden on the DBMS. In this thesis, we mainly focus on enhancing the
query optimizer, which is one of the main components in DBMS that is responsible
for choosing the optimal query execution plan and consequently determines the query
execution time. Inspired by recent research in reinforcement learning in different
domains, this thesis proposes Deep Reinforcement Learning Based Query Optimizer
(RL_QOptimizer), a new approach to find the best policy for join order in the query
plan which depends solely on the reward system of reinforcement learning. The
experimental results show a notable advantage of the proposed approach against the
existing query optimization model of PostgreSQL DBMS. However, changes in the
data distribution can make trained reinforcement learning models outdated, resulting
in longer execution times. To address such a challenge, the thesis also proposes an
online training strategy in order to extend the existing reinforcement learning models
and improve their adaptation when the data distribution changes. مع الازدياد المستمر في حجم قواعد البيانات والحاجة إلى التوسع، أصبح من الضروري تطوير أنظمة توفر أفضل أداء ممكن من حيث السرعة. على المستوى المادي، يمكن ترجمة الأداء إلى وقت تنفيذ الاستعلامات في أنظمة إدارة قواعد البيانات (DBMS). يتعين تنفيذ الاستعلامات بأعلى درجات الكفاءة، أي في أقل وقت ممكن، لتلبية المتطلبات المتزايدة للمستخدمين، الأمر الذي يفرض ضغوطاً كبيرة على أنظمة إدارة قواعد البيانات. في هذه الرسالة، نولي اهتمامًا خاصًا لتطوير وتحسين مُحسِّن الاستعلام، العنصر الأساسي في أنظمة إدارة قواعد البيانات وهو المسؤول عن انتقاء أنسب خطة لتنفيذ الاستعلامات، مما يؤثر بشكل مباشر على مدة تنفيذ هذه الاستعلامات. مستوحاة من التطورات الأخيرة في مجال التعلم المعزز (Reinforcement Learning) في مجالات مختلفة، تقترح هذه الأطروحة مُحسِّن الاستعلام القائم على التعلم المعزز العميق (RL_QOptimizer)، وهو نهج جديد يعتمد على نظام المكافآت في التعلم المعزز لتحديد أفضل طريقة لترتيب ربط الجداول في خطة الاستعلام. تُظهر النتائج التجريبية تفوقًا واضحًا للنهج المقترح مقارنةً بنموذج تحسين الاستعلام في نظام PostgreSQL DBMS. ومع ذلك، فإن التغييرات في توزيع البيانات يمكن أن تجعل نماذج التعلم المعزز المدربة قديمة، مما ينتج عنه أوقات تنفيذ أطول. لمواجهة هذا التحدي، تقترح الأطروحة أيضًا استراتيجية تدريب مباشر لتعزيز قدرات نماذج التعلم المعزز القائمة وتحسين قدرتها على التكيف مع التغيرات في توزيع البيانات.

Text in English and abstract in Arabic & English.

Subjects--Topical Terms:
Machine Learning

Subjects--Index Terms: Join Ordering Problem Query Execution Plan Query Optimization

Dewey Class. No.: 005.31