Advanced Machine Learning Applications Based On Speech Recognition Technology/ Hany Ahmed Sayed Mansour ; Supervisors: Prof. Dr. Mohsen A. Rashwan.

By:

Hany Ahmed Sayed Mansour [preparation.]

Contributor(s):

Mohsen A. Rashwan [thesis advisor.]

Material type: Text

TextLanguage: English Summary language: English, Arabic Producer: 2023Description: 72 pages : illustrations ; 30 cm. + CDContent type:

text

Media type:

Unmediated

Carrier type:

volume

Other title:

/تطبيقات تعلم الآلة المتقدمة بناءً على تقنية التعرف على الكلام [Added title page title]

Subject(s):

DDC classification:

621.382

Available additional physical forms:

Issues also as CD.

Dissertation note: Thesis (Ph.D)-Cairo University, 2023. Summary: Based on the nature of the speech recognition systems and their components like Acoustic modeling and language modeling, we can reuse these components in different applications and different fields. For example, acoustic modeling can be replaced by spatial model in the Optical Character Recognition (OCR) problem and the same language modeling techniques can be used in this case. Another problem is enhancing the performance of most Error-Correction (EC) algorithms that operate on genomics reads in the medical field. We can use language modeling techniques to enhance the performance of these tools. In this thesis, we are going to present different techniques of speech technologies and how we can benefit from them in different applications. Firstly, we proposed the OCR system that can deal with handwritten/typewritten. Secondly, we used language modeling techniques to automatically tune the performance-sensitive configuration parameters for EC algorithms. Using N-Gram and Recurrent neural Network (RNN) language modeling, we validate the intuition that the EC performance can be computed quantitatively and efficiently. Finally, we proposed a system that uses semi-supervised techniques to enhance the quality of speech recognition models. This system competed in an international competition (MGB5) and won the first place with word Accuracy 63% while the second place was 58%. Summary: بناءً على طبيعة أنظمة التعرف على الكلام ومكوناتها مثل النمذجة الصوتية ونمذجة اللغة، يمكننا إعادة استخدام هذه المكونات في تطبيقات مختلفة ومجالات مختلفة. على سبيل المثال، يمكن استبدال النمذجة الصوتية بالنموذج المكاني في مشكلة التعرف الضوئي على الحروف ويمكن استخدام تقنيات نمذجة اللغة نفسها في هذه الحالة. هناك مشكلة أخرى تتمثل في تحسين أداء معظم خوارزميات تصحيح الخطأEC التي تعمل على قراءة الجينوميات في المجال الطبي. يمكننا استخدام تقنيات النمذجة اللغوية لتحسين أداء هذه الأدوات. في هذه الأطروحة ، سوف نقدم تقنيات مختلفة لتقنيات الكلام وكيف يمكننا الاستفادة منها في تطبيقات مختلفة. أولاً ، اقترحنا نظام التعرف الضوئي على الحروف الذي يمكنه التعامل مع الكتابة اليدوية / المكتوبة على الآلة الكاتبة. ثانيًا ، استخدمنا تقنيات نمذجة اللغة لضبط معلمات التكوين الحساسة للأداء لخوارزميات تلقائيًا. باستخدام نمذجة لغة N-Gram والشبكة العصبية المتكررة، فإننا نتحقق من صحة الحدس القائل بأنه يمكن حساب أداء EC كميًا وفعالًا. أخيرًا ، اقترحنا نظامًا يستخدم تقنيات شبه خاضعة للإشراف لتحسين جودة نماذج التعرف على الكلام. تنافس هذا النظام في مسابقة دولية (MGB5) وفاز بالمركز الأول بدقة كلمة 63٪ بينما كان المركز الثاني 58٪.

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings
Item type	Current library	Home library	Call number	Status	Barcode
Thesis	قاعة الرسائل الجامعية - الدور الاول	المكتبة المركزبة الجديدة - جامعة القاهرة	Cai01.13.08.Ph.D.2023.Ha.A (Browse shelf(Opens below))	Not for loan	01010110090053000

Browsing المكتبة المركزبة الجديدة - جامعة القاهرة shelves Close shelf browser (Hides shelf browser)

Previous	No cover image available	No cover image available	No cover image available	No cover image available	No cover image available	No cover image available	No cover image available	Next
Previous	Cai01.13.08.Ph.D.2022.Ra.I Implementation of deep convolutionalneural networks (cnn) on fpga/cpuplatform using xilinx sdsoc /	Cai01.13.08.Ph.D.2022.Ro.C closed form analysis for indirect learning digital pre-distorter for wideband wireless communication systems	Cai01.13.08.Ph.D.2023.Ah.M MOVING OBJECT TRACKING USING WIRELESS VISUAL SENSOR NETWORKS /	Cai01.13.08.Ph.D.2023.Ha.A Advanced Machine Learning Applications Based On Speech Recognition Technology/	Cai01.13.08.Ph.D.2023.Mo.N Novel positioning techniques based on differential timing information, bearing angle and range estimation in cellular networks /	Cai01.13.08.Ph.D.2023.Ra.D Design Automation And Fpga Implementation Of Machine Learning Classifiers /	Cai01.13.08.Ph.D.2024.Ma.D Dependability of fault-tolerant fpga-based safety-critical systems /	Next

Thesis (Ph.D)-Cairo University, 2023.

Bibliography: pages 65-72.

Based on the nature of the speech recognition systems and their components like
Acoustic modeling and language modeling, we can reuse these components in
different applications and different fields. For example, acoustic modeling can be
replaced by spatial model in the Optical Character Recognition (OCR) problem
and the same language modeling techniques can be used in this case. Another
problem is enhancing the performance of most Error-Correction (EC) algorithms
that operate on genomics reads in the medical field. We can use language
modeling techniques to enhance the performance of these tools. In this thesis, we
are going to present different techniques of speech technologies and how we can
benefit from them in different applications. Firstly, we proposed the OCR system
that can deal with handwritten/typewritten. Secondly, we used language modeling
techniques to automatically tune the performance-sensitive configuration
parameters for EC algorithms. Using N-Gram and Recurrent neural Network
(RNN) language modeling, we validate the intuition that the EC performance can
be computed quantitatively and efficiently. Finally, we proposed a system that
uses semi-supervised techniques to enhance the quality of speech recognition
models. This system competed in an international competition (MGB5) and won
the first place with word Accuracy 63% while the second place was 58%.

بناءً على طبيعة أنظمة التعرف على الكلام ومكوناتها مثل النمذجة الصوتية ونمذجة اللغة، يمكننا إعادة استخدام هذه المكونات في تطبيقات مختلفة ومجالات مختلفة. على سبيل المثال، يمكن استبدال النمذجة الصوتية بالنموذج المكاني في مشكلة التعرف الضوئي على الحروف ويمكن استخدام تقنيات نمذجة اللغة نفسها في هذه الحالة. هناك مشكلة أخرى تتمثل في تحسين أداء معظم خوارزميات تصحيح الخطأEC التي تعمل على قراءة الجينوميات في المجال الطبي. يمكننا استخدام تقنيات النمذجة اللغوية لتحسين أداء هذه الأدوات. في هذه الأطروحة ، سوف نقدم تقنيات مختلفة لتقنيات الكلام وكيف يمكننا الاستفادة منها في تطبيقات مختلفة. أولاً ، اقترحنا نظام التعرف الضوئي على الحروف الذي يمكنه التعامل مع الكتابة اليدوية / المكتوبة على الآلة الكاتبة. ثانيًا ، استخدمنا تقنيات نمذجة اللغة لضبط معلمات التكوين الحساسة للأداء لخوارزميات تلقائيًا. باستخدام نمذجة لغة N-Gram والشبكة العصبية المتكررة، فإننا نتحقق من صحة الحدس القائل بأنه يمكن حساب أداء EC كميًا وفعالًا. أخيرًا ، اقترحنا نظامًا يستخدم تقنيات شبه خاضعة للإشراف لتحسين جودة نماذج التعرف على الكلام. تنافس هذا النظام في مسابقة دولية (MGB5) وفاز بالمركز الأول بدقة كلمة 63٪ بينما كان المركز الثاني 58٪.

Issues also as CD.

Text in English and abstract in Arabic & English.

There are no comments on this title.

to post a comment.

Click on an image to view it in the image viewer