An enhanced deterministic error correction model For optical character recognition degraded arabic text / Mariam Adel Abdelhady Muhammad ; Supervised Mervat Gheith , Tarek Elghazaly , Mustafa Ezzat

By:

Mariam Adel Abdelhady Muhammad

Contributor(s):

Material type: Text

TextLanguage: English Publication details: Cairo : Mariam Adel Abdelhady Muhammad , 2016Description: 109 Leaves ; 30cmOther title:

نموذج محسن لتصحيح الأخطاء المحددة فى النص العربى منخفض الدقة الناتج عـن نظم التعرف الضوئية [Added title page title]

Subject(s):

Available additional physical forms:

Issued also as CD

Dissertation note: Thesis (M.Sc.) - Cairo University - Institute of Statistical Studies and Research - Department of Computer and Information Science Summary: Recently, the spell correcting of optical character recognition (OCR) has been one of the main focuses of natural language processing research. The challenges of the Arabic language and the lack of resources have made it difficult to provide Arabic OCR systems with high accuracy. Post-processing techniques are used to correct the Arabic degraded OCR text. This research presents a new correction model for Arabic OCR errors. The proposed model is mainly based on the character segmentation and the character alignment on a single character or multi-characters. This research investigates four factors can affect the proposed model: (i) the effect of increasing the size of training set, (ii) the effect of adding the training and test sets words into the dictionary to find the correct words of the candidate words, (iii) the effect of using different versions of OCR application upon testing, and (iv) the effect of using different fonts upon testing. The results show that the first and the second factors have a positive effect, but the third and the fourth factors have a negative effect on the performance of the model. Results also show that the proposed model contribute in enhancing the performance of the model

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings
Item type	Current library	Home library	Call number	Copy number	Status	Date due	Barcode
Thesis	قاعة الرسائل الجامعية - الدور الاول	المكتبة المركزبة الجديدة - جامعة القاهرة	Cai01.18.02.M.Sc.2016.Ma.O (Browse shelf(Opens below))		Not for loan		01010110071393000
CD - Rom	مخـــزن الرســائل الجـــامعية - البدروم	المكتبة المركزبة الجديدة - جامعة القاهرة	Cai01.18.02.M.Sc.2016.Ma.O (Browse shelf(Opens below))	71393.CD	Not for loan		01020110071393000

Browsing المكتبة المركزبة الجديدة - جامعة القاهرة shelves Close shelf browser (Hides shelf browser)

Previous	No cover image available	No cover image available	No cover image available	No cover image available	No cover image available	No cover image available	No cover image available	Next
Previous	Cai01.18.02.M.Sc.2016.As.A Autism classification using machine learning techniques /	Cai01.18.02.M.Sc.2016.Fa.E Enhanced authentication protocol for cloud computing /	Cai01.18.02.M.Sc.2016.Fa.E Enhanced authentication protocol for cloud computing /	Cai01.18.02.M.Sc.2016.Ma.O An enhanced deterministic error correction model For optical character recognition degraded arabic text /	Cai01.18.02.M.Sc.2016.Ma.O An enhanced deterministic error correction model For optical character recognition degraded arabic text /	Cai01.18.02.M.Sc.2016.Mo.N A new approach for requirements{u2019} prioritization /	Cai01.18.02.M.Sc.2016.Mo.N A new approach for requirements{u2019} prioritization /	Next

Thesis (M.Sc.) - Cairo University - Institute of Statistical Studies and Research - Department of Computer and Information Science

Recently, the spell correcting of optical character recognition (OCR) has been one of the main focuses of natural language processing research. The challenges of the Arabic language and the lack of resources have made it difficult to provide Arabic OCR systems with high accuracy. Post-processing techniques are used to correct the Arabic degraded OCR text. This research presents a new correction model for Arabic OCR errors. The proposed model is mainly based on the character segmentation and the character alignment on a single character or multi-characters. This research investigates four factors can affect the proposed model: (i) the effect of increasing the size of training set, (ii) the effect of adding the training and test sets words into the dictionary to find the correct words of the candidate words, (iii) the effect of using different versions of OCR application upon testing, and (iv) the effect of using different fonts upon testing. The results show that the first and the second factors have a positive effect, but the third and the fourth factors have a negative effect on the performance of the model. Results also show that the proposed model contribute in enhancing the performance of the model

Issued also as CD

There are no comments on this title.

to post a comment.