header
Image from OpenLibrary

An enhanced deterministic error correction model For optical character recognition degraded arabic text / Mariam Adel Abdelhady Muhammad ; Supervised Mervat Gheith , Tarek Elghazaly , Mustafa Ezzat

By: Contributor(s): Material type: TextTextLanguage: English Publication details: Cairo : Mariam Adel Abdelhady Muhammad , 2016Description: 109 Leaves ; 30cmOther title:
  • نموذج محسن لتصحيح الأخطاء المحددة فى النص العربى منخفض الدقة الناتج عـن نظم التعرف الضوئية [Added title page title]
Subject(s): Available additional physical forms:
  • Issued also as CD
Dissertation note: Thesis (M.Sc.) - Cairo University - Institute of Statistical Studies and Research - Department of Computer and Information Science Summary: Recently, the spell correcting of optical character recognition (OCR) has been one of the main focuses of natural language processing research. The challenges of the Arabic language and the lack of resources have made it difficult to provide Arabic OCR systems with high accuracy. Post-processing techniques are used to correct the Arabic degraded OCR text. This research presents a new correction model for Arabic OCR errors. The proposed model is mainly based on the character segmentation and the character alignment on a single character or multi-characters. This research investigates four factors can affect the proposed model: (i) the effect of increasing the size of training set, (ii) the effect of adding the training and test sets words into the dictionary to find the correct words of the candidate words, (iii) the effect of using different versions of OCR application upon testing, and (iv) the effect of using different fonts upon testing. The results show that the first and the second factors have a positive effect, but the third and the fourth factors have a negative effect on the performance of the model. Results also show that the proposed model contribute in enhancing the performance of the model
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
Holdings
Item type Current library Home library Call number Copy number Status Date due Barcode
Thesis Thesis قاعة الرسائل الجامعية - الدور الاول المكتبة المركزبة الجديدة - جامعة القاهرة Cai01.18.02.M.Sc.2016.Ma.O (Browse shelf(Opens below)) Not for loan 01010110071393000
CD - Rom CD - Rom مخـــزن الرســائل الجـــامعية - البدروم المكتبة المركزبة الجديدة - جامعة القاهرة Cai01.18.02.M.Sc.2016.Ma.O (Browse shelf(Opens below)) 71393.CD Not for loan 01020110071393000

Thesis (M.Sc.) - Cairo University - Institute of Statistical Studies and Research - Department of Computer and Information Science

Recently, the spell correcting of optical character recognition (OCR) has been one of the main focuses of natural language processing research. The challenges of the Arabic language and the lack of resources have made it difficult to provide Arabic OCR systems with high accuracy. Post-processing techniques are used to correct the Arabic degraded OCR text. This research presents a new correction model for Arabic OCR errors. The proposed model is mainly based on the character segmentation and the character alignment on a single character or multi-characters. This research investigates four factors can affect the proposed model: (i) the effect of increasing the size of training set, (ii) the effect of adding the training and test sets words into the dictionary to find the correct words of the candidate words, (iii) the effect of using different versions of OCR application upon testing, and (iv) the effect of using different fonts upon testing. The results show that the first and the second factors have a positive effect, but the third and the fourth factors have a negative effect on the performance of the model. Results also show that the proposed model contribute in enhancing the performance of the model

Issued also as CD

There are no comments on this title.

to post a comment.