header

Towards holistic technique for a completely Arabic word OCR system /

Farhan Mohammed Ali Nashwan

Towards holistic technique for a completely Arabic word OCR system / نحو تقنية شاملة لنظام متكامل للتعرف الضوئى على الكلمات العربية Farhan Mohammed Ali Nashwan ; Supervised Mohsen A. Rashwan - Cairo : Farhan Mohammed Ali Nashwan , 2014 - 105 P. ; 30cm

Thesis (Ph.D.) - Cairo University - Faculty of Engineering - Department of Electronics and Communication

Firstly, a simple Holistic approach for Arabic OCR is presented to capture total information for the whole Arabic word to reduce the possible vocabulary for the OCR classifier engine. A clustering accuracy of 99.% is achieved through selecting few word candidates (within average 115 words per cluster) from a large lexicon of more than 356K words. This vocabulary size has a good coverage for the Arabic Language. This means that the problem facing the OCR classifier is tremendously reduced, and much higher accuracy can be expected for the OCR systems. Secondly, we have implemented an Arabic OCR system using the Holistic approach. A preliminary Arabic OCR system based on the holistic approach that is font size independent achieved good results



Data reduction Discrete cosine transform Holistic Arabic word OCR system