header

Arabic document layout analysis using machine learning and connected components based features / (Record no. 71181)

MARC details
000 -LEADER
fixed length control field 02425cam a2200337 a 4500
003 - CONTROL NUMBER IDENTIFIER
control field EG-GiCUC
005 - DATE AND TIME OF LATEST TRANSACTION
control field 20250223032236.0
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field 190402s2018 ua dh f m 000 0 eng d
040 ## - CATALOGING SOURCE
Original cataloging agency EG-GiCUC
Language of cataloging eng
Transcribing agency EG-GiCUC
041 0# - LANGUAGE CODE
Language code of text/sound track or separate title eng
049 ## - LOCAL HOLDINGS (OCLC)
Holding library Deposite
097 ## - Thesis Degree
Thesis Level M.Sc
099 ## - LOCAL FREE-TEXT CALL NUMBER (OCLC)
Classification number Cai01.13.08.M.Sc.2018.Ra.A
100 0# - MAIN ENTRY--PERSONAL NAME
Personal name Rana Sobhy Mostafa Saad
245 10 - TITLE STATEMENT
Title Arabic document layout analysis using machine learning and connected components based features /
Statement of responsibility, etc. Rana Sobhy Mostafa Saad ; Supervised Neamt Sayed Abdelkader , Samia Abdelrazeq Mashaly
246 15 - VARYING FORM OF TITLE
Title proper/short title تحليل هيئة الوثائق العربية باستخدام تعلم الآلة و سمات المكونات المترابطة
260 ## - PUBLICATION, DISTRIBUTION, ETC.
Place of publication, distribution, etc. Cairo :
Name of publisher, distributor, etc. Rana Sobhy Mostafa Saad ,
Date of publication, distribution, etc. 2018
300 ## - PHYSICAL DESCRIPTION
Extent 122 P. :
Other physical details charts , facsimiles ;
Dimensions 30cm
502 ## - DISSERTATION NOTE
Dissertation note Thesis (M.Sc.) - Cairo University - Faculty of Engineering - Department of Electronics and Communications
520 ## - SUMMARY, ETC.
Summary, etc. Document Layout Analysis (DLA) is a key preprocessing stage for optical character recognition (OCR). It locates and defines text and non-text regions of a document image. Arabic DLA is less addressed compared to other languages due to the lack of appropriate publicly available research datasets.A full pipeline of DLA procedure is composed of several stages: Input document Preprocessing, Document Physical layout Analysis (PLA), Document Logical Layout Analysis (LLA), and document analysis output representation. In this thesis, CCs geometric features are used to represent the Arabic document images These CCs features are classified by means of Support Vector Machines (SVM) and Random Forests (RF) classifiers into text and non-text components to perform PLA for scanned Arabic book pages. Experiments on BCE-v1, and other researcher's datasets showed remarkable performance of both the SVM and RF based solutions. Comparing to other classical and state-of-the-art systems showed much strength to the proposed system and promise further application to wider problem domains
530 ## - ADDITIONAL PHYSICAL FORM AVAILABLE NOTE
Additional physical form available note Issued also as CD
653 #4 - INDEX TERM--UNCONTROLLED
Uncontrolled term Arabic dataset
653 #4 - INDEX TERM--UNCONTROLLED
Uncontrolled term Document Layout Analysis
653 #4 - INDEX TERM--UNCONTROLLED
Uncontrolled term Page segmentation
700 0# - ADDED ENTRY--PERSONAL NAME
Personal name Neamt Sayed Abdelkader ,
Relator term
700 0# - ADDED ENTRY--PERSONAL NAME
Personal name Samia Abdelrazeq Mashaly ,
Relator term
856 ## - ELECTRONIC LOCATION AND ACCESS
Uniform Resource Identifier <a href="http://172.23.153.220/th.pdf">http://172.23.153.220/th.pdf</a>
905 ## - LOCAL DATA ELEMENT E, LDE (RLIN)
Cataloger Nazla
Reviser Revisor
905 ## - LOCAL DATA ELEMENT E, LDE (RLIN)
Cataloger Shimaa
Reviser Cataloger
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Source of classification or shelving scheme Dewey Decimal Classification
Koha item type Thesis
Holdings
Source of classification or shelving scheme Not for loan Home library Current library Date acquired Full call number Barcode Date last seen Koha item type Copy number
Dewey Decimal Classification   المكتبة المركزبة الجديدة - جامعة القاهرة قاعة الرسائل الجامعية - الدور الاول 11.02.2024 Cai01.13.08.M.Sc.2018.Ra.A 01010110077523000 22.09.2023 Thesis  
Dewey Decimal Classification   المكتبة المركزبة الجديدة - جامعة القاهرة مخـــزن الرســائل الجـــامعية - البدروم 11.02.2024 Cai01.13.08.M.Sc.2018.Ra.A 01020110077523000 22.09.2023 CD - Rom 77523.CD