Arabic document layout analysis using machine learning and connected components based features / (Record no. 71181)
[ view plain ]
000 -LEADER | |
---|---|
fixed length control field | 02425cam a2200337 a 4500 |
003 - CONTROL NUMBER IDENTIFIER | |
control field | EG-GiCUC |
005 - DATE AND TIME OF LATEST TRANSACTION | |
control field | 20250223032236.0 |
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION | |
fixed length control field | 190402s2018 ua dh f m 000 0 eng d |
040 ## - CATALOGING SOURCE | |
Original cataloging agency | EG-GiCUC |
Language of cataloging | eng |
Transcribing agency | EG-GiCUC |
041 0# - LANGUAGE CODE | |
Language code of text/sound track or separate title | eng |
049 ## - LOCAL HOLDINGS (OCLC) | |
Holding library | Deposite |
097 ## - Thesis Degree | |
Thesis Level | M.Sc |
099 ## - LOCAL FREE-TEXT CALL NUMBER (OCLC) | |
Classification number | Cai01.13.08.M.Sc.2018.Ra.A |
100 0# - MAIN ENTRY--PERSONAL NAME | |
Personal name | Rana Sobhy Mostafa Saad |
245 10 - TITLE STATEMENT | |
Title | Arabic document layout analysis using machine learning and connected components based features / |
Statement of responsibility, etc. | Rana Sobhy Mostafa Saad ; Supervised Neamt Sayed Abdelkader , Samia Abdelrazeq Mashaly |
246 15 - VARYING FORM OF TITLE | |
Title proper/short title | تحليل هيئة الوثائق العربية باستخدام تعلم الآلة و سمات المكونات المترابطة |
260 ## - PUBLICATION, DISTRIBUTION, ETC. | |
Place of publication, distribution, etc. | Cairo : |
Name of publisher, distributor, etc. | Rana Sobhy Mostafa Saad , |
Date of publication, distribution, etc. | 2018 |
300 ## - PHYSICAL DESCRIPTION | |
Extent | 122 P. : |
Other physical details | charts , facsimiles ; |
Dimensions | 30cm |
502 ## - DISSERTATION NOTE | |
Dissertation note | Thesis (M.Sc.) - Cairo University - Faculty of Engineering - Department of Electronics and Communications |
520 ## - SUMMARY, ETC. | |
Summary, etc. | Document Layout Analysis (DLA) is a key preprocessing stage for optical character recognition (OCR). It locates and defines text and non-text regions of a document image. Arabic DLA is less addressed compared to other languages due to the lack of appropriate publicly available research datasets.A full pipeline of DLA procedure is composed of several stages: Input document Preprocessing, Document Physical layout Analysis (PLA), Document Logical Layout Analysis (LLA), and document analysis output representation. In this thesis, CCs geometric features are used to represent the Arabic document images These CCs features are classified by means of Support Vector Machines (SVM) and Random Forests (RF) classifiers into text and non-text components to perform PLA for scanned Arabic book pages. Experiments on BCE-v1, and other researcher's datasets showed remarkable performance of both the SVM and RF based solutions. Comparing to other classical and state-of-the-art systems showed much strength to the proposed system and promise further application to wider problem domains |
530 ## - ADDITIONAL PHYSICAL FORM AVAILABLE NOTE | |
Additional physical form available note | Issued also as CD |
653 #4 - INDEX TERM--UNCONTROLLED | |
Uncontrolled term | Arabic dataset |
653 #4 - INDEX TERM--UNCONTROLLED | |
Uncontrolled term | Document Layout Analysis |
653 #4 - INDEX TERM--UNCONTROLLED | |
Uncontrolled term | Page segmentation |
700 0# - ADDED ENTRY--PERSONAL NAME | |
Personal name | Neamt Sayed Abdelkader , |
Relator term | |
700 0# - ADDED ENTRY--PERSONAL NAME | |
Personal name | Samia Abdelrazeq Mashaly , |
Relator term | |
856 ## - ELECTRONIC LOCATION AND ACCESS | |
Uniform Resource Identifier | <a href="http://172.23.153.220/th.pdf">http://172.23.153.220/th.pdf</a> |
905 ## - LOCAL DATA ELEMENT E, LDE (RLIN) | |
Cataloger | Nazla |
Reviser | Revisor |
905 ## - LOCAL DATA ELEMENT E, LDE (RLIN) | |
Cataloger | Shimaa |
Reviser | Cataloger |
942 ## - ADDED ENTRY ELEMENTS (KOHA) | |
Source of classification or shelving scheme | Dewey Decimal Classification |
Koha item type | Thesis |
Source of classification or shelving scheme | Not for loan | Home library | Current library | Date acquired | Full call number | Barcode | Date last seen | Koha item type | Copy number |
---|---|---|---|---|---|---|---|---|---|
Dewey Decimal Classification | المكتبة المركزبة الجديدة - جامعة القاهرة | قاعة الرسائل الجامعية - الدور الاول | 11.02.2024 | Cai01.13.08.M.Sc.2018.Ra.A | 01010110077523000 | 22.09.2023 | Thesis | ||
Dewey Decimal Classification | المكتبة المركزبة الجديدة - جامعة القاهرة | مخـــزن الرســائل الجـــامعية - البدروم | 11.02.2024 | Cai01.13.08.M.Sc.2018.Ra.A | 01020110077523000 | 22.09.2023 | CD - Rom | 77523.CD |