header
Image from OpenLibrary

Standardization, enrichment, and computerization of Arabic-English Dictionaries / Diaa Eldin Mohamed Mohamed Elsayed Abofayed ; Supervised Aly Aly Fahmy , Mohsen Abdelrazek Rashwan , Wafaa Kamel Fayed

By: Contributor(s): Material type: TextTextLanguage: English Publication details: Cairo : Diaa Eldin Mohamed Mohamed Elsayed Abofayed , 2016Description: 135 Leaves : charts ; 30cmOther title:
  • التوحيد القياسى و إثراء و ميكنة القاموس العربى - الإنجليزى [Added title page title]
Subject(s): Available additional physical forms:
  • Issued also as CD
Dissertation note: Thesis (Ph.D.) - Cairo University - Faculty of Computer and Information - Department of Computer Science Summary: Building software systems or applications for natural language processing (NLP) often requires large and rich quantities of information and that information is stored in a lexicon, lexical database, or knowledge base. The manual construction of these lexicons or databases requires linguistic experts and takes long time, large cost, and considerable effort. Subsequently, the need and necessity of automatic methods for constructing lexicons or lexical databases from Machine Readable Dictionaries (MRDs) is emerged. This research field goes in to two directions: computerizing the traditional dictionaries directly and extracting linguistic information to build lexicons or databases. In the beginning of MRD research, the ultimate goal was to convert any traditional dictionary into standalone complete lexicon or lexical knowledge base. Unfortunately, MRD research concluded that MRDs are neither efficient nor sufficient, in terms of the quantity and quality of information, to build standalone complete lexicon or lexical knowledge base. Consequently, lexicon should be built from more than one lexical resource such as dictionaries, corpora, etc. Furthermore, the need of diversity resources to build lexicon leads to the necessity of sharing, integration, and standardization of these resources. Although MRD research has been weakened and the research interest is shifted to the corpora as better resources for lexical information, the MRD research is still important for linguistically poor languages such as Arabic. Furthermore, Arabic has a special reason for use MRD research; the Arabic language has rich and huge heritage of old and modern traditional dictionaries that need to be structured, computerized, and standardized. This thesis proposes and implements a general methodology of computerization, enrichment, and standardization for Arabic dictionaries in general and Arabic-English dictionaries in particular. The study includes three tasks: (a) structuring definitions of an Arabic-English dictionary and extracting lexical information from these definitions; (b) enrichment of linguistic information in the definitions which already automated in the first task, including supplement incomplete information or supplying new information; and (c) ISO standardization for all information in dictionary definitions as well as enriching information
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
Holdings
Item type Current library Home library Call number Copy number Status Date due Barcode
Thesis Thesis قاعة الرسائل الجامعية - الدور الاول المكتبة المركزبة الجديدة - جامعة القاهرة Cai01.20.03.Ph.D.2016.Di.S (Browse shelf(Opens below)) Not for loan 01010110069581000
CD - Rom CD - Rom مخـــزن الرســائل الجـــامعية - البدروم المكتبة المركزبة الجديدة - جامعة القاهرة Cai01.20.03.Ph.D.2016.Di.S (Browse shelf(Opens below)) 69581.CD Not for loan 01020110069581000

Thesis (Ph.D.) - Cairo University - Faculty of Computer and Information - Department of Computer Science

Building software systems or applications for natural language processing (NLP) often requires large and rich quantities of information and that information is stored in a lexicon, lexical database, or knowledge base. The manual construction of these lexicons or databases requires linguistic experts and takes long time, large cost, and considerable effort. Subsequently, the need and necessity of automatic methods for constructing lexicons or lexical databases from Machine Readable Dictionaries (MRDs) is emerged. This research field goes in to two directions: computerizing the traditional dictionaries directly and extracting linguistic information to build lexicons or databases. In the beginning of MRD research, the ultimate goal was to convert any traditional dictionary into standalone complete lexicon or lexical knowledge base. Unfortunately, MRD research concluded that MRDs are neither efficient nor sufficient, in terms of the quantity and quality of information, to build standalone complete lexicon or lexical knowledge base. Consequently, lexicon should be built from more than one lexical resource such as dictionaries, corpora, etc. Furthermore, the need of diversity resources to build lexicon leads to the necessity of sharing, integration, and standardization of these resources. Although MRD research has been weakened and the research interest is shifted to the corpora as better resources for lexical information, the MRD research is still important for linguistically poor languages such as Arabic. Furthermore, Arabic has a special reason for use MRD research; the Arabic language has rich and huge heritage of old and modern traditional dictionaries that need to be structured, computerized, and standardized. This thesis proposes and implements a general methodology of computerization, enrichment, and standardization for Arabic dictionaries in general and Arabic-English dictionaries in particular. The study includes three tasks: (a) structuring definitions of an Arabic-English dictionary and extracting lexical information from these definitions; (b) enrichment of linguistic information in the definitions which already automated in the first task, including supplement incomplete information or supplying new information; and (c) ISO standardization for all information in dictionary definitions as well as enriching information

Issued also as CD

There are no comments on this title.

to post a comment.