header
Local cover image
Local cover image
Image from OpenLibrary

Acronyms expansion disambiguation and their effect on NLP tasks / Akram Gaballah Ahmed Almatarky ; Supervised Amr Ahmed Badr , Emad Nabil Hassan

By: Contributor(s): Material type: TextTextLanguage: English Publication details: Cairo : Akram Gaballah Ahmed Almatarky , 2016Description: 113 P. : facsimiles ; 30cmOther title:
  • معرفة معاني الاختصارات و تأثيرها على مهام معالجة اللغات الطبيعية [Added title page title]
Subject(s): Online resources: Available additional physical forms:
  • Issued also as CD
Dissertation note: Thesis (M.Sc.) - Cairo University - Faculty of Computers and Information - Department of Computer Science Summary: Nonstandard words such as proper nouns, abbreviations, and acronyms are a major obstacle in natural language text processing and information retrieval. Acronyms, in particular, are difficult to read and process because they are often domain specific with high degree of polysemy. In this work, we propose a language modeling approach for the automatic disambiguation of acronym senses using context information. First, a dictionary of all possible expansions of acronyms is generated automatically. The dictionary is used to search for all possible expansions or senses to expand a given acronym. The extracted dictionary consists of about 17 thousands acronym-expansion pairs defining 1,829 expansions from different fields where the average number of expansions per acronym was 9.47. Training data is automatically collected from downloaded documents identified from the results of search engine queries. The collected data is used to build a language model that models the context of each candidate expansion. The expansion context were filtered and retained only the terms that produces the highest information gain. Expansions from different acronyms were grouped together based on the similarity between their contexts. At the in-context expansion prediction phase, the relevance of acronym expansion candidates is calculated based on the similarity between the context of each specific acronym occurrence and the language model of each candidate expansion. Unlike other work in the literature, our approach has the option to reject to expand an acronym if it is not confident on disambiguation
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
Holdings
Item type Current library Home library Call number Copy number Status Barcode
Thesis Thesis قاعة الرسائل الجامعية - الدور الاول المكتبة المركزبة الجديدة - جامعة القاهرة Cai01.20.03.M.Sc.2016.Ah.A (Browse shelf(Opens below)) Not for loan 01010110070817000
CD - Rom CD - Rom مخـــزن الرســائل الجـــامعية - البدروم المكتبة المركزبة الجديدة - جامعة القاهرة Cai01.20.03.M.Sc.2016.Ah.A (Browse shelf(Opens below)) 70817.CD Not for loan 01020110070817000

Thesis (M.Sc.) - Cairo University - Faculty of Computers and Information - Department of Computer Science

Nonstandard words such as proper nouns, abbreviations, and acronyms are a major obstacle in natural language text processing and information retrieval. Acronyms, in particular, are difficult to read and process because they are often domain specific with high degree of polysemy. In this work, we propose a language modeling approach for the automatic disambiguation of acronym senses using context information. First, a dictionary of all possible expansions of acronyms is generated automatically. The dictionary is used to search for all possible expansions or senses to expand a given acronym. The extracted dictionary consists of about 17 thousands acronym-expansion pairs defining 1,829 expansions from different fields where the average number of expansions per acronym was 9.47. Training data is automatically collected from downloaded documents identified from the results of search engine queries. The collected data is used to build a language model that models the context of each candidate expansion. The expansion context were filtered and retained only the terms that produces the highest information gain. Expansions from different acronyms were grouped together based on the similarity between their contexts. At the in-context expansion prediction phase, the relevance of acronym expansion candidates is calculated based on the similarity between the context of each specific acronym occurrence and the language model of each candidate expansion. Unlike other work in the literature, our approach has the option to reject to expand an acronym if it is not confident on disambiguation

Issued also as CD

There are no comments on this title.

to post a comment.

Click on an image to view it in the image viewer

Local cover image