Acronyms expansion disambiguation and their effect on NLP tasks / Akram Gaballah Ahmed Almatarky ; Supervised Amr Ahmed Badr , Emad Nabil Hassan

By:

Akram Gaballah Ahmed Almatarky

Contributor(s):

Material type: Text

TextLanguage: English Publication details: Cairo : Akram Gaballah Ahmed Almatarky , 2016Description: 113 P. : facsimiles ; 30cmOther title:

معرفة معاني الاختصارات و تأثيرها على مهام معالجة اللغات الطبيعية [Added title page title]

Subject(s):

Online resources:

Click here to access online

Available additional physical forms:

Issued also as CD

Dissertation note: Thesis (M.Sc.) - Cairo University - Faculty of Computers and Information - Department of Computer Science Summary: Nonstandard words such as proper nouns, abbreviations, and acronyms are a major obstacle in natural language text processing and information retrieval. Acronyms, in particular, are difficult to read and process because they are often domain specific with high degree of polysemy. In this work, we propose a language modeling approach for the automatic disambiguation of acronym senses using context information. First, a dictionary of all possible expansions of acronyms is generated automatically. The dictionary is used to search for all possible expansions or senses to expand a given acronym. The extracted dictionary consists of about 17 thousands acronym-expansion pairs defining 1,829 expansions from different fields where the average number of expansions per acronym was 9.47. Training data is automatically collected from downloaded documents identified from the results of search engine queries. The collected data is used to build a language model that models the context of each candidate expansion. The expansion context were filtered and retained only the terms that produces the highest information gain. Expansions from different acronyms were grouped together based on the similarity between their contexts. At the in-context expansion prediction phase, the relevance of acronym expansion candidates is calculated based on the similarity between the context of each specific acronym occurrence and the language model of each candidate expansion. Unlike other work in the literature, our approach has the option to reject to expand an acronym if it is not confident on disambiguation

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings
Item type	Current library	Home library	Call number	Copy number	Status	Barcode
Thesis	قاعة الرسائل الجامعية - الدور الاول	المكتبة المركزبة الجديدة - جامعة القاهرة	Cai01.20.03.M.Sc.2016.Ah.A (Browse shelf(Opens below))		Not for loan	01010110070817000
CD - Rom	مخـــزن الرســائل الجـــامعية - البدروم	المكتبة المركزبة الجديدة - جامعة القاهرة	Cai01.20.03.M.Sc.2016.Ah.A (Browse shelf(Opens below))	70817.CD	Not for loan	01020110070817000

Browsing المكتبة المركزبة الجديدة - جامعة القاهرة shelves Close shelf browser (Hides shelf browser)

Previous	No cover image available	No cover image available	No cover image available	No cover image available	No cover image available	No cover image available	No cover image available	Next
Previous	Cai01.20.03.M.Sc.2015.Wa.E Enhancement of a secure coercion-free electronic voting system /	Cai01.20.03.M.Sc.2016.Ab.F Feature-based framework for Arabic Opinion Mining /	Cai01.20.03.M.Sc.2016.Ab.F Feature-based framework for Arabic Opinion Mining /	Cai01.20.03.M.Sc.2016.Ah.A Acronyms expansion disambiguation and their effect on NLP tasks /	Cai01.20.03.M.Sc.2016.Ah.A Acronyms expansion disambiguation and their effect on NLP tasks /	Cai01.20.03.M.Sc.2016.Ah.D Dynamic modeling of users in social networks /	Cai01.20.03.M.Sc.2016.Ah.D Dynamic modeling of users in social networks /	Next

Thesis (M.Sc.) - Cairo University - Faculty of Computers and Information - Department of Computer Science

Nonstandard words such as proper nouns, abbreviations, and acronyms are a major obstacle in natural language text processing and information retrieval. Acronyms, in particular, are difficult to read and process because they are often domain specific with high degree of polysemy. In this work, we propose a language modeling approach for the automatic disambiguation of acronym senses using context information. First, a dictionary of all possible expansions of acronyms is generated automatically. The dictionary is used to search for all possible expansions or senses to expand a given acronym. The extracted dictionary consists of about 17 thousands acronym-expansion pairs defining 1,829 expansions from different fields where the average number of expansions per acronym was 9.47. Training data is automatically collected from downloaded documents identified from the results of search engine queries. The collected data is used to build a language model that models the context of each candidate expansion. The expansion context were filtered and retained only the terms that produces the highest information gain. Expansions from different acronyms were grouped together based on the similarity between their contexts. At the in-context expansion prediction phase, the relevance of acronym expansion candidates is calculated based on the similarity between the context of each specific acronym occurrence and the language model of each candidate expansion. Unlike other work in the literature, our approach has the option to reject to expand an acronym if it is not confident on disambiguation

Issued also as CD

There are no comments on this title.

to post a comment.

Click on an image to view it in the image viewer