Ahmed Ibrahim Moussa Hussein Ali

Improving the automatic summarization of Arabic text depending on rhetorical structure theory / تحسين التلخيص الآلي للنصوص العربية اعتمادًا على نظرية التركيب البياني Ahmed Ibrahim Moussa Hussein Ali ; Supervised Mervat Gheith , Laila Nassef , Tarek Elghazaly - Cairo : Ahmed Ibrahim Moussa Hussein Ali , 2014 - 114 Leaves : charts ; 25cm

Thesis (M.Sc.) - Cairo University - Institute of Statistical Studies and Research - Department of Computer and Information Sciences

Nowadays, numerous documents, reports and articles are available in a digital form. Consequently, search engines retrieve an abundance of information. Besides, an overwhelming number of emails and documents floods users and agencies. Therefore, such retrieved documents need to be summarized. In this information explosion, the automatic text summarization proves to be an essential tool. Nevertheless, the key problem with the automatic text summarization process is that the target-summarized text is incoherent and deviates from the context of the original text. This problem emerges when statistical techniques are used for summarization. This thesis uses a semantic technique by adopting a Rhetorical Structure Theory. RST is a descriptive theory for a major aspect of the organization of natural texts. It extracts the semantics behind the text by identifying the most significant parts thereof. Here comes the role of this thesis as it introduces an infrastructure for applying RST to Arabic by collecting the Arabic rhetorical relations from different resources to build the rhetorical structure theory. However, the quality of RST summarization suffers when dealing with large documents



Arabic text Rhetorical structure theory RST