Normal view MARC view ISBD view

Sentiment analysis of text incorporating emojis : (Record no. 169828)

MARC details
000 -LEADER
fixed length control field	10616namaa22004091i 4500
003 - CONTROL NUMBER IDENTIFIER
control field	OSt
005 - أخر تعامل مع التسجيلة
control field	20250108125459.0
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field	241229s2024 \|\|\|a\|\|\|f m\|\|\| 000 0 eng d
040 ## - CATALOGING SOURCE
Original cataloguing agency	EG-GICUC
Language of cataloging	eng
Transcribing agency	EG-GICUC
Modifying agency	EG-GICUC
Description conventions	rda
041 0# - LANGUAGE CODE
Language code of text/sound track or separate title	eng
Language code of summary or abstract	eng
--	ara
049 ## - Acquisition Source
Acquisition Source	Deposit
082 04 - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number	005.31
092 ## - LOCALLY ASSIGNED DEWEY CALL NUMBER (OCLC)
Classification number	005.31
Edition number	21
097 ## - Degree
Degree	M.Sc
099 ## - LOCAL FREE-TEXT CALL NUMBER (OCLC)
Local Call Number	Cai01.18.02.M.Sc.2024.Mo.L
100 0# - MAIN ENTRY--PERSONAL NAME
Authority record control number or standard number	Mona Mohamed Abd ElSalam,
Preparation	preparation.
245 10 - TITLE STATEMENT
Title	Sentiment analysis of text incorporating emojis :
Remainder of title	Machine Learning Approach /
Statement of responsibility, etc.	by Mona Mohamed Abd ElSalam ; Supervised by Prof. Dr. Hesham Ahmed Hefny, Dr. Ahmed Mohammed Gadallah.
246 15 - VARYING FORM OF TITLE
Title proper/short title	: تحليل المشاعر للنص المحتوى على رموز تعبيرية
Remainder of title	/ أسلوب تعلم الآلة
264 #0 - PRODUCTION, PUBLICATION, DISTRIBUTION, MANUFACTURE, AND COPYRIGHT NOTICE
Date of production, publication, distribution, manufacture, or copyright notice	2024.
300 ## - PHYSICAL DESCRIPTION
Extent	123 leaves :
Other physical details	illustrations ;
Dimensions	30 cm. +
Accompanying material	CD.
336 ## - CONTENT TYPE
Content type term	text
Source	rda content
337 ## - MEDIA TYPE
Media type term	Unmediated
Source	rdamedia
338 ## - CARRIER TYPE
Carrier type term	volume
Source	rdacarrier
502 ## - DISSERTATION NOTE
Dissertation note	Thesis (M.Sc)-Cairo University, 2024.
504 ## - BIBLIOGRAPHY, ETC. NOTE
Bibliography, etc. note	Bibliography: pages 96-109.
520 ## - SUMMARY, ETC.
Summary, etc.	Nowadays, people use emojis in their text to communicate their sentiments or summarize their words. Prior artificial intelligence (AI) strategies only included the order of text, emoticons, pictures, or emoticons with text have always been disregarded, resulting in a slew of feelings being overlooked. Sentiment Analysis examines the problem of studying texts, like posts and reviews, uploaded by users on microblogging, social media platforms, forums, and electronic businesses, regarding the opinions they have about a product, service, event, person or idea. It is still difficult for a vast majority of tools to precisely evaluate what truly is a negative, neutral, and a positive statement especially with the extreme use of emoji shapes in the customer reviews. Accordingly, there is a need for more flexible context-sensitive sentiment analysis approaches for texts including emojis. <br/>This thesis proposed sentiment analysis approach for text aims mainly to gain benefit of mechanizing rate the assessments as unstructured information which is been an important issue today. The main Goal is approving the effectiveness of emoji on text, that is examined by using two different dataset and developing several classifiers with different matrics for each tweet only onetime and another for tweets and emoji. On other hand, the use of sarcasm is a common language phenomena in online writing that expresses personal thoughts. Sarcasm detection is crucial and advantageous for many NLP applications, including sentiment analysis, opinion mining, and advertising. So, the data is gathered, Exploring and Processing Text Data are done, different algorithms and features are applied on text, and evaluation metrics are performed. <br/> The proposed approach is implemented by different machine learning classifiers (Random Forest, Support Vector Machine, Gaussain Naïve Bayes, Logistic Regression, Gradient Boosting, K-Nearset Neighbors) for text and emoji, and for text only. Two different data sets were used to evaluate the proposed approach. <br/> The first data set is concerned with Covid-19 tweets. It contains 1000 text tweets. This data set is combined with 1000 random emojis. The data set is divided into 70% training and 30% testing. The proposed approach is used to classify the new combined tweets with accuracy of 0.95, f-score 0.95, precision 0.99 and re-call 0.96. It is found that, when the emojis are eliminated from the first data set, the classification performance became, 0.45, 0.47, 0.67 and 0.45 for Accuracy, F-score, Precision and Recall measures respectively. This ensures the benefits of adding emojis to text tweets. Another experiment has been performed to show the efficacy the proposed approach. <br/> The second data set is concerned with evaluation of airline services. It is consisted of 12000 text tweets. The proposed approach is tested to classify this data set in case of only text tweets and when combined with 12000 emojis. The result shows that when the data set is divided into 70% for training and 30% for testing is found to be: 0.96, 0.95, 0.96 and 0.96 for Accuracy, F-score, Precision and Recall measures respectively. On the other hand, when the emojis patterns are eliminated, the classification performance is reduced to: 0.39, 0.53, 0.81 and 0.41 for Accuracy, F-score, Precision and Recall measures respectively. <br/> Therefore, the experimental evaluation shows that the proposed approach of adding emojis to text tweets in quite powerful for impressing sentiment classification.
520 ## - SUMMARY, ETC.
Summary, etc.	في الوقت الحاضر، يستخدم الناس الرموز التعبيرية في نصوصهم للتعبير عن مشاعرهم أو تلخيص كلماتهم. كانت استراتيجيات الذكاء الاصطناعي السابقة تتضمن فقط ترتيب النص، أو الرموز التعبيرية، أو الصور، أو الرموز التعبيرية مع النص، والتي تم تجاهلها دائمًا، مما أدى إلى التغاضي عن عدد كبير من المشاعر. يدرس تحليل المشاعر مشكلة دراسة النصوص، مثل المنشورات والمراجعات، التي يتم تحميلها من قبل المستخدمين على المدونات الصغيرة ومنصات التواصل الاجتماعي والمنتديات والشركات الإلكترونية، فيما يتعلق بآراءهم حول منتج أو خدمة أو حدث أو شخص أو فكرة. لا يزال من الصعب على الغالبية العظمى من الأدوات إجراء تقييم دقيق لما هو بيان سلبي ومحايد وإيجابي، خاصة مع الاستخدام المفرط لأشكال الرموز التعبيرية في مراجعات العملاء. وبناءً على ذلك، هناك حاجة إلى أساليب تحليل المشاعر الأكثر مرونة والتي تراعي السياق بالنسبة للنصوص بما في ذلك الرموز التعبيرية.<br/>تهدف هذه الأطروحة إلى منهج تحليل المشاعر للنص بشكل أساسي للاستفادة من ميكنة معدل التقييمات باعتبارها معلومات غير منظمة والتي أصبحت قضية مهمة اليوم. الهدف الرئيسي هو التحقق من فعالية الرموز التعبيرية على النص، والتي يتم فحصها باستخدام مجموعتي بيانات مختلفتين وتطوير عدة مصنفات بمصفوفات مختلفة لكل تغريدة مرة واحدة فقط وأخرى للتغريدات والرموز التعبيرية. من ناحية أخرى، يعد استخدام السخرية ظاهرة لغوية شائعة في الكتابة عبر الإنترنت والتي تعبر عن الأفكار الشخصية. يعد اكتشاف السخرية أمرًا بالغ الأهمية ومفيدًا للعديد من تطبيقات البرمجة اللغوية العصبية، بما في ذلك تحليل المشاعر واستخراج الآراء والإعلانات. لذلك، يتم جمع البيانات، ويتم استكشاف البيانات النصية ومعالجتها، ويتم تطبيق خوارزميات وميزات مختلفة على النص، ويتم تنفيذ مقاييس التقييم.<br/>يتم تنفيذ النهج المقترح من خلال مصنفات مختلفة للتعلم الآلي (Random Forest، Support Vector Machine، Gaussain Naïve Bayes، Logistic Regression، Gradient Boosting، K-Nearset Neighbors) للنص والرموز التعبيرية وللنص فقط. تم استخدام مجموعتين مختلفتين من البيانات لتقييم النهج المقترح.<br/>تتعلق مجموعة البيانات الأولى بتغريدات كوفيد-19. يحتوي على 1000 تغريدة نصية. تم دمج مجموعة البيانات هذه مع 1000 رمز تعبيري عشوائي. تنقسم مجموعة البيانات إلى 70% تدريب و30% اختبار. تم استخدام الطريقة المقترحة لتصنيف التغريدات المجمعة الجديدة الضبط accuracy 0.95 وf-score 0.95 و precision 0.99 و re-call 0.96. لقد وجد أنه عند حذف الرموز التعبيرية من مجموعة البيانات الأولى، أصبح أداء التصنيف 0.45 و0.47 و0.67 و0.45 Accuracy و F-score و precision و re-call على التوالي. وهذا يضمن فوائد إضافة الرموز التعبيرية إلى التغريدات النصية. وقد تم إجراء تجربة أخرى لإظهار فعالية النهج المقترح.<br/>وتتعلق مجموعة البيانات الثانية بتقييم خدمات شركات الطيران. وهي تتألف من 12000 تغريدة نصية. تم اختبار النهج المقترح لتصنيف مجموعة البيانات هذه في حالة التغريدات النصية فقط وعند دمجها مع 12000 رمز تعبيري. تظهر النتيجة أنه عند تقسيم مجموعة البيانات إلى 70% للتدريب و30% للاختبار نجد أن 0.96 و0.95 و0.96 و0.96 لكل من Accuracy و F-score و precision و re-call على التوالي. من ناحية أخرى، عند حذف أنماط الرموز التعبيرية، يتم تقليل أداء التصنيف إلى: 0.39 و0.53 و0.81 و0.41 لكل من Accuracy و F-score و precision و re-call على التوالي.<br/>لذلك، أظهر التقييم التجريبي أن النهج المقترح لإضافة الرموز التعبيرية إلى التغريدات النصية قوي جدًا في التأثير على تصنيف المشاعر
530 ## - ADDITIONAL PHYSICAL FORM AVAILABLE NOTE
Issues CD	Issues also as CD.
546 ## - LANGUAGE NOTE
Text Language	Text in English and abstract in Arabic & English.
650 #7 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element	Machine Learning
Source of heading or term	qrmak
653 #0 - INDEX TERM--UNCONTROLLED
Uncontrolled term	sentiment analysis
--	artificial intelligence
--	text
--	emoticons
--	feelings
--	negative
--	neutral
--	a positive
--	sarcasm
--	machine learning
--	evaluation metrics
--	classifiers
700 0# - ADDED ENTRY--PERSONAL NAME
Personal name	Hesham Ahmed Hefny
Relator term	thesis advisor.
700 0# - ADDED ENTRY--PERSONAL NAME
Personal name	Ahmed Mohammed Gadallah
Relator term	thesis advisor.
900 ## - Thesis Information
Grant date	01-01-2024
Supervisory body	Hesham Ahmed Hefny
--	Ahmed Mohammed Gadallah
Universities	Cairo University
Faculties	Faculty of Graduate Studies for Statistical Research
Department	Department of Computer Sciences
905 ## - Cataloger and Reviser Names
Cataloger Name	Shimaa
Reviser Names	Huda
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Source of classification or shelving scheme	Dewey Decimal Classification
Koha item type	Thesis
Edition	21
Suppress in OPAC	No

Holdings
Source of classification or shelving scheme	Home library	Current library	Date acquired	Inventory number	Full call number	Barcode	Date last seen	Effective from	Koha item type
Dewey Decimal Classification	المكتبة المركزبة الجديدة - جامعة القاهرة	قاعة الرسائل الجامعية - الدور الاول	29.12.2024	90100	Cai01.18.02.M.Sc.2024.Mo.L	01010110090100000	29.12.2024	29.12.2024	Thesis