Haidi Emade El-Dean Hassan,

Enhanced deep domain adaptation techniques for text classification / نظام محسن لتهيئة وتكيف التعليم العميق في تصنيف النصوص / by Haidi Emade El-Dean Hassan ; Supervision Prof. Magda Fayek, Prof. Nayer Wanas. - 103 pages : illustrations ; 30 cm. + CD.

Thesis (Ph.D)-Cairo University, 2024.

Bibliography: pages 92-103.

Recent advancements in domain adaptation have demonstrated signiﬁcant
success in transferring knowledge from a source domain to a target domain,
particularly in unsupervised domain adaptation (UDA), where labeled data
is only available in the source domain. A primary method in UDA involves
identifying shared features between domains. However, challenges such as
feature stability and concurrent handling of both domains remain.
This thesis tackles these challenges in text classiﬁcation by proposing three
innovative models: WS-UDA, UDA-SP, and FlexAdapt. WS-UDA focuses on
discovering deeper and more stable features for the target domain through
sequential steps. It has demonstrated robust performance in adapting to
diverse and dissimilar domains, consistently achieving improvement accuracy
across Amazon reviews, FDU-MTL, and Spam data-sets with 2.23%, 23.32%,
and 1.9% respectively . The sequential approach of WS-UDA enhances its
adaptability to varying levels of domain dissimilarity.
UDA-SP incorporates a source feature extractor to balance performance
between source and target domains. By leveraging the available source
samples, UDA-SP eﬀectively adapts to closely related domains while pre-
serving the performance on the source domain. This model highlights the
importance of maintaining a trade-oﬀ between source and target performance
post-adaptation. UDA-SP shows improvements of 2.25% on Amazon reviews,
2.75% on FDU-MTL, and 1.08% on Spam data-sets.
FlexAdapt integrates adversarial training with dynamic weighting based
on domain similarity. FlexAdapt excels in scenarios involving both similar

and dissimilar domains by capitalizing on the unique strengths of WS-
UDA and UDA-SP. The incorporation of similarity-based weighting ensures
balanced performance across varying domain dissimilarities. FlexAdapt shows
improvements of 4.09% on Amazon reviews, 4.14% on FDU-MTL, and 3.22%
on Spam data-sets.
The ﬁndings of this thesis suggest that domain similarity plays a signiﬁcant
role in the adaptation process. Models such as WS-UDA perform exception-
ally well in scenarios with signiﬁcant domain dissimilarities, while UDA-SP
excels in closely related domains. FlexAdapt demonstrates versatility by
maintaining high performance across a spectrum of domain similarities.
Overall, this thesis provides valuable insights into the eﬀectiveness of these
models in enhancing domain adaptation. It lays a foundation for future
research to further improve UDA techniques and address complex domain
adaptation challenges. يقترح البحث نماذج متقدمة في تكييف المجال غير الخاضع للإشراف، تهدف لنقل المعرفة من مجال مرئي يحتوي على بيانات مسماة (المجال المصدر) إلى مجال غير خاضع للإشراف يحتوي على بيانات غير مسماة (المجال المستهدف) في تحليل المشاعر. تعالج هذه الأطروحة تحديات مثل استقرار الميزة والمعالجة المتزامنة لكلا المجالين في تصنيف النصوص من خلال اقتراح ثلاثة نماذج: نموذج "دبليو اس-يو دي ا"، ونموذج " يو دي ا -اس ب" ونموذج “فليكس ادابت”.
تعمل "وس-يو دي ا" على تعزيز القدرة على التكيف مع المجالات المتنوعة وغير المتشابهة من خلال خطوات متسلسلة. يشتمل " يو دي ا -اس ب" على مستخرج ميزات المصدر لموازنة الأداء بين نطاقات المصدر والهدف. ويجمع “فليكس ادابت”. بين" دبليو اس-يو دي ا" و" يو دي ا -اس ب" مع ترجيح ديناميكي يعتمد على تشابه المجال، ويتفوق في كل من المجالات المتشابهة وغير المتشابهة. توفر هذه النماذج رؤى قيمة وتضع أساسًا لأبحاث تكييف المجال غير الخاضع للإشراف المستقبلية. تم تقييم النماذج المقترحة باستخدام ثلاث مجموعات بيانات مختلفة. أظهرت "دبليو اس-يو دي ا" متوسط تحسين بنسبة 2.48%، وحقق " يو دي ا -اس ب" متوسط تحسين بنسبة 2.02%، وأظهر “فليكس ادابت” متوسط تحسين بنسبة 3.82%.

Text in English and abstract in Arabic & English.

Subjects--Topical Terms:
Computer Engineering

Subjects--Index Terms: Unsupervised Domain Adaptation Domain Adaptation Text Classiﬁcation Sentiment Analysis cross domains Spam Filtering cross domains Feature Stability Adversarial Training Machine Learning

Dewey Class. No.: 621.39