header
Image from OpenLibrary

An intelligent data clustering model for a real application / Doaa Saleh Ali ; Supervised Mohamed Saleh , Mohamed Rasmy , Ayman Ghoneim

By: Contributor(s): Material type: TextTextLanguage: English Publication details: Cairo : Doaa Saleh Ali , 2017Description: 195 Leaves : charts ; 30cmOther title:
  • نموذج عنقودية البيانات الذكية مع التطبيق الواقعي [Added title page title]
Subject(s): Available additional physical forms:
  • Issued also as CD
Dissertation note: Thesis (Ph.D.) - Cairo University - Faculty of Computers and Information - Department of Operations Research and Decision Support Summary: Data Clustering, an important unsupervised technique in data mining, aims to identify interesting distributions and patterns in the underlying data. Cluster validity indices are used to evaluate the performance of clustering models. Some recent research used cluster validity indices as the objective functions in multiobjective framework, in order to improve the clustering performance. Therefore, an interesting research question is how to further improve the clustering performance via cluster validity indices. We address this research question by three main contributions. First, using new combinations of cluster validity indices, we introduce two new multiobjective data clustering models for numerical and categorical data. Based on our literature review, we select a combination of cluster validity indices (i.e. objective functions) for the proposed clustering models. Based on the experimental results, the proposed multiobjective data clustering models prove their efficiency in improving the clustering performance. However, when forming a new combination of the cluster validity indices for any given dataset, there are still open research questions regarding what the best cluster validity indices are to use and what the best size for this combination is. The second contribution of the dissertation addresses these questions by proposing a hybrid meta-heuristic clustering (HMHC) methodology for computing the best combination of the cluster validity indices for any used dataset. The HMHC methodology illustrates its ability to compute a different and better-performing combination of indices for each benchmark dataset. Also, for reducing the complexity of the HMHC methodology, we introduce a way to filter the indices in the pool based on the data features of the dataset under consideration. Finally, we also introduce some recommendations for the practitioners in a data clustering field, by doing some additional analyses on the experimental results by using the concepts of Shapely value and mutual information
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
Holdings
Item type Current library Home library Call number Copy number Status Date due Barcode
Thesis Thesis قاعة الرسائل الجامعية - الدور الاول المكتبة المركزبة الجديدة - جامعة القاهرة Cai01.20.02.Ph.D.2017.Do.I (Browse shelf(Opens below)) Not for loan 01010110074659000
CD - Rom CD - Rom مخـــزن الرســائل الجـــامعية - البدروم المكتبة المركزبة الجديدة - جامعة القاهرة Cai01.20.02.Ph.D.2017.Do.I (Browse shelf(Opens below)) 74659.CD Not for loan 01020110074659000

Thesis (Ph.D.) - Cairo University - Faculty of Computers and Information - Department of Operations Research and Decision Support

Data Clustering, an important unsupervised technique in data mining, aims to identify interesting distributions and patterns in the underlying data. Cluster validity indices are used to evaluate the performance of clustering models. Some recent research used cluster validity indices as the objective functions in multiobjective framework, in order to improve the clustering performance. Therefore, an interesting research question is how to further improve the clustering performance via cluster validity indices. We address this research question by three main contributions. First, using new combinations of cluster validity indices, we introduce two new multiobjective data clustering models for numerical and categorical data. Based on our literature review, we select a combination of cluster validity indices (i.e. objective functions) for the proposed clustering models. Based on the experimental results, the proposed multiobjective data clustering models prove their efficiency in improving the clustering performance. However, when forming a new combination of the cluster validity indices for any given dataset, there are still open research questions regarding what the best cluster validity indices are to use and what the best size for this combination is. The second contribution of the dissertation addresses these questions by proposing a hybrid meta-heuristic clustering (HMHC) methodology for computing the best combination of the cluster validity indices for any used dataset. The HMHC methodology illustrates its ability to compute a different and better-performing combination of indices for each benchmark dataset. Also, for reducing the complexity of the HMHC methodology, we introduce a way to filter the indices in the pool based on the data features of the dataset under consideration. Finally, we also introduce some recommendations for the practitioners in a data clustering field, by doing some additional analyses on the experimental results by using the concepts of Shapely value and mutual information

Issued also as CD

There are no comments on this title.

to post a comment.