header

Fusion of inconsistent integrated data using source qualifications /

Reham Ibrahim Abdelmonem Ibrahim

Fusion of inconsistent integrated data using source qualifications / استخدام مؤهلات المصادر لدمج البيانات الغير متناسقة Reham Ibrahim Abdelmonem Ibrahim ; Supervised Ali Hamed Elbastawissy , Mohamed Medhat Elwakil - Cairo : Reham Ibrahim Abdelmonem Ibrahim , 2017 - 139 Leaves ; 30cm

Thesis (M.Sc.) - Cairo University - Faculty of Computer and Information - Department of Information Systems

The need to access information found in multiple, distributed and heterogeneous data sources and obtaining it in a uniform way with high quality is increasing, as high quality means better decisions and accurate and reliable processes. Data integration (DI) is a process in which data desired for answering a query is collected from distributed and heterogeneous data sources and provided to users in a single form. Each source has a set of local constraints and quality properties (quality measures) such as accuracy, completeness, timeliness, validity, response time and cost of access which can be utilized in improving the query processing. Query answers of the data integration system can be improved by evaluating the quality of the data sources, according to their quality measures and constraints then retrieving results from the important ones only. Query answers of the data integration system can be also improved by ranking them according to the quality level required by user in query and presented them at a reasonable time. The contribution of this thesis is a data integration framework to improve the outputs of the data integration systems, especially when some of the data sources contain dirty data and when the number of data sources is big. The framework built on "both as view (BAV)", a mapping technique, which considered the best mapping technique as it doesnt have any of the disadvantages of both "global as view (GAV)" and "local as view (LAV)" mapping techniques. The framework allows admission of integrity constraints in the global schema and local schemas for providing consistent answers. The framework based on calculating and storing a set of quality measures of data sources



Data Integration Data quality Data sources