GSMA: A Structural Matching Algorithm for Schema Matching in Data Warehousing

Author(s):  
Wei Cheng ◽  
Yufang Sun
Author(s):  
JIANGUO LU ◽  
JU WANG ◽  
SHENGRUI WANG

XML Schema matching problem can be formulated as follows: given two XML Schemas, find the best mapping between the elements and attributes of the schemas, and the overall similarity between them. XML Schema matching is an important problem in data integration, schema evolution, and software reuse. This paper describes a matching system that can find accurate matches and scales to large XML Schemas with hundreds of nodes. In our system, XML Schemas are modeled as labeled and unordered trees, and the schema matching problem is turned into a tree matching problem. We proposed Approximate Common Structures in trees, and developed a tree matching algorithm based on this concept. Compared with the traditional tree edit-distance algorithm and other schema matching systems, our algorithm is faster and more suitable for large XML Schema matching.


2014 ◽  
Vol 70 (5) ◽  
Author(s):  
Thabit Sabbah ◽  
Ali Selamat

Thesaurus is used in many Information Retrieval (IR) applications such as data integration, data warehousing, semantic query processing and classifiers. It was also utilized to solve the problem of schema matching. Considering the fact of existence of many thesauri for a certain area of knowledge, the quality of schema matching results when using different thesauri in the same field is not predictable. In this paper, we propose a methodology to study the performance of the thesaurus in solving schema matching. The paper also presents results of experiments using different thesauri. Precision, recall, F-measure, and similarity average were calculated to show that the quality of matching changed according to the used thesaurus.  


2011 ◽  
Vol 10 (03) ◽  
pp. 519-537 ◽  
Author(s):  
BEEN-CHIAN CHIEN ◽  
SHIANG-YI HE

To manipulate semantic web and integrate different data sources efficiently, automatic schema matching plays a key role. A generic schema matching method generally includes two phases: the linguistic similarity matching phase and the structural similarity matching phase. Since linguistic matching is an essential step for effective schema matching, developing a high accurate linguistic similarity matching scheme is required. In this paper, a schema matching approach called Similarity Yield Matcher (SYM) is proposed. In SYM, a lexical decision tree is presented to determine the linguistic similarity matching of the first phase. A structural matching algorithm is then proposed to find the structure similarity between two tree schemas. The proposed schema matching approach was evaluated by testing on several benchmarks of real schemas and comparing with other methods. The experimental results show that the proposed lexical decision tree substantially improves the linguistic similarity matching effectively and efficiently. The proposed SYM algorithm also performs high effectiveness on 1–1 schema matching.


2013 ◽  
Vol 2 (3) ◽  
pp. 173-186
Author(s):  
Jiyoon Lee ◽  
Sukhoon Lee ◽  
Jangwon Kim ◽  
Dongwon Jeong ◽  
Doo-Kwon Baik

Sign in / Sign up

Export Citation Format

Share Document