Understanding the Nature of Metadata – A Deep Insight in the Literature (Preprint)
BACKGROUND Metadata are created to describe the corresponding data in a detailed and unambiguous way and are used for various applications in different research areas, e.g. data identification and classification. However, the clear definition of metadata is crucial for further use. However, experience with the processing and management of metadata has shown that the term "metadata" and its use is not always unambiguous. OBJECTIVE The goal of this study was to understand the nature of metadata definition and the resulting impact on information reuse. METHODS A systematic literature search performed in this paper is conducted in accordance with the PRISMA Guidelines for Reporting on Systematic Reviews. Five research questions were identified to streamline the review process addressing the characteristics, metadata standards, use cases and encountered problems. The review is preceded by a process of harmonization in order to achieve a general understanding of the terms used. RESULTS The harmonization process resulted in a clear set of definitions for metadata processing focusing on data integration. The following literature review was conducted by ten reviewers with different backgrounds and using the harmonized definitions. The review included 81 peer-reviewed papers from the last decade after different filtering steps to identify the most relevant papers. The five research questions could be answered, resulting in a broad overview of standards, use cases, problems and corresponding solutions for the application of metadata in different research areas. CONCLUSIONS Metadata can be a powerful tool for identifying, describing and processing information, but its meaningful creation is costly and challenging. The review process discovered many standards, use cases, problems and solutions in dealing with metadata and gave a broad overview of the topic. The harmonized definitions and the new schema should improve the classification and creation of metadata by enabling a common understanding of metadata and its context.