scholarly journals Data quality evaluation: a comparative analysis of company registers’ open data in four European countries

Author(s):  
Janis Bicevskis ◽  
Zane Bicevska ◽  
Anastasija Nikiforova ◽  
Ivo Oditis
2020 ◽  
Vol 26 (1) ◽  
pp. 107-126
Author(s):  
Anastasija Nikiforova ◽  
Janis Bicevskis ◽  
Zane Bicevska ◽  
Ivo Oditis

The paper proposes a new data object-driven approach to data quality evaluation. It consists of three main components: (1) a data object, (2) data quality requirements, and (3) data quality evaluation process. As data quality is of relative nature, the data object and quality requirements are (a) use-case dependent and (b) defined by the user in accordance with his needs. All three components of the presented data quality model are described using graphical Domain Specific Languages (DSLs). In accordance with Model-Driven Architecture (MDA), the data quality model is built in two steps: (1) creating a platform-independent model (PIM), and (2) converting the created PIM into a platform-specific model (PSM). The PIM comprises informal specifications of data quality. The PSM describes the implementation of a data quality model, thus making it executable, enabling data object scanning and detecting data quality defects and anomalies. The proposed approach was applied to open data sets, analysing their quality. At least 3 advantages were highlighted: (1) a graphical data quality model allows the definition of data quality by non-IT and non-data quality experts as the presented diagrams are easy to read, create and modify, (2) the data quality model allows an analysis of "third-party" data without deeper knowledge on how the data were accrued and processed, (3) the quality of the data can be described at least at two levels of abstraction - informally using natural language or formally by including executable artefacts such as SQL statements.


2020 ◽  
Vol 11 (6) ◽  
Author(s):  
VALERII M. DRESHPAK ◽  
VIKTOR G. KOVALOV ◽  
NATALIІA V. BABACHENKO ◽  
EVGEN M. PAVLENKO

2021 ◽  
Vol 25 (4) ◽  
pp. 763-787
Author(s):  
Alladoumbaye Ngueilbaye ◽  
Hongzhi Wang ◽  
Daouda Ahmat Mahamat ◽  
Ibrahim A. Elgendy ◽  
Sahalu B. Junaidu

Knowledge extraction, data mining, e-learning or web applications platforms use heterogeneous and distributed data. The proliferation of these multifaceted platforms faces many challenges such as high scalability, the coexistence of complex similarity metrics, and the requirement of data quality evaluation. In this study, an extended complete formal taxonomy and some algorithms that utilize in achieving the detection and correction of contextual data quality anomalies were developed and implemented on structured data. Our methods were effective in detecting and correcting more data anomalies than existing taxonomy techniques, and also highlighted the demerit of Support Vector Machine (SVM). These proposed techniques, therefore, will be of relevance in detection and correction of errors in large contextual data (Big data).


2014 ◽  
Author(s):  
Hongwei Wei ◽  
Qingjiu Tian ◽  
Yan Huang ◽  
Yan Wang

2017 ◽  
Vol 9 (1) ◽  
pp. 175-190 ◽  
Author(s):  
Miroslaw Moroz

Abstract An assessment of the degree of the development of the digital economy in Poland in comparison to chosen European countries is the main purpose of the paper. The methodology of the conducted research is based on the analysis of secondary sources and applying statistical methods. In order to make the comparison in methodically correct manner, synthetic measures of the development of the e-economy were used in the form of two indexes: NRI (Networked Readiness Index) and DESI (Digital Economy and Society Index). On the basis of available statistical data, four European countries were confronted with Poland. Results of the analysis indicate a relatively unfavorable situation of Poland.


2021 ◽  
Vol 37 (37) ◽  
pp. 22-42
Author(s):  
Alicja Paluch ◽  
Henryk Spustek

The ever-increasing need for in-depth analysis and quantification of the national power, in particular ‘hard’ and ‘soft’ power-generating factors as well as difficulties in identifying a comprehensive and effective method for scientific determination of the national power, have given rise to research in the indicated scientific issues within this article. The presented considerations aim to define the assumptions for a descriptive sub-model that would enable a comparison of Poland’s power in the economic sphere (which is a component of the non-military sphere) with the power of selected European countries. The research hypothesis is that, among the variety of descriptive variables in the economic sphere of the national power, there is a subset of mutually independent variables, at the same time strongly correlated with the national power, which make it possible to define assumptions for the sub-model of the national power. The steps of the research procedure were carried out using the method of system analysis (multi-criteria comparative analysis) and statistical analysis. The research activities undertaken have shown that the factors that are strongly correlated with the national power in the economic area of the European countries adopted for the analysis are: dynamics of industrial production, private sector credit flows and economic freedom index. The comparative analysis carried out demonstrates that the greatest increase in the economic power in the analysed period took place in Germany (0.68). Slightly smaller growth was recorded in the Czech Republic (0.62) and Poland (0.60), while the lowest value of increase was in Romania (0.23). The conducted qualitative comparative analysis of the economic power of selected European countries allowed to conclude that the independent variables identified are crucial for the formation of the economic power of the analysed countries. At the same time, a fairly strong position of the Czech Republic and Poland in relation to the economic power of Germany was found. The performed quantification of the economic power of the European countries provides a basis for the correct determination of changes in the power distribution of political units, assessment of the power and resources held by the state.


2021 ◽  
Author(s):  
Huaqiang Zhong ◽  
Limin Sun ◽  
José Turmo ◽  
Ye Xia

<p>In recent years, the safety and comfort problems of bridges are not uncommon, and the operating conditions of in-service bridges have received widespread attention. Many large-span key bridges have installed structural health monitoring systems and collected massive amounts of data. Monitoring data is the basis of structural damage identification and performance evaluation, and it is of great significance to analyze and evaluate its quality. This paper takes the acceleration monitoring data of the main girder and arch rib of a long-span arch bridge as the research object, analyzes and summarizes the statistical characteristics of the data, summarizes 6 abnormal data conditions, and proposes a data quality evaluation method of convolutional neural network. This paper conducts frequency statistics on the acceleration vibration amplitude of the bridge in December 2018 in hours. In order to highlight the end effect of frequency statistics, the whole is amplified and used as network input for training and data quality evaluation. The results are good. It provides another new method for structural monitoring data quality evaluation and abnormal data elimination.</p>


Sign in / Sign up

Export Citation Format

Share Document