scholarly journals Effects of undetected data quality issues on climatological analyses

Author(s):  
Stefan Hunziker ◽  
Stefan Brönnimann ◽  
Juan Marcos Calle ◽  
Isabel Moreno ◽  
Marcos Andrade ◽  
...  

Abstract. Systematic data quality issues may occur at various stages of the data generation process. They may affect large fractions of observational datasets and remain largely undetected with standard data quality control. This study investigates the effects of such undetected data quality issues on the results of climatological analyses. For this purpose, we quality controlled daily observations of manned weather stations from the Central Andean area with a standard and an enhanced approach. The climate variables analysed are minimum and maximum temperature, and precipitation. About 40 % of the observations are inappropriate for the calculation of monthly temperature means and precipitation sums due to data quality issues. These quality problems undetected with the standard quality control method strongly affect climatological analyses, since they reduce the correlation coefficients of station pairs, deteriorate the performance of data homogenization methods, increase the spread of individual station trends, and significantly bias regional temperature trends. Our findings indicate that undetected data quality issues are included in important and frequently used observational datasets, and hence may affect a high number of climatological studies. It is of utmost importance to apply comprehensive and adequate data quality control approaches on manned weather station records in order to avoid biased results and large uncertainties.

2018 ◽  
Vol 14 (1) ◽  
pp. 1-20 ◽  
Author(s):  
Stefan Hunziker ◽  
Stefan Brönnimann ◽  
Juan Calle ◽  
Isabel Moreno ◽  
Marcos Andrade ◽  
...  

Abstract. Systematic data quality issues may occur at various stages of the data generation process. They may affect large fractions of observational datasets and remain largely undetected with standard data quality control. This study investigates the effects of such undetected data quality issues on the results of climatological analyses. For this purpose, we quality controlled daily observations of manned weather stations from the Central Andean area with a standard and an enhanced approach. The climate variables analysed are minimum and maximum temperature and precipitation. About 40 % of the observations are inappropriate for the calculation of monthly temperature means and precipitation sums due to data quality issues. These quality problems undetected with the standard quality control approach strongly affect climatological analyses, since they reduce the correlation coefficients of station pairs, deteriorate the performance of data homogenization methods, increase the spread of individual station trends, and significantly bias regional temperature trends. Our findings indicate that undetected data quality issues are included in important and frequently used observational datasets and hence may affect a high number of climatological studies. It is of utmost importance to apply comprehensive and adequate data quality control approaches on manned weather station records in order to avoid biased results and large uncertainties.


2021 ◽  
Author(s):  
Francesco Battocchio ◽  
Jaijith Sreekantan ◽  
Arghad Arnaout ◽  
Abed Benaichouche ◽  
Juma Sulaiman Al Shamsi ◽  
...  

Abstract Drilling data quality is notoriously a challenge for any analytics application, due to complexity of the real-time data acquisition system which routinely generates: (i) Time related issues caused by irregular sampling, (ii) Channel related issues in terms of non-uniform names and units, missing or wrong values, and (iii) Depth related issues caused block position resets, and depth compensation (for floating rigs). On the other hand, artificial intelligence drilling applications typically require a consistent stream of high-quality data as an input for their algorithms, as well as for visualization. In this work we present an automated workflow enhanced by data driven techniques that resolves complex quality issues, harmonize sensor drilling data, and report the quality of the dataset to be used for advanced analytics. The approach proposes an automated data quality workflow which formalizes the characteristics, requirements and constraints of sensor data within the context of drilling operations. The workflow leverages machine learning algorithms, statistics, signal processing and rule-based engines for detection of data quality issues including error values, outliers, bias, drifts, noise, and missing values. Further, once data quality issues are classified, they are scored and treated on a context specific basis in order to recover the maximum volume of data while avoiding information loss. This results into a data quality and preparation engine that organizes drilling data for further advanced analytics, and reports the quality of the dataset through key performance indicators. This novel data processing workflow allowed to recover more than 90% of a drilling dataset made of 18 offshore wells, that otherwise could not be used for analytics. This was achieved by resolving specific issues including, resampling timeseries with gaps and different sampling rates, smart imputation of wrong/missing data while preserving consistency of dataset across all channels. Additional improvement would include recovering data values that felt outside a meaningful range because of sensor drifting or depth resets. The present work automates the end-to-end workflow for data quality control of drilling sensor data leveraging advanced Artificial Intelligence (AI) algorithms. It allows to detect and classify patterns of wrong/missing data, and to recover them through a context driven approach that prevents information loss. As a result, the maximum amount of data is recovered for artificial intelligence drilling applications. The workflow also enables optimal time synchronization of different sensors streaming data at different frequencies, within discontinuous time intervals.


2018 ◽  
Vol 47 (2) ◽  
pp. 230002
Author(s):  
王贵宁 Wang Guining ◽  
刘秉义 Liu Bingyi ◽  
冯长中 Feng Changzhong ◽  
吴松华 Wu Songhua ◽  
刘金涛 Liu Jintao ◽  
...  

Author(s):  
Antonella D. Pontoriero ◽  
Giovanna Nordio ◽  
Rubaida Easmin ◽  
Alessio Giacomel ◽  
Barbara Santangelo ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document