Poor Quality of Data in Africa: What Are the Issues?

2018 ◽  
Vol 46 (6) ◽  
pp. 851-877 ◽  
Author(s):  
Abel Kinyondo ◽  
Riccardo Pelizzo
Keyword(s):  
Author(s):  
Manjunath Ramachandra

The data gathered from the sources are often noisy Poor quality of data results in business losses that keep increasing down the supply chain. The end customer finds it absolutely useless and misguiding. So, cleansing of data is to be performed immediately and automatically after the data acquisition. This chapter provides the different techniques for data cleansing and processing to achieve the same.


2019 ◽  
Vol 64 (6) ◽  
pp. 5-15
Author(s):  
Iwona Markowicz ◽  
Paweł Baran

Official statistics on trade in goods between EU member states are collect-ed on country-level and then aggregated by Eurostat. Methodology of data collecting differs slightly between member states(e.g. various statistical thresholds and coverage), including differences in exchange rates as well as undeclared or late-declared transac-tions, errors in classification of goods and other mistakes. It often involves incomparability of mirror data (nominally concerning the same transactions recorded in statistics of both dispatcher and receiver countries). A huge part of these differences can be explained with the variable quality of data resources in the Eurostat database. In the study data quality on intra-EU trade in goods for 2017 was compared between Poland and neigh-bouring EU countries, i.e.:Germany, Czech Republic, Slovakia, Lithuania,and other Baltic states–Latvia and Estonia. The additional aim was to indicate the directions hav-ing the greatestinfluence on the observed differences in mirror data. The results of the study indicate that the declarations made in Estonia affect the poor quality of data on trade in goods between the countries mentioned above to the greatest extent.


10.28945/2584 ◽  
2002 ◽  
Author(s):  
Herna L. Viktor ◽  
Wayne Motha

Increasingly, large organizations are engaging in data warehousing projects in order to achieve a competitive advantage through the exploration of the information as contained therein. It is therefore paramount to ensure that the data warehouse includes high quality data. However, practitioners agree that the improvement of the quality of data in an organization is a daunting task. This is especially evident in data warehousing projects, which are often initiated “after the fact”. The slightest suspicion of poor quality data often hinders managers from reaching decisions, when they waste hours in discussions to determine what portion of the data should be trusted. Augmenting data warehousing with data mining methods offers a mechanism to explore these vast repositories, enabling decision makers to assess the quality of their data and to unlock a wealth of new knowledge. These methods can be effectively used with inconsistent, noisy and incomplete data that are commonplace in data warehouses.


2009 ◽  
Vol 11 (2) ◽  
Author(s):  
L. Marshall ◽  
R. De la Harpe

[email protected] Making decisions in a business intelligence (BI) environment can become extremely challenging and sometimes even impossible if the data on which the decisions are based are of poor quality. It is only possible to utilise data effectively when it is accurate, up-to-date, complete and available when needed. The BI decision makers and users are in the best position to determine the quality of the data available to them. It is important to ask the right questions of them; therefore the issues of information quality in the BI environment were established through a literature study. Information-related problems may cause supplier relationships to deteriorate, reduce internal productivity and the business' confidence in IT. Ultimately it can have implications for an organisation's ability to perform and remain competitive. The purpose of this article is aimed at identifying the underlying factors that prevent information from being easily and effectively utilised and understanding how these factors can influence the decision-making process, particularly within a BI environment. An exploratory investigation was conducted at a large retail organisation in South Africa to collect empirical data from BI users through unstructured interviews. Some of the main findings indicate specific causes that impact the decisions of BI users, including accuracy, inconsistency, understandability and availability of information. Key performance measures that are directly impacted by the quality of data on decision-making include waste, availability, sales and supplier fulfilment. The time spent on investigating and resolving data quality issues has a major impact on productivity. The importance of documentation was highlighted as an important issue that requires further investigation. The initial results indicate the value of


2021 ◽  
Vol 33 (1) ◽  
Author(s):  
Bastien Boussat ◽  
Hude Quan ◽  
Jose Labarere ◽  
Danielle Southern ◽  
Chantal M Couris ◽  
...  

Abstract Question Are there ways to mitigate the challenges associated with imperfect data validity in Patient Safety Indicator (PSI) report cards? Findings Applying a methodological framework on simulated PSI report card data, we compare the adjusted PSI rates of three hospitals with variable quality of data and coding. This framework combines (i) a measure of PSI rates using existing algorithms; (ii) a medical record review on a small random sample of charts to produce a measure of hospital-specific data validity and (iii) a simple Bayesian calculation to derive estimated true PSI rates. For example, the estimated true PSI rate, for a theoretical hospital with a moderately good quality of coding, could be three times as high as the measured rate (for example, 1.4% rather than 0.5%). For a theoretical hospital with relatively poor quality of coding, the difference could be 50-fold (for example, 5.0% rather than 0.1%). Meaning Combining a medical chart review on a limited number of medical charts at the hospital level creates an approach to producing health system report cards with estimates of true hospital-level adverse event rates.


1980 ◽  
Vol 70 (5) ◽  
pp. 1833-1847
Author(s):  
Harsh K. Gupta ◽  
C. V. Rama Krishna Rao ◽  
B. K. Rastogi ◽  
S. C. Bhatia

abstract Twelve earthquakes of Ms ≧ 4.0, their foreshocks and aftershocks, which occurred during the period October 1973 through December 1976 in the vicinity of the Koyna Dam, Maharashtra have been investigated using the seismograms from the Koyna seismic network, WWSSN seismic station at Poona (POO), and the NGRI seismic station (HYB) at Hyderabad. In all 71 hypocenters are located. Due to paucity/poor quality of data, the locations are mainly fair to poor in quality. Inferred focal depths are less than 15 km. These hypocenter locations indicate the possibility of the existence of a N-S trending fault at 73°45′E longitude. An empirical relation between signal duration (τ) and surface-wave magnitude (Ms), Ms = −2.44 + 2.61 log τ, is obtained for the region. This relation yields more reliable estimates of magnitudes. Composite focal mechanism solutions could be obtained for eight earthquakes with Ms ≧ 4. These solutions are mostly consistent with a N-S trending fault. Energy release patterns have been investigated for four sequences. A major portion of energy is released through the main shock.


1998 ◽  
Vol 11 (2) ◽  
pp. 231-253 ◽  
Author(s):  
Jennie Macdiarmid ◽  
John Blundell

AbstractUnder-reporting of food intake is one of the fundamental obstacles preventing the collection of accurate habitual dietary intake data. The prevalence of under-reporting in large nutritional surveys ranges from 18 to 54% of the whole sample, but can be as high as 70% in particular subgroups. This wide variation between studies is partly due to different criteria used to identify under-reporters and also to non-uniformity of under-reporting across populations. The most consistent differences found are between men and women and between groups differing in body mass index. Women are more likely to under-report than men, and under-reporting is more common among overweight and obese individuals. Other associated characteristics, for which there is less consistent evidence, include age, smoking habits, level of education, social class, physical activity and dietary restraint.Determining whether under-reporting is specific to macronutrients or food is problematic, as most methods identify only low energy intakes. Studies that have attempted to measure under-reporting specific to macronutrients express nutrients as percentage of energy and have tended to find carbohydrate under-reported and protein over-reported. However, care must be taken when interpreting these results, especially when data are expressed as percentages. A logical conclusion is that food items with a negative health image (e.g. cakes, sweets, confectionery) are more likely to be under-reported, whereas those with a positive health image are more likely to be over-reported (e.g. fruits and vegetables). This also suggests that dietary fat is likely to be under-reported.However, it is necessary to distinguish between under-reporting and genuine under-eating for the duration of data collection. The key to understanding this problem, but one that has been widely neglected, concerns the processes that cause people to under-report their food intakes. The little work that has been done has simply confirmed the complexity of this issue. The importance of obtaining accurate estimates of habitual dietary intakes so as to assess health correlates of food consumption can be contrasted with the poor quality of data collected. This phenomenon should be considered a priority research area. Moreover, misreporting is not simply a nutritionist's problem, but requires a multidisciplinary approach (including psychology, sociology and physiology) to advance the understanding of under-reporting in dietary intake studies.


2021 ◽  
Author(s):  
Andrew Hill ◽  
Manya Mirchandani ◽  
Leah Ellis ◽  
Victoria Pilkington

Abstract Background Ivermectin is an antiparasitic drug being investigated in clinical trials for the prevention of COVID-19. However, there are concerns about the quality of some of these trials. Objectives To conduct a meta-analysis with randomised controlled trials of ivermectin for the prevention of COVID-19, while controlling for the quality of data. Methods We conducted a sub-group analysis based on the quality of randomised controlled trials evaluating ivermectin for the prevention of COVID-19. Quality was assessed using the Cochrane Risk of Bias measures (RoB 2) and additional checks on raw data, where possible. Results Four studies were included in the meta-analysis. One was rated as being potentially fraudulent, two as having a high risk of bias and one as having some concerns for bias. Ivermectin did not have a significant effect on preventing RT-PCR confirmed COVID-19 infection. Ivermectin had a significant effect on preventing symptomatic COVID-19 infection in one trial with some concerns of bias, but this result was based on post-hoc analysis of a multi-arm study. Conclusions This meta-analysis demonstrates that the currently available randomised trials evaluating ivermectin for the prevention of COVID-19 are insufficient and of poor quality.


1982 ◽  
Vol 26 ◽  
pp. 99-104
Author(s):  
Satyam C. Cherukuri ◽  
Robert L. Snyder ◽  
Donald W. Beard

Over the past fifteen years two basic computer search/match strategies have evolved. The exhaustive search approach of Johnson and Vand (1) uses a sequential file structure whereas Nichols (2) developed a strategy which uses an inverted file, examining only those patterns containing lines of interest. Frevel (3) was the first to attempt to relate the quality of the reference patterns to the search strategy using a very restricted data base. These “first generation” search/match algorithms were forced to use very wide d and I windows due to the poor quality of the unknown and reference patterns.Snyder (4) wrote the first “second generation” search/match procedure which takes advantage of high quality of data in the JCPDS data base when it is present. Recently, a minicomputer optimized version of the Johnson-Vand strategy has been incorporated into this search system enabling a chance to compare these two strategies under similar conditions.


Author(s):  
Petr Svoboda

The goal of this article is the analysis of influence of market concentration in selected areas of public procurement on chosen parameters of public procurement in years 2007 and 2011. Five concentration ratios and five parameters of contracts are calculated for each of five chosen areas of public tenders in 2007 and 2011. After this task, the correlation analysis between concentration ratios and parameters of contracts is done for finding out mutual relation between these two variables. Correlation analysis is then compared with four created hypotheses about the relationship between market concentration and parameters of public procurement The results of the analysis are surprising, because in most cases, the stated hypotheses were rejected, meaning that the correlations between the parameters of public procurement and market concentration were different than this study predicted based on economic theory. The possible reasons for this result, discussed in the article, are corruption and also poor quality of data from Information system of public procurement administered by the Ministry for Regional Development of Czech Republic.


Sign in / Sign up

Export Citation Format

Share Document