scholarly journals Quantifying the relationship between diseases and symptoms using big data (Preprint)

2018 ◽  
Author(s):  
ShihHan Lin ◽  
Hsin Hui Shao

BACKGROUND Crises in endemic transmitted diseases affect humans worldwide, and the symptoms these diseases cause may provide firsthand information about these disorders. OBJECTIVE We suggest that massive new data sources resulting from human interaction with the Internet may offer a unique perspective on the relationship between illness and symptoms. METHODS By analyzing changes in Google query volumes for search terms related to disease, we find a pattern that may define the relationship between symptoms and disorders. We first retrieved pattern data from Google Trend using the common cold as the primary disease, and sore throat, stuffy nose, sneeze, fever, cough, and headache as symptoms. Pearson’s correlation coefficient was calculated using SPSS to determine the relationship between the symptoms and the disease. RESULTS Data created since 2013/1/13 was retrieved from Google Trend on a weekly basis. A total of 261 sets of data were calculated to create a high correlation coefficient of 0.925 between the common cold and the stuffy nose symptom. The cough symptom has the second highest correlation coefficient of 0.925, sore throat has a correlation coefficient of 0.853, and fever has a correlation coefficient of 0.626, which was significant at the 0.01 level in a two-tailed test. CONCLUSIONS Data on the relationship between diseases and symptoms often comes from facilities such as government, hospitals, and clinics, where the data is collected through the documentation of physicians and nurses. A conventional study can be limited by the region, the number of patients and the interpretation of the specialist. However, with access to Google Trend’s big data, millions or even billions of data points are accumulated directly from the patient. Another contribution of this study is that the quantified relationship between symptoms and diseases can be used to educate future physicians or even artificial intelligence.

2015 ◽  
Vol 53 (1) ◽  
pp. 81-88
Author(s):  
Theodore J. Witek ◽  
David L. Ramsey ◽  
Andrew N. Carr ◽  
Donald K. Riker

Background: The common cold is the most frequently experienced infection among humans, but limited data exist to characterize the onset, duration, severity and intersection of symptoms in community-acquired colds. A more complete understanding of the symptom frequency and burden in naturally occurring colds is needed. Methodology: We characterized common cold symptoms from 226 cold episodes experienced by 104 male or female subjects. Subjects were enrolled in the work environment in an attempt to start symptom evaluation (frequency and severity) at the earliest sign of their cold. We also assessed the symptom that had the greatest impact on the subject by asking them to identify their single most bothersome symptom. Results: Symptom reporting started within 24 hours of cold onset for most subjects. Sore throat was a harbinger of the illness but was accompanied by multiple symptoms, including nasal congestion, runny nose and headache. Cough was not usually the most frequent symptom, but was present throughout the cold, becoming most bothersome later in the cold. Nasal congestion, pain (eg, sore throat, headache, muscle pains) or feverishness and secretory symptoms (eg, runny nose, sneezing), and even cough, were simultaneously experienced with high incidence over the first 4 days of illness. The single most bothersome symptom was sore throat on day 1, followed by nasal congestion on days 2-5 and cough on days 6 and 7. Conclusion: There is substantial overlap in the appearance of common cold symptoms over the first several days of the common cold. Nasal congestion, secretory and pain symptoms frequently occur together, with cough being somewhat less prominent, but quite bothersome when present. These data establish the typical symptomatology of a common cold and provide a foundation for the rational treatment of cold symptoms typically experienced by cold sufferers.


2020 ◽  
Vol 8 (4) ◽  
pp. 466-474
Author(s):  
V.A. Malakhov ◽  
A.K. Tyagniryadko ◽  
Y.A. Isaeva

The problem of osteoporosis and sarcopenia is one of the leading problems in world medicine. There is a significant increase in the number of patients with these pathologies, which is associated with increased life expectancy. Osteoporosis and sarcopenia are among the most common diseases in old age. Moreover, if earlier these pathologies, especially osteoporosis, were observed mainly in the elderly, now these diagnoses have significantly rejuvenated. Thus, early diagnosis, methods of prevention, early treatment and rehabilitation of these diseases become relevant. Equally important is the relationship between these diseases and the commonality of their etiology and pathogenesis, and, accordingly, the identity of methods of prevention and treatment. In the context of medical and preventive care, the commonalities and differences of genetic, biochemical and age factors and nosological units that lead to the development of these pathologies are analyzed. Methods of prevention and non-drug treatment of osteoporosis and sarcopenia are considered in detail. The most effective methods of prevention and non-drug treatment of osteoporosis and sarcopenia have been identified. The common etiopathogenetic factors of sarcopenia and osteoporosis, disorders of fat metabolism and, ultimately, reduced physical activity, suggests the presence of osteosarcopenia and osteosarcopenic obesity. The same commonality leads to almost identical approaches in the treatment and prevention of these diseases.


Author(s):  
Johannes Just ◽  
Marie-Therese Puth ◽  
Felix Regenold ◽  
Klaus Weckbecker ◽  
Markus Bleckwenn

BackgroundCombating the COVID-19 pandemic is a major challenge for health systems, citizens and policy makers worldwide. Early detection of affected patients within the huge population of patients with common cold symptoms is an important element of this effort but often hindered by limited testing resources. We aimed to identify predictive risk profiles for a positive PCR result in primary care.MethodsMulti-center cross-sectional cohort study on predictive characteristics over a period of 4 weeks in primary care patients in Germany. We evaluated age, sex, reason for testing, risk factors, symptoms, and expected PCR result for their impact on the test 46 result.ResultsIn total, 374 patients in 14 primary care centers received SARS-CoV-2 PCR swab testing and were included in this analysis. A fraction of 10.7% (n=40) tested positive for COVID-19. Patients who reported anosmia had a higher odds ratio (OR: 4.54; 95%-CI: 1.51–13.67) for a positive test result while patients with a sore throat had a lower OR (OR: 0.33; 95%-CI: 0.11–0.97). Patients who had a first grade contact with an infected persons and showed symptoms themselves also had an increased OR for positive testing (OR: 5.16; 95% CI: 1.72–15.51). This correlation was also present when they themselves were still asymptomatic (OR: 12.55; 95% CI: 3.97–39.67).ConclusionThe reported contact to an infected person is the most important factor for a positive PCR result, independent of any symptoms of illness in the tested patient. Those persons with contact to an infected person should always get a PCR test. If no contact is reported and testing material is scarce, anosmia should increase the likelihood of performing a test, while a sore throat should decrease it.


2000 ◽  
Vol 4 (4) ◽  
pp. 212-216 ◽  
Author(s):  
Ming Xu ◽  
Takashi Muto ◽  
Tosio Yabe ◽  
Fumiko Nagao ◽  
Yasushi Fukuwatart ◽  
...  

Author(s):  
Dong Hyun Kim ◽  
Shin Goo Park ◽  
Hwan Cheol Kim ◽  
Eui Cheol Lee ◽  
Jeong Hoon Kim ◽  
...  

2019 ◽  
Vol 40 (Supplement_1) ◽  
Author(s):  
A Kaura ◽  
J Davies ◽  
V Panoulas ◽  
B Glampson ◽  
A Mulla ◽  
...  

Abstract Background Many of the data points required to support translational research are collected as a matter of routine, and should be available within electronic patient records. Variations in clinical and data recording practice can mean that the extraction and standardisation of this data, with the aim of producing a large-scale, research-ready dataset, presents a number of challenges. Purpose We set out to create a large-scale, research-ready dataset to support translational research in cardiovascular medicine, using routinely-collected data from five large university-hospital partnerships. As an initial focus, we selected those data points that would support an investigation of the relationship between test results and outcomes in acute coronary syndrome (ACS). Methods The National Institute of Health Research (NIHR) Health Informatics Collaborative (HIC) is a programme of infrastructure development aimed at increasing the quality and availability of routinely-collected data for collaborative, translational research. Eighteen university-hospital partnerships signed the data sharing agreement, and are working to facilitate the sharing and re-use of data across centres, for approved research purposes. With support from the Directors of the NIHR Biomedical Research Centres (BRCs) within five of the largest partnerships, we established a clinical data collaboration, specifying a dataset and selecting an initial research question (Figure 1). The NIHR HIC team worked to extract data against this specification. With approval from an ethics committee, and from the information governance teams at each contributing centre, data was processed by one of the centres for standardisation and analysis. Results The specified dataset represented a longitudinal record for patients presenting with a suspected ACS, characterised by a request for a troponin test (Figure 1). The dataset included 156 data points, grouped into demographics, cardiovascular risk factor profile, emergency department attendance and inpatient episodes, blood tests, echocardiography and mortality. Data was extracted from the records of patients for whom a troponin test was requested between 2010 and 2017. A total of 257,948 records were standardised and analysed. The collaboration has been successful, and an initial version of the combined dataset has been created. The size of the dataset has yielded new insights into the relationship between test results and outcomes, and publications are in preparation. An expanded dataset of over 800 data points has been agreed for the next phase of the collaboration, and three other centres have joined. Figure 1. NIHR HIC dataset generation Conclusion It is perfectly feasible – in terms of governance and technology – to re-use routinely-collected data for collaborative, translational research in cardiovascular medicine. The resulting dataset will be large and complex enough to require big data tools and techniques, and will yield the kind of insights afforded only by big data in medicine. Acknowledgement/Funding Funded by NIHR Imperial Biomedical Research Centre (BRC) using NIHR Health Informatics Collaborative data service, supported by OUH, GSTT & UCLH BRCs


Entropy ◽  
2020 ◽  
Vol 22 (7) ◽  
pp. 773
Author(s):  
Yan Yan ◽  
Boyao Wu ◽  
Tianhai Tian ◽  
Hu Zhang

Complex network is a powerful tool to discover important information from various types of big data. Although substantial studies have been conducted for the development of stock relation networks, correlation coefficient is dominantly used to measure the relationship between stock pairs. Information theory is much less discussed for this important topic, though mutual information is able to measure nonlinear pairwise relationship. In this work we propose to use part mutual information for developing stock networks. The path-consistency algorithm is used to filter out redundant relationships. Using the Australian stock market data, we develop four stock relation networks using different orders of part mutual information. Compared with the widely used planar maximally filtered graph (PMFG), we can generate networks with cliques of large size. In addition, the large cliques show consistency with the structure of industrial sectors. We also analyze the connectivity and degree distributions of the generated networks. Analysis results suggest that the proposed method is an effective approach to develop stock relation networks using information theory.


Thorax ◽  
2007 ◽  
Vol 63 (6) ◽  
pp. 493-499 ◽  
Author(s):  
E Sapey ◽  
D Bayley ◽  
A Ahmad ◽  
P Newbold ◽  
N Snell ◽  
...  

Background:Measurements of pulmonary biomarkers can be used to monitor airway inflammation in chronic obstructive pulmonary disease (COPD), but the variability of sampled biomarkers and their inter-relationships are poorly understood. A study was undertaken to examine the intra- and inter-patient variability in spontaneous sputum samples from patients in the stable state and to describe the relationship between biomarkers, cell counts and markers of disease.Methods:Sputum interleukin-1β, tumour necrosis factor α, interleukin 8, myeloperoxidase, leucotriene B4, growth-related oncogene α and differential cell counts were measured in patients with moderate to severe stable COPD (n = 14) on 11 occasions over a 1-month period.Results:There was significant variability in all inflammatory indices (median intra-patient coefficient of variation (CV) 35% (IQR 22–69), median inter-patient CV 102% (IQR 61–145)). Variability could be reduced by using a rolling mean of individual patient data points. Sample size calculations were undertaken to determine the number of patients required to detect a 50% reduction in neutrophil count. Using a crossover design of a putative effective treatment, the number needed using one data point per patient was 72, reducing to 23 when a mean of three data points was used. Significant correlations were demonstrated both between the inflammatory biomarkers themselves and between inflammatory biomarkers and markers of disease. Some relationships were not apparent when results from a single sample were used. The reliability of inter-relationships improved as more data points were used for each patient.Conclusions:Clear relationships exist between inflammatory biomarkers in patients with stable COPD. Sequential sampling reduced the variability of individual mediators and the potential number of patients needed to power proof of concept interventional studies.


Sign in / Sign up

Export Citation Format

Share Document