scholarly journals Nowcasting Sexually Transmitted Infections in Chicago: Predictive Modeling and Evaluation Study Using Google Trends

10.2196/20588 ◽  
2020 ◽  
Vol 6 (4) ◽  
pp. e20588
Author(s):  
Amy Kristen Johnson ◽  
Runa Bhaumik ◽  
Irina Tabidze ◽  
Supriya D Mehta

Background Sexually transmitted infections (STIs) pose a significant public health challenge in the United States. Traditional surveillance systems are adversely affected by data quality issues, underreporting of cases, and reporting delays, resulting in missed prevention opportunities to respond to trends in disease prevalence. Search engine data can potentially facilitate an efficient and economical enhancement to surveillance reporting systems established for STIs. Objective We aimed to develop and train a predictive model using reported STI case data from Chicago, Illinois, and to investigate the model’s predictive capacity, timeliness, and ability to target interventions to subpopulations using Google Trends data. Methods Deidentified STI case data for chlamydia, gonorrhea, and primary and secondary syphilis from 2011-2017 were obtained from the Chicago Department of Public Health. The data set included race/ethnicity, age, and birth sex. Google Correlate was used to identify the top 100 correlated search terms with “STD symptoms,” and an autocrawler was established using Google Health Application Programming Interface to collect the search volume for each term. Elastic net regression was used to evaluate prediction accuracy, and cross-correlation analysis was used to identify timeliness of prediction. Subgroup elastic net regression analysis was performed for race, sex, and age. Results For gonorrhea and chlamydia, actual and predicted STI values correlated moderately in 2011 (chlamydia: r=0.65; gonorrhea: r=0.72) but correlated highly (chlamydia: r=0.90; gonorrhea: r=0.94) from 2012 to 2017. However, for primary and secondary syphilis, the high correlation was observed only for 2012 (r=0.79), 2013 (r=0.77), 2016 (0.80), and 2017 (r=0.84), with 2011, 2014, and 2015 showing moderate correlations (r=0.55-0.70). Model performance was the most accurate (highest correlation and lowest mean absolute error) for gonorrhea. Subgroup analyses improved model fit across disease and year. Regression models using search terms selected from the cross-correlation analysis improved the prediction accuracy and timeliness across diseases and years. Conclusions Integrating nowcasting with Google Trends in surveillance activities can potentially enhance the prediction and timeliness of outbreak detection and response as well as target interventions to subpopulations. Future studies should prospectively examine the utility of Google Trends applied to STI surveillance and response.

2020 ◽  
Author(s):  
Amy Kristen Johnson ◽  
Runa Bhaumik ◽  
Irina Tabidze ◽  
Supriya D Mehta

BACKGROUND Sexually transmitted infections (STIs) pose a significant public health challenge in the United States. Traditional surveillance systems are adversely affected by data quality issues, underreporting of cases, and reporting delays, resulting in missed prevention opportunities to respond to trends in disease prevalence. Search engine data can potentially facilitate an efficient and economical enhancement to surveillance reporting systems established for STIs. OBJECTIVE We aimed to develop and train a predictive model using reported STI case data from Chicago, Illinois, and to investigate the model’s predictive capacity, timeliness, and ability to target interventions to subpopulations using Google Trends data. METHODS Deidentified STI case data for chlamydia, gonorrhea, and primary and secondary syphilis from 2011-2017 were obtained from the Chicago Department of Public Health. The data set included race/ethnicity, age, and birth sex. Google Correlate was used to identify the top 100 correlated search terms with “STD symptoms,” and an autocrawler was established using Google Health Application Programming Interface to collect the search volume for each term. Elastic net regression was used to evaluate prediction accuracy, and cross-correlation analysis was used to identify timeliness of prediction. Subgroup elastic net regression analysis was performed for race, sex, and age. RESULTS For gonorrhea and chlamydia, actual and predicted STI values correlated moderately in 2011 (chlamydia: <i>r</i>=0.65; gonorrhea: <i>r</i>=0.72) but correlated highly (chlamydia: <i>r</i>=0.90; gonorrhea: <i>r</i>=0.94) from 2012 to 2017. However, for primary and secondary syphilis, the high correlation was observed only for 2012 (<i>r</i>=0.79), 2013 (<i>r</i>=0.77), 2016 (0.80), and 2017 (<i>r</i>=0.84), with 2011, 2014, and 2015 showing moderate correlations (<i>r</i>=0.55-0.70). Model performance was the most accurate (highest correlation and lowest mean absolute error) for gonorrhea. Subgroup analyses improved model fit across disease and year. Regression models using search terms selected from the cross-correlation analysis improved the prediction accuracy and timeliness across diseases and years. CONCLUSIONS Integrating nowcasting with Google Trends in surveillance activities can potentially enhance the prediction and timeliness of outbreak detection and response as well as target interventions to subpopulations. Future studies should prospectively examine the utility of Google Trends applied to STI surveillance and response.


2021 ◽  
Author(s):  
Yuying Chu ◽  
Xue Wang ◽  
Hongliang Dai

BACKGROUND Since December 2019, an unexplained pneumonia has broken out in Wuhan, Hubei Province, China. In order to prevent the rapid spread of this disease, quarantine or lockdown measures were taken by Chinese government. These measures turned out to be effective in containing the contagious disease. Quarantine itself, however, would potentially cause certain health risks among the affected population, such as sleep disorder. OBJECTIVE The aims of this work were to analyze the volume of insomnia-related search during the COVID-19 outbreak in China, to explore the potential use of the Baidu Index for monitoring social and psychological distress, and to help community health workers provide timely and effective interventions for the public. METHODS In the context of the pandemic, we conducted a descriptive analysis of the overall search situation. Spearman's correlation and cross-correlation analysis were used to explore the relationship between daily search index values for insomnia-related terms and daily newly confirmed cases. The means of search volume for insomnia-related terms during the COVID-19 quarantine or knockdown period (January 23rd, 2020 to April 8th, 2020) were compared with those during 2016 to 2019 using a Student's t test. Finally, by analyzing the overall daily mean of insomnia in various provinces, we further evaluated whether there existed regional differences in searching for insomnia during COVID-19 isolation. RESULTS During COVID-19 lockdown, the number of insomnia-related searches increased significantly, especially the average daily the Baidu Index for “the best treatment for insomnia” reaching 5,923.86. Seventeen out of the 24 insomnia related search terms were associated with daily newly confirmed cases, of which “a simple cure for insomnia” had the closest correlation (r=0.676; P<.001). The cross-correlation analysis also verified the forward or backward time correlation between daily newly confirmed cases and the search terms. Compared with the same period in the past four years, a significant change in insomnia-related search volume was found during COVID-19 quarantine period. We also found that all provinces suffered from insomnia during the quarantine period, with Guangdong province representing the leading areas for insomnia-related search. CONCLUSIONS Quarantine measures have led to an increase in insomnia-related searches during the COVID-19 pandemic. Community medical staff should use big data-based tools to screen for insomnia and mental health problems. Early interventions toward insomnia and associated mental health are also essential for prevention and reduction of the long-term impact of the major traumatic events.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Ming-Yang Wang ◽  
Nai-jun Tang

Abstract Background Salmonella infection (salmonellosis) is a common infectious disease leading to gastroenteritis, dehydration, uveitis, etc. Internet search is a new method to monitor the outbreak of infectious disease. An internet-based surveillance system using internet data is logistically advantageous and economical to show term-related diseases. In this study, we tried to determine the relationship between salmonellosis and Google Trends in the USA from January 2004 to December 2017. Methods We downloaded the reported salmonellosis in the USA from the National Outbreak Reporting System (NORS) from January 2004 to December 2017. Additionally, we downloaded the Google search terms related to salmonellosis from Google Trends in the same period. Cross-correlation analysis and multiple regression analysis were conducted. Results The results showed that 6 Google Trends search terms appeared earlier than reported salmonellosis, 26 Google Trends search terms coincided with salmonellosis, and 16 Google Trends search terms appeared after salmonellosis were reported. When the search terms preceded outbreaks, “foods” (t = 2.927, P = 0.004) was a predictor of salmonellosis. When the search terms coincided with outbreaks, “hotel” (t = 1.854, P = 0.066), “poor sanitation” (t = 2.895, P = 0.004), “blueberries” (t = 2.441, P = 0.016), and “hypovolemic shock” (t = 2.001, P = 0.047) were predictors of salmonellosis. When the search terms appeared after outbreaks, “ice cream” (t = 3.077, P = 0.002) was the predictor of salmonellosis. Finally, we identified the most important indicators of Google Trends search terms, including “hotel” (t = 1.854, P = 0.066), “poor sanitation” (t = 2.895, P = 0.004), “blueberries” (t = 2.441, P = 0.016), and “hypovolemic shock” (t = 2.001, P = 0.047). In the future, the increased search activities of these terms might indicate the salmonellosis. Conclusion We evaluated the related Google Trends search terms with salmonellosis and identified the most important predictors of salmonellosis outbreak.


2019 ◽  
Vol 11 (1) ◽  
pp. 01025-1-01025-5 ◽  
Author(s):  
N. A. Borodulya ◽  
◽  
R. O. Rezaev ◽  
S. G. Chistyakov ◽  
E. I. Smirnova ◽  
...  

Sensors ◽  
2018 ◽  
Vol 18 (5) ◽  
pp. 1571 ◽  
Author(s):  
Jhonatan Camacho Navarro ◽  
Magda Ruiz ◽  
Rodolfo Villamizar ◽  
Luis Mujica ◽  
Jabid Quiroga

2010 ◽  
Vol 09 (02) ◽  
pp. 203-217 ◽  
Author(s):  
XIAOJUN ZHAO ◽  
PENGJIAN SHANG ◽  
YULEI PANG

This paper reports the statistics of extreme values and positions of extreme events in Chinese stock markets. An extreme event is defined as the event exceeding a certain threshold of normalized logarithmic return. Extreme values follow a piecewise function or a power law distribution determined by the threshold due to a crossover. Extreme positions are studied by return intervals of extreme events, and it is found that return intervals yield a stretched exponential function. According to correlation analysis, extreme values and return intervals are weakly correlated and the correlation decreases with increasing threshold. No long-term cross-correlation exists by using the detrended cross-correlation analysis (DCCA) method. We successfully introduce a modification specific to the correlation and derive the joint cumulative distribution of extreme values and return intervals at 95% confidence level.


Sign in / Sign up

Export Citation Format

Share Document