Machine Learning Models of COVID-19 Cases in the United States: A Study of Initial Lockdown and Reopen Regimes

Arnold Kamis; Yudan Ding; Zhenzhen Qu; Chenchen Zhang

doi:10.3390/app112311227

Machine Learning Models of COVID-19 Cases in the United States: A Study of Initial Lockdown and Reopen Regimes

Applied Sciences ◽

10.3390/app112311227 ◽

2021 ◽

Vol 11 (23) ◽

pp. 11227

Author(s):

Arnold Kamis ◽

Yudan Ding ◽

Zhenzhen Qu ◽

Chenchen Zhang

Keyword(s):

United States ◽

Machine Learning ◽

Additive Model ◽

Regression Tree ◽

Predictor Variable ◽

The United States ◽

Predictor Variables ◽

Future Research ◽

Machine Learning Methods ◽

Variance Explained

The purpose of this paper is to model the cases of COVID-19 in the United States from 13 March 2020 to 31 May 2020. Our novel contribution is that we have obtained highly accurate models focused on two different regimes, lockdown and reopen, modeling each regime separately. The predictor variables include aggregated individual movement as well as state population density, health rank, climate temperature, and political color. We apply a variety of machine learning methods to each regime: Multiple Regression, Ridge Regression, Elastic Net Regression, Generalized Additive Model, Gradient Boosted Machine, Regression Tree, Neural Network, and Random Forest. We discover that Gradient Boosted Machines are the most accurate in both regimes. The best models achieve a variance explained of 95.2% in the lockdown regime and 99.2% in the reopen regime. We describe the influence of the predictor variables as they change from regime to regime. Notably, we identify individual person movement, as tracked by GPS data, to be an important predictor variable. We conclude that government lockdowns are an extremely important de-densification strategy. Implications and questions for future research are discussed.

Download Full-text

Machine Learning Methods to Identify Missed Cases of Bladder Cancer in Population-Based Registries

JCO Clinical Cancer Informatics ◽

10.1200/cci.20.00170 ◽

2021 ◽

pp. 641-653

Author(s):

Anne-Michelle Noone ◽

Clara J. K. Lam ◽

Angela B. Smith ◽

Matthew E. Nielsen ◽

Eric Boyd ◽

...

Keyword(s):

United States ◽

Machine Learning ◽

Bladder Cancer ◽

Cancer Incidence ◽

Cancer Registries ◽

The United States ◽

Population Based ◽

Learning Methods ◽

Machine Learning Methods ◽

Classification And Regression

PURPOSE Population-based cancer incidence rates of bladder cancer may be underestimated. Accurate estimates are needed for understanding the burden of bladder cancer in the United States. We developed and evaluated the feasibility of a machine learning–based classifier to identify bladder cancer cases missed by cancer registries, and estimated the rate of bladder cancer cases potentially missed. METHODS Data were from population-based cohort of 37,940 bladder cancer cases 65 years of age and older in the SEER cancer registries linked with Medicare claims (2007-2013). Cases with other urologic cancers, abdominal cancers, and unrelated cancers were included as control groups. A cohort of cancer-free controls was also selected using the Medicare 5% random sample. We used five supervised machine learning methods: classification and regression trees, random forest, logic regression, support vector machines, and logistic regression, for predicting bladder cancer. RESULTS Registry linkages yielded 37,940 bladder cancer cases and 766,303 cancer-free controls. Using health insurance claims, classification and regression trees distinguished bladder cancer cases from noncancer controls with very high accuracy (95%). Bacille Calmette-Guerin, cystectomy, and mitomycin were the most important predictors for identifying bladder cancer. From 2007 to 2013, we estimated that up to 3,300 bladder cancer cases in the United States may have been missed by the SEER registries. This would result in an average of 3.5% increase in the reported incidence rate. CONCLUSION SEER cancer registries may potentially miss bladder cancer cases during routine reporting. These missed cases can be identified leveraging Medicare claims and data analytics, leading to more accurate estimates of bladder cancer incidence.

Download Full-text

A Brief Analysis of Key Machine Learning Methods for Predicting Medicare Payments Related to Physical Therapy Practices in the United States

Information ◽

10.3390/info12020057 ◽

2021 ◽

Vol 12 (2) ◽

pp. 57

Author(s):

Shrirang A. Kulkarni ◽

Jodh S. Pannu ◽

Andriy V. Koval ◽

Gabriel J. Merrin ◽

Varadraj P. Gurupur ◽

...

Keyword(s):

United States ◽

Machine Learning ◽

Physical Therapy ◽

Random Forest ◽

Generalized Additive Model ◽

Additive Model ◽

The United States ◽

Random Forest Regression ◽

Key Variables ◽

Machine Learning Models

Background and objectives: Machine learning approaches using random forest have been effectively used to provide decision support in health and medical informatics. This is especially true when predicting variables associated with Medicare reimbursements. However, more work is needed to analyze and predict data associated with reimbursements through Medicare and Medicaid services for physical therapy practices in the United States. The key objective of this study is to analyze different machine learning models to predict key variables associated with Medicare standardized payments for physical therapy practices in the United States. Materials and Methods: This study employs five methods, namely, multiple linear regression, decision tree regression, random forest regression, K-nearest neighbors, and linear generalized additive model, (GAM) to predict key variables associated with Medicare payments for physical therapy practices in the United States. Results: The study described in this article adds to the body of knowledge on the effective use of random forest regression and linear generalized additive model in predicting Medicare Standardized payment. It turns out that random forest regression may have any edge over other methods employed for this purpose. Conclusions: The study provides a useful insight into comparing the performance of the aforementioned methods, while identifying a few intricate details associated with predicting Medicare costs while also ascertaining that linear generalized additive model and random forest regression as the most suitable machine learning models for predicting key variables associated with standardized Medicare payments.

Download Full-text

The #MeToo Movement in the United States: Text Analysis of Early Twitter Conversations

Journal of Medical Internet Research ◽

10.2196/13837 ◽

2019 ◽

Vol 21 (9) ◽

pp. e13837 ◽

Cited By ~ 2

Author(s):

Sepideh Modrek ◽

Bozhidar Chakalov

Keyword(s):

United States ◽

Machine Learning ◽

Sexual Assault ◽

Sexual Harassment ◽

Early Life ◽

English Language ◽

Life Experiences ◽

The United States ◽

Learning Methods ◽

Machine Learning Methods

Background The #MeToo movement sparked an international debate on the sexual harassment, abuse, and assault and has taken many directions since its inception in October of 2017. Much of the early conversation took place on public social media sites such as Twitter, where the hashtag movement began. Objective The aim of this study is to document, characterize, and quantify early public discourse and conversation of the #MeToo movement from Twitter data in the United States. We focus on posts with public first-person revelations of sexual assault/abuse and early life experiences of such events. Methods We purchased full tweets and associated metadata from the Twitter Premium application programming interface between October 14 and 21, 2017 (ie, the first week of the movement). We examined the content of novel English language tweets with the phrase “MeToo” from within the United States (N=11,935). We used machine learning methods, least absolute shrinkage and selection operator regression, and support vector machine models to summarize and classify the content of individual tweets with revelations of sexual assault and abuse and early life experiences of sexual assault and abuse. Results We found that the most predictive words created a vivid archetype of the revelations of sexual assault and abuse. We then estimated that in the first week of the movement, 11% of novel English language tweets with the words “MeToo” revealed details about the poster’s experience of sexual assault or abuse and 5.8% revealed early life experiences of such events. We examined the demographic composition of posters of sexual assault and abuse and found that white women aged 25-50 years were overrepresented in terms of their representation on Twitter. Furthermore, we found that the mass sharing of personal experiences of sexual assault and abuse had a large reach, where 6 to 34 million Twitter users may have seen such first-person revelations from someone they followed in the first week of the movement. Conclusions These data illustrate that revelations shared went beyond acknowledgement of having experienced sexual harassment and often included vivid and traumatic descriptions of early life experiences of assault and abuse. These findings and methods underscore the value of content analysis, supported by novel machine learning methods, to improve our understanding of how widespread the revelations were, which likely amplified the spread and saliency of the #MeToo movement.

Download Full-text

The #MeToo Movement in the United States: Text Analysis of Early Twitter Conversations (Preprint)

10.2196/preprints.13837 ◽

2019 ◽

Author(s):

Sepideh Modrek ◽

Bozhidar Chakalov

Keyword(s):

United States ◽

Machine Learning ◽

Sexual Assault ◽

Sexual Harassment ◽

Early Life ◽

English Language ◽

Life Experiences ◽

The United States ◽

Learning Methods ◽

Machine Learning Methods

BACKGROUND The #MeToo movement sparked an international debate on the sexual harassment, abuse, and assault and has taken many directions since its inception in October of 2017. Much of the early conversation took place on public social media sites such as Twitter, where the hashtag movement began. OBJECTIVE The aim of this study is to document, characterize, and quantify early public discourse and conversation of the #MeToo movement from Twitter data in the United States. We focus on posts with public first-person revelations of sexual assault/abuse and early life experiences of such events. METHODS We purchased full tweets and associated metadata from the Twitter Premium application programming interface between October 14 and 21, 2017 (ie, the first week of the movement). We examined the content of novel English language tweets with the phrase “MeToo” from within the United States (N=11,935). We used machine learning methods, least absolute shrinkage and selection operator regression, and support vector machine models to summarize and classify the content of individual tweets with revelations of sexual assault and abuse and early life experiences of sexual assault and abuse. RESULTS We found that the most predictive words created a vivid archetype of the revelations of sexual assault and abuse. We then estimated that in the first week of the movement, 11% of novel English language tweets with the words “MeToo” revealed details about the poster’s experience of sexual assault or abuse and 5.8% revealed early life experiences of such events. We examined the demographic composition of posters of sexual assault and abuse and found that white women aged 25-50 years were overrepresented in terms of their representation on Twitter. Furthermore, we found that the mass sharing of personal experiences of sexual assault and abuse had a large reach, where 6 to 34 million Twitter users may have seen such first-person revelations from someone they followed in the first week of the movement. CONCLUSIONS These data illustrate that revelations shared went beyond acknowledgement of having experienced sexual harassment and often included vivid and traumatic descriptions of early life experiences of assault and abuse. These findings and methods underscore the value of content analysis, supported by novel machine learning methods, to improve our understanding of how widespread the revelations were, which likely amplified the spread and saliency of the #MeToo movement.

Download Full-text

Exploring the Relationship Between Chlorophyll-a and Other Water Quality Parameters by Using Machine Learning Methods:A Case Study of Lake Erie

10.5194/egusphere-egu21-14933 ◽

2021 ◽

Author(s):

Xue Hu ◽

Jinhui Jeanne Huang ◽

Yu Li

Keyword(s):

Neural Network ◽

United States ◽

Machine Learning ◽

Water Quality ◽

Chlorophyll A ◽

Lake Erie ◽

The United States ◽

Learning Methods ◽

Machine Learning Methods ◽

Input Variables

<p>Chlorophyll a (CHLA) is a key water quality indicator for the eutrophication of Lake Erie. In order to better predict the concentration of CHLA, this study divided Lake Erie into the United States and Canada according to national boundaries, and found the input variables most relevant to CHLA. It is concluded that the United States is total phosphorus (TP), and Canada is total nitrogen (TN), and it is analyzed that industrial and agricultural pollution around Lake Erie has caused excessive TP and TN content. The study used machine learning methods to model the water quality of the two parts respectively. The data used in the modelling was obtained from the Canadian Environment and Climate Change Agency for Lake Erie between 2000 and 2018. Several neural network (NN) models and other machine learning methods are used for data analysis, including standard neural network (NN) models, simple recurrent neural network (SRN) models, backpropagation neural network (BPNN) models, jump connections neural network (JCNN) model, random forest (RF) and support vector machine (SVM). At the same time, the most suitable combinations of input variables for CHLA prediction was found. The United States was TP, TN, DO, and T, and Canada was TP, TN, PH, and DO. Combining this result with the environmental protection policies of the United States and Canada, recommendations for improving the pollutant content of Lake Erie were proposed. This will help reduce the risk of eutrophication in Lake Erie.</p>

Download Full-text

The Dynamics of Political Incivility on Twitter

SAGE Open ◽

10.1177/2158244020919447 ◽

2020 ◽

Vol 10 (2) ◽

pp. 215824402091944 ◽

Cited By ~ 1

Author(s):

Yannis Theocharis ◽

Pablo Barberá ◽

Zoltán Fazekas ◽

Sebastian Adrian Popa

Keyword(s):

United States ◽

Machine Learning ◽

Political Communication ◽

The United States ◽

Time Span ◽

Supervised Machine Learning ◽

Machine Learning Methods ◽

Policy Debates ◽

Political Events ◽

Descriptive Account

Online incivility and harassment in political communication have become an important topic of concern among politicians, journalists, and academics. This study provides a descriptive account of uncivil interactions between citizens and politicians on Twitter. We develop a conceptual framework for understanding the dynamics of incivility at three distinct levels: macro (temporal), meso (contextual), and micro (individual). Using longitudinal data from the Twitter communication mentioning Members of Congress in the United States across a time span of over a year and relying on supervised machine learning methods and topic models, we offer new insights about the prevalence and dynamics of incivility toward legislators. We find that uncivil tweets represent consistently around 18% of all tweets mentioning legislators, but with spikes that correspond to controversial policy debates and political events. Although we find evidence of coordinated attacks, our analysis reveals that the use of uncivil language is common to a large number of users.

Download Full-text

Forecast Accuracy Matters for Hurricane Damage

Econometrics ◽

10.3390/econometrics8020018 ◽

2020 ◽

Vol 8 (2) ◽

pp. 18

Author(s):

Andrew B. Martinez

Keyword(s):

United States ◽

Machine Learning ◽

Forecast Accuracy ◽

The United States ◽

Hurricane Damage ◽

Machine Learning Methods ◽

Wide Range ◽

Ex Ante ◽

Landfall Location ◽

The U.S

I analyze damage from hurricane strikes on the United States since 1955. Using machine learning methods to select the most important drivers for damage, I show that large errors in a hurricane’s predicted landfall location result in higher damage. This relationship holds across a wide range of model specifications and when controlling for ex-ante uncertainty and potential endogeneity. Using a counterfactual exercise I find that the cumulative reduction in damage from forecast improvements since 1970 is about $82 billion, which exceeds the U.S. government’s spending on the forecasts and private willingness to pay for them.

Download Full-text

Fragments of the American Dream: Immigration, Race, and Medical Care in the Segregated South, 1929

Public Voices ◽

10.22140/pv.111 ◽

2016 ◽

Vol 13 (2) ◽

pp. 1

Author(s):

John R Phillips

Keyword(s):

United States ◽

American Dream ◽

Black Population ◽

The United States ◽

Single Woman ◽

Future Research ◽

Legal Doctrine ◽

Quarter Century ◽

History Of ◽

Sunflower County

The cover photograph for this issue of Public Voices was taken sometime in the summer of 1929 (probably June) somewhere in Sunflower County, Mississippi. Very probably the photo was taken in Indianola but, perhaps, it was Ruleville. It is one of three such photos, one of which does have the annotation on the reverse “Ruleville Midwives Club 1929.” The young woman wearing a tie in this and in one of the other photos was Ann Reid Brown, R.N., then a single woman having only arrived in the United States from Scotland a few years before, in 1923. Full disclosure: This commentary on the photo combines professional research interests in public administration and public policy with personal interests—family interests—for that young nurse later married and became the author’s mother. From the scholarly perspective, such photographs have been seen as “instrumental in establishing midwives’ credentials and cultural identity at a key transitional moment in the history of the midwife and of public health” (Keith, Brennan, & Reynolds 2012). There is also deep irony if we see these photographs as being a fragment of the American dream, of a recent immigrant’s hope for and success at achieving that dream; but that fragment of the vision is understood quite differently when we see that she began a hopeful career working with a Black population forcibly segregated by law under the incongruously named “separate but equal” legal doctrine. That doctrine, derived from the United States Supreme Court’s 1896 decision, Plessy v. Ferguson, would remain the foundation for legally enforced segregation throughout the South for another quarter century. The options open to the young, white, immigrant nurse were almost entirely closed off for the population with which she then worked. The remaining parts of this overview are meant to provide the following: (1) some biographical information on the nurse; (2) a description, in so far as we know it, of why she was in Mississippi; and (3) some indication of areas for future research on this and related topics.

Download Full-text

Questions

10.1093/oso/9780190865214.003.0007 ◽

2018 ◽

Cited By ~ 1

Author(s):

James L. Gibson ◽

Michael J. Nelson

Keyword(s):

United States ◽

African Americans ◽

Supreme Court ◽

Legal System ◽

Public Support ◽

The United States ◽

Future Research ◽

White Americans ◽

Two Factors ◽

The U.S

We have investigated the differences in support for the U.S. Supreme Court among black, Hispanic, and white Americans, catalogued the variation in African Americans’ group attachments and experiences with legal authorities, and examined how those latter two factors shape individuals’ support for the U.S. Supreme Court, that Court’s decisions, and for their local legal system. We take this opportunity to weave our findings together, taking stock of what we have learned from our analyses and what seem like fruitful paths for future research. In the process, we revisit Positivity Theory. We present a modified version of the theory that we hope will guide future inquiry on public support for courts, both in the United States and abroad.

Download Full-text

Introduction

10.1093/oxfordhb/9780190248178.013.1 ◽

2017 ◽

Author(s):

Travis D. Stimeling

Keyword(s):

United States ◽

The United States ◽

Country Music ◽

Future Research ◽

Music Culture ◽

Rich Variety ◽

Ethnographic Studies ◽

Folklore Studies ◽

History Of ◽

The Rich

This chapter offers a historiographic survey of country music scholarship from the publication of Bill C. Malone’s “A History of Commercial Country Music in the United States, 1920–1964” (1965) to the leading publications of the today. Very little of substance has been written on country music recorded since the 1970s, especially when compared to the wealth of available literature on early country recording artists. Ethnographic studies of country music and country music culture are rare, and including ethnographic methods in country music studies offers new insights into the rich variety of ways in which people make, consume, and engage with country music as a genre. The chapter traces the influence of folklore studies, sociology, cultural studies, and musicology on the development of country music studies and proposes some directions for future research in the field.

Download Full-text