Machine Learning Predictions as Regression Covariates

2020 ◽  
pp. 1-18
Author(s):  
Christian Fong ◽  
Matthew Tyler

Abstract In text, images, merged surveys, voter files, and elsewhere, data sets are often missing important covariates, either because they are latent features of observations (such as sentiment in text) or because they are not collected (such as race in voter files). One promising approach for coping with this missing data is to find the true values of the missing covariates for a subset of the observations and then train a machine learning algorithm to predict the values of those covariates for the rest. However, plugging in these predictions without regard for prediction error renders regression analyses biased, inconsistent, and overconfident. We characterize the severity of the problem posed by prediction error, describe a procedure to avoid these inconsistencies under comparatively general assumptions, and demonstrate the performance of our estimators through simulations and a study of hostile political dialogue on the Internet. We provide software implementing our approach.

This project proposes a method for forecasting weather conditions and predicting rainfall by means of machine learning. Here, there are two set ups: one, to measure the weather parameters like temperature, humidity using sensors along with Arduino and another set up, to display the current values(status) and predicted rainfall based on the trained machine learning data sets. The weather forecasting and prediction is done based on the older datasets collected and compared with the current values. The user need not have a backup of huge data to predict the rainfall. Instead a machine learning algorithm can suffice the same. The temperature, humidity sensor modules are used to measure weather parameters and interfaced to an Arduino controller. The proposed setup will compare the forecast value with real-time data, and the predict rainfall based on the dataset fed to the machine learning algorithm.


2021 ◽  
Vol 12 (26) ◽  
pp. 1-13
Author(s):  
Carlos Alberto Arango Pastrana ◽  
Carlos Fernando Osorio Andrade

To reduce the rate of contagion by Covid-19, the Colombian government has adopted, among other measures, for mandatory isolation, with divided opinions, because despite helping to reduce the spread of the virus, it generates mental and economic problems that are difficult to overcome. The objective of this document was to analyze the underlying sentiments in the Twitter comments related to isolation, identifying the topics and words most frequently used in this context. A machine learning algorithm was built to identify sentiments in 72,564 posts and a social network analysis was applied establishing the most frequent topics in the data sets. The results suggest that the algorithm is highly accurate in classifying feelings. Also, as the isolation extends, comments related to the quarantine grow proportionally. Fear was identified as the predominant feeling throughout the period of confinement in Colombia.


2021 ◽  
Author(s):  
G.N. Balaji ◽  
S.V. Suryanarayana ◽  
P. Vijayaragavan

There is a need to wear a mask during the coronavirus outbreak to efficiently deter the transmission of COVID-19 virus. In these instances, traditional facial screening technologies obsolete for monitoring of group entry at Airports, shopping malls, railway stations, etc. It is, therefore, vital to boost the efficiency of screening. This paper addresses the machine learning algorithm for contactless face screening systems in group participation, social interaction, school management, mall entry management, and market resumption scenarios in the case of COVID- 19. A method to screen entry with masks are developed using machine learning, which depends on various face specimens that were discussed here. The second fold discussion in this paper is that previously there are not many freely accessible masked face-databases. To this end, various forms of masked face data sets are identified, namely MFDD, Real MFRD, and Simulated MFRD. Such data sets became widely accessible to businesses and academics, based on which specific apps may be built on masked faces. The mathematical model, with the code was given. The availability and issues of the above data sets were discussed for the benefit of researchers.


2018 ◽  
Author(s):  
C.H.B. van Niftrik ◽  
F. van der Wouden ◽  
V. Staartjes ◽  
J. Fierstra ◽  
M. Stienen ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document