scholarly journals Learning Latent Space Representations to Predict Patient Outcomes: Model Development and Validation (Preprint)

2019 ◽  
Author(s):  
Subendhu Rongali ◽  
Adam J Rose ◽  
David D McManus ◽  
Adarsha S Bajracharya ◽  
Alok Kapoor ◽  
...  

BACKGROUND Scalable and accurate health outcome prediction using electronic health record (EHR) data has gained much attention in research recently. Previous machine learning models mostly ignore relations between different types of clinical data (ie, laboratory components, International Classification of Diseases codes, and medications). OBJECTIVE This study aimed to model such relations and build predictive models using the EHR data from intensive care units. We developed innovative neural network models and compared them with the widely used logistic regression model and other state-of-the-art neural network models to predict the patient’s mortality using their longitudinal EHR data. METHODS We built a set of neural network models that we collectively called as long short-term memory (LSTM) outcome prediction using comprehensive feature relations or in short, CLOUT. Our CLOUT models use a correlational neural network model to identify a latent space representation between different types of discrete clinical features during a patient’s encounter and integrate the latent representation into an LSTM-based predictive model framework. In addition, we designed an ablation experiment to identify risk factors from our CLOUT models. Using physicians’ input as the gold standard, we compared the risk factors identified by both CLOUT and logistic regression models. RESULTS Experiments on the Medical Information Mart for Intensive Care-III dataset (selected patient population: 7537) show that CLOUT (area under the receiver operating characteristic curve=0.89) has surpassed logistic regression (0.82) and other baseline NN models (<0.86). In addition, physicians’ agreement with the CLOUT-derived risk factor rankings was statistically significantly higher than the agreement with the logistic regression model. CONCLUSIONS Our results support the applicability of CLOUT for real-world clinical use in identifying patients at high risk of mortality. CLINICALTRIAL

10.2196/16374 ◽  
2020 ◽  
Vol 22 (3) ◽  
pp. e16374
Author(s):  
Subendhu Rongali ◽  
Adam J Rose ◽  
David D McManus ◽  
Adarsha S Bajracharya ◽  
Alok Kapoor ◽  
...  

Background Scalable and accurate health outcome prediction using electronic health record (EHR) data has gained much attention in research recently. Previous machine learning models mostly ignore relations between different types of clinical data (ie, laboratory components, International Classification of Diseases codes, and medications). Objective This study aimed to model such relations and build predictive models using the EHR data from intensive care units. We developed innovative neural network models and compared them with the widely used logistic regression model and other state-of-the-art neural network models to predict the patient’s mortality using their longitudinal EHR data. Methods We built a set of neural network models that we collectively called as long short-term memory (LSTM) outcome prediction using comprehensive feature relations or in short, CLOUT. Our CLOUT models use a correlational neural network model to identify a latent space representation between different types of discrete clinical features during a patient’s encounter and integrate the latent representation into an LSTM-based predictive model framework. In addition, we designed an ablation experiment to identify risk factors from our CLOUT models. Using physicians’ input as the gold standard, we compared the risk factors identified by both CLOUT and logistic regression models. Results Experiments on the Medical Information Mart for Intensive Care-III dataset (selected patient population: 7537) show that CLOUT (area under the receiver operating characteristic curve=0.89) has surpassed logistic regression (0.82) and other baseline NN models (<0.86). In addition, physicians’ agreement with the CLOUT-derived risk factor rankings was statistically significantly higher than the agreement with the logistic regression model. Conclusions Our results support the applicability of CLOUT for real-world clinical use in identifying patients at high risk of mortality.


2018 ◽  
Vol 22 (5) ◽  
pp. 141-153
Author(s):  
N. A.  Bilev

In modern electronic stock exchanges there is an opportunity to analyze event driven market microstructure data. This data is highly informative and describes physical price formation which makes it possible to find complex patterns in price dynamics. It is very time consuming and hard to find this kind of patterns by handcrafted rules. However, modern machine learning models are able to solve such issues automatically by learning price behavior which is always changing. The present study presents profitable trading system based on a machine learning model and market microstructure data. Data for the research was collected from Moscow stock exchange MICEX and represents a limit order book change log and all market trades of a liquid security for a certain period. Logistic regression model was used and compared to neural network models with different configuration. According to the study results logistic regression model has almost the same prediction quality as neural network models have but also has a high speed of response which is very important for stock market trading. The developed trading system has medium frequency of deals submission that lets it to avoid expensive infrastructure which is usually needed in high-frequency trading systems. At the same time, the system uses the potential of high quality market microstructure data to the full extent. This paper describes the entire process of trading system development including feature engineering, models behavior comparison and creation of trading strategy with testing on historical data.


Author(s):  
Byunghyun Kang ◽  
Cheol Choi ◽  
Daeun Sung ◽  
Seongho Yoon ◽  
Byoung-Ho Choi

In this study, friction tests are performed, via a custom-built friction tester, on specimens of natural rubber used in automotive suspension bushings. By analyzing the problematic suspension bushings, the eleven candidate factors that influence squeak noise are selected: surface lubrication, hardness, vulcanization condition, surface texture, additive content, sample thickness, thermal aging, temperature, surface moisture, friction speed, and normal force. Through friction tests, the changes are investigated in frictional force and squeak noise occurrence according to various levels of the influencing factors. The degree of correlation between frictional force and squeak noise occurrence with the factors is determined through statistical tests, and the relationship between frictional force and squeak noise occurrence based on the test results is discussed. Squeak noise prediction models are constructed by considering the interactions among the influencing factors through both multiple logistic regression and neural network analysis. The accuracies of the two prediction models are evaluated by comparing predicted and measured results. The accuracies of the multiple logistic regression and neural network models in predicting the occurrence of squeak noise are 88.2% and 87.2%, respectively.


2021 ◽  
Author(s):  
Li Lu Wei ◽  
Yu jian

Abstract Background Hypertension is a common chronic disease in the world, and it is also a common basic disease of cardiovascular and brain complications. Overweight and obesity are the high risk factors of hypertension. In this study, three statistical methods, classification tree model, logistic regression model and BP neural network, were used to screen the risk factors of hypertension in overweight and obese population, and the interaction of risk factors was conducted Analysis, for the early detection of hypertension, early diagnosis and treatment, reduce the risk of hypertension complications, have a certain clinical significance.Methods The classification tree model, logistic regression model and BP neural network model were used to screen the risk factors of hypertension in overweight and obese people.The specificity, sensitivity and accuracy of the three models were evaluated by receiver operating characteristic curve (ROC). Finally, the classification tree CRT model was used to screen the related risk factors of overweight and obesity hypertension, and the non conditional logistic regression multiplication model was used to quantitatively analyze the interaction.Results The Youden index of ROC curve of classification tree model, logistic regression model and BP neural network model were 39.20%,37.02% ,34.85%, the sensitivity was 61.63%, 76.59%, 82.85%, the specificity was 77.58%, 60.44%, 52.00%, and the area under curve (AUC) was 0.721, 0.734,0.733, respectively. There was no significant difference in AUC between the three models (P>0.05). Classification tree CRT model and logistic regression multiplication model suggested that the interaction between NAFLD and FPG was closely related to the prevalence of overweight and obese hypertension.Conclusion NAFLD,FPG,age,TG,UA, LDL-C were the risk factors of hypertension in overweight and obese people. The interaction between NAFLD and FPG increased the risk of hypertension.


Information ◽  
2020 ◽  
Vol 11 (4) ◽  
pp. 207
Author(s):  
Asma Baccouche ◽  
Begonya Garcia-Zapirain ◽  
Cristian Castillo Olea ◽  
Adel Elmaghraby

Heart diseases are highly ranked among the leading causes of mortality in the world. They have various types including vascular, ischemic, and hypertensive heart disease. A large number of medical features are reported for patients in the Electronic Health Records (EHR) that allow physicians to diagnose and monitor heart disease. We collected a dataset from Medica Norte Hospital in Mexico that includes 800 records and 141 indicators such as age, weight, glucose, blood pressure rate, and clinical symptoms. Distribution of the collected records is very unbalanced on the different types of heart disease, where 17% of records have hypertensive heart disease, 16% of records have ischemic heart disease, 7% of records have mixed heart disease, and 8% of records have valvular heart disease. Herein, we propose an ensemble-learning framework of different neural network models, and a method of aggregating random under-sampling. To improve the performance of the classification algorithms, we implement a data preprocessing step with features selection. Experiments were conducted with unidirectional and bidirectional neural network models and results showed that an ensemble classifier with a BiLSTM or BiGRU model with a CNN model had the best classification performance with accuracy and F1-score between 91% and 96% for the different types of heart disease. These results are competitive and promising for heart disease dataset. We showed that ensemble-learning framework based on deep models could overcome the problem of classifying an unbalanced heart disease dataset. Our proposed framework can lead to highly accurate models that are adapted for clinical real data and diagnosis use.


Sign in / Sign up

Export Citation Format

Share Document