scholarly journals Machine Learning for the Diagnosis of Orthodontic Extractions: A Computational Analysis Using Ensemble Learning

2020 ◽  
Vol 7 (2) ◽  
pp. 55
Author(s):  
Yasir Suhail ◽  
Madhur Upadhyay ◽  
Aditya Chhibber ◽  
Kshitiz

Extraction of teeth is an important treatment decision in orthodontic practice. An expert system that is able to arrive at suitable treatment decisions can be valuable to clinicians for verifying treatment plans, minimizing human error, training orthodontists, and improving reliability. In this work, we train a number of machine learning models for this prediction task using data for 287 patients, evaluated independently by five different orthodontists. We demonstrate why ensemble methods are particularly suited for this task. We evaluate the performance of the machine learning models and interpret the training behavior. We show that the results for our model are close to the level of agreement between different orthodontists.

2020 ◽  
Vol 214 ◽  
pp. 01023
Author(s):  
Linan (Frank) Zhao

Long-term unemployment has significant societal impact and is of particular concerns for policymakers with regard to economic growth and public finances. This paper constructs advanced ensemble machine learning models to predict citizens’ risks of becoming long-term unemployed using data collected from European public authorities for employment service. The proposed model achieves 81.2% accuracy on identifying citizens with high risks of long-term unemployment. This paper also examines how to dissect black-box machine learning models by offering explanations at both a local and global level using SHAP, a state-of-the-art model-agnostic approach to explain factors that contribute to long-term unemployment. Lastly, this paper addresses an under-explored question when applying machine learning in the public domain, that is, the inherent bias in model predictions. The results show that popular models such as gradient boosted trees may produce unfair predictions against senior age groups and immigrants. Overall, this paper sheds light on the recent increasing shift for governments to adopt machine learning models to profile and prioritize employment resources to reduce the detrimental effects of long-term unemployment and improve public welfare.


Author(s):  
Maicon Herverton Lino Ferreira da Silva Barros ◽  
Geovanne Oliveira Alves ◽  
Lubnnia Morais Florêncio Souza ◽  
Élisson da Silva Rocha ◽  
João Fausto Lorenzato de Oliveira ◽  
...  

Tuberculosis (TB) is an airborne infectious disease caused by organisms in the Mycobacterium tuberculosis (Mtb) complex. In many low and middle-income countries, TB remains a major cause of morbidity and mortality. Once a patient has been diagnosed with TB, it is critical that healthcare workers make the most appropriate treatment decision given the individual conditions of the patient and the likely course of the disease based on medical experience. Depending on the prognosis, delayed or inappropriate treatment can result in unsatisfactory results including the exacerbation of clinical symptoms, poor quality of life, and increased risk of death. This work benchmarks machine learning models to aid TB prognosis using a Brazilian health database of confirmed cases and deaths related to TB in the State of Amazonas. The goal is to predict the probability of death by TB thus aiding the prognosis of TB and associated treatment decision making process. In its original form, the data set comprised 36,228 records and 130 fields but suffered from missing, incomplete, or incorrect data. Following data cleaning and preprocessing, a revised data set was generated comprising 24,015 records and 38 fields, including 22,876 reported cured TB patients and 1,139 deaths by TB. To explore how the data imbalance impacts model performance, two controlled experiments were designed using (1) imbalanced and (2) balanced data sets. The best result is achieved by the Gradient Boosting (GB) model using the balanced data set to predict TB-mortality, and the ensemble model composed by the Random Forest (RF), GB and Multi-layer Perceptron (MLP) models is the best model to predict the cure class.


Author(s):  
Khajamoinuddin Syed ◽  
William Sleeman ◽  
Payal Soni ◽  
Michael Hagan ◽  
Jatinder Palta ◽  
...  

2019 ◽  
Vol 2 (1) ◽  
Author(s):  
Antonin Dauvin ◽  
Carolina Donado ◽  
Patrik Bachtiger ◽  
Ke-Chun Huang ◽  
Christopher Martin Sauer ◽  
...  

AbstractPatients admitted to the intensive care unit frequently have anemia and impaired renal function, but often lack historical blood results to contextualize the acuteness of these findings. Using data available within two hours of ICU admission, we developed machine learning models that accurately (AUC 0.86–0.89) classify an individual patient’s baseline hemoglobin and creatinine levels. Compared to assuming the baseline to be the same as the admission lab value, machine learning performed significantly better at classifying acute kidney injury regardless of initial creatinine value, and significantly better at predicting baseline hemoglobin value in patients with admission hemoglobin of <10 g/dl.


Informatics ◽  
2021 ◽  
Vol 8 (2) ◽  
pp. 27
Author(s):  
Maicon Herverton Lino Ferreira da Silva Barros ◽  
Geovanne Oliveira Alves ◽  
Lubnnia Morais Florêncio Souza ◽  
Elisson da Silva Rocha ◽  
João Fausto Lorenzato de Oliveira ◽  
...  

Tuberculosis (TB) is an airborne infectious disease caused by organisms in the Mycobacterium tuberculosis (Mtb) complex. In many low and middle-income countries, TB remains a major cause of morbidity and mortality. Once a patient has been diagnosed with TB, it is critical that healthcare workers make the most appropriate treatment decision given the individual conditions of the patient and the likely course of the disease based on medical experience. Depending on the prognosis, delayed or inappropriate treatment can result in unsatisfactory results including the exacerbation of clinical symptoms, poor quality of life, and increased risk of death. This work benchmarks machine learning models to aid TB prognosis using a Brazilian health database of confirmed cases and deaths related to TB in the State of Amazonas. The goal is to predict the probability of death by TB thus aiding the prognosis of TB and associated treatment decision making process. In its original form, the data set comprised 36,228 records and 130 fields but suffered from missing, incomplete, or incorrect data. Following data cleaning and preprocessing, a revised data set was generated comprising 24,015 records and 38 fields, including 22,876 reported cured TB patients and 1139 deaths by TB. To explore how the data imbalance impacts model performance, two controlled experiments were designed using (1) imbalanced and (2) balanced data sets. The best result is achieved by the Gradient Boosting (GB) model using the balanced data set to predict TB-mortality, and the ensemble model composed by the Random Forest (RF), GB and Multi-Layer Perceptron (MLP) models is the best model to predict the cure class.


2021 ◽  
Vol 11 (2) ◽  
pp. 110-114
Author(s):  
Aseel Qutub ◽  
◽  
Asmaa Al-Mehmadi ◽  
Munirah Al-Hssan ◽  
Ruyan Aljohani ◽  
...  

Employees are the most valuable resources for any organization. The cost associated with professional training, the developed loyalty over the years and the sensitivity of some organizational positions, all make it very essential to identify who might leave the organization. Many reasons can lead to employee attrition. In this paper, several machine learning models are developed to automatically and accurately predict employee attrition. IBM attrition dataset is used in this work to train and evaluate machine learning models; namely Decision Tree, Random Forest Regressor, Logistic Regressor, Adaboost Model, and Gradient Boosting Classifier models. The ultimate goal is to accurately detect attrition to help any company to improve different retention strategies on crucial employees and boost those employee satisfactions.


2020 ◽  
Vol 14 (5) ◽  
pp. 1097-1109
Author(s):  
Zohreh Sheikh Khozani ◽  
Khabat Khosravi ◽  
Mohammadamin Torabi ◽  
Amir Mosavi ◽  
Bahram Rezaei ◽  
...  

2021 ◽  
Vol 9 (4) ◽  
pp. 769-788
Author(s):  
Shan Zhong ◽  
David Hitchcock

We summarized both common and novel predictive models used for stock price prediction and combined them with technical indices, fundamental characteristics and text-based sentiment data to predict S&P stock prices. A 66.18% accuracy in S&P 500 index directional prediction and 62.09% accuracy in individual stock directional prediction was achieved by combining different machine learning models such as Random Forest and LSTM together into state-of-the-art ensemble models. The data we use contains weekly historical prices, finance reports, and text information from news items associated with 518 different common stocks issued by current and former S&P 500 large-cap companies, from January 1, 2000 to December 31, 2019. Our study's innovation includes utilizing deep language models to categorize and infer financial news item sentiment; fusing different models containing different combinations of variables and stocks to jointly make predictions; and overcoming the insufficient data problem for machine learning models in time series by using data across different stocks.


2021 ◽  
Author(s):  
Ada Ng ◽  
Boyang Wei ◽  
Jaya Jain ◽  
Erin Ward ◽  
Darius Tandon ◽  
...  

BACKGROUND Cognitive behavioral therapy (CBT)-based interventions are effective in reducing prenatal stress, which can have severe adverse health effects on mother and newborn if unaddressed. Predicting next-day physiologic or perceived stress can help to inform and enable preemptive interventions for a likely physiologically and/or perceptibly stressful day. Machine learning models are useful tools that can be developed to predict next-day physiologic and perceived stress using data collected the previous day. Such models can improve our understanding of the specific factors that predict physiologic and perceived stress and will also allow researchers to develop systems that collect selected features for assessment for clinical trials in order to minimize the burden of data collection. OBJECTIVE To build and evaluate a machine-learned model that predicts next-day physiologic and perceived stress using sensor-based, ecological momentary assessment (EMA)-based, and intervention-based features and to explain the prediction results. METHODS We enrolled pregnant women into a prospective proof-of-concept study and collected electrocardiography, EMA, and CBT intervention data over 12 weeks. We used the data to train and evaluate six machine learning models to predict next-day physiologic and perceived stress. After selecting the best performing model, SHapley Additive exPlanations (SHAP) were used to identify feature importance and explainability of each feature. RESULTS A total of 16 pregnant women enrolled in the study. Overall, 4157.18 hours of data were collected, and participants answered 2838 EMAs. After applying feature selection, 8 and 10 features were found to positively predict next-day physiologic and perceived stress, respectively. A random forest classifier performed the best in predicting next-day physiologic (F1-score 0.84) and next-day perceived stress (F1-score 0.74) using all features. While any subset of sensor-based, EMA-based, and/or intervention-based features could reliably predict next-day physiologic stress, EMA-based features were necessary to predict next-day perceived stress. Analysis of explainability metrics showed that prolonged duration of physiologic stress was highly predictive of next-day physiologic stress and that physiologic stress and perceived stress were temporally divergent. CONCLUSIONS In this study we were able to build interpretable machine learning models to predict next-day physiologic and perceived stress, and we identify unique features that were highly predictive of next-day stress that can help reduce the burden of data collection.


2019 ◽  
Author(s):  
Weina Zhang ◽  
Han Liu ◽  
Vincent Michael Bernard Silenzio ◽  
Peiyuan Qiu ◽  
Wenjie Gong

BACKGROUND Postpartum depression (PPD) is a serious public health problem. Building a predictive model for PPD using data during pregnancy can facilitate earlier identification and intervention. OBJECTIVE The aims of this study are to compare the effects of four different machine learning models using data during pregnancy to predict PPD and explore which factors in the model are the most important for PPD prediction. METHODS Information on the pregnancy period from a cohort of 508 women, including demographics, social environmental factors, and mental health, was used as predictors in the models. The Edinburgh Postnatal Depression Scale score within 42 days after delivery was used as the outcome indicator. Using two feature selection methods (expert consultation and random forest-based filter feature selection [FFS-RF]) and two algorithms (support vector machine [SVM] and random forest [RF]), we developed four different machine learning PPD prediction models and compared their prediction effects. RESULTS There was no significant difference in the effectiveness of the two feature selection methods in terms of model prediction performance, but 10 fewer factors were selected with the FFS-RF than with the expert consultation method. The model based on SVM and FFS-RF had the best prediction effects (sensitivity=0.69, area under the curve=0.78). In the feature importance ranking output by the RF algorithm, psychological elasticity, depression during the third trimester, and income level were the most important predictors. CONCLUSIONS In contrast to the expert consultation method, FFS-RF was important in dimension reduction. When the sample size is small, the SVM algorithm is suitable for predicting PPD. In the prevention of PPD, more attention should be paid to the psychological resilience of mothers.


Sign in / Sign up

Export Citation Format

Share Document