Prediction of the Temperature of Liquid Aluminum and the Dissolved Hydrogen Content in Liquid Aluminum with a Machine Learning Approach

Moon-Jo Kim; Jong Pil Yun; Ji-Ba-Reum Yang; Seung-Jun Choi; DongEung Kim

doi:10.3390/met10030330

Prediction of the Temperature of Liquid Aluminum and the Dissolved Hydrogen Content in Liquid Aluminum with a Machine Learning Approach

Metals ◽

10.3390/met10030330 ◽

2020 ◽

Vol 10 (3) ◽

pp. 330 ◽

Cited By ~ 1

Author(s):

Moon-Jo Kim ◽

Jong Pil Yun ◽

Ji-Ba-Reum Yang ◽

Seung-Jun Choi ◽

DongEung Kim

Keyword(s):

Machine Learning ◽

Linear Regression ◽

Hydrogen Content ◽

Liquid Aluminum ◽

Support Vector ◽

Learning Models ◽

Data Set ◽

Window Method ◽

Dissolved Hydrogen ◽

Machine Learning Models

In aluminum casting, the temperature of liquid aluminum and the dissolved hydrogen density are crucial factors to be controlled for the purpose of both quality control of molten metal and cost efficiency. However, the empirical and numerical approaches to predict these parameters are quite complex and time consuming, and it is necessary to develop an alternative method for rapid prediction with a small number of experiments. In this study, the machine learning models were developed to predict the temperature of liquid aluminum and the dissolved hydrogen content in liquid aluminum. The obtained experimental data was preprocessed to be used for constructing the machine learning models by the sliding time window method. The machine learning models of linear regression, regression tree, Gaussian process regression (GPR), Support vector machine (SVM), and ensembles of regression trees were compared to find the model with the highest performance to predict the target properties. For the prediction of the temperature of liquid aluminum and the dissolved hydrogen content in liquid aluminum, the linear regression and GPR models were selected with the high accuracy of prediction, respectively. In comparison to the numerical modeling, the machine learning modeling had better performance, and was more effective for predicting the target property even with the limited data set when the characteristics of the data were properly considered in data preprocessing.

Download Full-text

QUBO formulations for training machine learning models

Scientific Reports ◽

10.1038/s41598-021-89461-4 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Prasanna Date ◽

Davis Arthur ◽

Lauren Pusey-Nazzaro

Keyword(s):

Machine Learning ◽

Linear Regression ◽

Large Scale ◽

Support Vector ◽

Quantum Computers ◽

Np Hard ◽

Learning Models ◽

Moore’S Law ◽

Moore's Law ◽

Machine Learning Models

AbstractTraining machine learning models on classical computers is usually a time and compute intensive process. With Moore’s law nearing its inevitable end and an ever-increasing demand for large-scale data analysis using machine learning, we must leverage non-conventional computing paradigms like quantum computing to train machine learning models efficiently. Adiabatic quantum computers can approximately solve NP-hard problems, such as the quadratic unconstrained binary optimization (QUBO), faster than classical computers. Since many machine learning problems are also NP-hard, we believe adiabatic quantum computers might be instrumental in training machine learning models efficiently in the post Moore’s law era. In order to solve problems on adiabatic quantum computers, they must be formulated as QUBO problems, which is very challenging. In this paper, we formulate the training problems of three machine learning models—linear regression, support vector machine (SVM) and balanced k-means clustering—as QUBO problems, making them conducive to be trained on adiabatic quantum computers. We also analyze the computational complexities of our formulations and compare them to corresponding state-of-the-art classical approaches. We show that the time and space complexities of our formulations are better (in case of SVM and balanced k-means clustering) or equivalent (in case of linear regression) to their classical counterparts.

Download Full-text

Learning to Identify At-Risk Students in Distance Education Using Interaction Counts

Revista de Informática Teórica e Aplicada ◽

10.22456/2175-2745.62211 ◽

2016 ◽

Vol 23 (2) ◽

pp. 124 ◽

Cited By ~ 2

Author(s):

Douglas Detoni ◽

Cristian Cechinel ◽

Ricardo Araujo Matsumura ◽

Daniela Francisco Brauner

Keyword(s):

Machine Learning ◽

At Risk ◽

At Risk Students ◽

Drop Out ◽

Support Vector ◽

Learning Models ◽

Data Set ◽

Student Dropout ◽

Vector Machines ◽

Machine Learning Models

Student dropout is one of the main problems faced by distance learning courses. One of the major challenges for researchers is to develop methods to predict the behavior of students so that teachers and tutors are able to identify at-risk students as early as possible and provide assistance before they drop out or fail in their courses. Machine Learning models have been used to predict or classify students in these settings. However, while these models have shown promising results in several settings, they usually attain these results using attributes that are not immediately transferable to other courses or platforms. In this paper, we provide a methodology to classify students using only interaction counts from each student. We evaluate this methodology on a data set from two majors based on the Moodle platform. We run experiments consisting of training and evaluating three machine learning models (Support Vector Machines, Naive Bayes and Adaboost decision trees) under different scenarios. We provide evidences that patterns from interaction counts can provide useful information for classifying at-risk students. This classification allows the customization of the activities presented to at-risk students (automatically or through tutors) as an attempt to avoid students drop out.

Download Full-text

Machine learning predictive models of LDL-C in the population of eastern India and its comparison with directly measured and calculated LDL-C

Annals of Clinical Biochemistry International Journal of Laboratory Medicine ◽

10.1177/00045632211046805 ◽

2021 ◽

pp. 000456322110468

Author(s):

Anudeep P P ◽

Suchitra Kumari ◽

Aishvarya S Rajasimman ◽

Saurav Nayak ◽

Pooja Priyadarsini

Keyword(s):

Machine Learning ◽

Linear Regression ◽

Random Forests ◽

Predictive Performance ◽

Support Vector ◽

Learning Models ◽

Complex Interactions ◽

Clinical Biochemistry Laboratory ◽

Study Laboratory ◽

Machine Learning Models

Background LDL-C is a strong risk factor for cardiovascular disorders. The formulas used to calculate LDL-C showed varying performance in different populations. Machine learning models can study complex interactions between the variables and can be used to predict outcomes more accurately. The current study evaluated the predictive performance of three machine learning models—random forests, XGBoost, and support vector Rregression (SVR) to predict LDL-C from total cholesterol, triglyceride, and HDL-C in comparison to linear regression model and some existing formulas for LDL-C calculation, in eastern Indian population. Methods The lipid profiles performed in the clinical biochemistry laboratory of AIIMS Bhubaneswar during 2019–2021, a total of 13,391 samples were included in the study. Laboratory results were collected from the laboratory database. 70% of data were classified as train set and used to develop the three machine learning models and linear regression formula. These models were tested in the rest 30% of the data (test set) for validation. Performance of models was evaluated in comparison to best six existing LDL-C calculating formulas. Results LDL-C predicted by XGBoost and random forests models showed a strong correlation with directly estimated LDL-C (r = 0.98). Two machine learning models performed superior to the six existing and commonly used LDL-C calculating formulas like Friedewald in the study population. When compared in different triglycerides strata also, these two models outperformed the other methods used. Conclusion Machine learning models like XGBoost and random forests can be used to predict LDL-C with more accuracy comparing to conventional linear regression LDL-C formulas.

Download Full-text

Classification Models for Bank Marketing Campaign: Towards Smart Bank Marketing

American Journal of Business and Operations Research ◽

10.54216/ajbor.050102 ◽

2021 ◽

pp. 21-30

Author(s):

Ahmad Freij ◽

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Linear Regression ◽

Support Vector ◽

Classification Models ◽

Learning Models ◽

Marketing Campaign ◽

Bank Marketing ◽

Machine Learning Models

In this paper, we have proposed two models of marketing classification which are Support Vector Machine (SVM) and Linear regression, these two models are the most popular and useful models of classification. In this paper, we represent how these two models are used for a case study of a bank marketing campaign, the dataset is related to a bank marketing campaign, and for Applying the machine learning models of classification, the RapidMiner software was used.

Download Full-text

Prediction of cancer incidence rates for the European continent using machine learning models

Health Informatics Journal ◽

10.1177/1460458220983878 ◽

2021 ◽

Vol 27 (1) ◽

pp. 146045822098387

Author(s):

Boran Sekeroglu ◽

Kubra Tuncal

Keyword(s):

Neural Network ◽

Colorectal Cancer ◽

Machine Learning ◽

Linear Regression ◽

Support Vector Regression ◽

Incidence Rates ◽

Support Vector ◽

Learning Models ◽

European Continent ◽

Machine Learning Models

Cancer is one of the most important and common public health problems on Earth that can occur in many different types. Treatments and precautions are aimed at minimizing the deaths caused by cancer; however, incidence rates continue to rise. Thus, it is important to analyze and estimate incidence rates to support the determination of more effective precautions. In this research, 2018 Cancer Datasheet of World Health Organization (WHO), is used and all countries on the European Continent are considered to analyze and predict the incidence rates until 2020, for Lung cancer, Breast cancer, Colorectal cancer, Prostate cancer and All types of cancer, which have highest incidence and mortality rates. Each cancer type is trained by six machine learning models namely, Linear Regression, Support Vector Regression, Decision Tree, Long-Short Term Memory neural network, Backpropagation neural network, and Radial Basis Function neural network according to gender types separately. Linear regression and support vector regression outperformed the other models with the [Formula: see text] scores 0.99 and 0.98, respectively, in initial experiments, and then used for prediction of incidence rates of the considered cancer types. The ML models estimated that the maximum rise of incidence rates would be in colorectal cancer for females by 6%.

Download Full-text

A comparison of machine learning models for predicting rehospitalisation and death after a first hospitalisation with heart failure

European Heart Journal ◽

10.1093/ehjci/ehaa946.0984 ◽

2020 ◽

Vol 41 (Supplement_2) ◽

Author(s):

Y Jones ◽

N Hillen ◽

J Friday ◽

P Pellicori ◽

S Kean ◽

...

Keyword(s):

Machine Learning ◽

Heart Failure ◽

Haemoglobin Concentration ◽

Hospital Length Of Stay ◽

Support Vector ◽

Health Board ◽

Funding Source ◽

Learning Models ◽

Data Set ◽

Machine Learning Models

Abstract Background Many machine learning models exist, including Multilayer Perceptron (MLP), Random Forest algorithm (RF), Support Vector Machine (SVM), and Gradient Boosted Machine (GBM), but their value for predicting outcome in patients with heart failure has not been compared. Aim To predict rehospitalisation (all-cause) and death (all-cause) at 1-, 3- and 12 months after discharge from a first hospitalisation for heart failure using four machine learning models. Methods The National Health Service Greater Glasgow and Clyde Health Board serves a population of ∼1.1 million. We obtained de-identified administrative data, including investigations, diagnosis and prescriptions, linked to hospital admissions and deaths for anyone with a diagnosis of vascular disease or heart failure or prescribed loop diuretics, statins or neuro-endocrine antagonists at any time between 1st January 2010 and 1st June 2018. Patients who were under 18 or had no prior hospitalisation for heart failure were excluded. Four ML algorithms using 46 variables were applied. Results Of 360,000 people who met the above criteria between 2010–2018, 6,372 had a hospitalisation for heart failure prior to 1st January 2010 and 8,304 had a first hospitalisation for heart failure thereafter. Between 2010 and 2018 there were 3,086 re-hospitalisations over 24 hours and 3,706 patients died, with 5,070 patients experiencing the composite outcome. GBM and RF consistently outperformed MLP and SVM when comparing AUC, sensitivity and specificity combined, with GBM performing best in all scenarios. Since GBM and RF are both tree-based models, and with SVM and MLP regularly reporting very poor sensitivity or specificity despite a similar AUC to the others, this suggests that SVM and MLP may be suffering from overfitting and might perform better in larger data-sets. Both GBM and RF work by ordering variables, so the final model can be used to determine the most important prediction variables. Age, number of times a blood sample was taken out of hospital, length of stay, social deprivation index and haemoglobin concentration consistently ranked amongst the most important variables. Models predicted all 1-month events better than later events. Conclusions Some, but not all, ML models applied to this data-set predicted rehospitalisation and death with great accuracy for up to 3 months after a first hospitalisation for heart failure. The models identified several important prognostic variables that are currently seldom collected in clinical research registries but perhaps should be. Funding Acknowledgement Type of funding source: Public grant(s) – National budget only. Main funding source(s): Medical Research Council

Download Full-text

Predicting S&P 500 Market Price by Deep Neural Network and Enemble Model

E3S Web of Conferences ◽

10.1051/e3sconf/202021402040 ◽

2020 ◽

Vol 214 ◽

pp. 02040

Author(s):

Feiyu Wang

Keyword(s):

Neural Network ◽

Machine Learning ◽

Support Vector Machine ◽

Linear Regression ◽

Deep Neural Network ◽

Market Price ◽

Support Vector ◽

Learning Models ◽

Conventional Machine ◽

Machine Learning Models

The method to predict the movement of stock market has appealed to scientists for decades. In this article, we use three different models to tackle that problem. In particular, we propose a Deep Neural Network (DNN) to predict the intraday direction of SP500 index and compare the DNN with two conventional machine learning models, i.e. linear regression, support vector machine. We demonstrate that DNN is able to predict SP500 index with relatively highest accuracy.

Download Full-text

Short-Term Electricity Generation Forecasting Using Machine Learning Algorithms: A Case Study of the Benin Electricity Community (C.E.B)

TH Wildau Engineering and Natural Sciences Proceedings ◽

10.52825/thwildauensp.v1i.25 ◽

2021 ◽

Vol 1 ◽

Author(s):

Agbassou Guenoupkati ◽

Adekunlé Akim Salami ◽

Mawugno Koffi Kodjo ◽

Kossi Napo

Keyword(s):

Machine Learning ◽

Time Series ◽

Linear Regression ◽

Performance Metrics ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

Learning Models ◽

Short Term ◽

Machine Learning Models

Time series forecasting in the energy sector is important to power utilities for decision making to ensure the sustainability and quality of electricity supply, and the stability of the power grid. Unfortunately, the presence of certain exogenous factors such as weather conditions, electricity price complicate the task using linear regression models that are becoming unsuitable. The search for a robust predictor would be an invaluable asset for electricity companies. To overcome this difficulty, Artificial Intelligence differs from these prediction methods through the Machine Learning algorithms which have been performing over the last decades in predicting time series on several levels. This work proposes the deployment of three univariate Machine Learning models: Support Vector Regression, Multi-Layer Perceptron, and the Long Short-Term Memory Recurrent Neural Network to predict the electricity production of Benin Electricity Community. In order to validate the performance of these different methods, against the Autoregressive Integrated Mobile Average and Multiple Regression model, performance metrics were used. Overall, the results show that the Machine Learning models outperform the linear regression methods. Consequently, Machine Learning methods offer a perspective for short-term electric power generation forecasting of Benin Electricity Community sources.

Download Full-text

Important citations identification by exploiting generative model into discriminative model

Journal of Information Science ◽

10.1177/0165551521991034 ◽

2021 ◽

pp. 016555152199103

Author(s):

Xin An ◽

Xin Sun ◽

Shuo Xu ◽

Liyuan Hao ◽

Jinghong Li

Keyword(s):

Machine Learning ◽

Topic Model ◽

Kernel Functions ◽

Support Vector ◽

Svm Classifier ◽

Learning Models ◽

Data Set ◽

Discriminative Models ◽

Influence Model ◽

Machine Learning Models

Although the citations between scientific documents are deemed as a vehicle for dissemination, inheritance and development of scientific knowledge, not all citations are well-positioned to be equal. A plethora of taxonomies and machine-learning models have been implemented to tackle the task of citation function and importance classification from qualitative aspect. Inspired by the success of kernel functions from resulting general models to promote the performance of the support vector machine (SVM) model, this work exploits the potential of combining generative and discriminative models for the task of citation importance classification. In more detail, generative features are generated from a topic model, citation influence model (CIM) and then fed to two discriminative traditional machine-learning models, SVM and RF (random forest), and a deep learning model, convolutional neural network (CNN), with other 13 traditional features to identify important citations. The extensive experiments are performed on two data sets with different characteristics. These three models perform better on the data set from one discipline. It is very possible that the patterns for important citations may vary by the fields, which disable machine-learning models to learn effectively the discriminative patterns from publications from multiple domains. The RF classifier outperforms the SVM classifier, which accords with many prior studies. However, the CNN model does not achieve the desired performance due to small-scaled data set. Furthermore, our CIM model–based features improve further the performance for identifying important citations.

Download Full-text

Monitoring the Foliar Nutrients Status of Mango Using Spectroscopy-Based Spectral Indices and PLSR-Combined Machine Learning Models

Remote Sensing ◽

10.3390/rs13040641 ◽

2021 ◽

Vol 13 (4) ◽

pp. 641

Author(s):

Gopal Ramdas Mahajan ◽

Bappa Das ◽

Dayesh Murgaokar ◽

Ittai Herrmann ◽

Katja Berger ◽

...

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Partial Least Square ◽

Least Square ◽

Partial Least Square Regression ◽

Support Vector ◽

Spectral Indices ◽

Learning Models ◽

Leaf Nutrients ◽

Machine Learning Models

Conventional methods of plant nutrient estimation for nutrient management need a huge number of leaf or tissue samples and extensive chemical analysis, which is time-consuming and expensive. Remote sensing is a viable tool to estimate the plant’s nutritional status to determine the appropriate amounts of fertilizer inputs. The aim of the study was to use remote sensing to characterize the foliar nutrient status of mango through the development of spectral indices, multivariate analysis, chemometrics, and machine learning modeling of the spectral data. A spectral database within the 350–1050 nm wavelength range of the leaf samples and leaf nutrients were analyzed for the development of spectral indices and multivariate model development. The normalized difference and ratio spectral indices and multivariate models–partial least square regression (PLSR), principal component regression, and support vector regression (SVR) were ineffective in predicting any of the leaf nutrients. An approach of using PLSR-combined machine learning models was found to be the best to predict most of the nutrients. Based on the independent validation performance and summed ranks, the best performing models were cubist (R2 ≥ 0.91, the ratio of performance to deviation (RPD) ≥ 3.3, and the ratio of performance to interquartile distance (RPIQ) ≥ 3.71) for nitrogen, phosphorus, potassium, and zinc, SVR (R2 ≥ 0.88, RPD ≥ 2.73, RPIQ ≥ 3.31) for calcium, iron, copper, boron, and elastic net (R2 ≥ 0.95, RPD ≥ 4.47, RPIQ ≥ 6.11) for magnesium and sulfur. The results of the study revealed the potential of using hyperspectral remote sensing data for non-destructive estimation of mango leaf macro- and micro-nutrients. The developed approach is suggested to be employed within operational retrieval workflows for precision management of mango orchard nutrients.

Download Full-text