Estimating Design Floods at Ungauged Watersheds in South Korea Using Machine Learning Models

Jin-Young Lee; Changhyun Choi; Doosun Kang; Byung Sik Kim; Tae-Woong Kim

doi:10.3390/w12113022

Estimating Design Floods at Ungauged Watersheds in South Korea Using Machine Learning Models

Water ◽

10.3390/w12113022 ◽

2020 ◽

Vol 12 (11) ◽

pp. 3022

Author(s):

Jin-Young Lee ◽

Changhyun Choi ◽

Doosun Kang ◽

Byung Sik Kim ◽

Tae-Woong Kim

Keyword(s):

Neural Network ◽

Machine Learning ◽

Random Forest ◽

South Korea ◽

Recurrent Neural Network ◽

Flood Damage ◽

Flood Frequency Analysis ◽

Support Vector ◽

Design Floods ◽

Ungauged Watersheds

With recent increases of heavy rainfall during the summer season, South Korea is hit by substantial flood damage every year. To reduce such flood damage and cope with flood disasters, it is necessary to reliably estimate design floods. Despite the ongoing efforts to develop practical design practice, it has been difficult to develop a standardized guideline due to the lack of hydrologic data, especially flood data. In fact, flood frequency analysis (FFA) is impractical for ungauged watersheds, and design rainfall–runoff analysis (DRRA) overestimates design floods. This study estimated the appropriate design floods at ungauged watersheds by combining the DRRA and watershed characteristics using machine learning methods, including decision tree, random forest, support vector machine, deep neural network, the Elman recurrent neural network, and the Jordan recurrent neural network. The proposed models were validated using K-fold cross-validation to reduce overfitting and were evaluated based on various error measures. Even though the DRRA overestimated the design floods by 160%, on average, for our study areas the proposed model using random forest reduced the errors and estimated design floods at 99% of the FFA, on average.

Download Full-text

Possibility of Autonomous Estimation of Shiba Goat’s Estrus and Non-Estrus Behavior by Machine Learning Methods

Animals ◽

10.3390/ani10050771 ◽

2020 ◽

Vol 10 (5) ◽

pp. 771

Author(s):

Toshiya Arakawa

Keyword(s):

Neural Network ◽

Machine Learning ◽

Random Forest ◽

Markov Models ◽

Tracking System ◽

Video Tracking ◽

Training Data ◽

Support Vector ◽

Learning Methods ◽

Machine Learning Methods

Mammalian behavior is typically monitored by observation. However, direct observation requires a substantial amount of effort and time, if the number of mammals to be observed is sufficiently large or if the observation is conducted for a prolonged period. In this study, machine learning methods as hidden Markov models (HMMs), random forests, support vector machines (SVMs), and neural networks, were applied to detect and estimate whether a goat is in estrus based on the goat’s behavior; thus, the adequacy of the method was verified. Goat’s tracking data was obtained using a video tracking system and used to estimate whether they, which are in “estrus” or “non-estrus”, were in either states: “approaching the male”, or “standing near the male”. Totally, the PC of random forest seems to be the highest. However, The percentage concordance (PC) value besides the goats whose data were used for training data sets is relatively low. It is suggested that random forest tend to over-fit to training data. Besides random forest, the PC of HMMs and SVMs is high. However, considering the calculation time and HMM’s advantage in that it is a time series model, HMM is better method. The PC of neural network is totally low, however, if the more goat’s data were acquired, neural network would be an adequate method for estimation.

Download Full-text

Cooperative Recurrent Neural Network for Multiclass Support Vector Machine Learning

Advances in Neural Networks – ISNN 2009 - Lecture Notes in Computer Science ◽

10.1007/978-3-642-01510-6_32 ◽

2009 ◽

pp. 276-286 ◽

Cited By ~ 2

Author(s):

Ying Yu ◽

Youshen Xia ◽

Mohamed Kamel

Keyword(s):

Neural Network ◽

Machine Learning ◽

Support Vector Machine ◽

Recurrent Neural Network ◽

Support Vector ◽

Multiclass Support Vector Machine

Download Full-text

EEG-Based Emotion Classification for Alzheimer’s Disease Patients Using Conventional Machine Learning and Recurrent Neural Network Models

Sensors ◽

10.3390/s20247212 ◽

2020 ◽

Vol 20 (24) ◽

pp. 7212

Author(s):

Jungryul Seo ◽

Teemu H. Laine ◽

Gyuhwan Oh ◽

Kyung-Ah Sohn

Keyword(s):

Neural Network ◽

Machine Learning ◽

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Recurrent Neural Network ◽

Classification Model ◽

Support Vector ◽

Number Of Patients ◽

Study Results ◽

Conventional Machine

As the number of patients with Alzheimer’s disease (AD) increases, the effort needed to care for these patients increases as well. At the same time, advances in information and sensor technologies have reduced caring costs, providing a potential pathway for developing healthcare services for AD patients. For instance, if a virtual reality (VR) system can provide emotion-adaptive content, the time that AD patients spend interacting with VR content is expected to be extended, allowing caregivers to focus on other tasks. As the first step towards this goal, in this study, we develop a classification model that detects AD patients’ emotions (e.g., happy, peaceful, or bored). We first collected electroencephalography (EEG) data from 30 Korean female AD patients who watched emotion-evoking videos at a medical rehabilitation center. We applied conventional machine learning algorithms, such as a multilayer perceptron (MLP) and support vector machine, along with deep learning models of recurrent neural network (RNN) architectures. The best performance was obtained from MLP, which achieved an average accuracy of 70.97%; the RNN model’s accuracy reached only 48.18%. Our study results open a new stream of research in the field of EEG-based emotion detection for patients with neurological disorders.

Download Full-text

Utilização de técnicas de Machine Learning e de Deep Learning para a predição de casos de internações causadas por dengue em municípios da Paraíba

10.5753/ercemapi.2021.17914 ◽

2021 ◽

Author(s):

Ewerthon Dyego de Araújo Batista ◽

Wellington Candeia de Araújo ◽

Romeryto Vieira Lira ◽

Laryssa Izabel de Araújo Batista

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Random Forest ◽

Convolutional Neural Network ◽

Support Vector Regression ◽

Multilayer Perceptron ◽

Support Vector

Dengue é um problema de saúde pública no Brasil, os casos da doença voltaram a crescer na Paraíba. O boletim epidemiológico da Paraíba, divulgado em agosto de 2021, informa um aumento de 53% de casos em relação ao ano anterior. Técnicas de Machine Learning (ML) e de Deep Learning estão sendo utilizadas como ferramentas para a predição da doença e suporte ao seu combate. Por meio das técnicas Random Forest (RF), Support Vector Regression (SVR), Multilayer Perceptron (MLP), Long ShortTerm Memory (LSTM) e Convolutional Neural Network (CNN), este artigo apresenta um sistema capaz de realizar previsões de internações causadas por dengue para as cidades Bayeux, Cabedelo, João Pessoa e Santa Rita. O sistema conseguiu realizar previsões para Bayeux com taxa de erro 0,5290, já em Cabedelo o erro foi 0,92742, João Pessoa 9,55288 e Santa Rita 0,74551.

Download Full-text

Performance Comparison of Oil Spill and Ship Classification from X-Band Dual- and Single-Polarized SAR Image Using Support Vector Machine, Random Forest, and Deep Neural Network

Remote Sensing ◽

10.3390/rs13163203 ◽

2021 ◽

Vol 13 (16) ◽

pp. 3203

Author(s):

Won-Kyung Baek ◽

Hyung-Sup Jung

Keyword(s):

Neural Network ◽

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Performance Improvement ◽

Oil Spill ◽

Deep Neural Network ◽

Support Vector ◽

Sar Image ◽

X Band

It is well known that the polarization characteristics in X-band synthetic aperture radar (SAR) image analysis can provide us with additional information for marine target classification and detection. Normally, dual-and single-polarized SAR images are acquired by SAR satellites, and then we must determine how accurate the marine mapping performance from dual-polarized (pol) images is versus the marine mapping performance from the single-pol images in a given machine learning model. The purpose of this study is to compare the performance of single- and dual-pol SAR image classification achieved by the support vector machine (SVM), random forest (RF), and deep neural network (DNN) models. The test image is a TerraSAR-X dual-pol image acquired from the 2007 Kerch Strait oil spill event. For this, 824,026 pixels and 1,648,051 pixels were extracted from the image for the training and test, respectively, and sea, ship, oil, and land objects were classified from the image by using the three machine learning methods. The mean f1-scores of the SVM, RF, and DNN models resulting from the single-pol image were approximately 0.822, 0.882, and 0.889, respectively, and those from the dual-pol image were about 0.852, 0.908, and 0.898, respectively. The performance improvement achieved by dual-pol was about 3.6%, 2.9%, and 1% in SVM, RF, and DNN, respectively. The DNN model had the best performance (0.889) in the single-pol test while the RF model was best (0.908) in the dual-pol test. The performance improvement was approximately 2.1% and not noticeable. If the condition that dual-pol images have two-times lower spatial resolution versus single-pol images in the azimuth direction is considered, a small improvement may not be valuable. Therefore, the results show that the performance improvement by X-band dual-pol image may be not remarkable when classifying the sea, ships, oil spills, and sea and land surfaces.

Download Full-text

Machine learning model for predicting the optimal depth of tracheal tube insertion in pediatric patients: A retrospective cohort study

PLoS ONE ◽

10.1371/journal.pone.0257069 ◽

2021 ◽

Vol 16 (9) ◽

pp. e0257069

Author(s):

Jae-Geum Shim ◽

Kyoung-Ho Ryu ◽

Sung Hyun Lee ◽

Eun-Ah Cho ◽

Sungho Lee ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Artificial Neural Network ◽

Support Vector Machine ◽

Random Forest ◽

Tracheal Tube ◽

Pediatric Patients ◽

Support Vector ◽

Learning Models ◽

Machine Learning Models

Objective To construct a prediction model for optimal tracheal tube depth in pediatric patients using machine learning. Methods Pediatric patients aged <7 years who received post-operative ventilation after undergoing surgery between January 2015 and December 2018 were investigated in this retrospective study. The optimal location of the tracheal tube was defined as the median of the distance between the upper margin of the first thoracic(T1) vertebral body and the lower margin of the third thoracic(T3) vertebral body. We applied four machine learning models: random forest, elastic net, support vector machine, and artificial neural network and compared their prediction accuracy to three formula-based methods, which were based on age, height, and tracheal tube internal diameter(ID). Results For each method, the percentage with optimal tracheal tube depth predictions in the test set was calculated as follows: 79.0 (95% confidence interval [CI], 73.5 to 83.6) for random forest, 77.4 (95% CI, 71.8 to 82.2; P = 0.719) for elastic net, 77.0 (95% CI, 71.4 to 81.8; P = 0.486) for support vector machine, 76.6 (95% CI, 71.0 to 81.5; P = 1.0) for artificial neural network, 66.9 (95% CI, 60.9 to 72.5; P < 0.001) for the age-based formula, 58.5 (95% CI, 52.3 to 64.4; P< 0.001) for the tube ID-based formula, and 44.4 (95% CI, 38.3 to 50.6; P < 0.001) for the height-based formula. Conclusions In this study, the machine learning models predicted the optimal tracheal tube tip location for pediatric patients more accurately than the formula-based methods. Machine learning models using biometric variables may help clinicians make decisions regarding optimal tracheal tube depth in pediatric patients.

Download Full-text

Improvement of Short-Term BIPV Power Predictions Using Feature Engineering and a Recurrent Neural Network

Energies ◽

10.3390/en12173247 ◽

2019 ◽

Vol 12 (17) ◽

pp. 3247 ◽

Cited By ~ 1

Author(s):

Dongkyu Lee ◽

Jinhwa Jeong ◽

Sung Hoon Yoon ◽

Young Tae Chae

Keyword(s):

Neural Network ◽

Machine Learning ◽

Recurrent Neural Network ◽

Power Output ◽

Prediction Accuracy ◽

Support Vector ◽

Feature Engineering ◽

Short Term ◽

Interaction Detection ◽

Photovoltaic Power

The time resolution and prediction accuracy of the power generated by building-integrated photovoltaics are important for managing electricity demand and formulating a strategy to trade power with the grid. This study presents a novel approach to improve short-term hourly photovoltaic power output predictions using feature engineering and machine learning. Feature selection measured the importance score of input features by using a model-based variable importance. It verified that the normative sky index in the weather forecasted data had the least importance as a predictor for hourly prediction of photovoltaic power output. Six different machine-learning algorithms were assessed to select an appropriate model for the hourly power output prediction with onsite weather forecast data. The recurrent neural network outperformed five other models, including artificial neural networks, support vector machines, classification and regression trees, chi-square automatic interaction detection, and random forests, in terms of its ability to predict photovoltaic power output at an hourly and daily resolution for 64 tested days. Feature engineering was then used to apply dropout observation to the normative sky index from the training and prediction process, which improved the hourly prediction performance. In particular, the prediction accuracy for overcast days improved by 20% compared to the original weather dataset used without dropout observation. The results show that feature engineering effectively improves the short-term predictions of photovoltaic power output in buildings with a simple weather forecasting service.

Download Full-text

A Comparative Study on Supervised Machine Learning Algorithms for Copper Recovery Quality Prediction in a Leaching Process

Sensors ◽

10.3390/s21062119 ◽

2021 ◽

Vol 21 (6) ◽

pp. 2119

Author(s):

Victor Flores ◽

Claudio Leiva

Keyword(s):

Neural Network ◽

Machine Learning ◽

Artificial Neural Network ◽

Support Vector Machine ◽

Random Forest ◽

Mining Industry ◽

Machine Learning Algorithms ◽

Copper Recovery ◽

Support Vector ◽

Copper Mining

The copper mining industry is increasingly using artificial intelligence methods to improve copper production processes. Recent studies reveal the use of algorithms, such as Artificial Neural Network, Support Vector Machine, and Random Forest, among others, to develop models for predicting product quality. Other studies compare the predictive models developed with these machine learning algorithms in the mining industry as a whole. However, not many copper mining studies published compare the results of machine learning techniques for copper recovery prediction. This study makes a detailed comparison between three models for predicting copper recovery by leaching, using four datasets resulting from mining operations in Northern Chile. The algorithms used for developing the models were Random Forest, Support Vector Machine, and Artificial Neural Network. To validate these models, four indicators or values of merit were used: accuracy (acc), precision (p), recall (r), and Matthew’s correlation coefficient (mcc). This paper describes the dataset preparation and the refinement of the threshold values used for the predictive variable most influential on the class (the copper recovery). Results show both a precision over 98.50% and also the model with the best behavior between the predicted and the real values. Finally, the obtained models have the following mean values: acc = 0.943, p = 88.47, r = 0.995, and mcc = 0.232. These values are highly competitive when compared with those obtained in similar studies using other approaches in the context.

Download Full-text

Study on the Estimation of Forest Volume Based on Multi-Source Data

Sensors ◽

10.3390/s21237796 ◽

2021 ◽

Vol 21 (23) ◽

pp. 7796

Author(s):

Tao Hu ◽

Yuman Sun ◽

Weiwei Jia ◽

Dandan Li ◽

Maosheng Zou ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Remote Sensing ◽

Artificial Neural Network ◽

Random Forest ◽

Hybrid Model ◽

Prediction Accuracy ◽

Volume Estimation ◽

Support Vector ◽

Estimation Models

We performed a comparative analysis of the prediction accuracy of machine learning methods and ordinary Kriging (OK) hybrid methods for forest volume models based on multi-source remote sensing data combined with ground survey data. Taking Larix olgensis, Pinus koraiensis, and Pinus sylvestris plantations in Mengjiagang forest farms as the research object, based on the Chinese Academy of Forestry LiDAR, charge-coupled device, and hyperspectral (CAF-LiTCHy) integrated system, we extracted the visible vegetation index, texture features, terrain factors, and point cloud feature variables, respectively. Random forest (RF), support vector regression (SVR), and an artificial neural network (ANN) were used to estimate forest volume. In the small-scale space, the estimation of sample plot volume is influenced by the surrounding environment as well as the neighboring observed data. Based on the residuals of these three machine learning models, OK interpolation was applied to construct new hybrid forest volume estimation models called random forest Kriging (RFK), support vector machines for regression Kriging (SVRK), and artificial neural network Kriging (ANNK). The six estimation models of forest volume were tested using the leave-one-out (Loo) cross-validation method. The prediction accuracies of these six models are better, with RLoo2 values above 0.6, and the prediction accuracy values of the hybrid models are all improved to different extents. Among the six models, the RFK hybrid model had the best prediction effect, with an RLoo2 reaching 0.915. Therefore, the machine learning method based on multi-source remote sensing factors is useful for forest volume estimation; in particular, the hybrid model constructed by combining machine learning and the OK method greatly improved the accuracy of forest volume estimation, which, thus, provides a fast and effective method for the remote sensing inversion estimation of forest volume and facilitates the management of forest resources.

Download Full-text

Statistical and machine learning models for classification of human wear and delivery days in accelerometry data

10.1101/2020.12.31.424867 ◽

2021 ◽

Author(s):

Ryan Moore ◽

Kristin R. Archer ◽

Leena Choi

Keyword(s):

Neural Network ◽

Machine Learning ◽

Logistic Regression ◽

Random Forest ◽

Human Activity ◽

Recurrent Neural Network ◽

Learning Models ◽

Learning Context ◽

Machine Learning Models

AbstractPurposeAccelerometers are increasingly utilized in healthcare research to assess human activity. Accelerometry data are often collected by mailing accelerometers to participants, who wear the accelerometers to collect data on their activity. The devices are then mailed back to the laboratory for analysis. We develop models to classify days in accelerometry data as activity from actual human wear or the delivery process. These models can be used to automate the cleaning of accelerometry datasets that are adulterated with activity from delivery.MethodsFor the classification of delivery days in accelerometry data, we developed statistical and machine learning models in a supervised learning context using a large human activity and delivery labeled accelerometry dataset. We extracted several features, which were included to develop random forest, logistic regression, mixed effects regression, and multilayer perceptron models, while convolutional neural network, recurrent neural network, and hybrid convolutional recurrent neural network models were developed without feature extraction. Model performances were assessed using Monte Carlo cross-validation.ResultsWe found that a hybrid convolutional recurrent neural network performed best in the classification task with an F1 score of 0.960 but simpler models such as logistic regression and random forest also had excellent performance with F1 scores of 0.951 and 0.957, respectively.ConclusionThe models developed in this study can be used to classify days in accelerometry data as either human or delivery activity. An analyst can weigh the larger computational cost and greater performance of the convolutional recurrent neural network against the faster but slightly less powerful random forest or logistic regression. The best performing models for classification of delivery data are publicly available on the open source R package, PhysicalActivity.

Download Full-text