Machine Learning Modelling of the Relationship between Weather and Paddy Yield in Sri Lanka

Journal of Mathematics ◽

10.1155/2021/9941899 ◽

2021 ◽

Vol 2021 ◽

pp. 1-14

Author(s):

Piyal Ekanayake ◽

Windhya Rankothge ◽

Rukmal Weliwatta ◽

Jeevani W. Jayasinghe

Keyword(s):

Machine Learning ◽

Sri Lanka ◽

Relative Humidity ◽

Mean Squared Error ◽

Absolute Error ◽

Maximum Temperature ◽

Percentage Error ◽

Pairwise Correlation ◽

Geographical Regions ◽

Paddy Yield

This paper presents the development of crop-weather models for the paddy yield in Sri Lanka based on nine weather indices, namely, rainfall, relative humidity (minimum and maximum), temperature (minimum and maximum), wind speed (morning and evening), evaporation, and sunshine hours. The statistics of seven geographical regions, which contribute to about two-thirds of the country’s total paddy production, were used for this study. The significance of the weather indices on the paddy yield was explored by employing Random Forest (RF) and the variable importance of each of them was determined. Pearson’s correlation and Spearman’s correlation were used to identify the behavior of correlation in a positive or negative direction. Further, the pairwise correlation among the weather indices was examined. The results indicate that the minimum relative humidity and the maximum temperature during the paddy cultivation period are the most influential weather indices. Moreover, RF was used to develop a paddy yield prediction model and four more techniques, namely, Power Regression (PR), Multiple Linear Regression (MLR) with stepwise selection, forward (step-up) selection, and backward (step-down) elimination, were used to benchmark the performance of the machine learning technique. Their performances were compared in terms of the Root Mean Squared Error (RMSE), Correlation Coefficient (R), Mean Absolute Error (MAE), and the Mean Absolute Percentage Error (MAPE). As per the results, RF is a reliable and accurate model for the prediction of paddy yield in Sri Lanka, demonstrating a very high R of 0.99 and the least MAPE of 1.4%.

Download Full-text

Machine learning and Grad-Cam based vascular aging assessment using photoplethysmogram (Preprint)

10.2196/preprints.31709 ◽

2021 ◽

Author(s):

Hangsik Shin

Keyword(s):

Machine Learning ◽

Correlation Coefficient ◽

Age Estimation ◽

Mean Squared Error ◽

Mean Absolute Error ◽

Absolute Error ◽

Coefficient Of Determination ◽

Vascular Aging ◽

Squared Error ◽

Vascular Age

BACKGROUND Arterial stiffness due to vascular aging is a major indicator for evaluating cardiovascular risk. OBJECTIVE In this study, we propose a method of estimating age by applying machine learning to photoplethysmogram for non-invasive vascular age assessment. METHODS The machine learning-based age estimation model that consists of three convolutional layers and two-layer fully connected layers, was developed using segmented photoplethysmogram by pulse from a total of 752 adults aged 19–87 years. The performance of the developed model was quantitatively evaluated using mean absolute error, root-mean-squared-error, Pearson’s correlation coefficient, coefficient of determination. The Grad-Cam was used to explain the contribution of photoplethysmogram waveform characteristic in vascular age estimation. RESULTS Mean absolute error of 8.03, root mean squared error of 9.96, 0.62 of correlation coefficient, and 0.38 of coefficient of determination were shown through 10-fold cross validation. Grad-Cam, used to determine the weight that the input signal contributes to the result, confirmed that the contribution to the age estimation of the photoplethysmogram segment was high around the systolic peak. CONCLUSIONS The machine learning-based vascular aging analysis method using the PPG waveform showed comparable or superior performance compared to previous studies without complex feature detection in evaluating vascular aging. CLINICALTRIAL 2015-0104

Download Full-text

Prediction of tensile strength of polymer carbon nanotube composites using practical machine learning method

Journal of Composite Materials ◽

10.1177/0021998320953540 ◽

2020 ◽

pp. 002199832095354 ◽

Cited By ~ 5

Author(s):

Tien-Thinh Le

Keyword(s):

Machine Learning ◽

Mechanical Properties ◽

Tensile Strength ◽

Carbon Nanotube ◽

Polymer Matrix ◽

Mean Squared Error ◽

Gaussian Process Regression ◽

Weight Fraction ◽

Percentage Error ◽

Input Variables

This paper is devoted to the development and construction of a practical Machine Learning (ML)-based model for the prediction of tensile strength of polymer carbon nanotube (CNTs) composites. To this end, a database was compiled from the available literature, composed of 11 input variables. The input variables for predicting tensile strength of nanocomposites were selected for the following main reasons: (i) type of polymer matrix, (ii) mechanical properties of polymer matrix, (iii) physical characteristics of CNTs, (iv) mechanical properties of CNTs and (v) incorporation parameters such as CNT weight fraction, CNT surface modification method and processing method. As the problem of prediction is highly dimensional (with 11 dimensions), the Gaussian Process Regression (GPR) model was selected and optimized by means of a parametric study. The correlation coefficient (R), Willmott’s index of agreement (IA), slope of regression, Mean Absolute Percentage Error (MAPE), Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) were employed as error measurement criteria when training the GPR model. The GPR model exhibited good performance for both training and testing parts (RMSE = 5.982 and 5.327 MPa, MAE = 3.447 and 3.539 MPa, respectively). In addition, uncertainty analysis was also applied to estimate the prediction confidence intervals. Finally, the prediction capability of the GPR model with different ranges of values of input variables was investigated and discussed. For practical application, a Graphical User Interface (GUI) was developed in Matlab for predicting the tensile strength of nanocomposites.

Download Full-text

Machine Learning-Based Gully Erosion Susceptibility Mapping: A Case Study of Eastern India

Sensors ◽

10.3390/s20051313 ◽

2020 ◽

Vol 20 (5) ◽

pp. 1313 ◽

Cited By ~ 15

Author(s):

Sunil Saha ◽

Jagabandhu Roy ◽

Alireza Arabameri ◽

Thomas Blaschke ◽

Dieu Tien Bui

Keyword(s):

Machine Learning ◽

Mean Squared Error ◽

Absolute Error ◽

Gully Erosion ◽

Machine Learning Techniques ◽

Weight Of Evidence ◽

Validation Dataset ◽

Boosted Regression Tree ◽

Area Index ◽

Statistical Measures

Gully erosion is a form of natural disaster and one of the land loss mechanisms causing severe problems worldwide. This study aims to delineate the areas with the most severe gully erosion susceptibility (GES) using the machine learning techniques Random Forest (RF), Gradient Boosted Regression Tree (GBRT), Naïve Bayes Tree (NBT), and Tree Ensemble (TE). The gully inventory map (GIM) consists of 120 gullies. Of the 120 gullies, 84 gullies (70%) were used for training and 36 gullies (30%) were used to validate the models. Fourteen gully conditioning factors (GCFs) were used for GES modeling and the relationships between the GCFs and gully erosion was assessed using the weight-of-evidence (WofE) model. The GES maps were prepared using RF, GBRT, NBT, and TE and were validated using area under the receiver operating characteristic (AUROC) curve, the seed cell area index (SCAI) and five statistical measures including precision (PPV), false discovery rate (FDR), accuracy, mean absolute error (MAE), and root mean squared error (RMSE). Nearly 7% of the basin has high to very high susceptibility for gully erosion. Validation results proved the excellent ability of these models to predict the GES. Of the analyzed models, the RF (AUROC = 0.96, PPV = 1.00, FDR = 0.00, accuracy = 0.87, MAE = 0.11, RMSE = 0.19 for validation dataset) is accurate enough for modeling and better suited for GES modeling than the other models. Therefore, the RF model can be used to model the GES areas not only in this river basin but also in other areas with the same geo-environmental conditions.

Download Full-text

International arrivals forecasting for Australian airports and the impact of tourism marketing expenditure

Tourism Economics ◽

10.5367/te.2015.0507 ◽

2016 ◽

Vol 23 (2) ◽

pp. 403-428 ◽

Cited By ~ 9

Author(s):

Wai Hong Kan Tsui ◽

Faruk Balli

Keyword(s):

Mean Squared Error ◽

Absolute Error ◽

Percentage Error ◽

Endogenous Factors ◽

Tourism Marketing ◽

Passenger Demand ◽

Volatility Models ◽

International Tourist ◽

The Impact ◽

The Empirical Analysis

An airport’s international passenger arrivals are susceptible to exogenous and endogenous factors (such as economic conditions, flight services, fluctuations and shocks). Accurate and reliable airport passenger demand forecasts are imperative for policymaking and planning by airport and airline management as well as by tourism authorities and operators. This article employs the Box–Jenkins SARIMA, SARIMAX and SARIMAX/EGARCH volatility models to forecast international passenger arrivals for the eight key Australian airports (Adelaide, Brisbane, Cairns, Darwin, Gold Coast, Melbourne, Perth and Sydney). Monthly international tourist arrivals between January 2006 and September 2012 are used for the empirical analysis. All the forecasting models are highly accurate with the lower values of mean absolute percentage error, mean absolute error and root mean squared error. The findings suggest that the international passenger arrivals of Australian airports are affected by positive and negative shocks and tourism marketing expenditure is also a significant factor influencing the majority of Australian airports’ international passenger arrivals.

Download Full-text

Influence of Data Splitting on Performance of Machine Learning Models in Prediction of Shear Strength of Soil

Mathematical Problems in Engineering ◽

10.1155/2021/4832864 ◽

2021 ◽

Vol 2021 ◽

pp. 1-15 ◽

Cited By ~ 1

Author(s):

Quang Hung Nguyen ◽

Hai-Bang Ly ◽

Lanh Si Ho ◽

Nadhir Al-Ansari ◽

Hiep Van Le ◽

...

Keyword(s):

Machine Learning ◽

Monte Carlo ◽

Shear Strength ◽

Mean Squared Error ◽

Absolute Error ◽

Engineering Properties ◽

Stable Model ◽

Predictive Capability ◽

Effective Manner ◽

Soil Shear Strength

The main objective of this study is to evaluate and compare the performance of different machine learning (ML) algorithms, namely, Artificial Neural Network (ANN), Extreme Learning Machine (ELM), and Boosting Trees (Boosted) algorithms, considering the influence of various training to testing ratios in predicting the soil shear strength, one of the most critical geotechnical engineering properties in civil engineering design and construction. For this aim, a database of 538 soil samples collected from the Long Phu 1 power plant project, Vietnam, was utilized to generate the datasets for the modeling process. Different ratios (i.e., 10/90, 20/80, 30/70, 40/60, 50/50, 60/40, 70/30, 80/20, and 90/10) were used to divide the datasets into the training and testing datasets for the performance assessment of models. Popular statistical indicators, such as Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Correlation Coefficient (R), were employed to evaluate the predictive capability of the models under different training and testing ratios. Besides, Monte Carlo simulation was simultaneously carried out to evaluate the performance of the proposed models, taking into account the random sampling effect. The results showed that although all three ML models performed well, the ANN was the most accurate and statistically stable model after 1000 Monte Carlo simulations (Mean R = 0.9348) compared with other models such as Boosted (Mean R = 0.9192) and ELM (Mean R = 0.8703). Investigation on the performance of the models showed that the predictive capability of the ML models was greatly affected by the training/testing ratios, where the 70/30 one presented the best performance of the models. Concisely, the results presented herein showed an effective manner in selecting the appropriate ratios of datasets and the best ML model to predict the soil shear strength accurately, which would be helpful in the design and engineering phases of construction projects.

Download Full-text

Comparison of Machine Learning Algorithms for Discharge Prediction of Multipurpose Dam

Water ◽

10.3390/w13233369 ◽

2021 ◽

Vol 13 (23) ◽

pp. 3369

Author(s):

Jiyeong Hong ◽

Seoro Lee ◽

Gwanjae Lee ◽

Dongseok Yang ◽

Joo Hyun Bae ◽

...

Keyword(s):

Machine Learning ◽

Mean Squared Error ◽

Learning Algorithms ◽

Absolute Error ◽

Machine Learning Algorithms ◽

Physical Models ◽

Gradient Boosting ◽

Activity Schedules ◽

Discharge Data ◽

Dam Inflow

For effective water management in the downstream area of a dam, it is necessary to estimate the amount of discharge from the dam to quantify the flow downstream of the dam. In this study, a machine learning model was constructed to predict the amount of discharge from Soyang River Dam using precipitation and dam inflow/discharge data from 1980 to 2020. Decision tree, multilayer perceptron, random forest, gradient boosting, RNN-LSTM, and CNN-LSTM were used as algorithms. The RNN-LSTM model achieved a Nash–Sutcliffe efficiency (NSE) of 0.796, root-mean-squared error (RMSE) of 48.996 m3/s, mean absolute error (MAE) of 10.024 m3/s, R of 0.898, and R2 of 0.807, showing the best results in dam discharge prediction. The prediction of dam discharge using machine learning algorithms showed that it is possible to predict the amount of discharge, addressing limitations of physical models, such as the difficulty in applying human activity schedules and the need for various input data.

Download Full-text

ACCRUED FORECASTING ON TOURIST’S ARRIVAL IN BANGLADESH FOR SUSTAINABLE DEVELOPMENT

GeoJournal of Tourism and Geosites ◽

10.30892/gtg.362spl19-701 ◽

2021 ◽

Vol 36 (2spl) ◽

pp. 708-714

Author(s):

Sayed Mohibul HOSSEN ◽

◽

Mohd Tahir ISMAIL ◽

Mosab I. TABASH ◽

Ahmed ABOUSAMAK ◽

...

Keyword(s):

Sustainable Development ◽

Mean Squared Error ◽

Moving Average ◽

Critical Role ◽

Absolute Error ◽

Tourism Industry ◽

Percentage Error ◽

Absolute Deviation ◽

Autoregressive Integrated Moving Average ◽

Private And Public

Forecasting of potential tourists’ appearance could assume a critical role in the tourism industry, arranging at all levels in both the private and public sectors. In this study our aim to build an econometric model to forecast worldwide visitor streams to Bangladesh. For this purpose, the present investigation focuses on univariate Seasonal Autoregressive Integrated Moving Average (SARIMA) modeling. Model choice criteria were Mean Absolute Percentage Error (MAPE), Mean Absolute Error (MAE), and Mean Squared Error (RMSE). As per descriptive statistics, the mean appearances were 207012 and will be 656522 (application) every year. Mean Absolute Deviation and Mean Squared Deviation likewise concurred with MAPE, MAE, and MSE. The result reveals that for sustainable development the SARIMA model is the reasonable model for forecasting universal visitor appearances in Bangladesh.

Download Full-text

Visualization & Prediction of COVID-19 Future Outbreak by Using Machine Learning

International Journal of Information Technology and Computer Science ◽

10.5815/ijitcs.2021.03.02 ◽

2021 ◽

Vol 13 (3) ◽

pp. 16-32

Author(s):

Ahmed Hassan Mohammed Hassan ◽

◽

Arfan Ali Mohammed Qasem ◽

Walaa Faisal Mohammed Abdalla ◽

Omer H. Elhassan

Keyword(s):

Machine Learning ◽

Polynomial Regression ◽

Mean Squared Error ◽

Absolute Error ◽

Future Perspective ◽

Support Vector ◽

Squared Error ◽

Vector Machines ◽

The World ◽

Negative Factors

Day by day, the accumulative incidence of COVID-19 is rapidly increasing. After the spread of the Corona epidemic and the death of more than a million people around the world countries, scientists and researchers have tended to conduct research and take advantage of modern technologies to learn machine to help the world to get rid of the Coronavirus (COVID-19) epidemic. To track and predict the disease Machine Learning (ML) can be deployed very effectively. ML techniques have been anticipated in areas that need to identify dangerous negative factors and define their priorities. The significance of a proposed system is to find the predict the number of people infected with COVID19 using ML. Four standard models anticipate COVID-19 prediction, which are Neural Network (NN), Support Vector Machines (SVM), Bayesian Network (BN) and Polynomial Regression (PR). The data utilized to test these models content of number of deaths, newly infected cases, and recoveries in the next 20 days. Five measures parameters were used to evaluate the performance of each model, namely root mean squared error (RMSE), mean squared error (MAE), mean absolute error (MSE), Explained Variance score and r2 score (R2). The significance and value of proposed system auspicious mechanism to anticipate these models for the current cenario of the COVID-19 epidemic. The results showed NN outperformed the other models, while in the available dataset the SVM performs poorly in all the prediction. Reference to our results showed that injuries will increase slightly in the coming days. Also, we find that the results give rise to hope due to the low death rate. For future perspective, case explanation and data amalgamation must be kept up persistently.

Download Full-text

Use of Empirical Mode Decomposition in Improving Neural Network Forecasting of Paddy Price

MATEMATIKA ◽

10.11113/matematika.v35.n4.1263 ◽

2019 ◽

Vol 35 (4) ◽

pp. 53-64

Author(s):

Siti Nabilah Syuhada Abdullah ◽

Ani Shabri ◽

Ruhaidah Samsudin

Keyword(s):

Neural Network ◽

Empirical Mode Decomposition ◽

Mean Squared Error ◽

Absolute Error ◽

Percentage Error ◽

Forecast Errors ◽

Ann Model ◽

Mode Decomposition ◽

The Neural Network ◽

Artificial Neural Network Ann

Since rice is a staple food in Malaysia, its price fluctuations pose risks to the producers, suppliers and consumers. Hence, an accurate prediction of paddy price is essential to aid the planning and decision-making in related organizations. The artificial neural network (ANN) has been widely used as a promising method for time series forecasting. In this paper, the effectiveness of integrating empirical mode decomposition (EMD) into an ANN model to forecast paddy price is investigated. The hybrid method is applied on a series of monthly paddy prices fromFebruary 1999 up toMay 2018 as recorded in the Malaysian Ringgit (MYR) per metric tons. The performance of the simple ANN model and the EMD-ANN model was measured and compared based on their root mean squared Error (RMSE), mean absolute error (MAE) and mean percentage error (MPE). This study finds that the integration of EMD into the neural network model improves the forecasting capabilities. The use of EMD in the ANN model made the forecast errors reduced significantly, and the RMSE was reduced by 0.012, MAE by 0.0002 and MPE by 0.0448.

Download Full-text

RNN-based Dimensional Speech Emotion Recognition

10.31227/osf.io/wa3vp ◽

2020 ◽

Author(s):

Bagus Tris Atmaja

Keyword(s):

Emotion Recognition ◽

Short Term Memory ◽

Mean Squared Error ◽

Absolute Error ◽

Recognition System ◽

Speech Emotion Recognition ◽

Percentage Error ◽

Concordance Correlation ◽

Acoustic Feature ◽

Dense System

◆ A speech emotion recognition system based on recurrent neural networks is developed using long short-term memory networks.◆ Two of acoustic feature sets are evaluated: 31 Features (3 time-domain features, 5 frequency-domain features, 13 MFCCs, 5 F0s, and 5 Harmonics) and eGeMaps feature set (23 features).◆ To evaluate the performance, some metrics are used i.e. mean squared error (MSE), mean absolute percentage error (MAPE), mean absolute error (MAE) and concordance correlation coefficient (CCC). Among those metrics, CCC is main focus as it is used by other researchers.◆ The developed system used multi-task learning to maximize arousal, valence, and dominance at the same time using CCC loss (1 - CCC). The result shows using LSTM networks improve the CCC score compared to baseline dense system. The best CCC score isobtained on arousal followed by dominance and valence.

Download Full-text