Estimation of Rice Height and Biomass Using Multitemporal SAR Sentinel-1 for Camargue, Southern France

Emile Ndikumana; Dinh Ho Tong Minh; Hai Dang Nguyen; Nicolas Baghdadi; Dominique Courault; Laure Hossard; Ibrahim El Moussawi

doi:10.3390/rs10091394

Estimation of Rice Height and Biomass Using Multitemporal SAR Sentinel-1 for Camargue, Southern France

Remote Sensing ◽

10.3390/rs10091394 ◽

2018 ◽

Vol 10 (9) ◽

pp. 1394 ◽

Cited By ~ 15

Author(s):

Emile Ndikumana ◽

Dinh Ho Tong Minh ◽

Hai Dang Nguyen ◽

Nicolas Baghdadi ◽

Dominique Courault ◽

...

Keyword(s):

Random Forest ◽

Speckle Noise ◽

Machine Learning Techniques ◽

Support Vector ◽

Dual Polarization ◽

Southern France ◽

Biophysical Parameter ◽

Radar Images ◽

Dry Biomass

The research and improvement of methods to be used for crop monitoring are currently major challenges, especially for radar images due to their speckle noise nature. The European Space Agency’s (ESA) Sentinel-1 constellation provides synthetic aperture radar (SAR) images coverage with a 6-day revisit period at a high spatial resolution of pixel spacing of 20 m. Sentinel-1 data are considerably useful, as they provide valuable information of the vegetation cover. The objective of this work is to study the capabilities of multitemporal radar images for rice height and dry biomass retrievals using Sentinel-1 data. To do this, we train Sentinel-1 data against ground measurements with classical machine learning techniques (Multiple Linear Regression (MLR), Support Vector Regression (SVR) and Random Forest (RF)) to estimate rice height and dry biomass. The study is carried out on a multitemporal Sentinel-1 dataset acquired from May 2017 to September 2017 over the Camargue region, southern France. The ground in-situ measurements were made in the same period to collect rice height and dry biomass over 11 rice fields. The images were processed in order to produce a radar stack in C-band including dual-polarization VV (Vertical receive and Vertical transmit) and VH (Vertical receive and Horizontal transmit) data. We found that non-parametric methods (SVR and RF) had a better performance over the parametric MLR method for rice biophysical parameter retrievals. The accuracy of rice height estimation showed that rice height retrieval was strongly correlated to the in-situ rice height from dual-polarization, in which Random Forest yielded the best performance with correlation coefficient R 2 = 0.92 and the root mean square error (RMSE) 16% (7.9 cm). In addition, we demonstrated that the correlation of Sentinel-1 signal to the biomass was also very high in VH polarization with R 2 = 0.9 and RMSE = 18% (162 g·m − 2 ) (with Random Forest method). Such results indicate that the highly qualified Sentinel-1 radar data could be well exploited for rice biomass and height retrieval and they could be used for operational tasks.

Download Full-text

Preliminary Screening of COVID-19 Infection Employing Machine Learning Techniques From Simple Blood Profile

International Journal of Quantitative Structure-Property Relationships ◽

10.4018/ijqspr.2021070103 ◽

2021 ◽

Vol 6 (3) ◽

pp. 35-47

Author(s):

Anirudh Reddy Cingireddy ◽

Robin Ghosh ◽

Supratik Kar ◽

Venkata Melapu ◽

Sravanthi Joginipeli ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Naive Bayes ◽

Albert Einstein ◽

Naïve Bayes ◽

Machine Learning Techniques ◽

Support Vector ◽

Blood Profile ◽

Molecular Tests ◽

Large Populations

Frequent testing of the entire population would help to identify individuals with active COVID-19 and allow us to identify concealed carriers. Molecular tests, antigen tests, and antibody tests are being widely used to confirm COVID-19 in the population. Molecular tests such as the real-time reverse transcription-polymerase chain reaction (rRT-PCR) test will take a minimum of 3 hours to a maximum of 4 days for the results. The authors suggest using machine learning and data mining tools to filter large populations at a preliminary level to overcome this issue. The ML tools could reduce the testing population size by 20 to 30%. In this study, they have used a subset of features from full blood profile which are drawn from patients at Israelita Albert Einstein hospital located in Brazil. They used classification models, namely KNN, logistic regression, XGBooting, naive Bayes, decision tree, random forest, support vector machine, and multilayer perceptron with k-fold cross-validation, to validate the models. Naïve bayes, KNN, and random forest stand out as the most predictive ones with 88% accuracy each.

Download Full-text

Classification study of solvation free energies of organic molecules using machine learning techniques

RSC Advances ◽

10.1039/c4ra07961b ◽

2014 ◽

Vol 4 (106) ◽

pp. 61624-61630 ◽

Cited By ~ 8

Author(s):

N. S. Hari Narayana Moorthy ◽

Silvia A. Martins ◽

Sergio F. Sousa ◽

Maria J. Ramos ◽

Pedro A. Fernandes

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Organic Molecules ◽

Machine Learning Techniques ◽

Support Vector ◽

Classification Models ◽

Free Energies ◽

Learning Techniques ◽

Solvation Free Energies

Classification models to predict the solvation free energies of organic molecules were developed using decision tree, random forest and support vector machine approaches and with MACCS fingerprints, MOE and PaDEL descriptors.

Download Full-text

CPT Data Interpretation Employing Different Machine Learning Techniques

Geosciences ◽

10.3390/geosciences11070265 ◽

2021 ◽

Vol 11 (7) ◽

pp. 265

Author(s):

Stefan Rauter ◽

Franz Tschuchnigg

Keyword(s):

Machine Learning ◽

Grain Size ◽

Random Forest ◽

Classification Model ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Models ◽

Cone Penetration ◽

Tip Resistance ◽

Machine Learning Models

The classification of soils into categories with a similar range of properties is a fundamental geotechnical engineering procedure. At present, this classification is based on various types of cost- and time-intensive laboratory and/or in situ tests. These soil investigations are essential for each individual construction site and have to be performed prior to the design of a project. Since Machine Learning could play a key role in reducing the costs and time needed for a suitable site investigation program, the basic ability of Machine Learning models to classify soils from Cone Penetration Tests (CPT) is evaluated. To find an appropriate classification model, 24 different Machine Learning models, based on three different algorithms, are built and trained on a dataset consisting of 1339 CPT. The applied algorithms are a Support Vector Machine, an Artificial Neural Network and a Random Forest. As input features, different combinations of direct cone penetration test data (tip resistance qc, sleeve friction fs, friction ratio Rf, depth d), combined with “defined”, thus, not directly measured data (total vertical stresses σv, effective vertical stresses σ’v and hydrostatic pore pressure u0), are used. Standard soil classes based on grain size distributions and soil classes based on soil behavior types according to Robertson are applied as targets. The different models are compared with respect to their prediction performance and the required learning time. The best results for all targets were obtained with models using a Random Forest classifier. For the soil classes based on grain size distribution, an accuracy of about 75%, and for soil classes according to Robertson, an accuracy of about 97–99%, was reached.

Download Full-text

Credit Risk Assessment using Machine Learning Techniques

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.a4936.119119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 3482-3486

Keyword(s):

Machine Learning ◽

Risk Assessment ◽

Random Forest ◽

Credit Risk ◽

Banking Sector ◽

Machine Learning Techniques ◽

Support Vector ◽

Credit Risk Assessment ◽

Learning Techniques ◽

Cart Algorithm

Analysis of credit scoring is an effective credit risk assessment technique, which is one of the major research fields in the banking sector. Machine learning has a variety of applications in the banking sector and it has been widely used for data analysis. Modern techniques such as machine learning have provided a self-regulating process to analyze the data using classification techniques. The classification method is a supervised learning process in which the computer learns from the input data provided and makes use of this information to classify the new dataset. This research paper presents a comparison of various machine learning techniques used to evaluate the credit risk. A credit transaction that needs to be accepted or rejected is trained and implemented on the dataset using different machine learning algorithms. The techniques are implemented on the German credit dataset taken from UCI repository which has 1000 instances and 21 attributes, depending on which the transactions are either accepted or rejected. This paper compares algorithms such as Support Vector Network, Neural Network, Logistic Regression, Naive Bayes, Random Forest, and Classification and Regression Trees (CART) algorithm and the results obtained show that Random Forest algorithm was able to predict credit risk with higher accuracy

Download Full-text

Prediction of Bipolar Disorder with Voice Analysis using Machine Learning Techniques

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d9069.118419 ◽

2019 ◽

Vol 8 (4) ◽

pp. 11437-11440

Keyword(s):

Bipolar Disorder ◽

Random Forest ◽

Monitoring Data ◽

Machine Learning Techniques ◽

Phone Call ◽

Support Vector ◽

Self Monitoring ◽

Phone Calls ◽

Behavioral Activities ◽

The Voice

The change in the speech is the responsive and well-founded measure of the depression and obsession of the bipolar disorder. This analysis mainly focuses on perceiving the voice attributes and phone calls data is collected as it acts as a main search-space maker for bipolar clutters. By combining the voice features with the phone call data based on their behavioral activities, self- monitoring data control and illness activities the accuracy would increase to effective states. The voice attributes and smartphones collect the activities of sample phone data and self-monitoring data. These activities are the root cause of the expansion of two symptoms: depression and obsession. These symptoms were introduced by a researcher who was rendered with smartphones. The phone call data were examined through a statistical random forest algorithm. The states were extracted from daily phone calls and are classified using voice attributes. These attributes are more determined and accurate to classify the maniac states. The main subject in comparing the voice attributes and self-observed data with the behavioral activities of phone call data is that these attributes increase the efficiency, vulnerability, and definiteness of classifying the affective states. The techniques used to detect the voice features are support vector machine (SVM) random forest. the proposed system will enhance the performance of the prediction of all the techniques. By comparing all these techniques by finding the accuracy of each technique we can know which technique predicts more accurately.

Download Full-text

Compression Strength Prediction Using Machine Learning Techniques

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2021/431012021 ◽

2021 ◽

Vol 10 (1) ◽

pp. 301-307

Keyword(s):

Random Forest ◽

Compression Strength ◽

Machine Learning Techniques ◽

Support Vector ◽

Mathematical Relationship ◽

Random Forest Regression ◽

Learning Techniques ◽

Regression Techniques ◽

Compressive Strength Of Concrete ◽

Advanced Computing

The advanced computing techniques and its applications on other engineering disciplines accelerated the different aspects and phases in engineering process. Nowadays there are so many computer aided methods widely used in civil engineering domain. The mathematical relationship between ratios of different concrete components and other influencing factors with its compression strength need to be analyzed for different engineering needs. This paper aims to develop a mathematical relationship after analyzing the above factors and to foresee the compressive strength of concrete by applying various regression techniques such as linear regression, support vector regression, decision tree regression and random forest regression on assumeddata set., It was found that the accuracy of the random forest regression was considerable as per the result after applying the various regression techniques.

Download Full-text

Interpolation of Instantaneous Air Temperature Using Geographical and MODIS Derived Variables with Machine Learning Techniques

10.20944/preprints201906.0008.v1 ◽

2019 ◽

Author(s):

Marcos Ruiz-Álvarez ◽

Francisco Alonso-Sarría ◽

Francisco Gomariz-Castillo

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Linear Regression ◽

Air Temperature ◽

Satellite Data ◽

Multivariate Linear Regression ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector

Several methods have been tried to estimate air temperature using satellite imagery. In this paper, the results of two machine learning algorithms, Support Vector Machine and Random Forest, are compared with Multivariate Linear Regression, TVX and Ordinary kriging. Several geographic, remote sensing and time variables are used as predictors. The validation is carried out using four different statistics on a daily basis allowing the use of ANOVA to compare the results. The main conclusion is that Random Forest with residual kriging produces the best results (R$^2$=0.612 $\pm$ 0.019, NSE=0.578 $\pm$ 0.025, RMSE=1.068 $\pm$ 0.027, PBIAS=-0.172 $\pm$ 0.046), whereas TVX produces the least accurate results. The environmental conditions in the study area are not really suited to TVX, moreover this method only takes into account satellite data. On the other hand, regression methods (Support Vector Machine, Random Forest and Multivariate Linear Regression) use several parameters that are easily calculated from a Digital Elevation Model, adding very little difficulty to the use of satellite data alone. The most important variables in the Random Forest Model were satellite temperature, potential irradiation and cdayt, a cosine transformation of the julian day.

Download Full-text

Machine Learning Algorithms For Understanding The Determinants of Under-Five Mortality

10.21203/rs.3.rs-1021040/v1 ◽

2021 ◽

Author(s):

Rakesh Kumar Saroj ◽

Pawan Kumar Yadav ◽

Rajneesh Singh ◽

Obvious Nchimunya Chilyabanyama

Keyword(s):

Machine Learning ◽

Random Forest ◽

Information Gain ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Mortality Data ◽

Mortality Factors ◽

Under Five ◽

Learning Techniques

Abstract Background: The death rate of under-five children in India declined last few decades, but few bigger states have poor performance. This is a matter of serious concern for the child's health as well as social development. Nowadays, machine learning techniques play a crucial role in the smart health care system to capture the hidden factors and patterns of outcomes. In this paper, we used machine learning techniques to predict the important factors of under-five mortality.This study aims to explore the importance of machine learning techniques to predict under-five mortality and to find the important factors that cause under-five mortality.The data was taken from the National Family Health Survey-IV of Uttar Pradesh. We used four machine learning techniques like decision tree, support vector machine, random forest, and logistic regression to predict under-five mortality factors and model accuracy of each model. We have also used information gain to rank to know the important variables for accurate predictions in under-five mortality data.Result: Random Forest (RF) predicts the child mortality factors with the highest accuracy of 97.5 %, and the number of living children, births in the last five years, educational level, birth order, total children ever born, currently breastfeeding, and size of child at birth that identifying as essential factors for under-five mortality.Conclusion: The study focuses on machine learning techniques to predict and identify important factors for under-five mortality. The random forest model provides an excellent predictive result for estimating the risk factors of under-five mortality. Based on the resulting outcome, policymakers can make policies and plans to reduce under-five mortality.

Download Full-text

A Novel Approach for Detecting DGA-Based Botnets in DNS Queries Using Machine Learning Techniques

Journal of Computer Networks and Communications ◽

10.1155/2021/4767388 ◽

2021 ◽

Vol 2021 ◽

pp. 1-13

Author(s):

Ali Soleymani ◽

Fatemeh Arabgol

Keyword(s):

Machine Learning ◽

Random Forest ◽

Text Mining ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Detection Accuracy ◽

Domain Name ◽

Botnet Detection ◽

Learning Techniques

In today’s security landscape, advanced threats are becoming increasingly difficult to detect as the pattern of attacks expands. Classical approaches that rely heavily on static matching, such as blacklisting or regular expression patterns, may be limited in flexibility or uncertainty in detecting malicious data in system data. This is where machine learning techniques can show their value and provide new insights and higher detection rates. The behavior of botnets that use domain-flux techniques to hide command and control channels was investigated in this research. The machine learning algorithm and text mining used to analyze the network DNS protocol and identify botnets were also described. For this purpose, extracted and labeled domain name datasets containing healthy and infected DGA botnet data were used. Data preprocessing techniques based on a text-mining approach were applied to explore domain name strings with n-gram analysis and PCA. Its performance is improved by extracting statistical features by principal component analysis. The performance of the proposed model has been evaluated using different classifiers of machine learning algorithms such as decision tree, support vector machine, random forest, and logistic regression. Experimental results show that the random forest algorithm can be used effectively in botnet detection and has the best botnet detection accuracy.

Download Full-text

A Supervised Machine Learning Approach to Detect the On/Off State in Parkinson’s Disease Using Wearable Based Gait Signals

Diagnostics ◽

10.3390/diagnostics10060421 ◽

2020 ◽

Vol 10 (6) ◽

pp. 421

Author(s):

Satyabrata Aich ◽

Jinyoung Youn ◽

Sabyasachi Chakraborty ◽

Pyari Mohan Pradhan ◽

Jin-han Park ◽

...

Keyword(s):

Machine Learning ◽

Parkinson’S Disease ◽

Parkinson's Disease ◽

Random Forest ◽

Wearable Devices ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Support Vector ◽

Healthcare Applications ◽

Reported Data

Fluctuations in motor symptoms are mostly observed in Parkinson’s disease (PD) patients. This characteristic is inevitable, and can affect the quality of life of the patients. However, it is difficult to collect precise data on the fluctuation characteristics using self-reported data from PD patients. Therefore, it is necessary to develop a suitable technology that can detect the medication state, also termed the “On”/“Off” state, automatically using wearable devices; at the same time, this could be used in the home environment. Recently, wearable devices, in combination with powerful machine learning techniques, have shown the potential to be effectively used in critical healthcare applications. In this study, an algorithm is proposed that can detect the medication state automatically using wearable gait signals. A combination of features that include statistical features and spatiotemporal gait features are used as inputs to four different classifiers such as random forest, support vector machine, K nearest neighbour, and Naïve Bayes. In total, 20 PD subjects with definite motor fluctuations have been evaluated by comparing the performance of the proposed algorithm in association with the four aforementioned classifiers. It was found that random forest outperformed the other classifiers with an accuracy of 96.72%, a recall of 97.35%, and a precision of 96.92%.

Download Full-text