Corn Nitrogen Nutrition Index Prediction Improved by Integrating Genetic, Environmental, and Management Factors with Active Canopy Sensing Using Machine Learning

Dan Li; Yuxin Miao; Curtis J. Ransom; G. Mac Bean; Newell R. Kitchen; Fabián G. Fernández; John E. Sawyer; James J. Camberato; Paul R. Carter; Richard B. Ferguson; David W. Franzen; Carrie A. M. Laboski; Emerson D. Nafziger; John F. Shanahan

doi:10.3390/rs14020394

Corn Nitrogen Nutrition Index Prediction Improved by Integrating Genetic, Environmental, and Management Factors with Active Canopy Sensing Using Machine Learning

Remote Sensing ◽

10.3390/rs14020394 ◽

2022 ◽

Vol 14 (2) ◽

pp. 394

Author(s):

Dan Li ◽

Yuxin Miao ◽

Curtis J. Ransom ◽

G. Mac Bean ◽

Newell R. Kitchen ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Vegetation Index ◽

Sensor Data ◽

Support Vector ◽

Random Forest Regression ◽

Machine Learning Methods ◽

Developmental Growth ◽

N Status ◽

Better Than

Accurate nitrogen (N) diagnosis early in the growing season across diverse soil, weather, and management conditions is challenging. Strategies using multi-source data are hypothesized to perform significantly better than approaches using crop sensing information alone. The objective of this study was to evaluate, across diverse environments, the potential for integrating genetic (e.g., comparative relative maturity and growing degree units to key developmental growth stages), environmental (e.g., soil and weather), and management (e.g., seeding rate, irrigation, previous crop, and preplant N rate) information with active canopy sensor data for improved corn N nutrition index (NNI) prediction using machine learning methods. Thirteen site-year corn (Zea mays L.) N rate experiments involving eight N treatments conducted in four US Midwest states in 2015 and 2016 were used for this study. A proximal RapidSCAN CS-45 active canopy sensor was used to collect corn canopy reflectance data around the V9 developmental growth stage. The utility of vegetation indices and ancillary data for predicting corn aboveground biomass, plant N concentration, plant N uptake, and NNI was evaluated using singular variable regression and machine learning methods. The results indicated that when the genetic, environmental, and management data were used together with the active canopy sensor data, corn N status indicators could be more reliably predicted either using support vector regression (R2 = 0.74–0.90 for prediction) or random forest regression models (R2 = 0.84–0.93 for prediction), as compared with using the best-performing single vegetation index or using a normalized difference vegetation index (NDVI) and normalized difference red edge (NDRE) together (R2 < 0.30). The N diagnostic accuracy based on the NNI was 87% using the data fusion approach with random forest regression (kappa statistic = 0.75), which was better than the result of a support vector regression model using the same inputs. The NDRE index was consistently ranked as the most important variable for predicting all the four corn N status indicators, followed by the preplant N rate. It is concluded that incorporating genetic, environmental, and management information with canopy sensing data can significantly improve in-season corn N status prediction and diagnosis across diverse soil and weather conditions.

Download Full-text

Possibility of Autonomous Estimation of Shiba Goat’s Estrus and Non-Estrus Behavior by Machine Learning Methods

Animals ◽

10.3390/ani10050771 ◽

2020 ◽

Vol 10 (5) ◽

pp. 771

Author(s):

Toshiya Arakawa

Keyword(s):

Neural Network ◽

Machine Learning ◽

Random Forest ◽

Markov Models ◽

Tracking System ◽

Video Tracking ◽

Training Data ◽

Support Vector ◽

Learning Methods ◽

Machine Learning Methods

Mammalian behavior is typically monitored by observation. However, direct observation requires a substantial amount of effort and time, if the number of mammals to be observed is sufficiently large or if the observation is conducted for a prolonged period. In this study, machine learning methods as hidden Markov models (HMMs), random forests, support vector machines (SVMs), and neural networks, were applied to detect and estimate whether a goat is in estrus based on the goat’s behavior; thus, the adequacy of the method was verified. Goat’s tracking data was obtained using a video tracking system and used to estimate whether they, which are in “estrus” or “non-estrus”, were in either states: “approaching the male”, or “standing near the male”. Totally, the PC of random forest seems to be the highest. However, The percentage concordance (PC) value besides the goats whose data were used for training data sets is relatively low. It is suggested that random forest tend to over-fit to training data. Besides random forest, the PC of HMMs and SVMs is high. However, considering the calculation time and HMM’s advantage in that it is a time series model, HMM is better method. The PC of neural network is totally low, however, if the more goat’s data were acquired, neural network would be an adequate method for estimation.

Download Full-text

Detecting Face Touching Using Smartwatches to Mitigate the Spread of COVID-19: Pilot Study (Preprint)

10.2196/preprints.28799 ◽

2021 ◽

Author(s):

Chen Bai ◽

Yu-Peng Chen ◽

Adam Wolach ◽

Lisa Anthony ◽

Mamoun Mardini

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Random Forest ◽

Respiratory Diseases ◽

Window Size ◽

Support Vector ◽

Accelerometer Data ◽

Respiratory Illnesses ◽

Motion Data ◽

Machine Learning Methods

BACKGROUND Frequent spontaneous facial self-touches, predominantly during outbreaks, have the theoretical potential to be a mechanism of contracting and transmitting diseases. Despite the recent advent of vaccines, behavioral approaches remain an integral part of reducing the spread of COVID-19 and other respiratory illnesses. Real-time biofeedback of face touching can potentially mitigate the spread of respiratory diseases. The gap addressed in this study is the lack of an on-demand platform that utilizes motion data from smartwatches to accurately detect face touching. OBJECTIVE The aim of this study was to utilize the functionality and the spread of smartwatches to develop a smartwatch application to identifying motion signatures that are mapped accurately to face touching. METHODS Participants (n=10, 50% women, aged 20-83) performed 10 physical activities classified into: face touching (FT) and non-face touching (NFT) categories, in a standardized laboratory setting. We developed a smartwatch application on Samsung Galaxy Watch to collect raw accelerometer data from participants. Then, data features were extracted from consecutive non-overlapping windows varying from 2-16 seconds. We examined the performance of state-of-the-art machine learning methods on face touching movements recognition (FT vs NFT) and individual activity recognition (IAR): logistic regression, support vector machine, decision trees and random forest. RESULTS Machine learning models were accurate in recognizing face touching categories; logistic regression achieved the best performance across all metrics (Accuracy: 0.93 +/- 0.08, Recall: 0.89 +/- 0.16, Precision: 0.93 +/- 0.08, F1-score: 0.90 +/- 0.11, AUC: 0.95 +/- 0.07) at the window size of 5 seconds. IAR models resulted in lower performance; the random forest classifier achieved the best performance across all metrics (Accuracy: 0.70 +/- 0.14, Recall: 0.70 +/- 0.14, Precision: 0.70 +/- 0.16, F1-score: 0.67 +/- 0.15) at the window size of 9 seconds. CONCLUSIONS Wearable devices, powered with machine learning, are effective in detecting facial touches. This is highly significant during respiratory infection outbreaks, as it has a great potential to refrain people from touching their faces and potentially mitigate the possibility of transmitting COVID-19 and future respiratory diseases.

Download Full-text

Classification models using circulating neutrophil transcripts can detect unruptured intracranial aneurysm

Journal of Translational Medicine ◽

10.1186/s12967-020-02550-2 ◽

2020 ◽

Vol 18 (1) ◽

Author(s):

Kerry E. Poppenberg ◽

Vincent M. Tutino ◽

Lu Li ◽

Muhammad Waqas ◽

Armond June ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Prediction Models ◽

Model Performance ◽

Supervised Machine Learning ◽

Support Vector ◽

Learning Methods ◽

Training Cohort ◽

Network Analyses ◽

Machine Learning Methods

Abstract Background Intracranial aneurysms (IAs) are dangerous because of their potential to rupture. We previously found significant RNA expression differences in circulating neutrophils between patients with and without unruptured IAs and trained machine learning models to predict presence of IA using 40 neutrophil transcriptomes. Here, we aim to develop a predictive model for unruptured IA using neutrophil transcriptomes from a larger population and more robust machine learning methods. Methods Neutrophil RNA extracted from the blood of 134 patients (55 with IA, 79 IA-free controls) was subjected to next-generation RNA sequencing. In a randomly-selected training cohort (n = 94), the Least Absolute Shrinkage and Selection Operator (LASSO) selected transcripts, from which we constructed prediction models via 4 well-established supervised machine-learning algorithms (K-Nearest Neighbors, Random Forest, and Support Vector Machines with Gaussian and cubic kernels). We tested the models in the remaining samples (n = 40) and assessed model performance by receiver-operating-characteristic (ROC) curves. Real-time quantitative polymerase chain reaction (RT-qPCR) of 9 IA-associated genes was used to verify gene expression in a subset of 49 neutrophil RNA samples. We also examined the potential influence of demographics and comorbidities on model prediction. Results Feature selection using LASSO in the training cohort identified 37 IA-associated transcripts. Models trained using these transcripts had a maximum accuracy of 90% in the testing cohort. The testing performance across all methods had an average area under ROC curve (AUC) = 0.97, an improvement over our previous models. The Random Forest model performed best across both training and testing cohorts. RT-qPCR confirmed expression differences in 7 of 9 genes tested. Gene ontology and IPA network analyses performed on the 37 model genes reflected dysregulated inflammation, cell signaling, and apoptosis processes. In our data, demographics and comorbidities did not affect model performance. Conclusions We improved upon our previous IA prediction models based on circulating neutrophil transcriptomes by increasing sample size and by implementing LASSO and more robust machine learning methods. Future studies are needed to validate these models in larger cohorts and further investigate effect of covariates.

Download Full-text

Machine learning predictive models for mineral prospectivity: An evaluation of neural networks, random forest, regression trees and support vector machines

Ore Geology Reviews ◽

10.1016/j.oregeorev.2015.01.001 ◽

2015 ◽

Vol 71 ◽

pp. 804-818 ◽

Cited By ~ 226

Author(s):

V. Rodriguez-Galiano ◽

M. Sanchez-Castillo ◽

M. Chica-Olmo ◽

M. Chica-Rivas

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Support Vector Machines ◽

Random Forest ◽

Predictive Models ◽

Regression Trees ◽

Support Vector ◽

Random Forest Regression ◽

Vector Machines ◽

Mineral Prospectivity

Download Full-text

Prediction of Liver Weight Recovery by an Integrated Metabolomics and Machine Learning Approach After 2/3 Partial Hepatectomy

Frontiers in Pharmacology ◽

10.3389/fphar.2021.760474 ◽

2021 ◽

Vol 12 ◽

Author(s):

Runbin Sun ◽

Haokai Zhao ◽

Shuzhen Huang ◽

Ran Zhang ◽

Zhenyao Lu ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Random Forest ◽

Liver Regeneration ◽

Partial Hepatectomy ◽

Support Vector ◽

Learning Methods ◽

Machine Learning Methods ◽

Liver Index ◽

Extreme Gradient Boosting

Liver has an ability to regenerate itself in mammals, whereas the mechanism has not been fully explained. Here we used a GC/MS-based metabolomic method to profile the dynamic endogenous metabolic change in the serum of C57BL/6J mice at different times after 2/3 partial hepatectomy (PHx), and nine machine learning methods including Least Absolute Shrinkage and Selection Operator Regression (LASSO), Partial Least Squares Regression (PLS), Principal Components Regression (PCR), k-Nearest Neighbors (KNN), Support Vector Machines (SVM), Random Forest (RF), eXtreme Gradient Boosting (xgbDART), Neural Network (NNET) and Bayesian Regularized Neural Network (BRNN) were used for regression between the liver index and metabolomic data at different stages of liver regeneration. We found a tree-based random forest method that had the minimum average Mean Absolute Error (MAE), Root Mean Squared Error (RMSE) and the maximum R square (R2) and is time-saving. Furthermore, variable of importance in the project (VIP) analysis of RF method was performed and metabolites with VIP ranked top 20 were selected as the most critical metabolites contributing to the model. Ornithine, phenylalanine, 2-hydroxybutyric acid, lysine, etc. were chosen as the most important metabolites which had strong correlations with the liver index. Further pathway analysis found Arginine biosynthesis, Pantothenate and CoA biosynthesis, Galactose metabolism, Valine, leucine and isoleucine degradation were the most influenced pathways. In summary, several amino acid metabolic pathways and glucose metabolism pathway were dynamically changed during liver regeneration. The RF method showed advantages for predicting the liver index after PHx over other machine learning methods used and a metabolic clock containing four metabolites is established to predict the liver index during liver regeneration.

Download Full-text

Comparative Analysis of Intellectual Methods for Muscular Contraction Interpretation for Gesture Interface Implementation

Journal of Physics Conference Series ◽

10.1088/1742-6596/2096/1/012190 ◽

2021 ◽

Vol 2096 (1) ◽

pp. 012190

Author(s):

E V Bunyaeva ◽

I V Kuznetsov ◽

Y V Ponomarchuk ◽

P S Timosh

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Logistic Regression ◽

Comparative Analysis ◽

Random Forest ◽

Decision Tree ◽

Single Channel ◽

Muscular Contraction ◽

Support Vector ◽

Machine Learning Methods

Abstract The paper considers comparative analysis results of the machine learning methods used for the gesture recognition based on the surface single-channel electromyography (sEMG) data. The data were processed using multilayer perceptron, support vector machine, decision tree ensemble (Random Forest) and logistic regression for the chosen four gesture types. The conclusion was derived on the analysis efficiency of these methods using commonly recommended accuracy metrics.

Download Full-text

Estimation of Leaf Nitrogen Content in Wheat Using New Hyperspectral Indices and a Random Forest Regression Algorithm

Remote Sensing ◽

10.3390/rs10121940 ◽

2018 ◽

Vol 10 (12) ◽

pp. 1940 ◽

Cited By ~ 23

Author(s):

Liang Liang ◽

Liping Di ◽

Ting Huang ◽

Jiahui Wang ◽

Li Lin ◽

...

Keyword(s):

Remote Sensing ◽

Random Forest ◽

Nitrogen Content ◽

Vegetation Index ◽

Leaf Nitrogen ◽

Support Vector ◽

Leaf Nitrogen Content ◽

Random Forest Regression ◽

First Derivative ◽

Remote Sensing Mapping

Novel hyperspectral indices, which are the first derivative normalized difference nitrogen index (FD-NDNI) and the first derivative ratio nitrogen vegetation index (FD-SRNI), were developed to estimate the leaf nitrogen content (LNC) of wheat. The field stress experiments were conducted with different nitrogen and water application rates across the growing season of wheat and 190 measurements were collected on canopy spectra and LNC under various treatments. The inversion models were constructed based on the dataset to evaluate the ability of various spectral indices to estimate LNC. A comparative analysis showed that the model accuracies of FD-NDNI and FD-SRNI were higher than those of other commonly used hyperspectral indices including mNDVI705, mSR, and NDVI705, which was indicated by higher R2 and lower root mean square error (RMSE) values. The least squares support vector regression (LS-SVR) and random forest regression (RFR) algorithms were then used to optimize the models constructed by FD-NDNI and FD-SRNI. The p-R2 values of the FD-NDNI_RFR and FD-SRNI_RFR models reached 0.874 and 0.872, respectively, which were higher than those of the exponential and SVR model and indicated that the RFR model was accurate. Using the RFR inversion model, remote sensing mapping for the Operative Modular Imaging Spectrometer (OMIS) image was accomplished. The remote sensing mapping of the OMIS image yielded an accuracy of R2 = 0.721 and RMSE = 0.540 for FD-NDNI and R2 = 0.720 and RMSE = 0.495 for FD-SRNI, which indicates that the similarity between the inversion value and the measured value was high. The results show that the new hyperspectral indices, i.e., FD-NDNI and FD-SRNI, are the optimal hyperspectral indices for estimating LNC and that the RFR algorithm is the preferred modeling method.

Download Full-text

Modeling Pan Evaporation Using Gaussian Process Regression K-Nearest Neighbors Random Forest and Support Vector Machines; Comparative Analysis

Atmosphere ◽

10.3390/atmos11010066 ◽

2020 ◽

Vol 11 (1) ◽

pp. 66 ◽

Cited By ~ 9

Author(s):

Sevda Shabani ◽

Saeed Samadianfard ◽

Mohammad Taghi Sattari ◽

Amir Mosavi ◽

Shahaboddin Shamshirband ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Gaussian Process ◽

Gaussian Process Regression ◽

Nearest Neighbors ◽

Support Vector ◽

Pan Evaporation ◽

Learning Methods ◽

K Nearest Neighbors ◽

Machine Learning Methods

Evaporation is a very important process; it is one of the most critical factors in agricultural, hydrological, and meteorological studies. Due to the interactions of multiple climatic factors, evaporation is considered as a complex and nonlinear phenomenon to model. Thus, machine learning methods have gained popularity in this realm. In the present study, four machine learning methods of Gaussian Process Regression (GPR), K-Nearest Neighbors (KNN), Random Forest (RF) and Support Vector Regression (SVR) were used to predict the pan evaporation (PE). Meteorological data including PE, temperature (T), relative humidity (RH), wind speed (W), and sunny hours (S) collected from 2011 through 2017. The accuracy of the studied methods was determined using the statistical indices of Root Mean Squared Error (RMSE), correlation coefficient (R) and Mean Absolute Error (MAE). Furthermore, the Taylor charts utilized for evaluating the accuracy of the mentioned models. The results of this study showed that at Gonbad-e Kavus, Gorgan and Bandar Torkman stations, GPR with RMSE of 1.521 mm/day, 1.244 mm/day, and 1.254 mm/day, KNN with RMSE of 1.991 mm/day, 1.775 mm/day, and 1.577 mm/day, RF with RMSE of 1.614 mm/day, 1.337 mm/day, and 1.316 mm/day, and SVR with RMSE of 1.55 mm/day, 1.262 mm/day, and 1.275 mm/day had more appropriate performances in estimating PE values. It was found that GPR for Gonbad-e Kavus Station with input parameters of T, W and S and GPR for Gorgan and Bandar Torkmen stations with input parameters of T, RH, W and S had the most accurate predictions and were proposed for precise estimation of PE. The findings of the current study indicated that the PE values may be accurately estimated with few easily measured meteorological parameters.

Download Full-text

Forest attribute imputation using machine-learning methods and ASTER data: comparison of k-NN, SVR and random forest regression algorithms

International Journal of Remote Sensing ◽

10.1080/01431161.2012.682661 ◽

2012 ◽

Vol 33 (19) ◽

pp. 6254-6280 ◽

Cited By ~ 41

Author(s):

Shaban Shataee ◽

Syavash Kalbi ◽

Asghar Fallah ◽

Dieter Pelz

Keyword(s):

Machine Learning ◽

Random Forest ◽

Random Forest Regression ◽

Learning Methods ◽

Aster Data ◽

Machine Learning Methods ◽

Data Comparison ◽

Regression Algorithms

Download Full-text

Generating Artificial Sensor Data for the Comparison of Unsupervised Machine Learning Methods

Sensors ◽

10.3390/s21072397 ◽

2021 ◽

Vol 21 (7) ◽

pp. 2397

Author(s):

Bernd Zimmering ◽

Oliver Niggemann ◽

Constanze Hasterok ◽

Erik Pfannstiel ◽

Dario Ramming ◽

...

Keyword(s):

Machine Learning ◽

Short Term Memory ◽

Sensor Data ◽

Support Vector ◽

Neural Net ◽

Generation Process ◽

Self Organizing Map ◽

Data Generation ◽

Learning Methods ◽

Machine Learning Methods

In the field of Cyber-Physical Systems (CPS), there is a large number of machine learning methods, and their intrinsic hyper-parameters are hugely varied. Since no agreed-on datasets for CPS exist, developers of new algorithms are forced to define their own benchmarks. This leads to a large number of algorithms each claiming benefits over other approaches but lacking a fair comparison. To tackle this problem, this paper defines a novel model for a generation process of data, similar to that found in CPS. The model is based on well-understood system theory and allows many datasets with different characteristics in terms of complexity to be generated. The data will pave the way for a comparison of selected machine learning methods in the exemplary field of unsupervised learning. Based on the synthetic CPS data, the data generation process is evaluated by analyzing the performance of the methods of the Self-Organizing Map, One-Class Support Vector Machine and Long Short-Term Memory Neural Net in anomaly detection.

Download Full-text