scholarly journals Evaluation of Light Gradient Boosted Machine Learning Technique in Large Scale Land Use and Land Cover Classification

Environments ◽  
2020 ◽  
Vol 7 (10) ◽  
pp. 84
Author(s):  
Dakota Aaron McCarty ◽  
Hyun Woo Kim ◽  
Hye Kyung Lee

The ability to rapidly produce accurate land use and land cover maps regularly and consistently has been a growing initiative as they have increasingly become an important tool in the efforts to evaluate, monitor, and conserve Earth’s natural resources. Algorithms for supervised classification of satellite images constitute a necessary tool for the building of these maps and they have made it possible to establish remote sensing as the most reliable means of map generation. In this paper, we compare three machine learning techniques: Random Forest, Support Vector Machines, and Light Gradient Boosted Machine, using a 70/30 training/testing evaluation model. Our research evaluates the accuracy of Light Gradient Boosted Machine models against the more classic and trusted Random Forest and Support Vector Machines when it comes to classifying land use and land cover over large geographic areas. We found that the Light Gradient Booted model is marginally more accurate with a 0.01 and 0.059 increase in the overall accuracy compared to Support Vector and Random Forests, respectively, but also performed around 25% quicker on average.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Tom Elliot ◽  
Robert Morse ◽  
Duane Smythe ◽  
Ashley Norris

AbstractIt is 50 years since Sieveking et al. published their pioneering research in Nature on the geochemical analysis of artefacts from Neolithic flint mines in southern Britain. In the decades since, geochemical techniques to source stone artefacts have flourished globally, with a renaissance in recent years from new instrumentation, data analysis, and machine learning techniques. Despite the interest over these latter approaches, there has been variation in the quality with which these methods have been applied. Using the case study of flint artefacts and geological samples from England, we present a robust and objective evaluation of three popular techniques, Random Forest, K-Nearest-Neighbour, and Support Vector Machines, and present a pipeline for their appropriate use. When evaluated correctly, the results establish high model classification performance, with Random Forest leading with an average accuracy of 85% (measured through F1 Scores), and with Support Vector Machines following closely. The methodology developed in this paper demonstrates the potential to significantly improve on previous approaches, particularly in removing bias, and providing greater means of evaluation than previously utilised.


2020 ◽  
Vol 24 (5) ◽  
pp. 1141-1160
Author(s):  
Tomás Alegre Sepúlveda ◽  
Brian Keith Norambuena

In this paper, we apply sentiment analysis methods in the context of the first round of the 2017 Chilean elections. The purpose of this work is to estimate the voting intention associated with each candidate in order to contrast this with the results from classical methods (e.g., polls and surveys). The data are collected from Twitter, because of its high usage in Chile and in the sentiment analysis literature. We obtained tweets associated with the three main candidates: Sebastián Piñera (SP), Alejandro Guillier (AG) and Beatriz Sánchez (BS). For each candidate, we estimated the voting intention and compared it to the traditional methods. To do this, we first acquired the data and labeled the tweets as positive or negative. Afterward, we built a model using machine learning techniques. The classification model had an accuracy of 76.45% using support vector machines, which yielded the best model for our case. Finally, we use a formula to estimate the voting intention from the number of positive and negative tweets for each candidate. For the last period, we obtained a voting intention of 35.84% for SP, compared to a range of 34–44% according to traditional polls and 36% in the actual elections. For AG we obtained an estimate of 37%, compared with a range of 15.40% to 30.00% for traditional polls and 20.27% in the elections. For BS we obtained an estimate of 27.77%, compared with the range of 8.50% to 11.00% given by traditional polls and an actual result of 22.70% in the elections. These results are promising, in some cases providing an estimate closer to reality than traditional polls. Some differences can be explained due to the fact that some candidates have been omitted, even though they held a significant number of votes.


2020 ◽  
Vol 13 (1-2) ◽  
pp. 43-52
Author(s):  
Boudewijn van Leeuwen ◽  
Zalán Tobak ◽  
Ferenc Kovács

AbstractClassification of multispectral optical satellite data using machine learning techniques to derive land use/land cover thematic data is important for many applications. Comparing the latest algorithms, our research aims to determine the best option to classify land use/land cover with special focus on temporary inundated land in a flat area in the south of Hungary. These inundations disrupt agricultural practices and can cause large financial loss. Sentinel 2 data with a high temporal and medium spatial resolution is classified using open source implementations of a random forest, support vector machine and an artificial neural network. Each classification model is applied to the same data set and the results are compared qualitatively and quantitatively. The accuracy of the results is high for all methods and does not show large overall differences. A quantitative spatial comparison demonstrates that the neural network gives the best results, but that all models are strongly influenced by atmospheric disturbances in the image.


Prediction of stock markets is the act of attempting to determine the future value of an inventory of a business or other financial instrument traded on an economic exchange.Effectively foreseeing the future cost of a stock will amplify the benefits of the financial specialist.This article suggests a model of machine learning to forecast the price of the stock market.During the way toward considering various techniques and factors that should be considered, we found that strategy, for example, random forest, support vector machines were not completely used in past structures. In this article, we will present and audit an increasingly suitable strategy for anticipating more prominent exactness stock oscillations.The primary thing we thought about was the securities exchange estimating informational index from yahoo stocks. We will audit the utilization of random forest after pre-handling the data, help the vector machine on the informational index and the outcomes it produces.The powerful stock gauge will be a superb resource for financial exchange associations and will give genuine options in contrast to the difficulties confronting the stock speculator.


2021 ◽  
pp. 1-29
Author(s):  
Ahmed Alsaihati ◽  
Mahmoud Abughaban ◽  
Salaheldin Elkatatny ◽  
Abdulazeez Abdulraheem

Abstract Fluid loss into formations is a common operational issue that is frequently encountered when drilling across naturally or induced fractured formations. This could pose significant operational risks, such as well-control, stuck pipe, and wellbore instability, which, in turn, lead to an increase of well time and cost. This research aims to use and evaluate different machine learning techniques, namely: support vector machines, random forests, and K-nearest neighbors in detecting loss circulation occurrences while drilling using solely drilling surface parameters. Actual field data of seven wells, which had suffered partial or severe loss circulation, were used to build predictive models, while Well-8 was used to compare the performance of the developed models. Different performance metrics were used to evaluate the performance of the developed models. Recall, precision, and F1-score measures were used to evaluate the ability of the developed model to detect loss circulation occurrences. The results showed the K-nearest neighbors classifier achieved a high F1-score of 0.912 in detecting loss circulation occurrence in the testing set, while the random forests was the second-best classifier with almost the same F1-score of 0.910. The support vector machines achieved an F1-score of 0.83 in predicting the loss circulation occurrence in the testing set. The K-nearest neighbors outperformed other models in detecting the loss circulation occurrences in Well-8 with an F1-score of 0.80. The main contribution of this research as compared to previous studies is that it identifies losses events based on real-time measurements of the active pit volume.


2019 ◽  
Vol 11 (13) ◽  
pp. 1600 ◽  
Author(s):  
Flávio F. Camargo ◽  
Edson E. Sano ◽  
Cláudia M. Almeida ◽  
José C. Mura ◽  
Tati Almeida

This study proposes a workflow for land use and land cover (LULC) classification of Advanced Land Observing Satellite-2 (ALOS-2) Phased Array type L-band Synthetic Aperture Radar-2 (PALSAR-2) images of the Brazilian tropical savanna (Cerrado) biome. The following LULC classes were considered: forestlands; shrublands; grasslands; reforestations; croplands; pasturelands; bare soils/straws; urban areas; and water reservoirs. The proposed approach combines polarimetric attributes, image segmentation, and machine-learning procedures. A set of 125 attributes was generated using polarimetric ALOS-2/PALSAR-2 images, including the van Zyl, Freeman–Durden, Yamaguchi, and Cloude–Pottier target decomposition components, incoherent polarimetric parameters (biomass indices and polarization ratios), and HH-, HV-, VH-, and VV-polarized amplitude images. These attributes were classified using the Naive Bayes (NB), DT J48 (DT = decision tree), Random Forest (RF), Multilayer Perceptron (MLP), and Support Vector Machine (SVM) algorithms. The RF, MLP, and SVM classifiers presented the most accurate performances. NB and DT J48 classifiers showed a lower performance in relation to the RF, MLP, and SVM. The DT J48 classifier was the most suitable algorithm for discriminating urban areas and natural vegetation cover. The proposed workflow can be replicated for other SAR images with different acquisition modes or for other types of vegetation domains.


Author(s):  
A.L. Kulikov ◽  
D.I. Bezdushnyi

The development of present-day power systems is associated with the wide use of digital technologies and intelligent algorithms in control and protection systems. It opens up new opportunities to improve relay protection and automation hardware and develop its design principles. Simulation modeling becomes a new tool not only for studying power systems operation but also for designing new relay protection methods. The use of simulation modeling in combination with machine learning algorithms makes it possible to create fundamentally new types of digital relay protections adaptable to a specific protected facility and able to use all the available current and voltage measurements to the fullest extent possible. Machine learning also allows developing auxiliary selective elements for improving the basic characteristics of existing relay protection algorithms such as selectivity, sensitivity, and speed of operation. The paper considers an example of designing an auxiliary element to provide selectivity in the backup zone of distance protection. The problem is solved using one of the most widely known machine learning techniques, i.e., the method of support vector machines (SVM).


Sign in / Sign up

Export Citation Format

Share Document