A Comprehensive Evaluation of Approaches for Built-Up Area Extraction from Landsat OLI Images Using Massive Samples

Tao Zhang; Hong Tang

doi:10.3390/rs11010002

A Comprehensive Evaluation of Approaches for Built-Up Area Extraction from Landsat OLI Images Using Massive Samples

Remote Sensing ◽

10.3390/rs11010002 ◽

2018 ◽

Vol 11 (1) ◽

pp. 2 ◽

Cited By ~ 11

Author(s):

Tao Zhang ◽

Hong Tang

Keyword(s):

Learning Strategies ◽

Classification Accuracy ◽

Feature Learning ◽

Automatic Generation ◽

Experimental Results ◽

Support Vector ◽

Feature Engineering ◽

Classification Algorithms ◽

Sample Points ◽

Better Than

Detailed information about built-up areas is valuable for mapping complex urban environments. Although a large number of classification algorithms for such areas have been developed, they are rarely tested from the perspective of feature engineering and feature learning. Therefore, we launched a unique investigation to provide a full test of the Operational Land Imager (OLI) imagery for 15-m resolution built-up area classification in 2015, in Beijing, China. Training a classifier requires many sample points, and we proposed a method based on the European Space Agency’s (ESA) 38-m global built-up area data of 2014, OpenStreetMap, and MOD13Q1-NDVI to achieve the rapid and automatic generation of a large number of sample points. Our aim was to examine the influence of a single pixel and image patch under traditional feature engineering and modern feature learning strategies. In feature engineering, we consider spectra, shape, and texture as the input features, and support vector machine (SVM), random forest (RF), and AdaBoost as the classification algorithms. In feature learning, the convolutional neural network (CNN) is used as the classification algorithm. In total, 26 built-up land cover maps were produced. The experimental results show the following: (1) The approaches based on feature learning are generally better than those based on feature engineering in terms of classification accuracy, and the performance of ensemble classifiers (e.g., RF) are comparable to that of CNN. Two-dimensional CNN and the 7-neighborhood RF have the highest classification accuracies at nearly 91%; (2) Overall, the classification effect and accuracy based on image patches are better than those based on single pixels. The features that can highlight the information of the target category (e.g., PanTex (texture-derived built-up presence index) and enhanced morphological building index (EMBI)) can help improve classification accuracy. The code and experimental results are available at https://github.com/zhangtao151820/CompareMethod.

Download Full-text

A Comprehensive Evaluation of Approaches for Built-up Area Extraction from Landsat OLI Images Using Massive Samples

10.20944/preprints201812.0067.v1 ◽

2018 ◽

Author(s):

Tao Zhang ◽

Hong Tang

Keyword(s):

Learning Strategies ◽

Classification Accuracy ◽

Comprehensive Evaluation ◽

Feature Learning ◽

Automatic Generation ◽

Feature Engineering ◽

Classification Algorithms ◽

Ensemble Classifiers ◽

Sample Points ◽

Better Than

Detailed built-up area information is valuable for mapping complex urban environments. Although a large number of classification algorithms about built-up areas have been developed, they are rarely tested from the perspective of feature engineering and feature learning. Therefore we launched a unique investigation to provide a full test of the OLI imagery for 15-m resolution built-up area classification in 2015, in Beijing, China. Training a classifier requires many sample points, and we propose a method based on the ESA's 38-meter global built-up area data of 2014, Open Street Map and MOD13Q1-NDVI to achieve rapid and automatic generation of a large number of sample points. Our aim is to examine the influence of a single pixel and image patch under traditional feature engineering and modern feature learning strategies. In feature engineering, we consider spectra, shape and texture as the input features, and SVM, random forest (RF) and AdaBoost as the classification algorithms. In feature learning, the convolution neural network (CNN) is used as the classification algorithm. In total, 26 built-up land cover maps were produced. Experimental results show that: (1) the approaches based on feature learning are generally better than those based on feature engineering in terms of classification accuracy, and the performance of ensemble classifiers e.g., RF, is comparable to that of CNN. Two dimensional CNN and the 7 neighborhood RF have the highest classification accuracy of nearly 91%. (2) Overall, the classification effect and accuracy based on image patches are better than those based on single pixels. The features that can highlight the information of the target category (for example, PanTex and EMBI) can help improve classification accuracy.

Download Full-text

A Computational Method for the Identification of Endolysins and Autolysins

Protein and Peptide Letters ◽

10.2174/0929866526666191002104735 ◽

2020 ◽

Vol 27 (4) ◽

pp. 329-336 ◽

Cited By ~ 1

Author(s):

Lei Xu ◽

Guangmin Liang ◽

Baowen Chen ◽

Xu Tan ◽

Huaikun Xiang ◽

...

Keyword(s):

Support Vector Machine ◽

Cell Wall ◽

Experimental Results ◽

Computational Method ◽

Lytic Enzyme ◽

Support Vector ◽

Lytic Enzymes ◽

Data Set ◽

Optimal Feature ◽

Better Than

Background: Cell lytic enzyme is a kind of highly evolved protein, which can destroy the cell structure and kill the bacteria. Compared with antibiotics, cell lytic enzyme will not cause serious problem of drug resistance of pathogenic bacteria. Thus, the study of cell wall lytic enzymes aims at finding an efficient way for curing bacteria infectious. Compared with using antibiotics, the problem of drug resistance becomes more serious. Therefore, it is a good choice for curing bacterial infections by using cell lytic enzymes. Cell lytic enzyme includes endolysin and autolysin and the difference between them is the purpose of the break of cell wall. The identification of the type of cell lytic enzymes is meaningful for the study of cell wall enzymes. Objective: In this article, our motivation is to predict the type of cell lytic enzyme. Cell lytic enzyme is helpful for killing bacteria, so it is meaningful for study the type of cell lytic enzyme. However, it is time consuming to detect the type of cell lytic enzyme by experimental methods. Thus, an efficient computational method for the type of cell lytic enzyme prediction is proposed in our work. Method: We propose a computational method for the prediction of endolysin and autolysin. First, a data set containing 27 endolysins and 41 autolysins is built. Then the protein is represented by tripeptides composition. The features are selected with larger confidence degree. At last, the classifier is trained by the labeled vectors based on support vector machine. The learned classifier is used to predict the type of cell lytic enzyme. Results: Following the proposed method, the experimental results show that the overall accuracy can attain 97.06%, when 44 features are selected. Compared with Ding's method, our method improves the overall accuracy by nearly 4.5% ((97.06-92.9)/92.9%). The performance of our proposed method is stable, when the selected feature number is from 40 to 70. The overall accuracy of tripeptides optimal feature set is 94.12%, and the overall accuracy of Chou's amphiphilic PseAAC method is 76.2%. The experimental results also demonstrate that the overall accuracy is improved by nearly 18% when using the tripeptides optimal feature set. Conclusion: The paper proposed an efficient method for identifying endolysin and autolysin. In this paper, support vector machine is used to predict the type of cell lytic enzyme. The experimental results show that the overall accuracy of the proposed method is 94.12%, which is better than some existing methods. In conclusion, the selected 44 features can improve the overall accuracy for identification of the type of cell lytic enzyme. Support vector machine performs better than other classifiers when using the selected feature set on the benchmark data set.

Download Full-text

Using the Relevance Vector Machine Model Combined with Local Phase Quantization to Predict Protein-Protein Interactions from Protein Sequences

BioMed Research International ◽

10.1155/2016/4783801 ◽

2016 ◽

Vol 2016 ◽

pp. 1-9 ◽

Cited By ~ 13

Author(s):

Ji-Yong An ◽

Fan-Rong Meng ◽

Zhu-Hong You ◽

Yu-Hong Fang ◽

Yu-Jun Zhao ◽

...

Keyword(s):

Protein Sequences ◽

Relevance Vector Machine ◽

Experimental Results ◽

Computational Method ◽

Support Vector ◽

Svm Classifier ◽

Local Phase ◽

Local Phase Quantization ◽

Phase Quantization ◽

Better Than

We propose a novel computational method known as RVM-LPQ that combines the Relevance Vector Machine (RVM) model and Local Phase Quantization (LPQ) to predict PPIs from protein sequences. The main improvements are the results of representing protein sequences using the LPQ feature representation on a Position Specific Scoring Matrix (PSSM), reducing the influence of noise using a Principal Component Analysis (PCA), and using a Relevance Vector Machine (RVM) based classifier. We perform 5-fold cross-validation experiments onYeastandHumandatasets, and we achieve very high accuracies of 92.65% and 97.62%, respectively, which is significantly better than previous works. To further evaluate the proposed method, we compare it with the state-of-the-art support vector machine (SVM) classifier on theYeastdataset. The experimental results demonstrate that our RVM-LPQ method is obviously better than the SVM-based method. The promising experimental results show the efficiency and simplicity of the proposed method, which can be an automatic decision support tool for future proteomics research.

Download Full-text

Analysing Temporal Effects on Classification of SAR and Optical Images

10.5194/egusphere-egu21-14386 ◽

2021 ◽

Author(s):

Ahmet Batuhan Polat ◽

Ozgun Akcay ◽

Fusun Balik Sanli

Keyword(s):

Rural Areas ◽

Classification Accuracy ◽

Winter Season ◽

Support Vector ◽

Classification Algorithms ◽

Sar Images ◽

Optical Images ◽

Object Based ◽

Training Samples

<p>Obtaining high accuracy in land cover classification is a non-trivial problem in geosciences for monitoring urban and rural areas. In this study, different classification algorithms were tested with different types of data, and besides the effects of seasonal changes on these classification algorithms and the evaluation of the data used are investigated. In addition, the effect of increasing classification training samples on classification accuracy has been revealed as a result of the study. Sentinel-1 Synthetic Aperture Radar (SAR) images and Sentinel-2 multispectral optical images were used as datasets. Object-based approach was used for the classification of various fused image combinations. The classification algorithms Support Vector Machines (SVM), Random Forest (RF) and K-Nearest Neighborhood (kNN) methods were used for this process. In addition, Normalized Difference Vegetation Index (NDVI) was examined separately to define the exact contribution to the classification accuracy. &#160;As a result, the overall accuracies were compared by classifying the fused data generated by combining optical and SAR images. It has been determined that the increase in the number of training samples improve the classification accuracy. Moreover, it was determined that the object-based classification obtained from single SAR imagery produced the lowest classification accuracy among the used different dataset combinations in this study. In addition, it has been shown that NDVI data does not increase the accuracy of the classification in the winter season as the trees shed their leaves due to climate conditions.</p>

Download Full-text

Combining Block DCV and Support Vector Machine for Ear Recognition

International Journal of Interdisciplinary Telecommunications and Networking ◽

10.4018/ijitn.2016040104 ◽

2016 ◽

Vol 8 (2) ◽

pp. 36-44 ◽

Cited By ~ 1

Author(s):

Zhao Hailong ◽

Yi Junyan

Keyword(s):

Support Vector Machine ◽

Feature Extraction ◽

Image Retrieval ◽

Construction Method ◽

Experimental Results ◽

Support Vector ◽

Recognition Method ◽

Ear Recognition ◽

Human Ear ◽

Better Than

In recent years, automatic ear recognition has become a popular research. Effective feature extraction is one of the most important steps in Content-based ear image retrieval applications. In this paper, the authors proposed a new vectors construction method for ear retrieval based on Block Discriminative Common Vector. According to this method, the ear image is divided into 16 blocks firstly and the features are extracted by applying DCV to the sub-images. Furthermore, Support Vector Machine is used as classifier to make decision. The experimental results show that the proposed method performs better than classical PCA+LDA, so it is an effective human ear recognition method.

Download Full-text

AN EVALUATION ON PERFORMANCE OF PCA IN FACE RECOGNITION WITH EXPRESSION VARIATIONS

Scientific Journal of Tra Vinh University ◽

10.35382/18594816.1.30.2018.19 ◽

2018 ◽

Vol 1 (30) ◽

pp. 61-66

Author(s):

Khanh Ngoc Van Duong ◽

An Bao Nguyen

Keyword(s):

Face Recognition ◽

Facial Expression ◽

Mouth Opening ◽

Experimental Results ◽

Classification Algorithms ◽

Test Cases ◽

Wide Mouth ◽

Better Than

Appearance-based recognition methods often encounter difficulties when the input images contain facial expression variations such as laughing, crying or wide mouth opening. In these cases, holistic methods give better performance than appearance-based methods. This paper presents some evaluation on face recognition under variation of facial expression using the combination of PCA and classification algorithms. The experimental results showed that the best accuracy can be obtained with very few eigenvectors and KNN algorithm (with k=1) performs better than SVM in most test cases.

Download Full-text

Earth remote sensing imagery classification using a multi-sensor super-resolution fusion algorithm

Computer Optics ◽

10.18287/2412-6179-co-735 ◽

2020 ◽

Vol 44 (4) ◽

pp. 627-635

Author(s):

A.M. Belov ◽

A.Y. Denisova

Keyword(s):

Remote Sensing ◽

Support Vector Machines ◽

Random Forest ◽

Classification Accuracy ◽

Support Vector ◽

Random Forest Algorithm ◽

Earth Remote Sensing ◽

Vector Machines ◽

Fused Image ◽

Better Than

Earth remote sensing data fusion is intended to produce images of higher quality than the original ones. However, the fusion impact on further thematic processing remains an open question because fusion methods are mostly used to improve the visual data representation. This article addresses an issue of the effect of fusion with increasing spatial and spectral resolution of data on thematic classification of images using various state-of-the-art classifiers and features extraction methods. In this paper, we use our own algorithm to perform multi-frame image fusion over optical remote sensing images with different spatial and spectral resolutions. For classification, we applied support vector machines and Random Forest algorithms. For features, we used spectral channels, extended attribute profiles and local feature attribute profiles. An experimental study was carried out using model images of four imaging systems. The resulting image had a spatial resolution of 2, 3, 4 and 5 times better than for the original images of each imaging system, respectively. As a result of our studies, it was revealed that for the support vector machines method, fusion was inexpedient since excessive spatial details had a negative effect on the classification. For the Random Forest algorithm, the classification results of a fused image were more accurate than for the original low-resolution images in 90% of cases. For example, for images with the smallest difference in spatial resolution (2 times) from the fusion result, the classification accuracy of the fused image was on average 4% higher. In addition, the results obtained for the Random Forest algorithm with fusion were better than the results for the support vector machines method without fusion. Additionally, it was shown that the classification accuracy of a fused image using the Random Forest method could be increased by an average of 9% due to the use of extended attribute profiles as features. Thus, when using data fusion, it is better to use the Random Forest classifier, whereas using fusion with the support vector machines method is not recommended.

Download Full-text

Application of ACO-SVM in Chinese Text Feature Selection

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.155-156.770 ◽

2012 ◽

Vol 155-156 ◽

pp. 770-775

Author(s):

Jin Xiu Cui ◽

Xiao Xia Huang

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Chinese Text ◽

Classification Accuracy ◽

Considerable Increase ◽

Experimental Results ◽

Support Vector ◽

Text Feature

The algorithm proposed in this paper applies ACO in combination with support vector machine (SVM) in Chinese text feature selection. It obtains classifier models for each category at last. The experimental results show that the proposed method is feasible and lead to a considerable increase of classification accuracy.

Download Full-text

Personal Identification Based on Vectorcardiogram Derived from Limb Leads Electrocardiogram

Journal of Applied Mathematics ◽

10.1155/2012/904905 ◽

2012 ◽

Vol 2012 ◽

pp. 1-12 ◽

Cited By ~ 10

Author(s):

Jongshill Lee ◽

Youngjoon Chee ◽

Inyoung Kim

Keyword(s):

Support Vector Machine ◽

Classification Accuracy ◽

Personal Identification ◽

Experimental Results ◽

New Method ◽

Support Vector ◽

Inverse Transform ◽

Transform Matrix ◽

Electrocardiogram Ecg

We propose a new method for personal identification using the derived vectorcardiogram (dVCG), which is derived from the limb leads electrocardiogram (ECG). The dVCG was calculated from the standard limb leads ECG using the precalculated inverse transform matrix. Twenty-one features were extracted from the dVCG, and some or all of these 21 features were used in support vector machine (SVM) learning and in tests. The classification accuracy was 99.53%, which is similar to the previous dVCG analysis using the standard 12-lead ECG. Our experimental results show that it is possible to identify a person by features extracted from a dVCG derived from limb leads only. Hence, only three electrodes have to be attached to the person to be identified, which can reduce the effort required to connect electrodes and calculate the dVCG.

Download Full-text

Identifying e-Commerce in Enterprises by means of Text Mining and Classification Algorithms

Mathematical Problems in Engineering ◽

10.1155/2018/7231920 ◽

2018 ◽

Vol 2018 ◽

pp. 1-8 ◽

Cited By ~ 3

Author(s):

Gianpiero Bianchi ◽

Renato Bruni ◽

Francesco Scalfati

Keyword(s):

Text Mining ◽

Classification Problem ◽

Automatic Generation ◽

Support Vector ◽

Classification Algorithms ◽

Practical Case ◽

Difficult Case ◽

Similar Data ◽

Basic Task ◽

Corporate Websites

Monitoring specific features of the enterprises, for example, the adoption of e-commerce, is an important and basic task for several economic activities. This type of information is usually obtained by means of surveys, which are costly due to the amount of personnel involved in the task. An automatic detection of this information would allow consistent savings. This can actually be performed by relying on computer engineering, since in general this information is publicly available on-line through the corporate websites. This work describes how to convert the detection of e-commerce into a supervised classification problem, where each record is obtained from the automatic analysis of one corporate website, and the class is the presence or the absence of e-commerce facilities. The automatic generation of similar data records requires the use of several Text Mining phases; in particular we compare six strategies based on the selection of best words and best n-grams. After this, we classify the obtained dataset by means of four classification algorithms: Support Vector Machines; Random Forest; Statistical and Logical Analysis of Data; Logistic Classifier. This turns out to be a difficult case of classification problem. However, after a careful design and set-up of the whole procedure, the results on a practical case of Italian enterprises are encouraging.

Download Full-text