A Search Method for Optimal Band Combination of Hyperspectral Imagery Based on Two Layers Selection Strategy

Computational Intelligence and Neuroscience ◽

10.1155/2021/5592323 ◽

2021 ◽

Vol 2021 ◽

pp. 1-14

Author(s):

Nian Chen ◽

Kezhong Lu ◽

Hao Zhou

Keyword(s):

Local Density ◽

Hyperspectral Imagery ◽

Selection Strategy ◽

Data Sets ◽

Optimal Subset ◽

Density Peaks ◽

Average Accuracy ◽

Target Set ◽

Two Phases ◽

Candidate Set

A band selection method based on two layers selection (TLS) strategy, which forms an optimal subset from all-bands set to reconstitute the original hyperspectral imagery (HSI) and aims to cost a fewer bands for better performances, is proposed in this paper. As its name implies, TLS picks out the bands with low correlation and a large amount of information into the target set to reach dimensionality reduction for HSI via two phases. Specifically, the fast density peaks clustering (FDPC) algorithm is used to select the most representative node in each cluster to build a candidate set at first. During the implementation, we normalize the local density and relative distance and utilize the dynamic cutoff distance to weaken the influence of density so that the selection is more likely to be carried out in scattered clusters than in high-density ones. After that, we conduct a further selection in the candidate set using mRMR strategy and comprehensive measurement of information (CMI), and the eventual winners will be selected into the target set. Compared with other six state-of-the-art unsupervised algorithms on three real-world HSI data sets, the results show that TLS can group the bands with lower correlation and richer information and has obvious advantages in indicators of overall accuracy (OA), average accuracy (AA), and Kappa coefficient.

Download Full-text

SICEM: A Generation Approach of Band Combination for Hyperspectral Imagery Reconstitution Based on Space and Information Analyses

Computational Intelligence and Neuroscience ◽

10.1155/2021/8178495 ◽

2021 ◽

Vol 2021 ◽

pp. 1-15

Author(s):

Nian Chen ◽

Kezhong Lu ◽

Hao Zhou

Keyword(s):

Comprehensive Evaluation ◽

Hyperspectral Imagery ◽

Evaluation Model ◽

Optimal Subset ◽

Density Peaks ◽

Discrete Band ◽

Information Score ◽

Geometric Space ◽

Two Phases ◽

Unsupervised Algorithms

A band selection algorithm named space and information comprehensive evaluation model (SICEM) is proposed in this paper, which reconstitutes the hyperspectral imagery by building an optimal subset to replace the original spectrum. SICEM reduces the dimensions while keeping the vital information of an image, and these are accomplished through two phases. Specifically, the improved fast density peaks clustering (I-FDPC) algorithm is employed to pick out the scattered bands in geometric space to generate a candidate set U at first. Then, we conduct pruning in U through iterative information analysis until the target set Ω is built. In this phase, we need to calculate comprehensive information score (CIS) for every member in U after assigning weights to the amount of information (AoI) and correlation. In each iteration, the band with highest score is selected into Ω , and the ones highly related to it will be removed out of U via a threshold. Compared with the four state-of-the-art unsupervised algorithms on real-world HSI datasets (IndianP and PaviaU), we find that SICEM has strong ability to form an optimal reduced-dimension combination with low correlation and rich information and it performs well in discrete band distribution, accuracy, consistency, and stability.

Download Full-text

Do galactic bars depend on environment?: an information theoretic analysis of Galaxy Zoo 2

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/staa3665 ◽

2020 ◽

Vol 501 (1) ◽

pp. 994-1001

Author(s):

Suman Sarkar ◽

Biswajit Pandey ◽

Snehasish Bhattacharjee

Keyword(s):

Spatial Distribution ◽

Mutual Information ◽

Local Density ◽

Statistical Significance ◽

Distribution Functions ◽

Cumulative Distribution ◽

Host Galaxy ◽

Data Sets ◽

Data Set ◽

Information Theoretic

ABSTRACT We use an information theoretic framework to analyse data from the Galaxy Zoo 2 project and study if there are any statistically significant correlations between the presence of bars in spiral galaxies and their environment. We measure the mutual information between the barredness of galaxies and their environments in a volume limited sample (Mr ≤ −21) and compare it with the same in data sets where (i) the bar/unbar classifications are randomized and (ii) the spatial distribution of galaxies are shuffled on different length scales. We assess the statistical significance of the differences in the mutual information using a t-test and find that both randomization of morphological classifications and shuffling of spatial distribution do not alter the mutual information in a statistically significant way. The non-zero mutual information between the barredness and environment arises due to the finite and discrete nature of the data set that can be entirely explained by mock Poisson distributions. We also separately compare the cumulative distribution functions of the barred and unbarred galaxies as a function of their local density. Using a Kolmogorov–Smirnov test, we find that the null hypothesis cannot be rejected even at $75{{\ \rm per\ cent}}$ confidence level. Our analysis indicates that environments do not play a significant role in the formation of a bar, which is largely determined by the internal processes of the host galaxy.

Download Full-text

Entropy Based Features Distribution for Anti-DDoS Model in SDN

Sustainability ◽

10.3390/su13031522 ◽

2021 ◽

Vol 13 (3) ◽

pp. 1522

Author(s):

Raja Majid Ali Ujjan ◽

Zeeshan Pervez ◽

Keshav Dahal ◽

Wajahat Ali Khan ◽

Asad Masood Khattak ◽

...

Keyword(s):

Network Security ◽

False Positive ◽

Denial Of Service ◽

Network Services ◽

Detection Accuracy ◽

Data Sets ◽

Traffic Patterns ◽

Average Accuracy ◽

Ddos Detection ◽

Quantitative Results

In modern network infrastructure, Distributed Denial of Service (DDoS) attacks are considered as severe network security threats. For conventional network security tools it is extremely difficult to distinguish between the higher traffic volume of a DDoS attack and large number of legitimate users accessing a targeted network service or a resource. Although these attacks have been widely studied, there are few works which collect and analyse truly representative characteristics of DDoS traffic. The current research mostly focuses on DDoS detection and mitigation with predefined DDoS data-sets which are often hard to generalise for various network services and legitimate users’ traffic patterns. In order to deal with considerably large DDoS traffic flow in a Software Defined Networking (SDN), in this work we proposed a fast and an effective entropy-based DDoS detection. We deployed generalised entropy calculation by combining Shannon and Renyi entropy to identify distributed features of DDoS traffic—it also helped SDN controller to effectively deal with heavy malicious traffic. To lower down the network traffic overhead, we collected data-plane traffic with signature-based Snort detection. We then analysed the collected traffic for entropy-based features to improve the detection accuracy of deep learning models: Stacked Auto Encoder (SAE) and Convolutional Neural Network (CNN). This work also investigated the trade-off between SAE and CNN classifiers by using accuracy and false-positive results. Quantitative results demonstrated SAE achieved relatively higher detection accuracy of 94% with only 6% of false-positive alerts, whereas the CNN classifier achieved an average accuracy of 93%.

Download Full-text

Self-Adaptive K-Means Based on a Covering Algorithm

Complexity ◽

10.1155/2018/7698274 ◽

2018 ◽

Vol 2018 ◽

pp. 1-16 ◽

Cited By ~ 1

Author(s):

Yiwen Zhang ◽

Yuanyuan Zhou ◽

Xing Guo ◽

Jintao Wu ◽

Qiang He ◽

...

Keyword(s):

Large Scale ◽

Clustering Algorithm ◽

Real Data ◽

Second Phase ◽

Data Sets ◽

Number Of Clusters ◽

Large Scale Data ◽

Long Time ◽

Two Phases ◽

Selection Of

The K-means algorithm is one of the ten classic algorithms in the area of data mining and has been studied by researchers in numerous fields for a long time. However, the value of the clustering number k in the K-means algorithm is not always easy to be determined, and the selection of the initial centers is vulnerable to outliers. This paper proposes an improved K-means clustering algorithm called the covering K-means algorithm (C-K-means). The C-K-means algorithm can not only acquire efficient and accurate clustering results but also self-adaptively provide a reasonable numbers of clusters based on the data features. It includes two phases: the initialization of the covering algorithm (CA) and the Lloyd iteration of the K-means. The first phase executes the CA. CA self-organizes and recognizes the number of clusters k based on the similarities in the data, and it requires neither the number of clusters to be prespecified nor the initial centers to be manually selected. Therefore, it has a “blind” feature, that is, k is not preselected. The second phase performs the Lloyd iteration based on the results of the first phase. The C-K-means algorithm combines the advantages of CA and K-means. Experiments are carried out on the Spark platform, and the results verify the good scalability of the C-K-means algorithm. This algorithm can effectively solve the problem of large-scale data clustering. Extensive experiments on real data sets show that the accuracy and efficiency of the C-K-means algorithm outperforms the existing algorithms under both sequential and parallel conditions.

Download Full-text

Structure and Electrochemical Potential Simulation for the Cathode Material Li1+xV3O8

MRS Proceedings ◽

10.1557/proc-496-115 ◽

1997 ◽

Vol 496 ◽

Cited By ~ 1

Author(s):

R. Benedek ◽

M. M. Thackeray ◽

L. H. Yang

Keyword(s):

Density Functional ◽

Structural Information ◽

Local Density ◽

Tetrahedral Site ◽

Low Energy ◽

Electrochemical Potential ◽

X Ray Diffraction ◽

X Ray ◽

Local Density Functional Theory ◽

Two Phases

ABSTRACTThe structure and electrochemical potential of monoclinic Li1+xV3O8 were calculated within the local-density-functional-theory framework by use of plane-wave-pseudopotential methods. Special attention was given to the compositions 1+x=1.2 and 1+x=4, for which x-ray diffraction structure refinements are available. The calculated low-energy configuration for 1+x=4 is consistent with the three Li sites identified in x-ray diffraction measurements and predicts the position of the unobserved Li. The location of the tetrahedrally coordinated Li in the calculated low-energy configuration for 1+x=1.5 is consistent with the structure measured by x-ray diffraction for Li1.2V3O8. Calculations were also performed for the two monoclinic phases at intermediate Li compositions, for which no structural information is available. Calculations at these compositions are based on hypothetical Li configurations suggested by the ordering of vacancy energies for Li4V3O8 and tetrahedral site energies in Li1.5V3O8. The internal energy curves for the two phases- cross near 1+x=3. Predicted electrochemical potential curves agree well with experiment.

Download Full-text

Interpreting the Data: Parallel Analysis with Sawzall

Scientific Programming ◽

10.1155/2005/962135 ◽

2005 ◽

Vol 13 (4) ◽

pp. 277-298 ◽

Cited By ~ 217

Author(s):

Rob Pike ◽

Sean Dorward ◽

Robert Griesemer ◽

Sean Quinlan

Keyword(s):

Programming Language ◽

Regular Structure ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Data Parallel ◽

Web Document ◽

Distributed Computations ◽

Procedural Programming ◽

Two Phases

Very large data sets often have a flat but regular structure and span multiple disks and machines. Examples include telephone call records, network logs, and web document repositories. These large data sets are not amenable to study using traditional database techniques, if only because they can be too large to fit in a single relational database. On the other hand, many of the analyses done on them can be expressed using simple, easily distributed computations: filtering, aggregation, extraction of statistics, and so on. We present a system for automating such analyses. A filtering phase, in which a query is expressed using a new procedural programming language, emits data to an aggregation phase. Both phases are distributed over hundreds or even thousands of computers. The results are then collated and saved to a file. The design – including the separation into two phases, the form of the programming language, and the properties of the aggregators – exploits the parallelism inherent in having data and computation distributed across many machines.

Download Full-text

A novel density peaks clustering with sensitivity of local density and density-adaptive metric

Knowledge and Information Systems ◽

10.1007/s10115-018-1189-7 ◽

2018 ◽

Vol 59 (2) ◽

pp. 285-309 ◽

Cited By ~ 12

Author(s):

Mingjing Du ◽

Shifei Ding ◽

Yu Xue ◽

Zhongzhi Shi

Keyword(s):

Local Density ◽

Density Peaks ◽

Density Peaks Clustering ◽

Adaptive Metric

Download Full-text

A Fast Method for Estimating the Number of Clusters Based on Score and the Minimum Distance of the Center Point

Information ◽

10.3390/info11010016 ◽

2019 ◽

Vol 11 (1) ◽

pp. 16

Author(s):

Zhenzhen He ◽

Zongpu Jia ◽

Xiaohong Zhang

Keyword(s):

Minimum Distance ◽

Learning Algorithm ◽

Kernel Density ◽

Gaussian Kernel ◽

Data Sets ◽

Fast Method ◽

Number Of Clusters ◽

Center Point ◽

Estimation Function ◽

Candidate Set

Clustering is widely used as an unsupervised learning algorithm. However, it is often necessary to manually enter the number of clusters, and the number of clusters has a great impact on the clustering effect. At present, researchers propose some algorithms to determine the number of clusters, but the results are not very good for determining the number of clusters of data sets with complex and scattered shapes. To solve these problems, this paper proposes using the Gaussian Kernel density estimation function to determine the maximum number of clusters, use the change of center point score to get the candidate set of center points, and further use the change of the minimum distance between center points to get the number of clusters. The experiment shows the validity and practicability of the proposed algorithm.

Download Full-text

A summary of observational records on periodicities above the rotational period in the Jovian magnetosphere

Annales Geophysicae ◽

10.5194/angeo-27-2565-2009 ◽

2009 ◽

Vol 27 (6) ◽

pp. 2565-2573 ◽

Cited By ~ 19

Author(s):

E. A. Kronberg ◽

J. Woch ◽

N. Krupp ◽

A. Lagg

Keyword(s):

Data Sets ◽

Mass Loading ◽

Triggering Mechanism ◽

Release Process ◽

Jovian Magnetosphere ◽

Planetary Rotation ◽

Mass Release ◽

Two Phases ◽

Associated Mass

Abstract. The Jovian magnetosphere is a very dynamic system. The plasma mass-loading from the moon Io and the fast planetary rotation lead to regular release of mass from the Jovian magnetosphere and to a change of the magnetic topology. These regular variations, most commonly on several (2.5–4) days scale, were derived from various data sets obtained by different spacecraft missions and instruments ranging from auroral images to in situ measurements of magnetospheric particles. Specifically, ion measurements from the Galileo spacecraft represent the periodicities, very distinctively, namely the periodic thinning of the plasma sheet and subsequent dipolarization, and explosive mass release occurring mainly during the transition between these two phases. We present a review of these periodicities, particularly concentrating on those observed in energetic particle data. The most distinct periodicities are observed for ions of sulfur and oxygen. The periodic topological change of the Jovian magnetosphere, the associated mass-release process and auroral signatures can be interpreted as a global magnetospheric instability with analogies to the two step concept of terrestrial substorms. Different views on the triggering mechanism of this magnetospheric instability are discussed.

Download Full-text

An Efficient Clustering Method for Hyperspectral Optimal Band Selection via Shared Nearest Neighbor

Remote Sensing ◽

10.3390/rs11030350 ◽

2019 ◽

Vol 11 (3) ◽

pp. 350 ◽

Cited By ~ 7

Author(s):

Qiang Li ◽

Qi Wang ◽

Xuelong Li

Keyword(s):

Nearest Neighbor ◽

Hyperspectral Image ◽

Local Density ◽

Computational Time ◽

Band Selection ◽

Data Sets ◽

Selection Methods ◽

Clustering Method ◽

Slope Change ◽

Shared Nearest Neighbor

A hyperspectral image (HSI) has many bands, which leads to high correlation between adjacent bands, so it is necessary to find representative subsets before further analysis. To address this issue, band selection is considered as an effective approach that removes redundant bands for HSI. Recently, many band selection methods have been proposed, but the majority of them have extremely poor accuracy in a small number of bands and require multiple iterations, which does not meet the purpose of band selection. Therefore, we propose an efficient clustering method based on shared nearest neighbor (SNNC) for hyperspectral optimal band selection, claiming the following contributions: (1) the local density of each band is obtained by shared nearest neighbor, which can more accurately reflect the local distribution characteristics; (2) in order to acquire a band subset containing a large amount of information, the information entropy is taken as one of the weight factors; (3) a method for automatically selecting the optimal band subset is designed by the slope change. The experimental results reveal that compared with other methods, the proposed method has competitive computational time and the selected bands achieve higher overall classification accuracy on different data sets, especially when the number of bands is small.

Download Full-text