scholarly journals Identifying strong lenses with unsupervised machine learning using convolutional autoencoder

2020 ◽  
Vol 494 (3) ◽  
pp. 3750-3765 ◽  
Author(s):  
Ting-Yun Cheng ◽  
Nan Li ◽  
Christopher J Conselice ◽  
Alfonso Aragón-Salamanca ◽  
Simon Dye ◽  
...  

ABSTRACT In this paper, we develop a new unsupervised machine learning technique comprised of a feature extractor, a convolutional autoencoder, and a clustering algorithm consisting of a Bayesian Gaussian mixture model. We apply this technique to visual band space-based simulated imaging data from the Euclid Space Telescope using data from the strong gravitational lenses finding challenge. Our technique promisingly captures a variety of lensing features such as Einstein rings with different radii, distorted arc structures, etc., without using predefined labels. After the clustering process, we obtain several classification clusters separated by different visual features which are seen in the images. Our method successfully picks up ∼63 per cent of lensing images from all lenses in the training set. With the assumed probability proposed in this study, this technique reaches an accuracy of 77.25 ± 0.48 per cent in binary classification using the training set. Additionally, our unsupervised clustering process can be used as the preliminary classification for future surveys of lenses to efficiently select targets and to speed up the labelling process. As the starting point of the astronomical application using this technique, we not only explore the application to gravitationally lensed systems, but also discuss the limitations and potential future uses of this technique.

Algorithms ◽  
2021 ◽  
Vol 14 (9) ◽  
pp. 258
Author(s):  
Tran Dinh Khang ◽  
Manh-Kien Tran ◽  
Michael Fowler

Clustering is an unsupervised machine learning method with many practical applications that has gathered extensive research interest. It is a technique of dividing data elements into clusters such that elements in the same cluster are similar. Clustering belongs to the group of unsupervised machine learning techniques, meaning that there is no information about the labels of the elements. However, when knowledge of data points is known in advance, it will be beneficial to use a semi-supervised algorithm. Within many clustering techniques available, fuzzy C-means clustering (FCM) is a common one. To make the FCM algorithm a semi-supervised method, it was proposed in the literature to use an auxiliary matrix to adjust the membership grade of the elements to force them into certain clusters during the computation. In this study, instead of using the auxiliary matrix, we proposed to use multiple fuzzification coefficients to implement the semi-supervision component. After deriving the proposed semi-supervised fuzzy C-means clustering algorithm with multiple fuzzification coefficients (sSMC-FCM), we demonstrated the convergence of the algorithm and validated the efficiency of the method through a numerical example.


2019 ◽  
Vol 9 (2) ◽  
pp. 238 ◽  
Author(s):  
Yanqing Yang ◽  
Kangfeng Zheng ◽  
Chunhua Wu ◽  
Xinxin Niu ◽  
Yixian Yang

Machine learning plays an important role in building intrusion detection systems. However, with the increase of data capacity and data dimension, the ability of shallow machine learning is becoming more limited. In this paper, we propose a fuzzy aggregation approach using the modified density peak clustering algorithm (MDPCA) and deep belief networks (DBNs). To reduce the size of the training set and the imbalance of the samples, MDPCA is used to divide the training set into several subsets with similar sets of attributes. Each subset is used to train its own sub-DBNs classifier. These sub-DBN classifiers can learn and explore high-level abstract features, automatically reduce data dimensions, and perform classification well. According to the nearest neighbor criterion, the fuzzy membership weights of each test sample in each sub-DBNs classifier are calculated. The output of all sub-DBNs classifiers is aggregated based on fuzzy membership weights. Experimental results on the NSL-KDD and UNSW-NB15 datasets show that our proposed model has higher overall accuracy, recall, precision and F1-score than other well-known classification methods. Furthermore, the proposed model achieves better performance in terms of accuracy, detection rate and false positive rate compared to the state-of-the-art intrusion detection methods.


2020 ◽  
Vol 222 (3) ◽  
pp. 1750-1764 ◽  
Author(s):  
Yangkang Chen

SUMMARY Effective and efficient arrival picking plays an important role in microseismic and earthquake data processing and imaging. Widely used short-term-average long-term-average ratio (STA/LTA) based arrival picking algorithms suffer from the sensitivity to moderate-to-strong random ambient noise. To make the state-of-the-art arrival picking approaches effective, microseismic data need to be first pre-processed, for example, removing sufficient amount of noise, and second analysed by arrival pickers. To conquer the noise issue in arrival picking for weak microseismic or earthquake event, I leverage the machine learning techniques to help recognizing seismic waveforms in microseismic or earthquake data. Because of the dependency of supervised machine learning algorithm on large volume of well-designed training data, I utilize an unsupervised machine learning algorithm to help cluster the time samples into two groups, that is, waveform points and non-waveform points. The fuzzy clustering algorithm has been demonstrated to be effective for such purpose. A group of synthetic, real microseismic and earthquake data sets with different levels of complexity show that the proposed method is much more robust than the state-of-the-art STA/LTA method in picking microseismic events, even in the case of moderately strong background noise.


2018 ◽  
Vol 11 (3) ◽  
pp. 274-300
Author(s):  
Waqas Khalid ◽  
Zaza Nadja Lee Herbert-Hansen

PurposeThis paper aims to investigate the application of unsupervised machine learning in the international location decision (ILD). This paper addresses the need for a fast, quantitative and dynamic location decision framework.Design/methodology/approachUnsupervised machine learning technique, i.e. k-means clustering, is used to carry out the analysis. In total, 24 different indicators of 94 countries, categorized into five groups, have been used in the analysis. After the clustering, the clusters have been compared and scored to select the feasible countries.FindingsA new framework is developed based on k-means clustering that can be used in ILD. This method provides a quantitative output without personal subjectivity. The indicators can be easily added or extracted based on the preferences of the decision-makers. Hence, it was found out that the unsupervised machine learning, i.e. k-means clustering, is a fast and flexible decision support framework that can be used in ILD.Research limitations/implicationsLimitations include the generality of selected indicators and clustering algorithm used. The use of other methods and parameters may lead to alternate results.Originality/valueThe framework developed through the research intends to assist the decision-makers in deciding on the facility locations. The framework can be used in international and national domains. It provides a quantitative, fast and flexible way to shortlist the potential locations. Other methods can also be used to further decide on the specific location.


2021 ◽  
Vol 13 (5) ◽  
pp. 845
Author(s):  
Yuchao Chen ◽  
Qian Huang ◽  
Jiannan Zhao ◽  
Xiangyun Hu

Lunar volcanic domes are essential windows into the local magmatic activities on the Moon. Classification of domes is a useful way to figure out the relationship between dome appearances and formation processes. Previous studies of dome classification were manually or semi-automatically carried out either qualitatively or quantitively. We applied an unsupervised machine-learning method to domes that are annularly or radially distributed around Gardner, a unique central-vent volcano located in the northern part of the Mare Tranquillitatis. High-resolution lunar imaging and spectral data were used to extract morphometric and spectral properties of domes in both the Gardner volcano and its surrounding region in the Mare Tranquillitatis. An integrated robust Fuzzy C-Means clustering algorithm was performed on 120 combinations of five morphometric (diameter, area, height, surface volume, and slope) and two elemental features (FeO and TiO2 contents) to find the optimum combination. Rheological features of domes and their dike formation parameters were calculated for dome-forming lava explanations. Results show that diameter, area, surface volume, and slope are the selected optimum features for dome clustering. 54 studied domes can be grouped into four dome clusters (DC1 to DC4). DC1 domes are relatively small, steep, and close to the Gardner volcano, with forming lavas of high viscosities and low effusion rates, representing the latest Eratosthenian dome formation stage of the Gardner volcano. Domes of DC2 to DC4 are relatively large, smooth, and widely distributed, with forming lavas of low viscosities and high effusion rates, representing magmatic activities varying from Imbrian to Eratosthenian in the northern Mare Tranquillitatis. The integrated algorithm provides a new and independent way to figure out the representative properties of lunar domes and helps us further clarify the relationship between dome clusters and local magma activities of the Moon.


Economies ◽  
2019 ◽  
Vol 7 (3) ◽  
pp. 74
Author(s):  
Carlos Mendez

The cross-country convergence hypothesis is one of the central topics of long-run macroeconomics. This paper revisits this hypothesis in a context beyond GDP. It uses a novel welfare index that incorporates measures of consumption, leisure, life expectancy, and inequality. Based on a sample of 128 countries over the 1980–2007 period, the lack of global sigma and beta convergence is first documented. Next, the paper incorporates some recent developments from the unsupervised machine learning literature to evaluate the existence of local convergence. In particular, the application of a distribution-based clustering algorithm suggests the formation of three local convergence clubs. Under this classification, beta convergence is recovered for each club. However, only the core members of the richest club appear to be reducing their welfare differences in a way that is consistent with the strong notion of sigma convergence. Overall, these results re-emphasize the finding that beta convergence is necessary, but not sufficient for sigma convergence, even within convergence clubs and in a context beyond GDP.


Sign in / Sign up

Export Citation Format

Share Document