scholarly journals Nonuniform Granularity-Based Classification in Social Interest Detection

2017 ◽  
Vol 2017 ◽  
pp. 1-10
Author(s):  
Wenjuan Shao ◽  
Qingguo Shen ◽  
Xianli Jin ◽  
Liaoruo Huang ◽  
Jingjing Chen

Social interest detection is a new computing paradigm which processes a great variety of large scale resources. Effective classification of these resources is necessary for the social interest detection. In this paper, we describe some concepts and principles about classification and present a novel classification algorithm based on nonuniform granularity. Clustering algorithm is used to generate a clustering pedigree chart. By using suitable classification cutting values to cut the chart, we can get different branches which are used as categories. The size of cutting value is vital to the performance and can be dynamically adapted in the proposed algorithm. Experiments results carried on the blog posts illustrate the effectiveness of the proposed algorithm. Furthermore, the results for comparing with Naive Bayes, k-nearest neighbor, and so forth validate the better classification performance of the proposed algorithm for large scale resources.

2018 ◽  
Vol 5 (1) ◽  
pp. 8 ◽  
Author(s):  
Ajib Susanto ◽  
Daurat Sinaga ◽  
Christy Atika Sari ◽  
Eko Hari Rachmawanto ◽  
De Rosal Ignatius Moses Setiadi

The classification of Javanese character images is done with the aim of recognizing each character. The selected classification algorithm is K-Nearest Neighbor (KNN) at K = 1, 3, 5, 7, and 9. To improve KNN performance in Javanese character written by the author, and to prove that feature extraction is needed in the process image classification of Javanese character. In this study selected Local Binary Patter (LBP) as a feature extraction because there are research objects with a certain level of slope. The LBP parameters are used between [16 16], [32 32], [64 64], [128 128], and [256 256]. Experiments were performed on 80 training drawings and 40 test images. KNN values after combination with LBP characteristic extraction were 82.5% at K = 3 and LBP parameters [64 64].


Author(s):  
Bingming Wang ◽  
Shi Ying ◽  
Guoli Cheng ◽  
Rui Wang ◽  
Zhe Yang ◽  
...  

Logs play an important role in the maintenance of large-scale systems. The number of logs which indicate normal (normal logs) differs greatly from the number of logs that indicate anomalies (abnormal logs), and the two types of logs have certain differences. To automatically obtain faults by K-Nearest Neighbor (KNN) algorithm, an outlier detection method with high accuracy, is an effective way to detect anomalies from logs. However, logs have the characteristics of large scale and very uneven samples, which will affect the results of KNN algorithm on log-based anomaly detection. Thus, we propose an improved KNN algorithm-based method which uses the existing mean-shift clustering algorithm to efficiently select the training set from massive logs. Then we assign different weights to samples with different distances, which reduces the negative effect of unbalanced distribution of the log samples on the accuracy of KNN algorithm. By comparing experiments on log sets from five supercomputers, the results show that the method we proposed can be effectively applied to log-based anomaly detection, and the accuracy, recall rate and F measure with our method are higher than those of traditional keyword search method.


Proceedings ◽  
2018 ◽  
Vol 2 (7) ◽  
pp. 328 ◽  
Author(s):  
Eleftheria Mylona ◽  
Vassiliki Daskalopoulou ◽  
Olga Sykioti ◽  
Konstantinos Koutroumbas ◽  
Athanasios Rontogiannis

This paper deals with (both supervised and unsupervised) classification of multispectral Sentinel-2 images, utilizing the abundance representation of the pixels of interest. The latter pixel representation uncovers the hidden structured regions that are not often available in the reference maps. Additionally, it encourages class distinctions and bolsters accuracy. The adopted methodology, which has been successfully applied to hyperpsectral data, involves two main stages: (I) the determination of the pixel’s abundance representation; and (II) the employment of a classification algorithm applied to the abundance representations. More specifically, stage (I) incorporates two key processes, namely (a) endmember extraction, utilizing spectrally homogeneous regions of interest (ROIs); and (b) spectral unmixing, which hinges upon the endmember selection. The adopted spectral unmixing process assumes the linear mixing model (LMM), where each pixel is expressed as a linear combination of the endmembers. The pixel’s abundance vector is estimated via a variational Bayes algorithm that is based on a suitably defined hierarchical Bayesian model. The resulting abundance vectors are then fed to stage (II), where two off-the-shelf supervised classification approaches (namely nearest neighbor (NN) classification and support vector machines (SVM)), as well as an unsupervised classification process (namely the online adaptive possibilistic c-means (OAPCM) clustering algorithm), are adopted. Experiments are performed on a Sentinel-2 image acquired for a specific region of the Northern Pindos National Park in north-western Greece containing water, vegetation and bare soil areas. The experimental results demonstrate that the ad-hoc classification approaches utilizing abundance representations of the pixels outperform those utilizing the spectral signatures of the pixels in terms of accuracy.


2020 ◽  
Vol 12 (10) ◽  
pp. 1640 ◽  
Author(s):  
Zhi He ◽  
Dan He

Deep learning methods have been successfully applied for multispectral and hyperspectral images classification due to their ability to extract hierarchical abstract features. However, the performance of these methods relies heavily on large-scale training samples. In this paper, we propose a three-dimensional spatial-adaptive Siamese residual network (3D-SaSiResNet) that requires fewer samples and still enhances the performance. The proposed method consists of two main steps: construction of 3D spatial-adaptive patches and Siamese residual network for multiband images classification. In the first step, the spectral dimension of the original multiband images is reduced by a stacked autoencoder and superpixels of each band are obtained by the simple linear iterative clustering (SLIC) method. Superpixels of the original multiband image can be finally generated by majority voting. Subsequently, the 3D spatial-adaptive patch of each pixel is extracted from the original multiband image by reference to the previously generated superpixels. In the second step, a Siamese network composed of two 3D residual networks is designed to extract discriminative features for classification and we train the 3D-SaSiResNet by pairwise inputting the training samples into the networks. The testing samples are then fed into the trained 3D-SaSiResNet and the learned features of the testing samples are classified by the nearest neighbor classifier. Experimental results on three multiband image datasets show the feasibility of the proposed method in enhancing classification performance even with limited training samples.


Hacquetia ◽  
2020 ◽  
Vol 19 (1) ◽  
pp. 99-126
Author(s):  
Igor V. Goncharenko ◽  
Halina M. Yatsenko

AbstractThe study presents a floristic-sociological classification of the forest vegetation of Kyiv urban area. We identified 18 syntaxa within 7 classes, 7 orders, 8 alliances, and 3 new associations were allocated (Aristolochio clematitis-Populetum nigrae, Galio aparines-Aceretum negundi, Dryopterido carthusianae-Pinetum sylvestris). We analyzed vegetation data using quantitative approaches of ordination and phytoindication. Considering many relevés of transitional nature in the collected data on urban forests, the clustering algorithm of DRSA (Distance-Ranked Sorting Algorithm) was applied to classify vegetation matrix. Large-scale comparative floristic analysis of syntaxa from different regions and countries have been conducted and summarized in differentiating tables.


2017 ◽  
Vol 6 (2) ◽  
pp. 83-96 ◽  
Author(s):  
A. Sultanova ◽  
I.A. Ivanova

The article raises the question of the actuality level of normative data. This kind of data is necessary to compare the results of experimental studies with it, according to the traditions of Russian psychology. It can be assumed that the social changes that took place in the last decades should reflect on the process of forming of thinking and other mental functions. A pilot study for identifying the features of performing of classical pathopsychological techniques by healthy subjects was conducted. The study involved mentally healthy and socially adapted people of 20-39 years old, graduated or undergraduated. We used next several techniques: "Classification of objects", "Pictogram", filling in words missed in the text (Ebbinghaus test), "Interpretation of proverbs". The results of the experiment made it possible to identify two areas in which the changes were most significant. These spheres are emotional-motivational (personality) and thinking. Many subjects were characterized by: a wary-anxious attitude to the experiment, increased emotional - personal attitude to the stimuli material, a decrease in criticality to the results of their activities, neurodynamic disorders, inconsistency of thinking, versatility of thinking, a tendency to resonate, self-centered thinking (according to the authors these features are manifested in the form of "pathopsychology of everyday life" in everyday lifestyle). It is necessary to conduct special large-scale scientific research devoted to this problem.


2017 ◽  
Author(s):  
Debajyoti Sinha ◽  
Akhilesh Kumar ◽  
Himanshu Kumar ◽  
Sanghamitra Bandyopadhyay ◽  
Debarka Sengupta

ABSTRACTDroplet based single cell transcriptomics has recently enabled parallel screening of tens of thousands of single cells. Clustering methods that scale for such high dimensional data without compromising accuracy are scarce. We exploit Locality Sensitive Hashing, an approximate nearest neighbor search technique to develop ade novoclustering algorithm for large-scale single cell data. On a number of real datasets, dropClust outperformed the existing best practice methods in terms of execution time, clustering accuracy and detectability of minor cell sub-types.


Author(s):  
Clive Holes

This chapter outlines the scholarly background of the study of Arabic historical dialectology, and addresses the following issues: the early history of Arabic: myth and reality; the definition and exemplification of ‘Middle Arabic’ and ‘Mixed Arabic through history’; evidence for the early occurrence of certain Arabic dialectal features; examples of substrates and borrowing in Arabic dialects; the dialect geography of Arabic and its typology, especially the ‘sedentary’ and ‘bedouin’ divide; how and why dialects have undergone change, large-scale and small-scale, and the causative social factors; a classification of the typology of internal linguistic change in Arabic; causes of the social indexicalization of dialectal features of Arabic; examples of the pidginization and creolization of Arabic, and the reasons for the apparent rarity of this phenomenon.


Proceedings ◽  
2018 ◽  
Vol 2 (19) ◽  
pp. 1242 ◽  
Author(s):  
Macarena Espinilla ◽  
Javier Medina ◽  
Alberto Salguero ◽  
Naomi Irvine ◽  
Mark Donnelly ◽  
...  

Data driven approaches for human activity recognition learn from pre-existent large-scale datasets to generate a classification algorithm that can recognize target activities. Typically, several activities are represented within such datasets, characterized by multiple features that are computed from sensor devices. Often, some features are found to be more relevant to particular activities, which can lead to the classification algorithm providing less accuracy in detecting the activity where such features are not so relevant. This work presents an experimentation for human activity recognition with features derived from the acceleration data of a wearable device. Specifically, this work analyzes which features are most relevant for each activity and furthermore investigates which classifier provides the best accuracy with those features. The results obtained indicate that the best classifier is the k-nearest neighbor and furthermore, confirms that there do exist redundant features that generally introduce noise into the classification, leading to decreased accuracy.


Sign in / Sign up

Export Citation Format

Share Document