scholarly journals Smartwatch User Authentication by Sensing Tapping Rhythms and Using One-Class DBSCAN

Sensors ◽  
2021 ◽  
Vol 21 (7) ◽  
pp. 2456
Author(s):  
Hanqi Zhang ◽  
Xi Xiao ◽  
Shiguang Ni ◽  
Changsheng Dou ◽  
Wei Zhou ◽  
...  

As important sensors in smart sensing systems, smartwatches are becoming more and more popular. Authentication can help protect the security and privacy of users. In addition to the classic authentication methods, behavioral factors can be used as robust measures for this purpose. This study proposes a lightweight authentication method for smartwatches based on edge computing, which identifies users by their tapping rhythms. Based on the DBSCAN clustering algorithm, a new classification method called One-Class DBSCAN is presented. It first seeks core objects and then leverages them to perform user authentication. We conducted extensive experiments on 6110 real data samples collected from more than 600 users. The results show that our method achieved the lowest Equal Error Rate (EER) of only 0.92%, which was lower than those of other state-of-the-art methods. In addition, a statistical method for detecting the security level of a tapping rhythm is proposed. It can prevent users from setting a simple tapping rhythm password, and thus improve the security of smartwatches.

IEEE Access ◽  
2021 ◽  
Vol 9 ◽  
pp. 43364-43377
Author(s):  
Xirui Xue ◽  
Shucai Huang ◽  
Jiahao Xie ◽  
Jiashun Ma ◽  
Ning Li

2016 ◽  
Vol 2016 ◽  
pp. 1-15
Author(s):  
N. Vanello ◽  
E. Ricciardi ◽  
L. Landini

Independent component analysis (ICA) of functional magnetic resonance imaging (fMRI) data can be employed as an exploratory method. The lack in the ICA model of strong a priori assumptions about the signal or about the noise leads to difficult interpretations of the results. Moreover, the statistical independence of the components is only approximated. Residual dependencies among the components can reveal informative structure in the data. A major problem is related to model order selection, that is, the number of components to be extracted. Specifically, overestimation may lead to component splitting. In this work, a method based on hierarchical clustering of ICA applied to fMRI datasets is investigated. The clustering algorithm uses a metric based on the mutual information between the ICs. To estimate the similarity measure, a histogram-based technique and one based on kernel density estimation are tested on simulated datasets. Simulations results indicate that the method could be used to cluster components related to the same task and resulting from a splitting process occurring at different model orders. Different performances of the similarity measures were found and discussed. Preliminary results on real data are reported and show that the method can group task related and transiently task related components.


Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-16 ◽  
Author(s):  
Yiwen Zhang ◽  
Yuanyuan Zhou ◽  
Xing Guo ◽  
Jintao Wu ◽  
Qiang He ◽  
...  

The K-means algorithm is one of the ten classic algorithms in the area of data mining and has been studied by researchers in numerous fields for a long time. However, the value of the clustering number k in the K-means algorithm is not always easy to be determined, and the selection of the initial centers is vulnerable to outliers. This paper proposes an improved K-means clustering algorithm called the covering K-means algorithm (C-K-means). The C-K-means algorithm can not only acquire efficient and accurate clustering results but also self-adaptively provide a reasonable numbers of clusters based on the data features. It includes two phases: the initialization of the covering algorithm (CA) and the Lloyd iteration of the K-means. The first phase executes the CA. CA self-organizes and recognizes the number of clusters k based on the similarities in the data, and it requires neither the number of clusters to be prespecified nor the initial centers to be manually selected. Therefore, it has a “blind” feature, that is, k is not preselected. The second phase performs the Lloyd iteration based on the results of the first phase. The C-K-means algorithm combines the advantages of CA and K-means. Experiments are carried out on the Spark platform, and the results verify the good scalability of the C-K-means algorithm. This algorithm can effectively solve the problem of large-scale data clustering. Extensive experiments on real data sets show that the accuracy and efficiency of the C-K-means algorithm outperforms the existing algorithms under both sequential and parallel conditions.


Author(s):  
J. W. Li ◽  
X. Q. Han ◽  
J. W. Jiang ◽  
Y. Hu ◽  
L. Liu

Abstract. How to establish an effective method of large data analysis of geographic space-time and quickly and accurately find the hidden value behind geographic information has become a current research focus. Researchers have found that clustering analysis methods in data mining field can well mine knowledge and information hidden in complex and massive spatio-temporal data, and density-based clustering is one of the most important clustering methods.However, the traditional DBSCAN clustering algorithm has some drawbacks which are difficult to overcome in parameter selection. For example, the two important parameters of Eps neighborhood and MinPts density need to be set artificially. If the clustering results are reasonable, the more suitable parameters can not be selected according to the guiding principles of parameter setting of traditional DBSCAN clustering algorithm. It can not produce accurate clustering results.To solve the problem of misclassification and density sparsity caused by unreasonable parameter selection in DBSCAN clustering algorithm. In this paper, a DBSCAN-based data efficient density clustering method with improved parameter optimization is proposed. Its evaluation index function (Optimal Distance) is obtained by cycling k-clustering in turn, and the optimal solution is selected. The optimal k-value in k-clustering is used to cluster samples. Through mathematical and physical analysis, we can determine the appropriate parameters of Eps and MinPts. Finally, we can get clustering results by DBSCAN clustering. Experiments show that this method can select parameters reasonably for DBSCAN clustering, which proves the superiority of the method described in this paper.


2020 ◽  
Author(s):  
Lucía Prieto Santamaría ◽  
Eduardo P. García del Valle ◽  
Gerardo Lagunes García ◽  
Massimiliano Zanin ◽  
Alejandro Rodríguez González ◽  
...  

AbstractWhile classical disease nosology is based on phenotypical characteristics, the increasing availability of biological and molecular data is providing new understanding of diseases and their underlying relationships, that could lead to a more comprehensive paradigm for modern medicine. In the present work, similarities between diseases are used to study the generation of new possible disease nosologic models that include both phenotypical and biological information. To this aim, disease similarity is measured in terms of disease feature vectors, that stood for genes, proteins, metabolic pathways and PPIs in the case of biological similarity, and for symptoms in the case of phenotypical similarity. An improvement in similarity computation is proposed, considering weighted instead of Booleans feature vectors. Unsupervised learning methods were applied to these data, specifically, density-based DBSCAN clustering algorithm. As evaluation metric silhouette coefficient was chosen, even though the number of clusters and the number of outliers were also considered. As a results validation, a comparison with randomly distributed data was performed. Results suggest that weighted biological similarities based on proteins, and computed according to cosine index, may provide a good starting point to rearrange disease taxonomy and nosology.


2020 ◽  
Vol 5 ◽  
Author(s):  
Luca Crociani ◽  
Giuseppe Vizzari ◽  
Andrea Gorrini ◽  
Stefania Bandini

Pedestrian behavioural dynamics have been growingly investigated by means of (semi)automated computing techniques for almost two decades, exploiting advancements on computing power, sensor accuracy and availability, computer vision algorithms. This has led to a unique consensus on the existence of significant difference between unidirectional and bidirectional flows of pedestrians, where the phenomenon of lane formation seems to play a major role. The collective behaviour of lane formation emerges in condition of variable density and due to a self-organisation dynamic, for which pedestrians are induced to walk following preceding persons to avoid and minimize conflictual situations. Although the formation of lanes is a well-known phenomenon in this field of study, there is still a lack of methods offering the possibility to provide an (even semi-) automatic identification and a quantitative characterization. In this context, the paper proposes an unsupervised learning approach for an automatic detection of lanes in multi-directional pedestrian flows, based on the DBSCAN clustering algorithm. The reliability of the approach is evaluated through an inter-rater agreement test between the results achieved by a human coder and by the algorithm.


2021 ◽  
Vol 18 (1) ◽  
pp. 34-57
Author(s):  
Weifeng Pan ◽  
Xinxin Xu ◽  
Hua Ming ◽  
Carl K. Chang

Mashup technology has become a promising way to develop and deliver applications on the web. Automatically organizing Mashups into functionally similar clusters helps improve the performance of Mashup discovery. Although there are many approaches aiming to cluster Mashups, they solely focus on utilizing semantic similarities to guide the Mashup clustering process and are unable to utilize both the structural and semantic information in Mashup profiles. In this paper, a novel approach to cluster Mashups into groups is proposed, which integrates structural similarity and semantic similarity using fuzzy AHP (fuzzy analytic hierarchy process). The structural similarity is computed from usage histories between Mashups and Web APIs using SimRank algorithm. The semantic similarity is computed from the descriptions and tags of Mashups using LDA (latent dirichlet allocation). A clustering algorithm based on the genetic algorithm is employed to cluster Mashups. Comprehensive experiments are performed on a real data set collected from ProgrammableWeb. The results show the effectiveness of the approach when compared with two kinds of conventional approaches.


2021 ◽  
Vol 23 (09) ◽  
pp. 1105-1121
Author(s):  
Dr. Ashish Kumar Tamrakar ◽  
◽  
Dr. Abhishek Verma ◽  
Dr. Vishnu Kumar Mishra ◽  
Dr. Megha Mishra ◽  
...  

Cloud computing is a new model for providing diverse services of software and hardware. This paradigm refers to a model for enabling on-demand network access to a shared pool of configurable computing resources, that can be rapidly provisioned and released with minimal service provider interaction .It helps the organizations and individuals deploy IT resources at a reduced total cost. However, the new approaches introduced by the clouds, related to computation outsourcing, distributed resources and multi-tenancy concept, increase the security and privacy concerns and challenges. It allows users to store their data remotely and then access to them at any time from any place .Cloud storage services are used to store data in ways that are considered cost saving and easy to use. In cloud storage, data are stored on remote servers that are not physically known by the consumer. Thus, users fear from uploading their private and confidential files to cloud storage due to security concerns. The usual solution to secure data is data encryption, which makes cloud users more satisfied when using cloud storage to store their data. Motivated by the above facts; we have proposed a solution to undertake the problem of cloud storage security. In cloud storage, there are public data that do not need any security measures, and there are sensitive data that need applying security mechanisms to keep them safe. In that context, data classification appears as the solution to this problem. The classification of data into classes, with different security requirements for each class is the best way to avoid under security and over security situation. The existing cloud storage systems use the same Journal of University of Shanghai for Science and Technology ISSN: 1007-6735 Volume 23, Issue 9, September – 2021 Page-1105 key size to encrypt all data without taking into consideration its confidentiality level. Treating the low and high confidential data with the same way and at the same security level will add unnecessary overhead and increase the processing time. In our proposal, we have combined the K-NN (K Nearest Neighbors) machine learning method and the goal programming decision-making method, to provide an efficient method for data classification. This method allows data classification according to the data owner security needs. Then, we introduce the user data to the suitable security mechanisms for each class. The use of our solution in cloud storage systems makes the data security process more flexible, besides; it increases the cloud storage system performance and decreases the needed resources, which are used to store the data.


2021 ◽  
pp. 1-18
Author(s):  
Angeliki Koutsimpela ◽  
Konstantinos D. Koutroumbas

Several well known clustering algorithms have their own online counterparts, in order to deal effectively with the big data issue, as well as with the case where the data become available in a streaming fashion. However, very few of them follow the stochastic gradient descent philosophy, despite the fact that the latter enjoys certain practical advantages (such as the possibility of (a) running faster than their batch processing counterparts and (b) escaping from local minima of the associated cost function), while, in addition, strong theoretical convergence results have been established for it. In this paper a novel stochastic gradient descent possibilistic clustering algorithm, called O- PCM 2 is introduced. The algorithm is presented in detail and it is rigorously proved that the gradient of the associated cost function tends to zero in the L 2 sense, based on general convergence results established for the family of the stochastic gradient descent algorithms. Furthermore, an additional discussion is provided on the nature of the points where the algorithm may converge. Finally, the performance of the proposed algorithm is tested against other related algorithms, on the basis of both synthetic and real data sets.


Sign in / Sign up

Export Citation Format

Share Document