Machine learning reveals multiple classes of diamond nanoparticles

2020 ◽  
Vol 5 (10) ◽  
pp. 1394-1399
Author(s):  
Amanda J. Parker ◽  
Amanda S. Barnard

Unsupervised clustering and supervised classification of a diverse set of reconstructed, twinned and passivated diamond nanoparticles predict nine classes that have distinctly different characteristics and electronic properties.

Author(s):  
Hyeuk Kim

Unsupervised learning in machine learning divides data into several groups. The observations in the same group have similar characteristics and the observations in the different groups have the different characteristics. In the paper, we classify data by partitioning around medoids which have some advantages over the k-means clustering. We apply it to baseball players in Korea Baseball League. We also apply the principal component analysis to data and draw the graph using two components for axis. We interpret the meaning of the clustering graphically through the procedure. The combination of the partitioning around medoids and the principal component analysis can be used to any other data and the approach makes us to figure out the characteristics easily.


2021 ◽  
Vol 9 (5) ◽  
pp. 1034
Author(s):  
Carlos Sabater ◽  
Lorena Ruiz ◽  
Abelardo Margolles

This study aimed to recover metagenome-assembled genomes (MAGs) from human fecal samples to characterize the glycosidase profiles of Bifidobacterium species exposed to different prebiotic oligosaccharides (galacto-oligosaccharides, fructo-oligosaccharides and human milk oligosaccharides, HMOs) as well as high-fiber diets. A total of 1806 MAGs were recovered from 487 infant and adult metagenomes. Unsupervised and supervised classification of glycosidases codified in MAGs using machine-learning algorithms allowed establishing characteristic hydrolytic profiles for B. adolescentis, B. bifidum, B. breve, B. longum and B. pseudocatenulatum, yielding classification rates above 90%. Glycosidase families GH5 44, GH32, and GH110 were characteristic of B. bifidum. The presence or absence of GH1, GH2, GH5 and GH20 was characteristic of B. adolescentis, B. breve and B. pseudocatenulatum, while families GH1 and GH30 were relevant in MAGs from B. longum. These characteristic profiles allowed discriminating bifidobacteria regardless of prebiotic exposure. Correlation analysis of glycosidase activities suggests strong associations between glycosidase families comprising HMOs-degrading enzymes, which are often found in MAGs from the same species. Mathematical models here proposed may contribute to a better understanding of the carbohydrate metabolism of some common bifidobacteria species and could be extrapolated to other microorganisms of interest in future studies.


2020 ◽  
Vol 2020 ◽  
pp. 1-7
Author(s):  
Alejandro-Israel Barranco-Gutiérrez

The image analysis of the brain with machine learning continues to be a relevant work for the detection of different characteristics of this complex organ. Recent research has observed that there are differences in the structure of the brain, specifically in white matter, when learning and using a second language. This work focuses on knowing the brain from the classification of Magnetic Resonance Images (MRIs) of bilingual and monolingual people who have English as their common language. Different artificial neural networks of a hidden layer were tested until reaching two neurons in that layer. The number of entries used was nine hundred and the classifier registered a high percentage of effectiveness. The training was supervised which could be improved in a future investigation. This task is usually carried out by an expert human with Tract-Based Spatial Statistics analysis and fractional anisotropy expressed in different colors on a screen. So, this proposal presents another option to quantitatively analyse this type of phenomena which allows to contribute to neuroscience by automatically detecting bilingual people of monolinguals by using machine learning from MRIs. This reinforces what is reported in manual detections and the way that a machine can do it.


2021 ◽  
Vol 10 (5) ◽  
pp. e13110514732
Author(s):  
Paulo César Ossani ◽  
Diogo Francisco Rossoni ◽  
Marcelo Ângelo Cirillo ◽  
Flávio Meira Borém

Specialty coffees have a big importance in the economic scenario, and its sensory quality is appreciated by the productive sector and by the market. Researches have been constantly carried out in the search for better blends in order to add value and differentiate prices according to the product quality. To accomplish that, new methodologies must be explored, taking into consideration factors that might differentiate the particularities of each consumer and/or product. Thus, this article suggests the use of the machine learning technique in the construction of supervised classification and identification models. In a sensory evaluation test for consumer acceptance using four classes of specialty coffees, applied to four groups of trained and untrained consumers, features such as flavor, body, sweetness and general grade were evaluated. The use of machine learning is viable because it allows the classification and identification of specialty coffees produced in different altitudes and different processing methods.


Energies ◽  
2018 ◽  
Vol 11 (9) ◽  
pp. 2235 ◽  
Author(s):  
Zigui Jiang ◽  
Rongheng Lin ◽  
Fangchun Yang

Time-series smart meter data can record precisely electricity consumption behaviors of every consumer in the smart grid system. A better understanding of consumption behaviors and an effective consumer categorization based on the similarity of these behaviors can be helpful for flexible demand management and effective energy control. In this paper, we propose a hybrid machine learning model including both unsupervised clustering and supervised classification for categorizing consumers based on the similarity of their typical electricity consumption behaviors. Unsupervised clustering algorithm is used to extract the typical electricity consumption behaviors and perform fuzzy consumer categorization, followed by a proposed novel algorithm to identify distinct consumer categories and their consumption characteristics. Supervised classification algorithm is used to classify new consumers and evaluate the validity of the identified categories. The proposed model is applied to a real dataset of U.S. non-residential consumers collected by smart meters over one year. The results indicate that large or special institutions usually have their distinct consumption characteristics while others such as some medium and small institutions or similar building types may have the same characteristics. Moreover, the comparison results with other methods show the improved performance of the proposed model in terms of category identification and classifying accuracy.


2021 ◽  
Author(s):  
Shenjun Zhong ◽  
Zhaolin Chen ◽  
Gary Egan

Parcellation of whole brain tractogram is a critical step to study brain white matter structures and connectivity patterns. The existing methods based on supervised classification of streamlines into predefined streamline bundle types are not designed to explore sub-bundle structures, and methods with manually designed features are expensive to compute streamline-wise similarities. To resolve these issues, we proposed a novel atlas-free method that learnt a latent space using a deep recurrent autoencoder which efficiently embedded any lengths of streamlines to fixed-size feature vectors, namely, streamline embeddings, and enabled tractogram parcellation via unsupervised clustering in the latent space. The method is evaluated on the ISMRM 2015 tractography challenge dataset, and shows the ability to discriminate major bundles with unsupervised clustering and query streamline based on similarity. The learnt latent representations of streamlines and bundles also open the possibility of quantitatively studying any granularities of sub-bundle structures with generic data mining techniques.


2021 ◽  
Vol 10 (3) ◽  
pp. 187
Author(s):  
Muhammed Enes Atik ◽  
Zaide Duran ◽  
Dursun Zafer Seker

3D scene classification has become an important research field in photogrammetry, remote sensing, computer vision and robotics with the widespread usage of 3D point clouds. Point cloud classification, called semantic labeling, semantic segmentation, or semantic classification of point clouds is a challenging topic. Machine learning, on the other hand, is a powerful mathematical tool used to classify 3D point clouds whose content can be significantly complex. In this study, the classification performance of different machine learning algorithms in multiple scales was evaluated. The feature spaces of the points in the point cloud were created using the geometric features generated based on the eigenvalues of the covariance matrix. Eight supervised classification algorithms were tested in four different areas from three datasets (the Dublin City dataset, Vaihingen dataset and Oakland3D dataset). The algorithms were evaluated in terms of overall accuracy, precision, recall, F1 score and process time. The best overall results were obtained for four test areas with different algorithms. Dublin City Area 1 was obtained with Random Forest as 93.12%, Dublin City Area 2 was obtained with a Multilayer Perceptron algorithm as 92.78%, Vaihingen was obtained as 79.71% with Support Vector Machines and Oakland3D with Linear Discriminant Analysis as 97.30%.


2021 ◽  
pp. 1-109
Author(s):  
Thang N. Ha ◽  
David Lubo-Robles ◽  
Kurt J. Marfurt ◽  
Bradley C. Wallet

In a machine learning workflow, data normalization is a crucial step that compensates for the large variation in data ranges and averages associated with different types of input measured with different units. However, most machine learning implementations do not provide data normalization beyond the z-score algorithm which subtracts the mean from the distribution and then scales the result by dividing by the standard deviation. Although z-score converts data with Gaussian behavior to have the same shape and size, many of our seismic attribute volumes exhibit log-normal, or even more complicated distributions. Because many machine learning applications are based on Gaussian statistics, we wish to evaluate the impact of more sophisticated data normalization techniques on the resulting classification. To do so, we provide an in-depth analysis of data normalization in machine-learning classifications by formulating and applying a logarithmic data transformation scheme to the unsupervised classifications (including PCA, ICA, SOM, and GTM) of a turbidite channel system in the Canterbury Basin, New Zealand, as well as implementing a per-class normalization scheme to the supervised probabilistic neural network (PNN) classification of salt in the Eugene Island mini-basin, Gulf of Mexico. Compared to the simple z-score normalization, a single logarithmic transformation applied to each input attribute significantly increases the spread of the resulting clusters (and corresponding color contrast), thereby enhancing subtle details in projection and unsupervised classification. However, this same uniform transformation produces less-confident results in supervised classification using probabilistic neural networks. We find that more accurate supervised classifications can be found by applying class-dependent normalization for each input attribute.


Sign in / Sign up

Export Citation Format

Share Document