cluster purity Latest Research Papers

Building the mega single-cell transcriptome ocular meta-atlas

GigaScience ◽

10.1093/gigascience/giab061 ◽

2021 ◽

Vol 10 (10) ◽

Author(s):

Vinay S Swamy ◽

Temesgen D Fufa ◽

Robert B Hufnagel ◽

David M McGaughey

Keyword(s):

Single Cell ◽

Biological Effects ◽

R Package ◽

Technical Noise ◽

Batch Correction ◽

Workflow System ◽

Cell Transcriptome ◽

Single Cell Transcriptome ◽

Compare And Contrast ◽

Cluster Purity

Abstract Background: The development of highly scalable single-cell transcriptome technology has resulted in the creation of thousands of datasets, >30 in the retina alone. Analyzing the transcriptomes between different projects is highly desirable because this would allow for better assessment of which biological effects are consistent across independent studies. However it is difficult to compare and contrast data across different projects because there are substantial batch effects from computational processing, single-cell technology utilized, and the natural biological variation. While many single-cell transcriptome-specific batch correction methods purport to remove the technical noise, it is difficult to ascertain which method functions best. Results: We developed a lightweight R package (scPOP, single-cell Pick Optimal Parameters) that brings in batch integration methods and uses a simple heuristic to balance batch merging and cell type/cluster purity. We use this package along with a Snakefile-based workflow system to demonstrate how to optimally merge 766,615 cells from 33 retina datsets and 3 species to create a massive ocular single-cell transcriptome meta-atlas. Conclusions: This provides a model for how to efficiently create meta-atlases for tissues and cells of interest.

Download Full-text

Identification of 20 polymer types by means of laser-induced breakdown spectroscopy (LIBS) and chemometrics

Analytical and Bioanalytical Chemistry ◽

10.1007/s00216-021-03622-y ◽

2021 ◽

Author(s):

Zuzana Gajarska ◽

Lukas Brunnbauer ◽

Hans Lohninger ◽

Andreas Limbeck

Keyword(s):

Near Infrared ◽

Laser Energy ◽

Real Life ◽

Discrimination Performance ◽

Laser Induced Breakdown Spectroscopy ◽

Forward Selection ◽

Experimental Conditions ◽

Breakdown Spectroscopy ◽

Laser Induced Breakdown ◽

Cluster Purity

AbstractOver the past few years, laser-induced breakdown spectroscopy (LIBS) has earned a lot of attention in the field of online polymer identification. Unlike the well-established near-infrared spectroscopy (NIR), LIBS analysis is not limited by the sample thickness or color and therefore seems to be a promising candidate for this task. Nevertheless, the similar elemental composition of most polymers results in high similarity of their LIBS spectra, which makes their discrimination challenging. To address this problem, we developed a novel chemometric strategy based on a systematic optimization of two factors influencing the discrimination ability: the set of experimental conditions (laser energy, gate delay, and atmosphere) employed for the LIBS analysis and the set of spectral variables used as a basis for the polymer discrimination. In the process, a novel concept of spectral descriptors was used to extract chemically relevant information from the polymer spectra, cluster purity based on the k-nearest neighbors (k-NN) was established as a suitable tool for monitoring the extent of cluster overlaps and an in-house designed random forest (RDF) experiment combined with a cluster purity–governed forward selection algorithm was employed to identify spectral variables with the greatest relevance for polymer identification. Using this approach, it was possible to discriminate among 20 virgin polymer types, which is the highest number reported in the literature so far. Additionally, using the optimized experimental conditions and data evaluation, robust discrimination performance could be achieved even with polymer samples containing carbon black or other common additives, which hints at an applicability of the developed approach to real-life samples. Graphical abstract

Download Full-text

Building the Mega Single Cell Transcriptome Ocular Meta-Atlas

10.1101/2021.03.26.437190 ◽

2021 ◽

Author(s):

Vinay S Swamy ◽

Temesgen D Fufa ◽

Robert B Hufnagel ◽

David M McGaughey

Keyword(s):

Single Cell ◽

Biological Effects ◽

R Package ◽

Technical Noise ◽

Batch Correction ◽

Workflow System ◽

Cell Transcriptome ◽

Single Cell Transcriptome ◽

Compare And Contrast ◽

Cluster Purity

The development of highly scalable single cell transcriptome technology has resulted in the creation of thousands of datasets, over 30 in the retina alone. Analyzing the transcriptomes between different projects is highly desirable as this would allow for better assessment of which biological effects are consistent across independent studies. However it is difficult to compare and contrast data across different projects as there are substantial batch effects from computational processing, single cell technology utilized, and the natural biological variation. While many single cell transcriptome specific batch correction methods purport to remove the technical noise it is difficult to ascertain which method functions works best. We developed a lightweight R package (scPOP) that brings in batch integration methods and uses a simple heuristic to balance batch merging and celltype/cluster purity. We use this package along with a Snakefile based workflow system to demonstrate how to optimally merge 766,615 cells from 34 retina datsets and three species to create a massive ocular single cell transcriptome meta-atlas. This provides a model how to efficiently create meta-atlases for tissues and cells of interest.

Download Full-text

OGRE: Overlap Graph-based metagenomic Read clustEring

Bioinformatics ◽

10.1093/bioinformatics/btaa760 ◽

2020 ◽

Author(s):

Marleen Balvert ◽

Xiao Luo ◽

Ernestina Hauptfeld ◽

Alexander Schönhuth ◽

Bas E Dutilh

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Computation Time ◽

Supplementary Information ◽

Sequencing Data ◽

High Throughput Sequencing Data ◽

Computationally Intensive ◽

Metagenome Sequencing ◽

Species Specific ◽

Cluster Purity

Abstract Motivation The microbes that live in an environment can be identified from the combined genomic material, also referred to as the metagenome. Sequencing a metagenome can result in large volumes of sequencing reads. A promising approach to reduce the size of metagenomic datasets is by clustering reads into groups based on their overlaps. Clustering reads are valuable to facilitate downstream analyses, including computationally intensive strain-aware assembly. As current read clustering approaches cannot handle the large datasets arising from high-throughput metagenome sequencing, a novel read clustering approach is needed. In this article, we propose OGRE, an Overlap Graph-based Read clustEring procedure for high-throughput sequencing data, with a focus on shotgun metagenomes. Results We show that for small datasets OGRE outperforms other read binners in terms of the number of species included in a cluster, also referred to as cluster purity, and the fraction of all reads that is placed in one of the clusters. Furthermore, OGRE is able to process metagenomic datasets that are too large for other read binners into clusters with high cluster purity. Conclusion OGRE is the only method that can successfully cluster reads in species-specific clusters for large metagenomic datasets without running into computation time- or memory issues. Availabilityand implementation Code is made available on Github (https://github.com/Marleen1/OGRE). Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

ROGUE: an entropy-based universal metric for assessing the purity of single cell population

10.1101/819581 ◽

2019 ◽

Cited By ~ 2

Author(s):

Baolin Liu ◽

Chenwei Li ◽

Ziyi Li ◽

Xianwen Ren ◽

Zemin Zhang

Keyword(s):

B Cell ◽

Single Cell ◽

Rna Sequencing ◽

Cell Types ◽

Wide Range ◽

Single Cell Rna Sequencing ◽

Pure Cell ◽

Versatile Tool ◽

Cluster Purity

AbstractSingle-cell RNA sequencing (scRNA-seq) is a versatile tool for discovering and annotating cell types and states, but the determination and annotation of cell subtypes is often subjective and arbitrary. Often, it is not even clear whether a given cluster is uniform. Here we present an entropy-based statistic, ROGUE, to accurately quantify the purity of identified cell clusters. We demonstrated that our ROGUE metric is generalizable across datasets, and enables accurate, sensitive and robust assessment of cluster purity on a wide range of simulated and real datasets. Applying this metric to fibroblast and B cell datasets, we identified additional subtypes and demonstrated the application of ROGUE-guided analyses to detect true signals in specific subpopulations. ROGUE can be applied to all tested scRNA-seq datasets, and has important implications for evaluating the quality of putative clusters, discovering pure cell subtypes and constructing comprehensive, detailed and standardized single cell atlas.

Download Full-text

Exploring the Impact of Optimal Clusters on Cluster Purity

2018 3rd International Conference on Communication and Electronics Systems (ICCES) ◽

10.1109/cesys.2018.8724114 ◽

2018 ◽

Author(s):

K.V.S.N. Rama Rao ◽

B. Manjula Josephine

Keyword(s):

The Impact ◽

Cluster Purity

Download Full-text

Understanding convolutional neural networks via discriminant feature analysis

APSIPA Transactions on Signal and Information Processing ◽

10.1017/atsip.2018.24 ◽

2018 ◽

Vol 7 ◽

Cited By ~ 2

Author(s):

Hao Xu ◽

Yueru Chen ◽

Ruiyuan Lin ◽

C.-C. Jay Kuo

Keyword(s):

Feature Representation ◽

Feature Analysis ◽

Individual Feature ◽

Discriminative Ability ◽

Multiple Features ◽

Operational Mechanism ◽

The Past ◽

Quantitative Metrics ◽

Cluster Purity ◽

Good Detection

Trained features of a convolution neural network (CNN) at different convolution layers is analyzed using two quantitative metrics in this work. We first show mathematically that the Gaussian confusion measure (GCM) can be used to identify the discriminative ability of an individual feature. Next, we generalize this idea, introduce another measure called the cluster purity measure (CPM), and use it to analyze the discriminative ability of multiple features jointly. The discriminative ability of trained CNN features is validated by experimental results. Research on CNNs utilizing GCM and CPM tools offers important insights into its operational mechanism, including the behavior of trained CNN features and good detection performance of some object classes that were considered difficult in the past. Finally, the trained feature representation is compared between different CNN structures to explain the superiority of deeper networks.

Download Full-text

Data labeling method based on cluster purity using relative rough entropy for categorical data clustering

2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI) ◽

10.1109/icacci.2013.6637222 ◽

2013 ◽

Cited By ~ 2

Author(s):

H. Venkateswara Reddy ◽

S. Viswanadha Raju ◽

Pratibha Agrawal

Keyword(s):

Categorical Data ◽

Data Clustering ◽

Labeling Method ◽

Categorical Data Clustering ◽

Cluster Purity

Download Full-text

Handling Datasets in a Multi-Relational Environment: Cluster Dispersion vs Cluster Purity

2007 4th IEEE Workshop on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications ◽

10.1109/idaacs.2007.4488404 ◽

2007 ◽

Author(s):

Rayner Alfred ◽

Dimitar Kazakov

Keyword(s):

Cluster Purity

Download Full-text

cluster purity
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Building the mega single-cell transcriptome ocular meta-atlas

Identification of 20 polymer types by means of laser-induced breakdown spectroscopy (LIBS) and chemometrics

Building the Mega Single Cell Transcriptome Ocular Meta-Atlas

OGRE: Overlap Graph-based metagenomic Read clustEring

ROGUE: an entropy-based universal metric for assessing the purity of single cell population

Exploring the Impact of Optimal Clusters on Cluster Purity

Understanding convolutional neural networks via discriminant feature analysis

Data labeling method based on cluster purity using relative rough entropy for categorical data clustering

Handling Datasets in a Multi-Relational Environment: Cluster Dispersion vs Cluster Purity

Export Citation Format

cluster purityRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Building the mega single-cell transcriptome ocular meta-atlas

Identification of 20 polymer types by means of laser-induced breakdown spectroscopy (LIBS) and chemometrics

Building the Mega Single Cell Transcriptome Ocular Meta-Atlas

OGRE: Overlap Graph-based metagenomic Read clustEring

ROGUE: an entropy-based universal metric for assessing the purity of single cell population

Exploring the Impact of Optimal Clusters on Cluster Purity

Understanding convolutional neural networks via discriminant feature analysis

Data labeling method based on cluster purity using relative rough entropy for categorical data clustering

Handling Datasets in a Multi-Relational Environment: Cluster Dispersion vs Cluster Purity

cluster purity
Recently Published Documents