cluster resolution Latest Research Papers

Assessment of genetic diversity of Musa species accessions with variable genomes using ISSR and SCoT markers

Genetic Resources and Crop Evolution ◽

10.1007/s10722-021-01202-8 ◽

2021 ◽

Author(s):

David Okeh Igwe ◽

Onyinye Constance Ihearahu ◽

Anne Adhiambo Osano ◽

George Acquaah ◽

George Nkem Ude

Keyword(s):

Genetic Diversity ◽

Gene Flow ◽

Allelic Richness ◽

Crop Improvement ◽

Inter Simple Sequence Repeat ◽

Start Codon ◽

Information Index ◽

Cluster Resolution ◽

Musa Species ◽

Simple Sequence

AbstractAssessing the effectiveness of different molecular markers is essential for identification of appropriate ones for crop improvement and conservation, hence, inter-simple sequence repeat (ISSR) and start codon targeted (SCoT) markers were used for this study. Sixty-six accessions with different genomes obtained from International Transit Center, Belgium, were used for DNA extraction, amplification with ISSR and SCoT markers and agarose gel electrophoresis. The reproducible bands were scored for analyses. We identified high allelic richness of 299 (ISSR) and 326 (SCoT). Polymorphic information contents (ISSR: 0.9225; SCoT: 0.9421) were high but SCoT exhibited higher level of informativeness. The two markers demonstrated high percentage polymorphic loci (ISSR: 91.21–100%; SCoT: 96.97–100%). Other genetic indicators including effective number of alleles, Nei’s genetic diversity, and Shannon information index were higher in SCoT and further elucidated the usefulness of the markers. Intraspecific genetic diversity, interspecific genetic diversity, coefficient of gene differentiation and level of gene flow revealed extensive gene flow and larger variability within the accessions. Both ISSR and SCoT grouped the accessions via dendrogram, biplot and structure analyses. Though the two marker systems varied in their informativeness, they demonstrated high effectiveness in resolving genetic diversity (GD) of the different accessions, with higher efficiency in SCoT markers. Due to higher GD indices exhibited by SCoT, AS is the most genetically endowed one. Our study showed that SCoT markers are more informative than ISSR for GD exploration, assessment and cluster resolution of Musa species, thereby revealing the potential of SCoT markers for improved breeding and conservation.

Download Full-text

An efficient and accurate numerical determination of the cluster resolution metric in two dimensions

Journal of Chemometrics ◽

10.1002/cem.3346 ◽

2021 ◽

Author(s):

Michael Sorochan Armstrong ◽

A. Paulina Mata ◽

James J. Harynuk

Keyword(s):

Two Dimensions ◽

Numerical Determination ◽

Cluster Resolution

Download Full-text

ABACUS: A flexible UMI counter that leverages intronic reads for single-nucleus RNAseq analysis

10.1101/2020.11.13.381624 ◽

2020 ◽

Author(s):

Simon Xi ◽

Lauren Gibilisco ◽

Markus Kummer ◽

Knut Biber ◽

Astrid Wachter ◽

...

Keyword(s):

Cell Types ◽

Droplet Microfluidics ◽

Rnaseq Data ◽

Total Data ◽

Single Nucleus ◽

Gene Expression Quantification ◽

Cluster Resolution ◽

Different Cell Types ◽

Expression Quantification ◽

Generation Sequencing

AbstractSingle-nucleus RNA sequencing (sNuc-RNAseq) is an emerging powerful genomics technology that combines droplet microfluidics with next-generation sequencing to interrogate transcriptome changes at single nucleus resolution. Here we developed Abacus, a flexible UMI counter software for sNuc-RNAseq analysis. Abacus draws extra information from sequencing reads mapped to introns of pre-mRNAs (~60% of total data) that are ignored by many single-cell RNAseq analysis pipelines. When applied to our pilot human brain sNuc-RNAseq data, ABACUS nearly doubled the number of nuclei identified by the CellRanger workflow, recovering a large number of nuclei from non-neuronal cells. By incorporating intronic reads into gene expression quantification, we showed that they encoded additional and valid transcription features of individual cells and could be used to improve cluster resolution of different cell types. By separately counting UMIs derived from forward and reverse intronic reads and from exonic reads, Abacus gives users flexibility in representing genes expressed at different abundance levels. In summary, Abacus represents a flexible, improved workflow for sNuc-RNAseq data processing and analysis.

Download Full-text

Information-theory-based benchmarking and feature selection algorithm improve cell type annotation and reproducibility of single cell RNA-seq data analysis pipelines

10.1101/2020.11.02.365510 ◽

2020 ◽

Author(s):

Ziyou Ren ◽

Martin Gerlach ◽

Hanyu Shi ◽

GR Scott Budinger ◽

Luís A. Nunes Amaral

Keyword(s):

Information Theory ◽

Feature Selection ◽

Data Analysis ◽

Single Cell ◽

Clustering Algorithms ◽

Rna Seq ◽

Cell Type ◽

Cluster Resolution ◽

Parameter Values ◽

The Impact

AbstractSingle cell RNA sequencing (scRNA-seq) data are now routinely generated in experimental practice because of their promise to enable the quantitative study of biological processes at the single cell level. However, cell type and cell state annotations remain an important computational challenge in analyzing scRNA-seq data. Here, we report on the development of a benchmark dataset where reference annotations are generated independently from transcriptomic measurements. We used this benchmark to systematically investigate the impact on labelling accuracy of different approaches to feature selection, of different clustering algorithms, and of different sets of parameter values. We show that an approach grounded on information theory can provide a general, reliable, and accurate process for discarding uninformative features and to optimize cluster resolution in single cell RNA-seq data analysis.

Download Full-text

Differential responses of transplanted stem cells to the diseased environment unveiled by a single molecular NIR II cell tracker

10.1101/2020.03.12.988295 ◽

2020 ◽

Author(s):

Hao Chen ◽

Huaxiao Yang ◽

Chen Zhang ◽

Si Chen ◽

Xin Zhao ◽

...

Keyword(s):

Stem Cells ◽

Stem Cell ◽

Cell Therapy ◽

Single Cell ◽

Real Time ◽

Cell Cluster ◽

Real Time Tracking ◽

Cluster Resolution ◽

First Time

AbstractStem cell therapy holds high promises in regenerative medicine. The major challenge of clinical translation is to precisely and quantitatively evaluate the in vivo cell distribution, migration, and engraftment, which cannot be easily achieved by current techniques. To address this issue, for the first time, we have developed a single molecular cell tracker with a strong fluorescence signal in the second near-infrared (NIR-II) window (1000-1700 nm) for real-time monitoring of in vivo cell behaviors in both healthy and diseased animal models. The NIR-II tracker (CelTrac1000) has shown complete cell labeling with low cytotoxicity and profound long-term tracking ability for 30 days in high temporospatial resolution for semi-quantification of the biodistribution of primary mesenchymal stem cell and induced pluripotent stem cell-derived endothelial cells. Taking advantage of the unique merits of CelTrac1000, the responses of transplanted stem cells to different diseased environments have been discriminated and unveiled. Furthermore, we also demonstrate CelTrac1000 as a universal and effective technique for ultrafast real-time tracking of the cellular migration and distribution in a single cell cluster resolution, along with the lung contraction and heart beating. As such, this single molecular NIR-II tracker will shift the optical cell tracking into a single cell cluster and millisecond temporospatial resolution for better evaluating and understanding stem cell therapy, affording optimal doses and efficacy.Significance StatementFor the first time, we synthesized a NIR-II tracker (CelTrac1000) for ultrafast real-time tracking of the migration trajectory of transplanted mesenchymal stem cells in the circulatory system with a single cell cluster resolution. Taking advantage of the merits of CelTrac1000, the responses of transplanted stem cells to different diseased environments, including acute lung injury, myocardial infarction, and middle cerebral artery occlusion, have been discriminated and unveiled in mice models. As such, our approach can help correlate critical biomedical information in stem cell therapies, such as stem cell dosing and engraftment and their relationships with efficacy, providing more accurate therapeutic treatment and outcomes in certain diseases during a long evaluation period (>30 days) in comparison with the commercial Qtracker (7-10 days).

Download Full-text

Improving replicability in single-cell RNA-Seq cell type discovery with Dune

10.1101/2020.03.03.974220 ◽

2020 ◽

Author(s):

Hector Roux de Bézieux ◽

Kelly Street ◽

Stephan Fischer ◽

Koen Van den Berge ◽

Rebecca Chance ◽

...

Keyword(s):

Single Cell ◽

Ad Hoc ◽

Clustering Algorithms ◽

Cell Types ◽

Optimal Choice ◽

Number Of Clusters ◽

Trade Off ◽

Tuning Parameters ◽

Original Dataset ◽

Cluster Resolution

AbstractSingle-cell transcriptome sequencing (scRNA-Seq) has allowed many new types of investigations at unprecedented and unique levels of resolution. Among the primary goals of scRNA-Seq is the classification of cells into potentially novel cell types. Many approaches build on the existing clustering literature to develop tools specific to single-cell applications. However, almost all of these methods rely on heuristics or user-supplied parameters to control the number of clusters identified. This affects both the resolution of the clusters within the original dataset as well as their replicability across datasets. While many recommendations exist to select these tuning parameters, most of them are quite ad hoc. In general, there is little assurance that any given set of parameters will represent an optimal choice in the ever-present trade-off between cluster resolution and replicability. For instance, it may be the case that another set of parameters will result in more clusters that are also more replicable, or in fewer clusters that are also less replicable.Here, we propose a new method called Dune for optimizing the trade-off between the resolution of the clusters and their replicability across datasets. Our method takes as input a set of clustering results on a single dataset, derived from any set of clustering algorithms and associated tuning parameters, and iteratively merges clusters within partitions in order to maximize their concordance between partitions. As demonstrated on a variety of scRNA-Seq datasets from different platforms, Dune outperforms existing techniques, that rely on hierarchical merging for reducing the number of clusters, in terms of replicability of the resultant merged clusters. It provides an objective approach for identifying replicable consensus clusters most likely to represent common biological features across multiple datasets.

Download Full-text

Genetic diversity and delineation of Salmonella Agona outbreak strains by next generation sequencing, Bavaria, Germany, 1993 to 2018

Eurosurveillance ◽

10.2807/1560-7917.es.2019.24.18.1800303 ◽

2019 ◽

Vol 24 (18) ◽

Cited By ~ 4

Author(s):

Alexandra Dangel ◽

Anja Berger ◽

Ute Messelhäußer ◽

Regina Konrad ◽

Stefan Hörmansdorfer ◽

...

Keyword(s):

Genetic Diversity ◽

Next Generation Sequencing ◽

Animal Feed ◽

Next Generation ◽

Homogeneous Sample ◽

Outbreak Investigations ◽

Cluster Resolution ◽

Species Specific ◽

Ngs Data ◽

Generation Sequencing

Background In 2017, a food-borne Salmonella Agona outbreak caused by infant milk products from a French supplier occurred in Europe. Simultaneously, S. Agona was detected in animal feed samples in Bavaria. Aim Using next generation sequencing (NGS) and three data analysis methods, this study’s objectives were to verify clonality of the Bavarian feed strains, rule out their connection to the outbreak, explore the genetic diversity of Bavarian S. Agona isolates from 1993 to 2018 and compare the analysis approaches employed, for practicality and ability to delineate outbreaks caused by the genetically monomorphic Agona serovar. Methods In this observational retrospective study, three 2017 Bavarian feed isolates were compared to a French outbreak isolate and 48 S. Agona isolates from our strain collections. The later included human, food, feed, veterinary and environmental isolates, of which 28 were epidemiologically outbreak related. All isolates were subjected to NGS and analysed by: (i) a publicly available species-specific core genome multilocus sequence typing (cgMLST) scheme, (ii) single nucleotide polymorphism phylogeny and (iii) an in-house serovar-specific cgMLST scheme. Using additional international S. Agona outbreak NGS data, the cluster resolution capacity of the two cgMLST schemes was assessed. Results We could prove clonality of the feed isolates and exclude their relation to the French outbreak. All approaches confirmed former Bavarian epidemiological clusters. Conclusion Even for S. Agona, species-level cgMLST can produce reasonable resolution, being standardisable by public health laboratories. For single samples or homogeneous sample sets, higher resolution by serovar-specific cgMLST or SNP genotyping can facilitate outbreak investigations.

Download Full-text

Finding Maternal Siblings in Birth Registration Data to form a Pregnancy Spine – Data Linkage & Graph Based Methods for Unknown Cluster Sizes

International Journal for Population Data Science ◽

10.23889/ijpds.v3i4.894 ◽

2018 ◽

Vol 3 (4) ◽

Author(s):

Shelley Gammon ◽

Charles Morris

Keyword(s):

Community Detection ◽

Data Linkage ◽

Error Rates ◽

Detection Methods ◽

Birth Registration ◽

Sibling Pairs ◽

Detection Techniques ◽

Registration Data ◽

Cluster Resolution ◽

Sibling Groups

IntroductionWe have developed an innovative methodology to link maternal siblings within 2000 – 2005 England and Wales Birth Registration data, to form a Pregnancy Spine, a unification of all births to each unique mother. Key challenges in this many-many linkage scenario: Blocking (reduction of record pair comparisons) Cluster resolution Objectives and ApproachProbabilistic data linkage (Python) was followed by generation of clusters (using igraph in R) and graph theory community detection techniques. To optimise geographical blocking and increase accuracy, we incorporated Internal Migration data to map the likely geographic movement of mothers between births. Maternal sibling clusters were modelled as a graph and the structure of clusters was optimised using community detection methods to link, split and evaluate sibling groups. Additionally, we incorporated additional childhood statistics data relating to child date of birth to evaluate likely accuracy of sibling pairs and remove false edges (links). ResultsOur development has resulted in a new blocking method and cluster resolution method. In addition, we developed new ways to assess and measure the accuracy of sibling groups, beyond traditional classifier metrics, and infer error rates. We applied our method to Registration Data used in earlier studies for QA of our methods. Using this, and by comparing against other statistics on maternal sibling composition we will present results which show that a high degree of accuracy (precision / recall and new checks) was obtained for precision, recall, and other evaluation metrics. Conclusion/ImplicationsThese methods will improve other linkage projects with unknown clusters sizes; for de-duplicating datasets, linkage of multiple datasets, or incorporation of data from a longer time-period through longitudinal linkage. To this Spine, researchers can now append and link other data sources to answer questions about maternal and child health outcomes.

Download Full-text

Finding Maternal Siblings in Birth Registration Data to form a Pregnancy Spine – Data Linkage & Graph Based Methods for Unknown Cluster Sizes

International Journal for Population Data Science ◽

10.23889/ijpds.v3i2.543 ◽

2018 ◽

Vol 3 (2) ◽

Author(s):

Charles Tomlin ◽

Shelley Gammon ◽

Charles Morris ◽

Charlotte O'Brien

Keyword(s):

Detection Methods ◽

Birth Registration ◽

Child Health Outcomes ◽

Registration Data ◽

Time Period ◽

Innovative Methodology ◽

Probabilistic Linkage ◽

Sibling Composition ◽

Cluster Resolution ◽

High Degree

We have developed an innovative methodology to link maternal siblings within 2000-2005 England and Wales Birth Registration data, to form a Pregnancy Spine, a unification of all births to each unique mother. Key challenges were Blocking & Cluster resolution. To optimise geographic blocking, Internal Migration data was incorporated to map likely geographic movement of mothers between births. Following probabilistic linkage, sibling clusters were modelled as a graph and their structure optimised using community detection methods. Childhood statistics data relating to child DOB were incorporated to evaluate accuracy and remove false links. Our development has resulted in a new blocking and cluster resolution method. We developed new ways to assess sibling group accuracy, beyond traditional classifier metrics, and infer error rates.We applied our method to Registration Data used in earlier studies for QA of our methods. Using this, and other maternal sibling composition statistics, we present results showing that a high degree of accuracy was obtained for standard and new evaluation metrics. These methods will improve other linkage projects linking unknown clusters sizes/multiple datasets, or longer time period longitudinal linkage. To this Spine, researchers can append and link other data sources to answer questions about maternal and child health outcomes.

Download Full-text

Estimation of start and stop numbers for cluster resolution feature selection algorithm: an empirical approach using null distribution analysis of Fisher ratios

Analytical and Bioanalytical Chemistry ◽

10.1007/s00216-017-0628-8 ◽

2017 ◽

Vol 409 (28) ◽

pp. 6699-6708 ◽

Cited By ~ 3

Author(s):

Lawrence A. Adutwum ◽

A. Paulina de la Mata ◽

Heather D. Bean ◽

Jane E. Hill ◽

James J. Harynuk

Keyword(s):

Feature Selection ◽

Null Distribution ◽

Empirical Approach ◽

Distribution Analysis ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Cluster Resolution

Download Full-text

cluster resolution
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Assessment of genetic diversity of Musa species accessions with variable genomes using ISSR and SCoT markers

An efficient and accurate numerical determination of the cluster resolution metric in two dimensions

ABACUS: A flexible UMI counter that leverages intronic reads for single-nucleus RNAseq analysis

Information-theory-based benchmarking and feature selection algorithm improve cell type annotation and reproducibility of single cell RNA-seq data analysis pipelines

Differential responses of transplanted stem cells to the diseased environment unveiled by a single molecular NIR II cell tracker

Improving replicability in single-cell RNA-Seq cell type discovery with Dune

Genetic diversity and delineation of Salmonella Agona outbreak strains by next generation sequencing, Bavaria, Germany, 1993 to 2018

Finding Maternal Siblings in Birth Registration Data to form a Pregnancy Spine – Data Linkage & Graph Based Methods for Unknown Cluster Sizes

Finding Maternal Siblings in Birth Registration Data to form a Pregnancy Spine – Data Linkage & Graph Based Methods for Unknown Cluster Sizes

Estimation of start and stop numbers for cluster resolution feature selection algorithm: an empirical approach using null distribution analysis of Fisher ratios

Export Citation Format

cluster resolutionRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Assessment of genetic diversity of Musa species accessions with variable genomes using ISSR and SCoT markers

An efficient and accurate numerical determination of the cluster resolution metric in two dimensions

ABACUS: A flexible UMI counter that leverages intronic reads for single-nucleus RNAseq analysis

Information-theory-based benchmarking and feature selection algorithm improve cell type annotation and reproducibility of single cell RNA-seq data analysis pipelines

Differential responses of transplanted stem cells to the diseased environment unveiled by a single molecular NIR II cell tracker

Improving replicability in single-cell RNA-Seq cell type discovery with Dune

Genetic diversity and delineation of Salmonella Agona outbreak strains by next generation sequencing, Bavaria, Germany, 1993 to 2018

Finding Maternal Siblings in Birth Registration Data to form a Pregnancy Spine – Data Linkage & Graph Based Methods for Unknown Cluster Sizes

Finding Maternal Siblings in Birth Registration Data to form a Pregnancy Spine – Data Linkage & Graph Based Methods for Unknown Cluster Sizes

Estimation of start and stop numbers for cluster resolution feature selection algorithm: an empirical approach using null distribution analysis of Fisher ratios

cluster resolution
Recently Published Documents