scholarly journals FlowFP: A Bioconductor Package for Fingerprinting Flow Cytometric Data

2009 ◽  
Vol 2009 ◽  
pp. 1-11 ◽  
Author(s):  
Wade T. Rogers ◽  
Herbert A. Holyst

A new software package called flowFP for the analysis of flow cytometry data is introduced. The package, which is tightly integrated with other Bioconductor software for analysis of flow cytometry, provides tools to transform raw flow cytometry data into a form suitable for direct input into conventional statistical analysis and empirical modeling software tools. The approach of flowFP is to generate a description of the multivariate probability distribution function of flow cytometry data in the form of a “fingerprint.” As such, it is independent of a presumptive functional form for the distribution, in contrast with model-based methods such as Gaussian Mixture Modeling. FlowFP is computationally efficient and able to handle extremely large flow cytometry data sets of arbitrary dimensionality. Algorithms and software implementation of the package are described. Use of the software is exemplified with applications to data quality control and to the automated classification of Acute Myeloid Leukemia.

mSphere ◽  
2021 ◽  
Vol 6 (1) ◽  
Author(s):  
Peter Rubbens ◽  
Ruben Props ◽  
Frederiek-Maarten Kerckhof ◽  
Nico Boon ◽  
Willem Waegeman

ABSTRACT Microbial flow cytometry can rapidly characterize the status of microbial communities. Upon measurement, large amounts of quantitative single-cell data are generated, which need to be analyzed appropriately. Cytometric fingerprinting approaches are often used for this purpose. Traditional approaches either require a manual annotation of regions of interest, do not fully consider the multivariate characteristics of the data, or result in many community-describing variables. To address these shortcomings, we propose an automated model-based fingerprinting approach based on Gaussian mixture models, which we call PhenoGMM. The method successfully quantifies changes in microbial community structure based on flow cytometry data, which can be expressed in terms of cytometric diversity. We evaluate the performance of PhenoGMM using data sets from both synthetic and natural ecosystems and compare the method with a generic binning fingerprinting approach. PhenoGMM supports the rapid and quantitative screening of microbial community structure and dynamics. IMPORTANCE Microorganisms are vital components in various ecosystems on Earth. In order to investigate the microbial diversity, researchers have largely relied on the analysis of 16S rRNA gene sequences from DNA. Flow cytometry has been proposed as an alternative technology to characterize microbial community diversity and dynamics. The technology enables a fast measurement of optical properties of individual cells. So-called fingerprinting techniques are needed in order to describe microbial community diversity and dynamics based on flow cytometry data. In this work, we propose a more advanced fingerprinting strategy based on Gaussian mixture models. We evaluated our workflow on data sets from both synthetic and natural ecosystems, illustrating its general applicability for the analysis of microbial flow cytometry data. PhenoGMM supports a rapid and quantitative analysis of microbial community structure using flow cytometry.


2019 ◽  
Vol 153 (3) ◽  
pp. 322-327 ◽  
Author(s):  
Gaurav K Gupta ◽  
Xiaoping Sun ◽  
Constance M Yuan ◽  
Maryalice Stetler-Stevenson ◽  
Robert J Kreitman ◽  
...  

Abstract Objectives We evaluated efficacy of two dual immunohistochemistry (IHC) staining assays in assessing hairy cell leukemia (HCL) involvement in core biopsies and compared the results with concurrently collected flow cytometric data. Methods Overall, 148 patients with HCL (123 male, 25 female; mean age: 59.8 years; range: 25-81 years) had multiparameter flow cytometry performed using CD19, CD20, CD22, CD11c, CD25, CD103, CD123, surface light chains, CD5, and CD23. In parallel, bone marrow IHC was done using PAX5/CD103 and PAX5/tartrate-resistant alkaline phosphatase (TRAP) dual IHC stains. Results Overall sensitivity of dual IHC stains was 81.4%, positive predictive value was 100%, and negative predictive value was 81.7%. All IHC-positive cases concurred with flow cytometry data, even when HCL burden was extremely low in the flow cytometry specimens (as low as 0.02% of all lymphoid cells). Conclusions Dual IHC stain is a sensitive tool in detecting HCL, even in cases with minimal disease involvement.


2021 ◽  
Author(s):  
Nanditha Mallesh ◽  
Max Zhao ◽  
Lisa Meintker ◽  
Alexander Höllein ◽  
Franz Elsner ◽  
...  

AbstractMulti-parameter flow cytometry (MFC) is a cornerstone in clinical decision making for hematological disorders such as leukemia or lymphoma. MFC data analysis requires trained experts to manually gate cell populations of interest, which is time-consuming and subjective. Manual gating is often limited to a two-dimensional space. In recent years, deep learning models have been developed to analyze the data in high-dimensional space and are highly accurate. Such models have been used successfully in histology, cytopathology, image flow cytometry, and conventional MFC analysis. However, current AI models used for subtype classification based on MFC data are limited to the antibody (flow cytometry) panel they were trained on. Thus, a key challenge in deploying AI models into routine diagnostics is the robustness and adaptability of such models. In this study, we present a workflow to extend our previous model to four additional MFC panels. We employ knowledge transfer to adapt the model to smaller data sets. We trained models for each of the data sets by transferring the features learned from our base model. With our workflow, we could increase the model’s overall performance and more prominently, increase the learning rate for very small training sizes.


2009 ◽  
Vol 2009 ◽  
pp. 1-10 ◽  
Author(s):  
Errol Strain ◽  
Florian Hahne ◽  
Ryan R. Brinkman ◽  
Perry Haaland

Flow cytometry (FCM) software packages from R/Bioconductor, such as flowCore and flowViz, serve as an open platform for development of new analysis tools and methods. We created plateCore, a new package that extends the functionality in these core packages to enable automated negative control-based gating and make the processing and analysis of plate-based data sets from high-throughput FCM screening experiments easier. plateCore was used to analyze data from a BD FACS CAP screening experiment where five Peripheral Blood Mononucleocyte Cell (PBMC) samples were assayed for 189 different human cell surface markers. This same data set was also manually analyzed by a cytometry expert using the FlowJo data analysis software package (TreeStar, USA). We show that the expression values for markers characterized using the automated approach in plateCore are in good agreement with those from FlowJo, and that using plateCore allows for more reproducible analyses of FCM screening data.


2019 ◽  
Author(s):  
Kodai Minoura ◽  
Ko Abe ◽  
Yuka Maeda ◽  
Hiroyoshi Nishikawa ◽  
Teppei Shimamura

AbstractMotivationModern flow cytometry technology has enabled the simultaneous analysis of multiple cell markers at the single-cell level, and it is widely used in a broad field of research. The detection of cell populations in flow cytometry data has long been dependent on “manual gating” by visual inspection. Recently, numerous software have been developed for automatic, computationally guided detection of cell populations; however, they are not designed for time-series flow cytometry data. Time-series flow cytometry data are indispensable for investigating the dynamics of cell populations that could not be elucidated by static time-point analysis.Therefore, there is a great need for tools to systematically analyze time-series flow cytometry data.ResultsWe propose a simple and efficient statistical framework, named CYBERTRACK (CYtometry-Based Estimation and Reasoning for TRACKing cell populations), to perform clustering and cell population tracking for time-series flow cytometry data. CYBERTRACK assumes that flow cytometry data are generated from a multivariate Gaussian mixture distribution with its mixture proportion at the current time dependent on that at a previous timepoint. Using simulation data, we evaluate the performance of CYBERTRACK when estimating parameters for a multivariate Gaussian mixture distribution, tracking time-dependent transitions of mixture proportions, and detecting change-points in the overall mixture proportion. The CYBERTRACK performance is validated using two real flow cytometry datasets, which demonstrate that the population dynamics detected by CYBERTRACK are consistent with our prior knowledge of lymphocyte behavior.ConclusionsOur results indicate that CYBERTRACK offers better understandings of time-dependent cell population dynamics to cytometry users by systematically analyzing time-series flow cytometry data.


2020 ◽  
Author(s):  
Paul D. Simonson ◽  
Yue Wu ◽  
David Wu ◽  
Jonathan R. Fromm ◽  
Aaron Y. Lee

AbstractObjectivesAutomated classification of flow cytometry data has the potential to reduce errors and accelerate flow cytometry interpretation. We desired a machine learning approach that is accurate, intuitively easy to understand, and highlights the cells that are most important in the algorithm’s prediction for a given case.MethodsWe developed an ensemble of convolutional neural networks (CNNs) for classification and visualization of impactful cell populations in detecting classic Hodgkin lymphoma, using two-dimensional (2D) histograms. Data from 977 and 245 clinical flow cytometry cases were used for training and testing, respectively. 78 non-gated 2D histograms were created per flow cytometry file. SHAP values were calculated to determine the most impactful 2D histograms and regions within the histograms. The SHAP values from all 78 histograms were then projected back to the original cells data for gating and visualization using standard flow cytometry software.ResultsThe algorithm achieved 67.7% recall (sensitivity), 82.4 % precision, and 0.92 AUROC. Visualization of the important cell populations in making individual predictions demonstrated correlations with known biology.ConclusionsThe method presented enables model explainability while highlighting important cell populations in individual flow cytometry specimens, with potential applications in both diagnosis and discovery of previously overlooked key cell populations.


Author(s):  
Brad M. Hopkins ◽  
Dan Maraini ◽  
Andrew Seidel ◽  
Parham Shahidi

Freight rail cars may experience high input forces during a coupling event, which could potentially cause damage to the car body and/or lading. The AAR recommended practice states that cars should not be coupled at speeds greater than 4 mph. However, this recommendation is not always followed and cars are often coupled at much higher speeds. As a result, accelerometers on the car body are sometimes used to monitor impact events. Threshold levels may be set to determine if an over-speed or high-force impact event has occurred. However, a single acceleration value can be difficult to interpret because its relationship to impact force is dependent on many factors, including car type, end-of-car device type, lading type, and loading condition. Dynamic modeling and parametric studies may be used to determine these relationships which can be applied in practice. This paper presents a study on the relationship between struck coupler force and car body acceleration for a series of impacts on a tank car in both loaded and unloaded states. For the loaded condition, the tank was filled with water. The simplest change from an unloaded tank to a loaded tank is the decrease in acceleration for a given force due to the added mass. However, there is additional complexity added to the system due to the sloshing liquid inside the tank. When attempting to model this dynamic system there is added uncertainty in struck coupler force estimation because of the non-linearity in low frequency car body oscillations. Several example data sets are presented in the time and frequency domains to illustrate this point. The data is then used to generate an empirical model using system identification techniques. The results show that the proposed model offers improved characterization of the system as compared to conventional techniques by accounting for the uncertainties introduced by the sloshing liquid in the tank. The proposed technique is computationally efficient and can potentially be implemented in real time. The model is used to estimate struck coupler force and is validated with real data.


2010 ◽  
Vol 298 (2) ◽  
pp. L127-L130 ◽  
Author(s):  
D. F. Alvarez ◽  
K. Helm ◽  
J. DeGregori ◽  
M. Roederer ◽  
S. Majka

Cellular measurements by flow cytometric analysis constitute an important step toward understanding individual attributes within a population of cells. Assessing individual cells within a population by protein expression using fluorescently labeled antibodies and other fluorescent probes can identify cellular patterns. The technology for accurately identifying subtle changes in protein expression within a population of cells using a vast array of technology has resulted in controversy and questions regarding reproducibility, which can be explained at least in part by the absence of standard methods to facilitate comparison of flow cytometric data. The complexity of technological advancements and the need for improvements in biological resolution results in the generation of complex data that demands the use of minimum standards for their publication. Herein we present a summarized view for the inclusion of consistent flow cytometric experimental information as supplemental data. Four major points, experimental and sample information, data acquisition, analysis, and presentation are emphasized. Together, these guidelines will facilitate the review and publication of flow cytometry data that provide an accurate foundation for ongoing studies with this evolving technology.


2019 ◽  
Author(s):  
Peter Rubbens ◽  
Ruben Props ◽  
Frederiek-Maarten Kerckhof ◽  
Nico Boon ◽  
Willem Waegeman

AbstractMicrobial flow cytometry allows to rapidly characterize microbial communities. Recent research has demonstrated a moderate to strong connection between the cytometric diversity and taxonomic diversity based on 16S rRNA gene amplicon sequencing data. This creates the opportunity to integrate both types of data to study and predict the microbial community diversity in an automated and efficient way. However, microbial flow cytometry data results in a number of unique challenges that need to be addressed. The results of our work are threefold: i) We expand current microbial cytometry fingerprinting approaches by proposing and validating a model-based fingerprinting approach based upon Gaussian Mixture Models, which we called PhenoGMM. ii) We show that microbial diversity can be rapidly estimated by PhenoGMM. In combination with a supervised machine learning model, diversity estimations based on 16S rRNA gene amplicon sequencing data can be predicted. iii) We evaluate our method extensively by using multiple datasets from different ecosystems and compare its predictive power with a generic binning fingerprinting approach that is commonly used in microbial flow cytometry. These results demonstrate the strong connection between the genetic make-up of a microbial community and its phenotypic properties as measured by flow cytometry. Our workflow facilitates the study of microbial diversity and community dynamics using flow cytometry in a fast and quantitative way.ImportanceMicroorganisms are vital components in various ecoystems on Earth. In order to investigate the microbial diversity, researchers have largely relied on the analysis of 16S rRNA gene sequences from DNA. Flow cytometry has been proposed as an alternative technique to characterize microbial community diversity and dynamics. It is an optical technique, able to rapidly characterize a number of phenotypic properties of individual cells. So-called fingerprinting techniques are needed in order to describe microbial community diversity and dynamics based on flow cytometry data. In this work, we propose a more advanced fingerprinting strategy based on Gaussian Mixture Models. When samples have been analyzed by both flow cytometry and 16S rRNA gene amplicon sequencing, we show that supervised machine learning models can be used to find the relationship between the two types of data. We evaluate our workflow on datasets from different ecosystems, illustrating its general applicability for the analysisof microbial flow cytometry data. PhenoGMM facilitates the rapid characterization and predictive modelling of microbial diversity using flow cytometry.


2019 ◽  
Vol 20 (S23) ◽  
Author(s):  
Kodai Minoura ◽  
Ko Abe ◽  
Yuka Maeda ◽  
Hiroyoshi Nishikawa ◽  
Teppei Shimamura

Abstract Background Modern flow cytometry technology has enabled the simultaneous analysis of multiple cell markers at the single-cell level, and it is widely used in a broad field of research. The detection of cell populations in flow cytometry data has long been dependent on “manual gating” by visual inspection. Recently, numerous software have been developed for automatic, computationally guided detection of cell populations; however, they are not designed for time-series flow cytometry data. Time-series flow cytometry data are indispensable for investigating the dynamics of cell populations that could not be elucidated by static time-point analysis. Therefore, there is a great need for tools to systematically analyze time-series flow cytometry data. Results We propose a simple and efficient statistical framework, named CYBERTRACK (CYtometry-Based Estimation and Reasoning for TRACKing cell populations), to perform clustering and cell population tracking for time-series flow cytometry data. CYBERTRACK assumes that flow cytometry data are generated from a multivariate Gaussian mixture distribution with its mixture proportion at the current time dependent on that at a previous timepoint. Using simulation data, we evaluate the performance of CYBERTRACK when estimating parameters for a multivariate Gaussian mixture distribution, tracking time-dependent transitions of mixture proportions, and detecting change-points in the overall mixture proportion. The CYBERTRACK performance is validated using two real flow cytometry datasets, which demonstrate that the population dynamics detected by CYBERTRACK are consistent with our prior knowledge of lymphocyte behavior. Conclusions Our results indicate that CYBERTRACK offers better understandings of time-dependent cell population dynamics to cytometry users by systematically analyzing time-series flow cytometry data.


Sign in / Sign up

Export Citation Format

Share Document