scholarly journals Managing Multi-center Flow Cytometry Data for Immune Monitoring

2014 ◽  
Vol 13s7 ◽  
pp. CIN.S16346 ◽  
Author(s):  
Scott White ◽  
Karoline Laske ◽  
Marij J.P. Welters ◽  
Nicole Bidmon ◽  
Sjoerd H. Van Der Burg ◽  
...  

With the recent results of promising cancer vaccines and immunotherapy 1 – 5 , immune monitoring has become increasingly relevant for measuring treatment-induced effects on T cells, and an essential tool for shedding light on the mechanisms responsible for a successful treatment. Flow cytometry is the canonical multi-parameter assay for the fine characterization of single cells in solution, and is ubiquitously used in pre-clinical tumor immunology and in cancer immunotherapy trials. Current state-of-the-art polychromatic flow cytometry involves multi-step, multi-reagent assays followed by sample acquisition on sophisticated instruments capable of capturing up to 20 parameters per cell at a rate of tens of thousands of cells per second. Given the complexity of flow cytometry assays, reproducibility is a major concern, especially for multi-center studies. A promising approach for improving reproducibility is the use of automated analysis borrowing from statistics, machine learning and information visualization 21 – 23 , as these methods directly address the subjectivity, operator-dependence, labor-intensive and low fidelity of manual analysis. However, it is quite time-consuming to investigate and test new automated analysis techniques on large data sets without some centralized information management system. For large-scale automated analysis to be practical, the presence of consistent and high-quality data linked to the raw FCS files is indispensable. In particular, the use of machine-readable standard vocabularies to characterize channel metadata is essential when constructing analytic pipelines to avoid errors in processing, analysis and interpretation of results. For automation, this high-quality metadata needs to be programmatically accessible, implying the need for a consistent Application Programming Interface (API). In this manuscript, we propose that upfront time spent normalizing flow cytometry data to conform to carefully designed data models enables automated analysis, potentially saving time in the long run. The ReFlow informatics framework was developed to address these data management challenges.

Processes ◽  
2020 ◽  
Vol 8 (6) ◽  
pp. 649
Author(s):  
Yifeng Liu ◽  
Wei Zhang ◽  
Wenhao Du

Deep learning based on a large number of high-quality data plays an important role in many industries. However, deep learning is hard to directly embed in the real-time system, because the data accumulation of the system depends on real-time acquisitions. However, the analysis tasks of such systems need to be carried out in real time, which makes it impossible to complete the analysis tasks by accumulating data for a long time. In order to solve the problems of high-quality data accumulation, high timeliness of the data analysis, and difficulty in embedding deep-learning algorithms directly in real-time systems, this paper proposes a new progressive deep-learning framework and conducts experiments on image recognition. The experimental results show that the proposed framework is effective and performs well and can reach a conclusion similar to the deep-learning framework based on large-scale data.


2010 ◽  
Vol 59 (9) ◽  
pp. 1435-1441 ◽  
Author(s):  
Jacob Frelinger ◽  
Janet Ottinger ◽  
Cécile Gouttefangeas ◽  
Cliburn Chan

2021 ◽  
pp. 1-12
Author(s):  
Bilal Tahir ◽  
Muhammad Amir Mehmood

 The confluence of high performance computing algorithms and large scale high-quality data has led to the availability of cutting edge tools in computational linguistics. However, these state-of-the-art tools are available only for the major languages of the world. The preparation of large scale high-quality corpora for low resource language such as Urdu is a challenging task as it requires huge computational and human resources. In this paper, we build and analyze a large scale Urdu language Twitter corpus Anbar. For this purpose, we collect 106.9 million Urdu tweets posted by 1.69 million users during one year (September 2018-August 2019). Our corpus consists of tweets with a rich vocabulary of 3.8 million unique tokens along with 58K hashtags and 62K URLs. Moreover, it contains 75.9 million (71.0%) retweets and 847K geotagged tweets. Furthermore, we examine Anbar using a variety of metrics like temporal frequency of tweets, vocabulary size, geo-location, user characteristics, and entities distribution. To the best of our knowledge, this is the largest repository of Urdu language tweets for the NLP research community which can be used for Natural Language Understanding (NLU), social analytics, and fake news detection.


Blood ◽  
2013 ◽  
Vol 122 (21) ◽  
pp. 2864-2864
Author(s):  
Jens Rueter ◽  
Vivek Philip ◽  
Krishna Karuturi ◽  
Zaher Oueida ◽  
Margaret Chavaree ◽  
...  

Abstract Introduction Recent developments of novel immunotherapeutic drugs have shown promising results for patients with hematologic malignancies, however, an unmet need for accurate and specific biomarkers persists. To address this need, we developed a novel integrative analysis procedure for the automated analysis of multidimensional flow cytometry data obtained from the peripheral blood of patients with chronic lymphocytic leukemia (CLL). State of the art flow cytometry analysis is accomplished by manual sequential segmentation, or gating, of cell populations based on similarities in fluorescence and light scatter characteristics through visualization of the data in one- or two-dimensional plots. This approach has a number of limitations, including the subjective nature of the gating and the inability to fully utilize the high-dimensional data. Recent efforts have produced sophisticated computational methods that overcome many of these limitations; however, these newer computational methods have not been rigorously tested in a clinical context and have focused on the rigorous and automated analysis of samples from individual patients, with substantially less effort towards the analysis of patient populations. The ultimate goal of our analysis is to develop computational approaches that will enable an identification of subsets of patients with distinct immunological markers. Methods We developed a novel analysis framework that facilitates automated identification of both common cell types and patient population subgroups, based on post-processing of individual sample analysis with the FLOCK program. FLOCK identifies clusters of putatively similar cells in an individual sample by multidimensional clustering of the fluorescence marker and light-scattering measurements. We developed a rigorous hierarchical clustering approach to identify common “cell signatures” across multiple patients. The cell signatures were then mapped back onto the individual patient samples and used in a second clustering that identified patient subgroups based on similar abundances of specific cell types. Results We used our analytic framework to analyze multidimensional flow cytometry data (26 cell surface markers in 4 different antibody cocktails) from peripheral blood specimens of a heterogeneous group of 55 CLL patients and 13 healthy controls. Our analysis revealed distinct differences between controls and CLL patients. Analyzing the non-malignant peripheral blood cell types, we were furthermore able to differentiate between distinct clinical subpopulations of patients (e.g. identify treatment-naïve patients from those that had previously undergone chemotherapy). Conclusion/Discussion Using a novel integrative analysis procedure to analyze complex flow cytometry data of the peripheral blood from CLL patients, we are able to identify distinct cell type distributions. We propose that this information is a marker for the overall health/disease status of the corresponding patient, and could ultimately be used for diagnosis, prognosis, and selection of optimal treatment. In the context of multiple novel treatment options for CLL patients, such a tool will be crucial for defining individual patient prognosis, and defining an accurately matched treatment plan. Disclosures: No relevant conflicts of interest to declare.


2019 ◽  
Vol 13 ◽  
pp. 117793221983885 ◽  
Author(s):  
Hunjoong Lee ◽  
Yongliang Sun ◽  
Lisa Patti-Diaz ◽  
Michael Hedrick ◽  
Anka G Ehrhardt

Advancements in flow cytometers with capability to measure 15 or more parameters have enabled us to characterize cell populations at unprecedented levels of detail. Beyond discovery research, there is now a growing demand to dive deeper into evaluating the immune response in clinical trials for immune modulating compounds. However, for high-volume, complex flow cytometry data generated in clinical trials, conventional manual gating remains the standard of practice. Traditional manual gating is resource intense and becomes a bottleneck and an impractical method to complete high volumes of flow cytometry data analysis. Current efforts to automate “manual gating” have shown that computational algorithms can facilitate the analysis of daunting multi-parameter data; however, a greater degree of precision in comparison with traditional manual gating is needed for wide-scale adoption of automated gating methods. In an effort to more closely follow the manual gating process, our automated gating pipeline was created to include negative controls (Fluorescence Minus One [FMO]) to enhance the reliability of gate placement. We demonstrate that use of an automated pipeline, heavily relying on FMO controls for population discrimination, can analyze multi-parameter, large-scale clinical datasets with comparable precision and accuracy to traditional manual gating.


2020 ◽  
Vol 93 (1) ◽  
Author(s):  
Amy Fox ◽  
Taru S. Dutt ◽  
Burton Karger ◽  
Andrés Obregón‐Henao ◽  
G. Brooke Anderson ◽  
...  

2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Rita Folcarelli ◽  
Gerjen H. Tinnevelt ◽  
Bart Hilvering ◽  
Kristiaan Wouters ◽  
Selma van Staveren ◽  
...  

Abstract Flow Cytometry is an analytical technology to simultaneously measure multiple markers per single cell. Ten thousands to millions of single cells can be measured per sample and each sample may contain a different number of cells. All samples may be bundled together, leading to a ‘multi-set’ structure. Many multivariate methods have been developed for Flow Cytometry data but none of them considers this structure in their quantitative handling of the data. The standard pre-processing used by existing multivariate methods provides models mainly influenced by the samples with more cells, while such a model should provide a balanced view of the biomedical information within all measurements. We propose an alternative ‘multi-set’ preprocessing that corrects for the difference in number of cells measured, balancing the relative importance of each multi-cell sample in the data while using all data collected from these expensive analyses. Moreover, one case example shows how multi-set pre-processing may benefit removal of undesired measurement-to-measurement variability and another where class-based multi-set pre-processing enhances the studied response upon comparison to the control reference samples. Our results show that adjusting data analysis algorithms to consider this multi-set structure may greatly benefit immunological insight and classification performance of Flow Cytometry data.


Methods ◽  
2018 ◽  
Vol 134-135 ◽  
pp. 164-176 ◽  
Author(s):  
Albina Rahim ◽  
Justin Meskas ◽  
Sibyl Drissler ◽  
Alice Yue ◽  
Anna Lorenc ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document