Analysis of High-Throughput Flow Cytometry Data Using plateCore

Advances in Bioinformatics ◽

10.1155/2009/356141 ◽

2009 ◽

Vol 2009 ◽

pp. 1-10 ◽

Cited By ~ 6

Author(s):

Errol Strain ◽

Florian Hahne ◽

Ryan R. Brinkman ◽

Perry Haaland

Keyword(s):

Flow Cytometry ◽

High Throughput ◽

Negative Control ◽

Data Sets ◽

Data Set ◽

Flow Cytometry Data ◽

Open Platform ◽

Screening Experiments ◽

Software Packages ◽

Good Agreement

Flow cytometry (FCM) software packages from R/Bioconductor, such as flowCore and flowViz, serve as an open platform for development of new analysis tools and methods. We created plateCore, a new package that extends the functionality in these core packages to enable automated negative control-based gating and make the processing and analysis of plate-based data sets from high-throughput FCM screening experiments easier. plateCore was used to analyze data from a BD FACS CAP screening experiment where five Peripheral Blood Mononucleocyte Cell (PBMC) samples were assayed for 189 different human cell surface markers. This same data set was also manually analyzed by a cytometry expert using the FlowJo data analysis software package (TreeStar, USA). We show that the expression values for markers characterized using the automated approach in plateCore are in good agreement with those from FlowJo, and that using plateCore allows for more reproducible analyses of FCM screening data.

Download Full-text

Outlier Mining in High Throughput Screening Experiments

CrossRef Listing of Deleted DOIs ◽

10.1177/108705710200700406 ◽

2002 ◽

Vol 7 (4) ◽

pp. 341-351 ◽

Cited By ~ 16

Author(s):

Michael F.M. Engels ◽

Luc Wouters ◽

Rudi Verbeeck ◽

Greet Vanhoof

Keyword(s):

Data Mining ◽

Biological Activity ◽

High Throughput ◽

High Throughput Screening ◽

False Negative ◽

Data Sets ◽

Data Set ◽

Structure Information ◽

Screening Experiments ◽

Mining Procedure

A data mining procedure for the rapid scoring of high-throughput screening (HTS) compounds is presented. The method is particularly useful for monitoring the quality of HTS data and tracking outliers in automated pharmaceutical or agrochemical screening, thus providing more complete and thorough structure-activity relationship (SAR) information. The method is based on the utilization of the assumed relationship between the structure of the screened compounds and the biological activity on a given screen expressed on a binary scale. By means of a data mining method, a SAR description of the data is developed that assigns probabilities of being a hit to each compound of the screen. Then, an inconsistency score expressing the degree of deviation between the adequacy of the SAR description and the actual biological activity is computed. The inconsistency score enables the identification of potential outliers that can be primed for validation experiments. The approach is particularly useful for detecting false-negative outliers and for identifying SAR-compliant hit/nonhit borderline compounds, both of which are classes of compounds that can contribute substantially to the development and understanding of robust SARs. In a first implementation of the method, one- and two-dimensional descriptors are used for encoding molecular structure information and logistic regression for calculating hits/nonhits probability scores. The approach was validated on three data sets, the first one from a publicly available screening data set and the second and third from in-house HTS screening campaigns. Because of its simplicity, robustness, and accuracy, the procedure is suitable for automation.

Download Full-text

Recent Bioinformatics Advances in the Analysis of High Throughput Flow Cytometry Data

Advances in Bioinformatics ◽

10.1155/2009/461763 ◽

2009 ◽

Vol 2009 ◽

pp. 1-2

Author(s):

Raphael Gottardo ◽

Ryan R. Brinkman ◽

George Luta ◽

Matt P. Wand

Keyword(s):

Flow Cytometry ◽

High Throughput ◽

Flow Cytometry Data

Download Full-text

Normative data for flow cytometry immunophenotyping of benign lymph nodes sampled by surgical biopsy

Journal of Clinical Pathology ◽

10.1136/jclinpath-2017-204687 ◽

2017 ◽

Vol 71 (2) ◽

pp. 174-179 ◽

Cited By ~ 8

Author(s):

Gregory David Scott ◽

Susan K Atwater ◽

Dita A Gratzinger

Keyword(s):

Flow Cytometry ◽

Lymph Node ◽

T Cell ◽

Lymph Nodes ◽

Inflammatory Disease ◽

Clinical History ◽

Surgical Biopsy ◽

Data Set ◽

Flow Cytometry Data ◽

Benign Lymph Node

AimsTo create clinically relevant normative flow cytometry data for understudied benign lymph nodes and characterise outliers.MethodsClinical, histological and flow cytometry data were collected and distributions summarised for 380 benign lymph node excisional biopsies. Outliers for kappa:lambda light chain ratio, CD10:CD19 coexpression, CD5:CD19 coexpression, CD4:CD8 ratios and CD7 loss were summarised for histological pattern, concomitant diseases and follow-up course.ResultsWe generated the largest data set of benign lymph node immunophenotypes by an order of magnitude. B and T cell antigen outliers often had background immunosuppression or inflammatory disease but did not subsequently develop lymphoma.ConclusionsDiagnostic immunophenotyping data from benign lymph nodes provide normative ranges for clinical use. Outliers raising suspicion for B or T cell lymphoma are not infrequent (26% of benign lymph nodes). Caution is indicated when interpreting outliers in the absence of excisional biopsy or clinical history, particularly in patients with concomitant immunosuppression or inflammatory disease.

Download Full-text

Conclusion

Computational Text Analysis ◽

10.1093/oso/9780198567400.003.0018 ◽

2006 ◽

Author(s):

Soumya Raychaudhuri

Keyword(s):

Text Mining ◽

High Throughput ◽

Relevant Literature ◽

Critical Role ◽

The Body ◽

Data Sets ◽

Data Set ◽

New Techniques ◽

Scientific Text ◽

Mining Methods

The genomics era has presented many new high throughput experimental modalities that are capable of producing large amounts of data on comprehensive sets of genes. In time there will certainly be many more new techniques that explore new avenues in biology. In any case, textual analysis will be an important aspect of the analysis. The body of the peer-reviewed scientific text represents all of our accomplishments in biology, and it plays a critical role in hypothesizing and interpreting any data set. To altogether ignore it is tantamount to reinventing the wheel with each analysis. The volume of relevant literature approaches proportions where it is all but impossible to manually search through all of it. Instead we must often rely on automated text mining methods to access the literature efficiently and effectively. The methods we present in this book provide an introduction to the avenues that one can employ to include text in a meaningful way in the analysis of these functional genomics data sets. They serve as a complement to the statistical methods such as classification and clustering that are commonly employed to analyze data sets. We are hopeful that this book will serve to encourage the reader to utilize and further develop text mining in their own analyses.

Download Full-text

Using Weighted Entropy to Rank Chemicals in Quantitative High-Throughput Screening Experiments

CrossRef Listing of Deleted DOIs ◽

10.1177/1087057113505325 ◽

2013 ◽

Vol 19 (3) ◽

pp. 344-353 ◽

Cited By ~ 9

Author(s):

Keith R. Shockley

Keyword(s):

High Throughput ◽

High Throughput Screening ◽

Nonlinear Models ◽

Hill Equation ◽

Parameter Estimates ◽

Chemical Library ◽

Data Set ◽

Screening Experiments ◽

The Hill ◽

Weighted Entropy

Quantitative high-throughput screening (qHTS) experiments can simultaneously produce concentration-response profiles for thousands of chemicals. In a typical qHTS study, a large chemical library is subjected to a primary screen to identify candidate hits for secondary screening, validation studies, or prediction modeling. Different algorithms, usually based on the Hill equation logistic model, have been used to classify compounds as active or inactive (or inconclusive). However, observed concentration-response activity relationships may not adequately fit a sigmoidal curve. Furthermore, it is unclear how to prioritize chemicals for follow-up studies given the large uncertainties that often accompany parameter estimates from nonlinear models. Weighted Shannon entropy can address these concerns by ranking compounds according to profile-specific statistics derived from estimates of the probability mass distribution of response at the tested concentration levels. This strategy can be used to rank all tested chemicals in the absence of a prespecified model structure, or the approach can complement existing activity call algorithms by ranking the returned candidate hits. The weighted entropy approach was evaluated here using data simulated from the Hill equation model. The procedure was then applied to a chemical genomics profiling data set interrogating compounds for androgen receptor agonist activity.

Download Full-text

Spectral analysis and inversion of experimental codas

Geophysics ◽

10.1190/1.1443424 ◽

1993 ◽

Vol 58 (3) ◽

pp. 408-418 ◽

Cited By ~ 3

Author(s):

L. R. Jannaud ◽

P. M. Adler ◽

C. G. Jacquin

Keyword(s):

Spectral Analysis ◽

Characteristic Length ◽

Gaussian Model ◽

Seismic Survey ◽

Data Sets ◽

Data Set ◽

Anisotropic Elastic ◽

And Inversion ◽

Good Agreement

A method developed for the determination of the characteristic lengths of an heterogeneous medium from the spectral analysis of codas is based on an extension of Aki’s theory to anisotropic elastic media. An equivalent Gaussian model is obtained and seems to be in good agreement with the two experimental data sets that illustrate the method. The first set was obtained in a laboratory experiment with an isotropic marble sample. This sample is characterized by a submillimetric length scale that can be directly observed on a thin section. The spectral analysis of codas and their inversion yields an equivalent correlation length that is in good agreement with the observed one. The second data set is obtained in a crosshole experiment at the usual scale of a seismic survey. The codas are recorded, analysed, and inverted. The analysis yields a vertical characteristic length for the studied subsurface that compares well with the characteristic length measured by seismic and stratigraphic logs.

Download Full-text

A Fast Cluster Motif Finding Algorithm for ChIP-Seq Data Sets

BioMed Research International ◽

10.1155/2015/218068 ◽

2015 ◽

Vol 2015 ◽

pp. 1-10 ◽

Cited By ~ 4

Author(s):

Yipu Zhang ◽

Ping Wang

Keyword(s):

High Throughput ◽

Motif Discovery ◽

Large Scale ◽

High Throughput Sequencing ◽

Es Cells ◽

Motif Finding ◽

Data Sets ◽

Data Set ◽

Binding Motifs ◽

Motif Finding Algorithm

New high-throughput technique ChIP-seq, coupling chromatin immunoprecipitation experiment with high-throughput sequencing technologies, has extended the identification of binding locations of a transcription factor to the genome-wide regions. However, the most existing motif discovery algorithms are time-consuming and limited to identify binding motifs in ChIP-seq data which normally has the significant characteristics of large scale data. In order to improve the efficiency, we propose a fast cluster motif finding algorithm, named as FCmotif, to identify the(l, d)motifs in large scale ChIP-seq data set. It is inspired by the emerging substrings mining strategy to find the enriched substrings and then searching the neighborhood instances to construct PWM and cluster motifs in different length. FCmotif is not following the OOPS model constraint and can find long motifs. The effectiveness of proposed algorithm has been proved by experiments on the ChIP-seq data sets from mouse ES cells. The whole detection of the real binding motifs and processing of the full size data of several megabytes finished in a few minutes. The experimental results show that FCmotif has advantageous to deal with the(l, d)motif finding in the ChIP-seq data; meanwhile it also demonstrates better performance than other current widely-used algorithms such as MEME, Weeder, ChIPMunk, and DREME.

Download Full-text

Computational analysis of high-throughput flow cytometry data

Expert Opinion on Drug Discovery ◽

10.1517/17460441.2012.693475 ◽

2012 ◽

Vol 7 (8) ◽

pp. 679-693 ◽

Cited By ~ 31

Author(s):

J Paul Robinson ◽

Bartek Rajwa ◽

Valery Patsekin ◽

Vincent Jo Davisson

Keyword(s):

Flow Cytometry ◽

High Throughput ◽

Computational Analysis ◽

Flow Cytometry Data

Download Full-text

Global validation of improved SCIAMACHY scientific ozone limb data using ozonesonde measurements

Atmospheric Measurement Techniques Discussions ◽

10.5194/amtd-8-4817-2015 ◽

2015 ◽

Vol 8 (5) ◽

pp. 4817-4858

Author(s):

J. Jia ◽

A. Rozanov ◽

A. Ladstätter-Weißenmayer ◽

J. P. Burrows

Keyword(s):

Global Scale ◽

Data Sets ◽

Vertical Profiles ◽

Data Set ◽

Latitude Range ◽

Ozone Profile ◽

The Tropics ◽

Relative Differences ◽

Significant Underestimation ◽

Good Agreement

Abstract. In this manuscript, the latest SCIAMACHY limb ozone scientific vertical profiles, namely the current V2.9 and the upcoming V3.0, are extensively compared with ozone sonde data from the WOUDC database. The comparisons are made on a global scale from 2003 to 2011, involving 61 sonde stations. The retrieval processors used to generate V2.9 and V3.0 data sets are briefly introduced. The comparisons are discussed in terms of vertical profiles and stratospheric partial columns. Our results indicate that the V2.9 ozone profile data between 20–30 km is in good agreement with ground based measurements with less than 5% relative differences in the latitude range of 90° S–40° N (with exception of the tropical Pacific region where an overestimation of more than 10% is observed), which corresponds to less than 5 DU partial column differences. In the tropics the differences are within 3%. However, this data set shows a significant underestimation northwards of 40° N (up to ~15%). The newly developed V3.0 data set reduces this bias to below 10% while maintaining a good agreement southwards of 40° N with slightly increased relative differences of up to 5% in the tropics.

Download Full-text

hcga: Highly Comparative Graph Analysis for network phenotyping

10.1101/2020.09.25.312926 ◽

2020 ◽

Author(s):

Robert L. Peach ◽

Alexis Arnaudon ◽

Julia A. Schmidt ◽

Henry A. Palasciano ◽

Nathan R. Bernier ◽

...

Keyword(s):

Statistical Learning ◽

Organic Semiconductors ◽

Scientific Discovery ◽

Theoretical Research ◽

Data Sets ◽

Regression Problem ◽

Data Set ◽

Graph Data ◽

Open Platform ◽

Graph Properties

AbstractNetworks are widely used as mathematical models of complex systems across many scientific disciplines, not only in biology and medicine but also in the social sciences, physics, computing and engineering. Decades of work have produced a vast corpus of research characterising the topological, combinatorial, statistical and spectral properties of graphs. Each graph property can be thought of as a feature that captures important (and some times overlapping) characteristics of a network. In the analysis of real-world graphs, it is crucial to integrate systematically a large number of diverse graph features in order to characterise and classify networks, as well as to aid network-based scientific discovery. In this paper, we introduce HCGA, a framework for highly comparative analysis of graph data sets that computes several thousands of graph features from any given network. HCGA also offers a suite of statistical learning and data analysis tools for automated identification and selection of important and interpretable features underpinning the characterisation of graph data sets. We show that HCGA outperforms other methodologies on supervised classification tasks on benchmark data sets whilst retaining the interpretability of network features. We also illustrate how HCGA can be used for network-based discovery through two examples where data is naturally represented as graphs: the clustering of a data set of images of neuronal morphologies, and a regression problem to predict charge transfer in organic semiconductors based on their structure. HCGA is an open platform that can be expanded to include further graph properties and statistical learning tools to allow researchers to leverage the wide breadth of graph-theoretical research to quantitatively analyse and draw insights from network data.

Download Full-text