scholarly journals Transcriptome Deconvolution of Heterogeneous Tumor Samples with Immune Infiltration

2017 ◽  
Author(s):  
Zeya Wang ◽  
Shaolong Cao ◽  
Jeffrey S. Morris ◽  
Jaeil Ahn ◽  
Rongjie Liu ◽  
...  

AbstractTranscriptomic deconvolution in cancer and other heterogeneous tissues remains challenging. Available methods lack the ability to estimate both component-specific proportions and expression profiles for individual samples. We present DeMixT, a new tool to deconvolve high dimensional data from mixtures of more than two components. DeMixT implements an iterated conditional mode algorithm and a novel gene-set-based component merging approach to improve accuracy. In a series of experimental validation studies and application to TCGA data, DeMixT showed high accuracy. Improved deconvolution is an important step towards linking tumor transcriptomic data with clinical outcomes. An R package, scripts and data are available: https://github.com/wwylab/DeMixT/.

2018 ◽  
Author(s):  
Guangsheng Pei ◽  
Yulin Dai ◽  
Zhongming Zhao ◽  
Peilin Jia

AbstractMotivationDiseases and traits are under dynamic tissue-specific regulation. However, heterogeneous tissues are often collected in biomedical studies, which reduce the power in the identification of disease-associated variants and gene expression profiles.ResultsWe present TSEA, an R package to conduct Tissue-Specific Enrichment Analysis (TSEA) with two built-in reference panels. Statistical methods are developed and implemented for detecting tissue-specific genes and for enrichment test of different forms of query data. Our applications using multi-trait genome-wide association data and cancer expression data showed that TSEA could effectively identify the most relevant tissues for each query trait or sample, providing insights for future studies.Availabilityhttps://github.com/bsml320/[email protected] or [email protected]


2020 ◽  
Vol 36 (11) ◽  
pp. 3431-3438
Author(s):  
Ziyi Li ◽  
Zhenxing Guo ◽  
Ying Cheng ◽  
Peng Jin ◽  
Hao Wu

Abstract Motivation In the analysis of high-throughput omics data from tissue samples, estimating and accounting for cell composition have been recognized as important steps. High cost, intensive labor requirements and technical limitations hinder the cell composition quantification using cell-sorting or single-cell technologies. Computational methods for cell composition estimation are available, but they are either limited by the availability of a reference panel or suffer from low accuracy. Results We introduce TOols for the Analysis of heterogeneouS Tissues TOAST/-P and TOAST/+P, two partial reference-free algorithms for estimating cell composition of heterogeneous tissues based on their gene expression profiles. TOAST/-P and TOAST/+P incorporate additional biological information, including cell-type-specific markers and prior knowledge of compositions, in the estimation procedure. Extensive simulation studies and real data analyses demonstrate that the proposed methods provide more accurate and robust cell composition estimation than existing methods. Availability and implementation The proposed methods TOAST/-P and TOAST/+P are implemented as part of the R/Bioconductor package TOAST at https://bioconductor.org/packages/TOAST. Contact [email protected] or [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Author(s):  
Wikum Dinalankara ◽  
Qian Ke ◽  
Donald Geman ◽  
Luigi Marchionni

AbstractGiven the ever-increasing amount of high-dimensional and complex omics data becoming available, it is increasingly important to discover simple but effective methods of analysis. Divergence analysis transforms each entry of a high-dimensional omics profile into a digitized (binary or ternary) code based on the deviation of the entry from a given baseline population. This is a novel framework that is significantly different from existing omics data analysis methods: it allows digitization of continuous omics data at the univariate or multivariate level, facilitates sample level analysis, and is applicable on many different omics platforms. The divergence package, available on the R platform through the Bioconductor repository collection, provides easy-to-use functions for carrying out this transformation. Here we demonstrate how to use the package with sample high throughput sequencing data from the Cancer Genome Atlas.


2014 ◽  
Author(s):  
Karl W Broman

Every data visualization can be improved with some level of interactivity. Interactive graphics hold particular promise for the exploration of high-dimensional data. R/qtlcharts is an R package to create interactive graphics for experiments to map quantitative trait loci (QTL; genetic loci that influence quantitative traits). R/qtlcharts serves as a companion to the R/qtl package, providing interactive versions of R/qtl's static graphs, as well as additional interactive graphs for the exploration of high-dimensional genotype and phenotype data.


PLoS ONE ◽  
2021 ◽  
Vol 16 (4) ◽  
pp. e0249002
Author(s):  
Wikum Dinalankara ◽  
Qian Ke ◽  
Donald Geman ◽  
Luigi Marchionni

Given the ever-increasing amount of high-dimensional and complex omics data becoming available, it is increasingly important to discover simple but effective methods of analysis. Divergence analysis transforms each entry of a high-dimensional omics profile into a digitized (binary or ternary) code based on the deviation of the entry from a given baseline population. This is a novel framework that is significantly different from existing omics data analysis methods: it allows digitization of continuous omics data at the univariate or multivariate level, facilitates sample level analysis, and is applicable on many different omics platforms. The divergence package, available on the R platform through the Bioconductor repository collection, provides easy-to-use functions for carrying out this transformation. Here we demonstrate how to use the package with data from the Cancer Genome Atlas.


Author(s):  
R. F. Mudde ◽  
C. Van Pijpen ◽  
R. Beugels

The PRIMIX helical static mixer has been investigated using numerical simulations. The flow is in the laminar regime (Re = 1 to 1000). The simulations concentrate on the pressure drop and on the use of particle tracking for mixing studies. For the pressure drop, experimental validation is provided. It is found that the pressure drop can be simulated with high accuracy for Re < 350. For higher Re-values no grid independent solution could be obtained and the experimental results no longer agree with those of the simulations. The simulated pressure drop results scaled to the empty pipe pressure drop, can be well summarized as K = 4.99 + Re/31.4. Using Particle Tracking it has been possible to reproduce literature data. However, it has been shown that the obtained results are rather sensitive to the choice of the time step. This limits the direct use of particle tracking techniques for studying the mixing of static mixers in the laminar regime.


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Jan Klosa ◽  
Noah Simon ◽  
Pål Olof Westermark ◽  
Volkmar Liebscher ◽  
Dörte Wittenburg

Abstract Background Statistical analyses of biological problems in life sciences often lead to high-dimensional linear models. To solve the corresponding system of equations, penalization approaches are often the methods of choice. They are especially useful in case of multicollinearity, which appears if the number of explanatory variables exceeds the number of observations or for some biological reason. Then, the model goodness of fit is penalized by some suitable function of interest. Prominent examples are the lasso, group lasso and sparse-group lasso. Here, we offer a fast and numerically cheap implementation of these operators via proximal gradient descent. The grid search for the penalty parameter is realized by warm starts. The step size between consecutive iterations is determined with backtracking line search. Finally, seagull -the R package presented here- produces complete regularization paths. Results Publicly available high-dimensional methylation data are used to compare seagull to the established R package SGL. The results of both packages enabled a precise prediction of biological age from DNA methylation status. But even though the results of seagull and SGL were very similar (R2 > 0.99), seagull computed the solution in a fraction of the time needed by SGL. Additionally, seagull enables the incorporation of weights for each penalized feature. Conclusions The following operators for linear regression models are available in seagull: lasso, group lasso, sparse-group lasso and Integrative LASSO with Penalty Factors (IPF-lasso). Thus, seagull is a convenient envelope of lasso variants.


2004 ◽  
Vol 3 (1) ◽  
pp. 1-24 ◽  
Author(s):  
Markus Ruschhaupt ◽  
Wolfgang Huber ◽  
Annemarie Poustka ◽  
Ulrich Mansmann

We demonstrate a concept and implementation of a compendium for the classification of high-dimensional data from microarray gene expression profiles. A compendium is an interactive document that bundles primary data, statistical processing methods, figures, and derived data together with the textual documentation and conclusions. Interactivity allows the reader to modify and extend these components. We address the following questions: how much does the discriminatory power of a classifier depend on the choice of the algorithm that was used to identify it; what alternative classifiers could be used just as well; how robust is the result. The answers to these questions are essential prerequisites for validation and biological interpretation of the classifiers. We show how to use this approach by looking at these questions for a specific breast cancer microarray data set that first has been studied by Huang et al. (2003).


Author(s):  
Chastine Fatichah ◽  
◽  
Martin Leonard Tangel ◽  
Muhammad Rahmat Widyanto ◽  
Fangyan Dong ◽  
...  

An Interest-based Ordering Scheme (IOS) for fuzzy morphology on White-Blood-Cell (WBC) image segmentation is proposed to improve accuracy of segmentation. The proposed method shows a high accuracy in segmenting both high- and low-density nuclei. Further, its running time is low, so it can be used for real applications. To evaluate the performance of the proposed method, 100 WBC images and 10 leukemia images are used, and the experimental results show that the proposed IOS segments a nucleus in WBC images 3.99% more accurately on average than the Lexicographical Ordering Scheme (LOS) does and 5.29% more accurately on average than the combined Fuzzy Clustering and Binary Morphology (FCBM) method does. The proposal method segments a cytoplasm 20.72% more accurately on average than the FCBM method. The WBC image segmentation is a part of WBC classification in an automatic cancer-diagnosis application that is being developed. In addition, the proposed method can be used to segment any images that focus on the important color of an object of interest.


Sign in / Sign up

Export Citation Format

Share Document