Which normalization method is best? A platform-independent biologically inspired quantitative comparison of normalization methods

Author(s):  
E.P. van Someren ◽  
M.J.T. Reinders
2005 ◽  
Vol 44 (03) ◽  
pp. 414-417 ◽  
Author(s):  
M. Neuhäuser ◽  
T. Boes

Summary Objectives: The high density oligonucleotide micro-arrays from Affymetrix (Affymetrix GeneChips) are very popular in biomedical research. They enable to study the expression of thousands of genes simultaneously. In experiments with multiple arrays, normalization techniques are used to reduce the so-called obscuring variation, i.e. the technical variation that is of non-biological origin. Several different normalization methods have been proposed during the last years. Methods: We review published results about the comparison of normalization methods proposed for Affymetrix GeneChips. Results: The quantile normalization seems to perform favorably regarding precision (low variance), accuracy (low bias), and practicability (low computing time). However, according to very recent results [1], this normalization method can have an impact on the biological variability and, therefore, appears to be less than optimal from this point of view. Conclusion: Although the quantile normalization may be recommendable, more investigations based on more data sets are needed so that the different normalization methods can be evaluated on widely differing data.


2018 ◽  
Vol 24 (4) ◽  
pp. 1362-1387 ◽  
Author(s):  
Ali Jahan

Recently, considerable attention has been devoted to application of multi-attribute decision-making (MADM) method in materials selection. Normalization can be considered as a foundation for rational MADM methods, which should deal with target-based criteria in addition to cost and benefit criteria. Although a good number of applications have been reported for point target criteria in MADM problems, in selection problems related to engineering design, it might be better to let the material and design criteria vary over a range in order to increase flexibility in subsequent design stages. The mentioned point supports a readily adaptable design in changing the customer requirements, which is also significant in offering a robust design. In this research, performance of three promising target-based normalization methods was investigated using simulation experiments to examine the effect of simulation parameters. The effect of parameters and normalization methods was examined using analysis of variance (ANOVA). Moreover, the best structure formula was identified to propose an inclusive range target-based normalization method. The suggested normalization method was used to enhance the capability of Weighted Aggregated Sum Product Assessment (WASPAS) method and applied to a real-word problem dealing with benefit-, cost-, and point target-based criteria as well as the range criterion.


2017 ◽  
Author(s):  
Li Chen ◽  
Jun Chen

ABSTRACTSummaryNormalization is the first and a critical step in microbiome sequencing (microbiome-Seq) data analysis to account for variable library sizes. Though RNA-Seq based normalization methods have been adapted for microbiome-Seq data, they fail to consider the unique characteristics of microbiome-Seq data, which contain a vast number of zeros due to the physical absence or undersampling of the microbes. Normalization methods that specifically address the zeroinflation remain largely undeveloped. Here we propose GMPR - a simple but effective normalization method - for zeroinflated sequencing data such as microbiome-Seq data. Simulation studies and analyses of 38 real gut microbiome datasets from 16S rRNA gene amplicon sequencing demonstrated the superior performance of the proposed method.Availability and Implementation‘GMPR’ is implemented in R andavailable at https://github.com/jchen1981/GMPRSupplementary InformationSupplementary data are available at Bioinformatics [email protected]


2017 ◽  
Author(s):  
Li Chen ◽  
James Reeve ◽  
Lujun Zhang ◽  
Shengbing Huang ◽  
Jun Chen

Normalization is the first critical step in microbiome sequencing data analysis used to account for variable library sizes. Current RNA-Seq based normalization methods that have been adapted for microbiome data fail to consider the unique characteristics of microbiome data, which contain a vast number of zeros due to the physical absence or under-sampling of the microbes. Normalization methods that specifically address the zero inflation remain largely undeveloped. Here we propose GMPR - a simple but effective normalization method - for zero-inflated sequencing data such as microbiome data. Simulation studies and real datasets analyses demonstrate that the proposed method is more robust than competing methods, leading to more powerful detection of differentially abundant taxa and higher reproducibility of the relative abundances of taxa.


2021 ◽  
Vol 12 ◽  
Author(s):  
Shen Yin ◽  
Xiaowei Zhan ◽  
Bo Yao ◽  
Guanghua Xiao ◽  
Xinlei Wang ◽  
...  

RNA-sequencing (RNA-seq) provides a comprehensive quantification of transcriptomic activities in biological samples. Formalin-Fixed Paraffin-Embedded (FFPE) samples are collected as part of routine clinical procedure, and are the most widely available biological sample format in medical research and patient care. Normalization is an essential step in RNA-seq data analysis. A number of normalization methods, though developed for RNA-seq data from fresh frozen (FF) samples, can be used with FFPE samples as well. The only extant normalization method specifically designed for FFPE RNA-seq data, MIXnorm, which has been shown to outperform the normalization methods, but at the cost of a complex mixture model and a high computational burden. It is therefore important to adapt MIXnorm for simplicity and computational efficiency while maintaining superior performance. Furthermore, it is critical to develop an integrated tool that performs commonly used normalization methods for both FF and FFPE RNA-seq data. We developed a new normalization method for FFPE RNA-seq data, named SMIXnorm, based on a simplified two-component mixture model compared to MIXnorm to facilitate computation. The expression levels of expressed genes are modeled by normal distributions without truncation, and those of non-expressed genes are modeled by zero-inflated Poisson distributions. The maximum likelihood estimates of the model parameters are obtained by a nested Expectation-Maximization algorithm with a less complicated latent variable structure, and closed-form updates are available within each iteration. Real data applications and simulation studies show that SMIXnorm greatly reduces computing time compared to MIXnorm, without sacrificing the performance. More importantly, we developed a web-based tool, RNA-seq Normalization (RSeqNorm), that offers a simple workflow to compute normalized RNA-seq data for both FFPE and FF samples. It includes SMIXnorm and MIXnorm for FFPE RNA-seq data, together with five commonly used normalization methods for FF RNA-seq data. Users can easily upload a raw RNA-seq count matrix and select one of the seven normalization methods to produce a downloadable normalized expression matrix for any downstream analysis. The R package is available at https://github.com/S-YIN/RSEQNORM. The web-based tool, RSeqNorm is available at http://lce.biohpc.swmed.edu/rseqnorm with no restriction to use or redistribute.


Author(s):  
I Nyoman Gede Arya Astawa ◽  
I Ketut Gede Darma Putra ◽  
I Made Sudarma ◽  
Rukmi Sari Hartati

One of the factors that affects the detection system or face recognition is lighting. Image color processing can help the face recognition system in poor lighting conditions. In this study, homomorphic filtering and intensity normalization methods used to help improve the accuracy of face image detection. The experimental results show that the non-uniform of the illumination of the face image can be uniformed using the intensity normalization method with the average value of Peak Signal to Noise Ratio (PSNR) obtained from the whole experiment is 22.05314 and the average Absolute Mean Brightness Error (AMBE) value obtained is 6.147787. The results showed that homomorphic filtering and intensity normalization methods can be used to improve the detection accuracy of a face image.


2016 ◽  
Vol 63 (1) ◽  
pp. 7-18 ◽  
Author(s):  
Marek Walesiak

In multidimensional scaling carried out on the basis of metric data matrix (interval, ratio) one of the stages is the choice of the variable normalization method. The R package clusterSim with data. Normalization function has been developed for that purpose. It provides 18 data normalization methods. In this paper the proposal of procedure which allows to isolate groups of normalization methods that lead to similar multidimensional scaling results were presented. The proposal can reduce the problem of choosing the normalization method in multidimensional scaling. The results are illustrated via empirical example.


2012 ◽  
Vol 29 (4) ◽  
pp. 589-596 ◽  
Author(s):  
Xiao-yong Zhuge ◽  
Fan Yu ◽  
Ye Wang

Abstract A new visible (VIS; 0.55–0.9 μm) albedo normalization method, that is, the quasi-Lambertian surface adjustment (QLSA), is developed herein by using the geostationary meteorological satellite data and radiative transfer model. Taking the variation of relative locations between the sun, satellite, and clouds into account, the QLSA effectively reduces the inconsistencies in the VIS image brightness caused by the Lambertian surface approximation to cloud tops (i.e., the reflection characteristic is isotropic). The evaluation, using Chinese and Japanese geostationary satellite data, shows that the QLSA is more effective and accurate than three other albedo normalization methods currently in use. The new algorithm is applicable in regions with solar zenith angle and satellite zenith angle less than 60°, which, in the summertime, approximately corresponds to the time range from 0800 to 1600 local time (LT).


2018 ◽  
Author(s):  
Li Chen ◽  
James Reeve ◽  
Lujun Zhang ◽  
Shengbing Huang ◽  
Xuefeng Wang ◽  
...  

Normalization is the first critical step in microbiome sequencing data analysis used to account for variable library sizes. Current RNA-Seq based normalization methods that have been adapted for microbiome data fail to consider the unique characteristics of microbiome data, which contain a vast number of zeros due to the physical absence or under-sampling of the microbes. Normalization methods that specifically address the zero inflation remain largely undeveloped. Here we propose GMPR - a simple but effective normalization method - for zero-inflated sequencing data such as microbiome data. Simulation studies and real datasets analyses demonstrate that the proposed method is more robust than competing methods, leading to more powerful detection of differentially abundant taxa and higher reproducibility of the relative abundances of taxa.


2020 ◽  
Vol 48 (W1) ◽  
pp. W436-W448 ◽  
Author(s):  
Qingxia Yang ◽  
Yunxia Wang ◽  
Ying Zhang ◽  
Fengcheng Li ◽  
Weiqi Xia ◽  
...  

Abstract Biological processes (like microbial growth & physiological response) are usually dynamic and require the monitoring of metabolic variation at different time-points. Moreover, there is clear shift from case-control (N=2) study to multi-class (N>2) problem in current metabolomics, which is crucial for revealing the mechanisms underlying certain physiological process, disease metastasis, etc. These time-course and multi-class metabolomics have attracted great attention, and data normalization is essential for removing unwanted biological/experimental variations in these studies. However, no tool (including NOREVA 1.0 focusing only on case-control studies) is available for effectively assessing the performance of normalization method on time-course/multi-class metabolomic data. Thus, NOREVA was updated to version 2.0 by (i) realizing normalization and evaluation of both time-course and multi-class metabolomic data, (ii) integrating 144 normalization methods of a recently proposed combination strategy and (iii) identifying the well-performing methods by comprehensively assessing the largest set of normalizations (168 in total, significantly larger than those 24 in NOREVA 1.0). The significance of this update was extensively validated by case studies on benchmark datasets. All in all, NOREVA 2.0 is distinguished for its capability in identifying well-performing normalization method(s) for time-course and multi-class metabolomics, which makes it an indispensable complement to other available tools. NOREVA can be accessed at https://idrblab.org/noreva/.


Sign in / Sign up

Export Citation Format

Share Document