Which normalization method is best? A platform-independent biologically inspired quantitative comparison of normalization methods

Normalization for Affymetrix GeneChips

Methods of Information in Medicine ◽

10.1055/s-0038-1633986 ◽

2005 ◽

Vol 44 (03) ◽

pp. 414-417 ◽

Cited By ~ 11

Author(s):

M. Neuhäuser ◽

T. Boes

Keyword(s):

Biomedical Research ◽

Computing Time ◽

Normalization Method ◽

Point Of View ◽

Data Sets ◽

Quantile Normalization ◽

Biological Variability ◽

Biological Origin ◽

Affymetrix Genechips ◽

Normalization Methods

Summary Objectives: The high density oligonucleotide micro-arrays from Affymetrix (Affymetrix GeneChips) are very popular in biomedical research. They enable to study the expression of thousands of genes simultaneously. In experiments with multiple arrays, normalization techniques are used to reduce the so-called obscuring variation, i.e. the technical variation that is of non-biological origin. Several different normalization methods have been proposed during the last years. Methods: We review published results about the comparison of normalization methods proposed for Affymetrix GeneChips. Results: The quantile normalization seems to perform favorably regarding precision (low variance), accuracy (low bias), and practicability (low computing time). However, according to very recent results [1], this normalization method can have an impact on the biological variability and, therefore, appears to be less than optimal from this point of view. Conclusion: Although the quantile normalization may be recommendable, more investigations based on more data sets are needed so that the different normalization methods can be evaluated on widely differing data.

Download Full-text

DEVELOPING WASPAS-RTB METHOD FOR RANGE TARGET-BASED CRITERIA: TOWARD SELECTION FOR ROBUST DESIGN

Technological and Economic Development of Economy ◽

10.3846/20294913.2017.1295288 ◽

2018 ◽

Vol 24 (4) ◽

pp. 1362-1387 ◽

Cited By ~ 3

Author(s):

Ali Jahan

Keyword(s):

Robust Design ◽

Research Performance ◽

Normalization Method ◽

Point Target ◽

Normalization Methods ◽

Promising Target ◽

Multi Attribute Decision Making ◽

Product Assessment ◽

Benefit Cost ◽

Simulation Parameters

Recently, considerable attention has been devoted to application of multi-attribute decision-making (MADM) method in materials selection. Normalization can be considered as a foundation for rational MADM methods, which should deal with target-based criteria in addition to cost and benefit criteria. Although a good number of applications have been reported for point target criteria in MADM problems, in selection problems related to engineering design, it might be better to let the material and design criteria vary over a range in order to increase flexibility in subsequent design stages. The mentioned point supports a readily adaptable design in changing the customer requirements, which is also significant in offering a robust design. In this research, performance of three promising target-based normalization methods was investigated using simulation experiments to examine the effect of simulation parameters. The effect of parameters and normalization methods was examined using analysis of variance (ANOVA). Moreover, the best structure formula was identified to propose an inclusive range target-based normalization method. The suggested normalization method was used to enhance the capability of Weighted Aggregated Sum Product Assessment (WASPAS) method and applied to a real-word problem dealing with benefit-, cost-, and point target-based criteria as well as the range criterion.

Download Full-text

GMPR: A novel normalization method for microbiome sequencing data

10.1101/112565 ◽

2017 ◽

Cited By ~ 1

Author(s):

Li Chen ◽

Jun Chen

Keyword(s):

Amplicon Sequencing ◽

Normalization Method ◽

Superior Performance ◽

Supplementary Information ◽

Rrna Gene ◽

Sequencing Data ◽

Data Simulation ◽

Vast Number ◽

Number Of Zeros ◽

Normalization Methods

ABSTRACTSummaryNormalization is the first and a critical step in microbiome sequencing (microbiome-Seq) data analysis to account for variable library sizes. Though RNA-Seq based normalization methods have been adapted for microbiome-Seq data, they fail to consider the unique characteristics of microbiome-Seq data, which contain a vast number of zeros due to the physical absence or undersampling of the microbes. Normalization methods that specifically address the zeroinflation remain largely undeveloped. Here we propose GMPR - a simple but effective normalization method - for zeroinflated sequencing data such as microbiome-Seq data. Simulation studies and analyses of 38 real gut microbiome datasets from 16S rRNA gene amplicon sequencing demonstrated the superior performance of the proposed method.Availability and Implementation‘GMPR’ is implemented in R andavailable at https://github.com/jchen1981/GMPRSupplementary InformationSupplementary data are available at Bioinformatics [email protected]

Download Full-text

GMPR: A robust normalization method for zero-inflated count data with application to microbiome sequencing data

10.7287/peerj.preprints.3417v2 ◽

2017 ◽

Author(s):

Li Chen ◽

James Reeve ◽

Lujun Zhang ◽

Shengbing Huang ◽

Jun Chen

Keyword(s):

Normalization Method ◽

Rna Seq ◽

Sequencing Data ◽

Data Simulation ◽

Vast Number ◽

Number Of Zeros ◽

Normalization Methods ◽

Under Sampling ◽

Microbiome Data ◽

Sequencing Data Analysis

Normalization is the first critical step in microbiome sequencing data analysis used to account for variable library sizes. Current RNA-Seq based normalization methods that have been adapted for microbiome data fail to consider the unique characteristics of microbiome data, which contain a vast number of zeros due to the physical absence or under-sampling of the microbes. Normalization methods that specifically address the zero inflation remain largely undeveloped. Here we propose GMPR - a simple but effective normalization method - for zero-inflated sequencing data such as microbiome data. Simulation studies and real datasets analyses demonstrate that the proposed method is more robust than competing methods, leading to more powerful detection of differentially abundant taxa and higher reproducibility of the relative abundances of taxa.

Download Full-text

SMIXnorm: Fast and Accurate RNA-Seq Data Normalization for Formalin-Fixed Paraffin-Embedded Samples

Frontiers in Genetics ◽

10.3389/fgene.2021.650795 ◽

2021 ◽

Vol 12 ◽

Author(s):

Shen Yin ◽

Xiaowei Zhan ◽

Bo Yao ◽

Guanghua Xiao ◽

Xinlei Wang ◽

...

Keyword(s):

Mixture Model ◽

Normalization Method ◽

Superior Performance ◽

Rna Seq ◽

Web Based ◽

Normalization Methods ◽

Formalin Fixed Paraffin ◽

Ffpe Samples ◽

Formalin Fixed Paraffin Embedded ◽

Formalin Fixed

RNA-sequencing (RNA-seq) provides a comprehensive quantification of transcriptomic activities in biological samples. Formalin-Fixed Paraffin-Embedded (FFPE) samples are collected as part of routine clinical procedure, and are the most widely available biological sample format in medical research and patient care. Normalization is an essential step in RNA-seq data analysis. A number of normalization methods, though developed for RNA-seq data from fresh frozen (FF) samples, can be used with FFPE samples as well. The only extant normalization method specifically designed for FFPE RNA-seq data, MIXnorm, which has been shown to outperform the normalization methods, but at the cost of a complex mixture model and a high computational burden. It is therefore important to adapt MIXnorm for simplicity and computational efficiency while maintaining superior performance. Furthermore, it is critical to develop an integrated tool that performs commonly used normalization methods for both FF and FFPE RNA-seq data. We developed a new normalization method for FFPE RNA-seq data, named SMIXnorm, based on a simplified two-component mixture model compared to MIXnorm to facilitate computation. The expression levels of expressed genes are modeled by normal distributions without truncation, and those of non-expressed genes are modeled by zero-inflated Poisson distributions. The maximum likelihood estimates of the model parameters are obtained by a nested Expectation-Maximization algorithm with a less complicated latent variable structure, and closed-form updates are available within each iteration. Real data applications and simulation studies show that SMIXnorm greatly reduces computing time compared to MIXnorm, without sacrificing the performance. More importantly, we developed a web-based tool, RNA-seq Normalization (RSeqNorm), that offers a simple workflow to compute normalized RNA-seq data for both FFPE and FF samples. It includes SMIXnorm and MIXnorm for FFPE RNA-seq data, together with five commonly used normalization methods for FF RNA-seq data. Users can easily upload a raw RNA-seq count matrix and select one of the seven normalization methods to produce a downloadable normalized expression matrix for any downstream analysis. The R package is available at https://github.com/S-YIN/RSEQNORM. The web-based tool, RSeqNorm is available at http://lce.biohpc.swmed.edu/rseqnorm with no restriction to use or redistribute.

Download Full-text

Measurement of Face Detection Accuracy Using Intensity Normalization Method and Homomorphic Filtering

International Journal of Engineering and Emerging Technology ◽

10.24843/ijeet.2017.v02.i01.p22 ◽

2017 ◽

Vol 2 (1) ◽

pp. 107

Author(s):

I Nyoman Gede Arya Astawa ◽

I Ketut Gede Darma Putra ◽

I Made Sudarma ◽

Rukmi Sari Hartati

Keyword(s):

Face Recognition ◽

Detection System ◽

Recognition System ◽

Face Image ◽

Normalization Method ◽

Detection Accuracy ◽

Normalization Methods ◽

Homomorphic Filtering ◽

The Face ◽

Intensity Normalization

One of the factors that affects the detection system or face recognition is lighting. Image color processing can help the face recognition system in poor lighting conditions. In this study, homomorphic filtering and intensity normalization methods used to help improve the accuracy of face image detection. The experimental results show that the non-uniform of the illumination of the face image can be uniformed using the intensity normalization method with the average value of Peak Signal to Noise Ratio (PSNR) obtained from the whole experiment is 22.05314 and the average Absolute Mean Brightness Error (AMBE) value obtained is 6.147787. The results showed that homomorphic filtering and intensity normalization methods can be used to improve the detection accuracy of a face image.

Download Full-text

The Choice of Groups of Variable Normalization Methods in Multidimensional Scaling

Przegląd Statystyczny ◽

10.5604/01.3001.0014.1145 ◽

2016 ◽

Vol 63 (1) ◽

pp. 7-18 ◽

Cited By ~ 2

Author(s):

Marek Walesiak

Keyword(s):

Multidimensional Scaling ◽

R Package ◽

Normalization Method ◽

Data Matrix ◽

Data Normalization ◽

Normalization Methods ◽

Interval Ratio ◽

Normalization Function

In multidimensional scaling carried out on the basis of metric data matrix (interval, ratio) one of the stages is the choice of the variable normalization method. The R package clusterSim with data. Normalization function has been developed for that purpose. It provides 18 data normalization methods. In this paper the proposal of procedure which allows to isolate groups of normalization methods that lead to similar multidimensional scaling results were presented. The proposal can reduce the problem of choosing the normalization method in multidimensional scaling. The results are illustrated via empirical example.

Download Full-text

A New Visible Albedo Normalization Method: Quasi-Lambertian Surface Adjustment

Journal of Atmospheric and Oceanic Technology ◽

10.1175/jtech-d-11-00191.1 ◽

2012 ◽

Vol 29 (4) ◽

pp. 589-596 ◽

Cited By ~ 5

Author(s):

Xiao-yong Zhuge ◽

Fan Yu ◽

Ye Wang

Keyword(s):

Satellite Data ◽

Zenith Angle ◽

Normalization Method ◽

Radiative Transfer Model ◽

Time Range ◽

Transfer Model ◽

Image Brightness ◽

Normalization Methods ◽

Geostationary Meteorological Satellite ◽

Cloud Tops

Abstract A new visible (VIS; 0.55–0.9 μm) albedo normalization method, that is, the quasi-Lambertian surface adjustment (QLSA), is developed herein by using the geostationary meteorological satellite data and radiative transfer model. Taking the variation of relative locations between the sun, satellite, and clouds into account, the QLSA effectively reduces the inconsistencies in the VIS image brightness caused by the Lambertian surface approximation to cloud tops (i.e., the reflection characteristic is isotropic). The evaluation, using Chinese and Japanese geostationary satellite data, shows that the QLSA is more effective and accurate than three other albedo normalization methods currently in use. The new algorithm is applicable in regions with solar zenith angle and satellite zenith angle less than 60°, which, in the summertime, approximately corresponds to the time range from 0800 to 1600 local time (LT).

Download Full-text

GMPR: A robust normalization method for zero-inflated count data with application to microbiome sequencing data

10.7287/peerj.preprints.3417 ◽

2018 ◽

Author(s):

Li Chen ◽

James Reeve ◽

Lujun Zhang ◽

Shengbing Huang ◽

Xuefeng Wang ◽

...

Keyword(s):

Normalization Method ◽

Rna Seq ◽

Sequencing Data ◽

Data Simulation ◽

Vast Number ◽

Number Of Zeros ◽

Normalization Methods ◽

Under Sampling ◽

Microbiome Data ◽

Sequencing Data Analysis

Normalization is the first critical step in microbiome sequencing data analysis used to account for variable library sizes. Current RNA-Seq based normalization methods that have been adapted for microbiome data fail to consider the unique characteristics of microbiome data, which contain a vast number of zeros due to the physical absence or under-sampling of the microbes. Normalization methods that specifically address the zero inflation remain largely undeveloped. Here we propose GMPR - a simple but effective normalization method - for zero-inflated sequencing data such as microbiome data. Simulation studies and real datasets analyses demonstrate that the proposed method is more robust than competing methods, leading to more powerful detection of differentially abundant taxa and higher reproducibility of the relative abundances of taxa.

Download Full-text

NOREVA: enhanced normalization and evaluation of time-course and multi-class metabolomic data

Nucleic Acids Research ◽

10.1093/nar/gkaa258 ◽

2020 ◽

Vol 48 (W1) ◽

pp. W436-W448 ◽

Cited By ~ 12

Author(s):

Qingxia Yang ◽

Yunxia Wang ◽

Ying Zhang ◽

Fengcheng Li ◽

Weiqi Xia ◽

...

Keyword(s):

Time Course ◽

Case Control ◽

Normalization Method ◽

Case Control Studies ◽

Combination Strategy ◽

Normalization Methods ◽

Metabolomic Data ◽

Benchmark Datasets ◽

Version 2.0 ◽

Clear Shift

Abstract Biological processes (like microbial growth & physiological response) are usually dynamic and require the monitoring of metabolic variation at different time-points. Moreover, there is clear shift from case-control (N=2) study to multi-class (N>2) problem in current metabolomics, which is crucial for revealing the mechanisms underlying certain physiological process, disease metastasis, etc. These time-course and multi-class metabolomics have attracted great attention, and data normalization is essential for removing unwanted biological/experimental variations in these studies. However, no tool (including NOREVA 1.0 focusing only on case-control studies) is available for effectively assessing the performance of normalization method on time-course/multi-class metabolomic data. Thus, NOREVA was updated to version 2.0 by (i) realizing normalization and evaluation of both time-course and multi-class metabolomic data, (ii) integrating 144 normalization methods of a recently proposed combination strategy and (iii) identifying the well-performing methods by comprehensively assessing the largest set of normalizations (168 in total, significantly larger than those 24 in NOREVA 1.0). The significance of this update was extensively validated by case studies on benchmark datasets. All in all, NOREVA 2.0 is distinguished for its capability in identifying well-performing normalization method(s) for time-course and multi-class metabolomics, which makes it an indispensable complement to other available tools. NOREVA can be accessed at https://idrblab.org/noreva/.

Download Full-text