scholarly journals When to use Quantile Normalization?

2014 ◽  
Author(s):  
Stephanie C. Hicks ◽  
Rafael A. Irizarry

Normalization and preprocessing are essential steps for the analysis of high-throughput data including next-generation sequencing and microarrays. Multi-sample global normalization methods, such as quantile normalization, have been successfully used to remove technical variation from noisy data. These methods rely on the assumption that observed global changes across samples are due to unwanted technical variability. Transforming the data to remove these differences has the potential to remove interesting biologically driven global variation and therefore may not be appropriate depending on the type and source of variation. Currently, it is up to the subject matter experts, for example biologists, to determine if the stated assumptions are appropriate or not. Here, we propose a data-driven method to test for the assumptions of global normalization methods. We demonstrate the utility of our method (quantro), by applying it to multiple gene expression and DNA methylation and show examples of when global normalization methods are not appropriate. We also perform a Monte Carlo simulation study to illustrate how our method generally outperforms the current approach. An R-package implementing our method is available on Bioconductor (http://www.bioconductor.org/packages/release/bioc/html/quantro.html).

2016 ◽  
Author(s):  
Stephanie C Hicks ◽  
Kwame Okrah ◽  
Joseph N Paulson ◽  
John Quackenbush ◽  
Rafael A Irizarry ◽  
...  

AbstractBetween-sample normalization is a critical step in genomic data analysis to remove systematic bias and unwanted technical variation in high-throughput data. Global normalization methods are based on the assumption that observed variability in global properties is due to technical reasons and are unrelated to the biology of interest. For example, some methods correct for differences in sequencing read counts by scaling features to have similar median values across samples, but these fail to reduce other forms of unwanted technical variation. Methods such as quantile normalization transform the statistical distributions across samples to be the same and assume global differences in the distribution are induced by only technical variation. However, it remains unclear how to proceed with normalization if these assumptions are violated, for example if there are global differences in the statistical distributions between biological conditions or groups, and external information, such as negative or control features, is not available. Here we introduce a generalization of quantile normalization, referred to as smooth quantile normalization (qsmooth), which is based on the assumption that the statistical distribution of each sample should be the same (or have the same distributional shape) within biological groups or conditions, but allowing that they may differ between groups. We illustrate the advantages of our method on several high-throughput datasets with global differences in distributions corresponding to different biological conditions. We also perform a Monte Carlo simulation study to illustrate the bias-variance tradeoff of qsmooth compared to other global normalization methods. A software implementation is available from https://github.com/stephaniehicks/qsmooth.


2020 ◽  
Vol 65 (1) ◽  
pp. 17-26
Author(s):  
Gergely Olt ◽  
Adrienne Csizmady

AbstractThe growth of the tourism and hospitality industry played an important role in the gentrification of the post-socialist city of Budapest. Although disinvestment was present, reinvestment was moderate for decades after 1989. Privatisation of individual tenancies and the consequent fragmented ownership structure of heritage buildings made refurbishment and reinvestment less profitable. Because of local contextual factors and global changes in consumption habits, the function of the dilapidated 19th century housing stock transformed in the 2000s, and the residential neighbourhood which was the subject of the research turned into the so called ‘party district’. The process was followed in our ongoing field research. The functional change made possible speculative investment in inner city housing and played a major role in the commodification of the disinvested housing stock.


2020 ◽  
Vol 7 (1) ◽  
pp. 1-11
Author(s):  
Izzah Tiari ◽  
Zulkardi Zulkardi ◽  
Sardianto Markos Siahaan

E-learning berbasis Chamilo telah berhasil dikembangkan pada pembelajaran simulasi dan komunikasi digital di SMK Negeri 5 Palembang. Penelitian bertujuan untuk mengetahui kevalidan, kepraktisan, dan efektifitas e-learning terhadap mata pelajaran fitur kolaboratif daring. Tahapan Penelitian pengembangan ini terdiri dari tahap perencanaan, desain, dan pengembangan. E-learning yang telah dikembangkan kemudian diuji kevalidan oleh ahli, diuji kepraktisan oleh peserta didik, dan diuji efektifitas dengan mengimplementasikan e-learning di SMK Negeri 5 Palembang. Pengembangan e-learning menggunakan model Alessi dan Trollip. Dari penelitian ini didapatkan bahwa 1.) E-learning teruji kevalidan dengan penilaian oleh ahli media terhadap e-learning sebesar 89,47%, penilaian oleh ahli materi sebesar 84,62%, dan penilaian ahli desain pembelajaran sebesar 81,82%; 2.) E-learning teruji kepraktisannya dengan penilaian dari 3 peserta didik tingkat rendah, sedang, dan atas dengan rata-rata sebesar 86,10%; dan 3.) E-learning teruji efektifitas meningkatkan hasil belajar peserta didik dilihat dari adanya gain sebesar 0,70 yang termasuk kedalam kategori tinggi.AbstractE-Learning of Chamilo based has been developed for materials simulation and communication digital in SMK Negeri 5 Palembang. The research aims to know validity, practicality, and effectiveness e-learning features on the subject matter of online collaboration features. This research development stage consists of the plan, design, and development phase. E-learning features that have been developed and then tested validity by experts, tested practicality by students, and tested for effectiveness by implementation e-learning in SMK Negeri 5 Palembang. The development of e-learning used the Alessi and Trollip model. From this study, it was found that 1.) The assessment e-learning features have been validity tested by media experts at 89.47%, by subject matter experts of 84,62%, and by learning design experts of 81,82%; 2.) The assessment of e-learning features has been practicality tested by three students’ low level, medium, and high an average by 86,10%; and 3.) E-learning features have been effectively tested to improving students learning outcomes seen from a gain of 0.70 included in the high category.


2021 ◽  
Vol 11 (3-4) ◽  
pp. 181-195
Author(s):  
Anetta Jedličková

Abstract The current coronavirus disease 2019 (COVID-19) pandemic has led to essential adjustments in clinical research involving human subjects. The pandemic is substantially affecting most procedures of ongoing, as well as new clinical trials related to diseases other than COVID-19. Procedural changes and study protocol modifications may significantly impact ethically salient fundamentals, such as the risk-benefit profile and safety of clinical trial participants, which raise key ethical challenges the subject-matter experts must face. This article aims to acquaint a wide audience of clinical research professionals, ethicists, as well as the general public interested in this topic with the legal, ethical and practical considerations in the field of clinical trials during the COVID-19 pandemic and to support the clinical researchers and study sponsors to fulfil their responsibilities in conducting clinical trials in a professional way that does not conflict with any legal or ethical obligations.


2019 ◽  
Vol 36 (8) ◽  
pp. 2587-2588 ◽  
Author(s):  
Christopher M Ward ◽  
Thu-Hien To ◽  
Stephen M Pederson

Abstract Motivation High throughput next generation sequencing (NGS) has become exceedingly cheap, facilitating studies to be undertaken containing large sample numbers. Quality control (QC) is an essential stage during analytic pipelines and the outputs of popular bioinformatics tools such as FastQC and Picard can provide information on individual samples. Although these tools provide considerable power when carrying out QC, large sample numbers can make inspection of all samples and identification of systemic bias a challenge. Results We present ngsReports, an R package designed for the management and visualization of NGS reports from within an R environment. The available methods allow direct import into R of FastQC reports along with outputs from other tools. Visualization can be carried out across many samples using default, highly customizable plots with options to perform hierarchical clustering to quickly identify outlier libraries. Moreover, these can be displayed in an interactive shiny app or HTML report for ease of analysis. Availability and implementation The ngsReports package is available on Bioconductor and the GUI shiny app is available at https://github.com/UofABioinformaticsHub/shinyNgsreports. Supplementary information Supplementary data are available at Bioinformatics online.


2005 ◽  
Vol 44 (03) ◽  
pp. 414-417 ◽  
Author(s):  
M. Neuhäuser ◽  
T. Boes

Summary Objectives: The high density oligonucleotide micro-arrays from Affymetrix (Affymetrix GeneChips) are very popular in biomedical research. They enable to study the expression of thousands of genes simultaneously. In experiments with multiple arrays, normalization techniques are used to reduce the so-called obscuring variation, i.e. the technical variation that is of non-biological origin. Several different normalization methods have been proposed during the last years. Methods: We review published results about the comparison of normalization methods proposed for Affymetrix GeneChips. Results: The quantile normalization seems to perform favorably regarding precision (low variance), accuracy (low bias), and practicability (low computing time). However, according to very recent results [1], this normalization method can have an impact on the biological variability and, therefore, appears to be less than optimal from this point of view. Conclusion: Although the quantile normalization may be recommendable, more investigations based on more data sets are needed so that the different normalization methods can be evaluated on widely differing data.


2010 ◽  
Vol 76 (12) ◽  
pp. 3863-3868 ◽  
Author(s):  
J. Kirk Harris ◽  
Jason W. Sahl ◽  
Todd A. Castoe ◽  
Brandie D. Wagner ◽  
David D. Pollock ◽  
...  

ABSTRACT Constructing mixtures of tagged or bar-coded DNAs for sequencing is an important requirement for the efficient use of next-generation sequencers in applications where limited sequence data are required per sample. There are many applications in which next-generation sequencing can be used effectively to sequence large mixed samples; an example is the characterization of microbial communities where ≤1,000 sequences per samples are adequate to address research questions. Thus, it is possible to examine hundreds to thousands of samples per run on massively parallel next-generation sequencers. However, the cost savings for efficient utilization of sequence capacity is realized only if the production and management costs associated with construction of multiplex pools are also scalable. One critical step in multiplex pool construction is the normalization process, whereby equimolar amounts of each amplicon are mixed. Here we compare three approaches (spectroscopy, size-restricted spectroscopy, and quantitative binding) for normalization of large, multiplex amplicon pools for performance and efficiency. We found that the quantitative binding approach was superior and represents an efficient scalable process for construction of very large, multiplex pools with hundreds and perhaps thousands of individual amplicons included. We demonstrate the increased sequence diversity identified with higher throughput. Massively parallel sequencing can dramatically accelerate microbial ecology studies by allowing appropriate replication of sequence acquisition to account for temporal and spatial variations. Further, population studies to examine genetic variation, which require even lower levels of sequencing, should be possible where thousands of individual bar-coded amplicons are examined in parallel.


Author(s):  
Anthony R. Mundy ◽  
Daniela E. Andrich

Urethral strictures are common and almost all urologists will deal with them on a regular if not daily basis. They have always been common and the history of the subject stretches back to 3,000 BC. Urethral dilators have been found in the tombs of the pharaohs so that they might be able to catheterize themselves or dilate their own strictures in the afterlife. Urethrotomy and dilatation are two of the most frequently performed procedures in urology. But these are usually only palliative, and curative treatment by urethroplasty is performed by very few urologists. In part this is because most strictures are bulbar strictures and most non-bulbar strictures are seen only by reconstructive urologists; but in part this represents a somewhat ambivalent attitude of most urologists to urethral stricture disease. In this chapter, we will attempt to clarify the current approach to this problem.


Nutrients ◽  
2019 ◽  
Vol 11 (2) ◽  
pp. 407 ◽  
Author(s):  
Aaron Mehus ◽  
Aaron Dickey ◽  
Timothy Smith ◽  
Kathleen Yeater ◽  
Matthew Picklo

Dietary n-3 polyunsaturated fatty acids (PUFA) influence postnatal brain growth and development. However, little data exist regarding the impacts of dietary n-3 PUFA in juvenile animals post weaning, which is a time of rapid growth. We tested the hypothesis that depleting dietary n-3 PUFA would result in modifications to the cerebellar transcriptome of juvenile rats. To test this hypothesis, three week old male rats (an age that roughly corresponds to an 11 month old child in brain development) were fed diets containing either soybean oil (SO) providing 1.1% energy from α-linolenic acid (ALA; 18:3n-3; ALA-sufficient) or corn oil (CO) providing 0.13% energy from ALA (ALA-deficient) for four weeks. Fatty acids (FAs) in the cerebellum were analyzed and revealed a 4-fold increase in n-6 docosapentaenoic acid (DPA; 22:5n-6), increases in arachidonic acid (AA; 20:4n-6) and docosatetraenoic acid (DTA; 22:4n-6), but no decrease in docosahexaenoic acid (DHA; 22:6n-3), in animals fed CO versus SO. Transcript abundance was then characterized to identify differentially expressed genes (DEGs) between the two diets. Upper quartile (UQ) scaling and transcripts per million (TPM) data normalization identified 100 and 107 DEGs, respectively. Comparison of DEGs from the two normalization methods identified 70 genes that overlapped, with 90% having abundance differences less than 2-fold. Nr4a3, a transcriptional activator that plays roles in neuroprotection and learning, was elevated over 2-fold from the CO diet. These data indicate that expression of Nr4a3 in the juvenile rat cerebellum is responsive to dietary n-3 PUFA, but additional studies are needed clarify the neurodevelopmental relationships between n-3 PUFA and Nr4a3 and the resulting impacts.


Sign in / Sign up

Export Citation Format

Share Document