Simultaneous Inference for High-Dimensional Approximate Factor Model

Yong Wang; Xiao Guo

doi:10.3390/e22111258

Simultaneous Inference for High-Dimensional Approximate Factor Model

Entropy ◽

10.3390/e22111258 ◽

2020 ◽

Vol 22 (11) ◽

pp. 1258

Author(s):

Yong Wang ◽

Xiao Guo

Keyword(s):

Multiple Testing ◽

Factor Model ◽

Real Data ◽

Independent Random Variables ◽

Simultaneous Inference ◽

Test Statistic ◽

Asymptotic Size ◽

Power Of The Test ◽

Discrepancy Measure ◽

Approximate Factor Model

This paper studies simultaneous inference for factor loadings in the approximate factor model. We propose a test statistic based on the maximum discrepancy measure. Taking advantage of the fact that the test statistic can be approximated by the sum of the independent random variables, we develop a multiplier bootstrap procedure to calculate the critical value, and demonstrate the asymptotic size and power of the test. Finally, we apply our result to multiple testing problems by controlling the family-wise error rate (FWER). The conclusions are confirmed by simulations and real data analysis.

Download Full-text

Properties and Evaluation of the MOBIT – a novel Linkage-based Test Statistic and Quantification Method for Imprinting

Statistical Applications in Genetics and Molecular Biology ◽

10.1515/sagmb-2018-0025 ◽

2019 ◽

Vol 18 (4) ◽

Author(s):

Markus Brugger ◽

Michael Knapp ◽

Konstantin Strauch

Keyword(s):

Type I Error ◽

Real Data ◽

Specific Marker ◽

Type I ◽

Test Statistic ◽

Multipoint Analysis ◽

P Values ◽

House Dust Mite Allergy ◽

Power Of The Test ◽

Mite Allergy

Abstract Genomic imprinting is a parent-of-origin effect apparent in an appreciable number of human diseases. We have proposed the new imprinting test statistic MOBIT, which is based on MOD score analysis. We were interested in the properties of the MOBIT concerning its distribution under three hypotheses: (1) H0, a: no linkage, no imprinting; (2) H0, b: linkage, no imprinting; (3) H1: linkage and imprinting. More specifically, we assessed the confounding between imprinting and sex-specific recombination frequencies, which presents a major difficulty in linkage-based testing for imprinting, and evaluated the power of the test. To this end, we have performed a linkage simulation study of affected sib-pairs and a three-generation pedigree with two trait models, many two- and multipoint marker scenarios, three genetic map ratios, two sample sizes, and five imprinting degrees. We also investigated the ability of the MOBIT to quantify the degree of imprinting and applied the MOBIT using a real data example on house dust mite allergy. We further proposed and evaluated two approaches to obtain empiric p values for the MOBIT. Our results showed that twopoint analyses assuming a sex-averaged marker map led to an inflated type I error due to confounding, especially for a larger marker-trait locus distance. When the correct sex-specific marker map was assumed, twopoint analyses have a reduced power to detect imprinting, compared to sex-averaged analyses with an appropriate correction for the inflation of the test statistic. However, confounding was not an issue in multipoint analysis unless the map ratio was extreme and marker spacing was sparse. With multipoint analysis, power as well as the ability to quantify the imprinting degree were almost equally high when a sex-averaged or the correct sex-specific map was used in the analysis. We recommend to obtain empiric p values for the MOBIT using genotype simulations based on the best-fitting nonimprinting model of the real dataset analysis. In addition, an implementation of a method based on the permutation of parental sexes is also available. In summary, we propose to perform multipoint analyses using densely spaced markers to efficiently discover new imprinted loci and to reliably quantify the degree of imprinting.

Download Full-text

Monitoring Persistence Change in Heavy-Tailed Observations

Symmetry ◽

10.3390/sym13060936 ◽

2021 ◽

Vol 13 (6) ◽

pp. 936

Author(s):

Dan Wang

Keyword(s):

Kernel Method ◽

Alternative Hypothesis ◽

Null Distribution ◽

Real Data ◽

Ratio Test ◽

Finite Sample ◽

Test Statistic ◽

Bootstrap Approximation ◽

Heavy Tailed ◽

Better Than

In this paper, a ratio test based on bootstrap approximation is proposed to detect the persistence change in heavy-tailed observations. This paper focuses on the symmetry testing problems of I(1)-to-I(0) and I(0)-to-I(1). On the basis of residual CUSUM, the test statistic is constructed in a ratio form. I prove the null distribution of the test statistic. The consistency under alternative hypothesis is also discussed. However, the null distribution of the test statistic contains an unknown tail index. To address this challenge, I present a bootstrap approximation method for determining the rejection region of this test. Simulation studies of artificial data are conducted to assess the finite sample performance, which shows that our method is better than the kernel method in all listed cases. The analysis of real data also demonstrates the excellent performance of this method.

Download Full-text

A test for fuzzy exponentiality based on Kullback-Leibler information

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-202555 ◽

2021 ◽

pp. 1-8

Author(s):

Lingtao Kong

Keyword(s):

Biological Sciences ◽

Monte Carlo ◽

Goodness Of Fit ◽

Experimental Studies ◽

Real Data ◽

Test Statistics ◽

Test Statistic ◽

Goodness Of Fit Test ◽

Higher Power ◽

Leibler Information

The exponential distribution has been widely used in engineering, social and biological sciences. In this paper, we propose a new goodness-of-fit test for fuzzy exponentiality using α-pessimistic value. The test statistics is established based on Kullback-Leibler information. By using Monte Carlo method, we obtain the empirical critical points of the test statistic at four different significant levels. To evaluate the performance of the proposed test, we compare it with four commonly used tests through some simulations. Experimental studies show that the proposed test has higher power than other tests in most cases. In particular, for the uniform and linear failure rate alternatives, our method has the best performance. A real data example is investigated to show the application of our test.

Download Full-text

Testing in Nonparametric Accelerated Life Time Models

Austrian Journal of Statistics ◽

10.17713/ajs.v37i1.289 ◽

2016 ◽

Vol 37 (1) ◽

Cited By ~ 1

Author(s):

Hannelore Liero

Keyword(s):

Limit Distribution ◽

Goodness Of Fit ◽

Bootstrap Method ◽

Life Time ◽

Test Statistic ◽

Goodness Of Fit Test ◽

Time Model ◽

Power Of The Test ◽

Accelerated Life ◽

Type Test

A goodness-of-fit test for testing the acceleration function in a nonparametric life time model is proposed. For this aim the limit distribution of an L2-type test statistic is derived. Furthermore, a bootstrap method is considered and the power of the test is studied.

Download Full-text

An approach to gene-based testing accounting for dependence of tests among nearby genes

10.1101/2021.05.24.445494 ◽

2021 ◽

Author(s):

Ronald J Yurko ◽

Kathryn Roeder ◽

Bernie Devlin ◽

Max G'Sell

Keyword(s):

Multiple Testing ◽

Association Studies ◽

Autism Spectrum ◽

P Value ◽

Genome Wide Association Studies ◽

Strongly Correlated ◽

Test Statistics ◽

Test Statistic ◽

Genome Wide ◽

Insight Into

In genome-wide association studies (GWAS), it has become commonplace to test millions of SNPs for phenotypic association. Gene-based testing can improve power to detect weak signal by reducing multiple testing and pooling signal strength. While such tests account for linkage disequilibrium (LD) structure of SNP alleles within each gene, current approaches do not capture LD of SNPs falling in different nearby genes, which can induce correlation of gene-based test statistics. We introduce an algorithm to account for this correlation. When a gene's test statistic is independent of others, it is assessed separately; when test statistics for nearby genes are strongly correlated, their SNPs are agglomerated and tested as a locus. To provide insight into SNPs and genes driving association within loci, we develop an interactive visualization tool to explore localized signal. We demonstrate our approach in the context of weakly powered GWAS for autism spectrum disorder, which is contrasted to more highly powered GWAS for schizophrenia and educational attainment. To increase power for these analyses, especially those for autism, we use adaptive p-value thresholding (AdaPT), guided by high-dimensional metadata modeled with gradient boosted trees, highlighting when and how it can be most useful. Notably our workflow is based on summary statistics.

Download Full-text

Reproducibility-optimized detection of differential DNA methylation

Epigenomics ◽

10.2217/epi-2019-0289 ◽

2020 ◽

Vol 12 (9) ◽

pp. 747-755

Author(s):

Veronika Suni ◽

Fatemeh Seyednasrollah ◽

Bishwa Ghimire ◽

Sini Junttila ◽

Asta Laiho ◽

...

Keyword(s):

Dna Methylation ◽

High Throughput Sequencing ◽

State Of The Art ◽

Methylation Status ◽

Real Data ◽

Epigenetic Mechanism ◽

Methylation Analysis ◽

Test Statistic ◽

Differentially Methylated Regions ◽

Dna Methylation Analysis

Aim: DNA methylation is a key epigenetic mechanism regulating gene expression. Identifying differentially methylated regions is integral to DNA methylation analysis and there is a need for robust tools reliably detecting regions with significant differences in their methylation status. Materials & methods: We present here a reproducibility-optimized test statistic (ROTS) for detection of differential DNA methylation from high-throughput sequencing or array-based data. Results: Using both simulated and real data, we demonstrate the ability of ROTS to identify differential methylation between sample groups. Conclusion: Compared with state-of-the-art methods, ROTS shows competitive sensitivity and specificity in detecting consistently differentially methylated regions.

Download Full-text

Joint variable selection and network modeling for detecting eQTLs

Statistical Applications in Genetics and Molecular Biology ◽

10.1515/sagmb-2019-0032 ◽

2020 ◽

Vol 19 (1) ◽

Author(s):

Xuan Cao ◽

Lili Ding ◽

Tesfaye B. Mersha

Keyword(s):

Variable Selection ◽

Multiple Testing ◽

Bayes Factor ◽

Graph Model ◽

Real Data ◽

Bayesian Regression ◽

Joint Estimation ◽

Eqtl Analysis ◽

Multiple Testing Correction ◽

Joint Variable

AbstractIn this study, we conduct a comparison of three most recent statistical methods for joint variable selection and covariance estimation with application of detecting expression quantitative trait loci (eQTL) and gene network estimation, and introduce a new hierarchical Bayesian method to be included in the comparison. Unlike the traditional univariate regression approach in eQTL, all four methods correlate phenotypes and genotypes by multivariate regression models that incorporate the dependence information among phenotypes, and use Bayesian multiplicity adjustment to avoid multiple testing burdens raised by traditional multiple testing correction methods. We presented the performance of three methods (MSSL – Multivariate Spike and Slab Lasso, SSUR – Sparse Seemingly Unrelated Bayesian Regression, and OBFBF – Objective Bayes Fractional Bayes Factor), along with the proposed, JDAG (Joint estimation via a Gaussian Directed Acyclic Graph model) method through simulation experiments, and publicly available HapMap real data, taking asthma as an example. Compared with existing methods, JDAG identified networks with higher sensitivity and specificity under row-wise sparse settings. JDAG requires less execution in small-to-moderate dimensions, but is not currently applicable to high dimensional data. The eQTL analysis in asthma data showed a number of known gene regulations such as STARD3, IKZF3 and PGAP3, all reported in asthma studies. The code of the proposed method is freely available at GitHub (https://github.com/xuan-cao/Joint-estimation-for-eQTL).

Download Full-text

An Omnibus Test for Systematic Changes in Judges’ Rankings

Journal of Educational Statistics ◽

10.3102/10769986017001001 ◽

1992 ◽

Vol 17 (1) ◽

pp. 1-26

Author(s):

Douglas E. Critchlow ◽

Joseph S. Verducci

Keyword(s):

Literary Criticism ◽

Null Hypothesis ◽

Statistical Test ◽

Post Treatment ◽

Graphical Methods ◽

Test Results ◽

Test Statistic ◽

Omnibus Test ◽

Null Distributions ◽

Power Of The Test

Paired rankings arise when each subject in a study independently ranks a set of items, undergoes a treatment, and afterwards ranks the same set of items. For such data, a statistical test is proposed to detect if the subjects’ posttreatment rankings have moved systematically toward some unknown ranking or set of rankings. The null hypothesis for this test is that each subject’s post-treatment ranking is symmetrically distributed about his pretreatment ranking. The exact and asymptotic null distributions of the test statistic are simulated and compared, and the power of the test is studied. Using paired rankings from an experimental course in literary criticism, we also offer some graphical methods for representing such data that help us to interpret the test results.

Download Full-text

Common risk difference test and interval estimation of risk difference for stratified bilateral correlated data

Statistical Methods in Medical Research ◽

10.1177/0962280218781988 ◽

2018 ◽

Vol 28 (8) ◽

pp. 2418-2438

Author(s):

Xi Shen ◽

Chang-Xing Ma ◽

Kam C Yuen ◽

Guo-Liang Tian

Keyword(s):

Data Analysis ◽

Confidence Interval ◽

Score Test ◽

Real Data ◽

Risk Difference ◽

Error Rates ◽

Correlated Data ◽

Test Statistic ◽

Data Set ◽

Intra Class Correlation

Bilateral correlated data are often encountered in medical researches such as ophthalmologic (or otolaryngologic) studies, in which each unit contributes information from paired organs to the data analysis, and the measurements from such paired organs are generally highly correlated. Various statistical methods have been developed to tackle intra-class correlation on bilateral correlated data analysis. In practice, it is very important to adjust the effect of confounder on statistical inferences, since either ignoring the intra-class correlation or confounding effect may lead to biased results. In this article, we propose three approaches for testing common risk difference for stratified bilateral correlated data under the assumption of equal correlation. Five confidence intervals of common difference of two proportions are derived. The performance of the proposed test methods and confidence interval estimations is evaluated by Monte Carlo simulations. The simulation results show that the score test statistic outperforms other statistics in the sense that the former has robust type [Formula: see text] error rates with high powers. The score confidence interval induced from the score test statistic performs satisfactorily in terms of coverage probabilities with reasonable interval widths. A real data set from an otolaryngologic study is used to illustrate the proposed methodologies.

Download Full-text

A novel gene-set association test based on variance-gamma distribution

Statistical Methods in Medical Research ◽

10.1177/0962280218791205 ◽

2018 ◽

Vol 28 (9) ◽

pp. 2868-2875

Author(s):

Zhongxue Chen ◽

Qingzhong Liu ◽

Kai Wang

Keyword(s):

Gamma Distribution ◽

Type I Error ◽

Null Distribution ◽

Real Data ◽

Association Test ◽

P Value ◽

Type I ◽

Test Statistic ◽

Data Set ◽

Variance Gamma

Several gene- or set-based association tests have been proposed recently in the literature. Powerful statistical approaches are still highly desirable in this area. In this paper we propose a novel statistical association test, which uses information of the burden component and its complement from the genotypes. This new test statistic has a simple null distribution, which is a special and simplified variance-gamma distribution, and its p-value can be easily calculated. Through a comprehensive simulation study, we show that the new test can control type I error rate and has superior detecting power compared with some popular existing methods. We also apply the new approach to a real data set; the results demonstrate that this test is promising.

Download Full-text