scholarly journals Reporting correct p-values in VEGAS analyses

2017 ◽  
Author(s):  
Julian Hecker ◽  
Anna Maaser ◽  
Dmitry Prokopenko ◽  
Heide Loehlein Fier ◽  
Christoph Lange

AbstractVEGAS (versatile gene-based association study) is a popular methodological framework to perform gene-based tests based on summary statistics from single-variant analyses. The approach incorporates linkage disequilibrium information from reference panels to account for the correlation of test statistics. The gene-based test can utilize three different types of tests. In 2015, the improved framework VEGAS2, using more detailed reference panels, was published. Both versions provide user-friendly web- and offline-based tools for the analysis. However, the implementation of the popular top-percentage test is erroneous in both versions. The p-values provided by VEGAS2 are deflated/anti-conservative. Based on real data examples, we demonstrate that this can increase substantially the rate of false positive findings and can lead to inconsistencies between different test options. We also provide code that allows the user of VEGAS to compute correct p-values.

2017 ◽  
Vol 20 (3) ◽  
pp. 257-259 ◽  
Author(s):  
Julian Hecker ◽  
Anna Maaser ◽  
Dmitry Prokopenko ◽  
Heide Loehlein Fier ◽  
Christoph Lange

VEGAS (versatile gene-based association study) is a popular methodological framework to perform gene-based tests based on summary statistics from single-variant analyses. The approach incorporates linkage disequilibrium information from reference panels to account for the correlation of test statistics. The gene-based test can utilize three different types of tests. In 2015, the improved framework VEGAS2, using more detailed reference panels, was published. Both versions provide user-friendly web- and offline-based tools for the analysis. However, the implementation of the popular top-percentage test is erroneous in both versions. The p values provided by VEGAS2 are deflated/anti-conservative. Based on real data examples, we demonstrate that this can increase substantially the rate of false-positive findings and can lead to inconsistencies between different test options. We also provide code that allows the user of VEGAS to compute correct p values.


2020 ◽  
Vol 175 (2) ◽  
pp. 156-167 ◽  
Author(s):  
Kenny Crump ◽  
Edmund Crouch ◽  
Daniel Zelterman ◽  
Casey Crump ◽  
Joseph Haseman

Abstract Glyphosate is a widely used herbicide worldwide. In 2015, the International Agency for Research on Cancer (IARC) reviewed glyphosate cancer bioassays and human studies and declared that the evidence for carcinogenicity of glyphosate is sufficient in experimental animals. We analyzed 10 glyphosate rodent bioassays, including those in which IARC found evidence of carcinogenicity, using a multiresponse permutation procedure that adjusts for the large number of tumors eligible for statistical testing and provides valid false-positive probabilities. The test statistics for these permutation tests are functions of p values from a standard test for dose-response trend applied to each specific type of tumor. We evaluated 3 permutation tests, using as test statistics the smallest p value from a standard statistical test for dose-response trend and the number of such tests for which the p value is less than or equal to .05 or .01. The false-positive probabilities obtained from 2 implementations of these 3 permutation tests are: smallest p value: .26, .17; p values ≤ .05: .08, .12; and p values ≤ .01: .06, .08. In addition, we found more evidence for negative dose-response trends than positive. Thus, we found no strong evidence that glyphosate is an animal carcinogen. The main cause for the discrepancy between IARC’s finding and ours appears to be that IARC did not account for the large number of tumor responses analyzed and the increased likelihood that several of these would show statistical significance simply by chance. This work provides a more comprehensive analysis of the animal carcinogenicity data for this important herbicide than previously available.


2019 ◽  
Author(s):  
Rumen Manolov

The lack of consensus regarding the most appropriate analytical techniques for single-case experimental designs data requires justifying the choice of any specific analytical option. The current text mentions some of the arguments, provided by methodologists and statisticians, in favor of several analytical techniques. Additionally, a small-scale literature review is performed in order to explore if and how applied researchers justify the analytical choices that they make. The review suggests that certain practices are not sufficiently explained. In order to improve the reporting regarding the data analytical decisions, it is proposed to choose and justify the data analytical approach prior to gathering the data. As a possible justification for data analysis plan, we propose using as a basis the expected the data pattern (specifically, the expectation about an improving baseline trend and about the immediate or progressive nature of the intervention effect). Although there are multiple alternatives for single-case data analysis, the current text focuses on visual analysis and multilevel models and illustrates an application of these analytical options with real data. User-friendly software is also developed.


Econometrics ◽  
2021 ◽  
Vol 9 (1) ◽  
pp. 10
Author(s):  
Šárka Hudecová ◽  
Marie Hušková ◽  
Simos G. Meintanis

This article considers goodness-of-fit tests for bivariate INAR and bivariate Poisson autoregression models. The test statistics are based on an L2-type distance between two estimators of the probability generating function of the observations: one being entirely nonparametric and the second one being semiparametric computed under the corresponding null hypothesis. The asymptotic distribution of the proposed tests statistics both under the null hypotheses as well as under alternatives is derived and consistency is proved. The case of testing bivariate generalized Poisson autoregression and extension of the methods to dimension higher than two are also discussed. The finite-sample performance of a parametric bootstrap version of the tests is illustrated via a series of Monte Carlo experiments. The article concludes with applications on real data sets and discussion.


2020 ◽  
Vol 36 (12) ◽  
pp. 3913-3915
Author(s):  
Hemi Luan ◽  
Xingen Jiang ◽  
Fenfen Ji ◽  
Zhangzhang Lan ◽  
Zongwei Cai ◽  
...  

Abstract Motivation Liquid chromatography–mass spectrometry-based non-targeted metabolomics is routinely performed to qualitatively and quantitatively analyze a tremendous amount of metabolite signals in complex biological samples. However, false-positive peaks in the datasets are commonly detected as metabolite signals by using many popular software, resulting in non-reliable measurement. Results To reduce false-positive calling, we developed an interactive web tool, termed CPVA, for visualization and accurate annotation of the detected peaks in non-targeted metabolomics data. We used a chromatogram-centric strategy to unfold the characteristics of chromatographic peaks through visualization of peak morphology metrics, with additional functions to annotate adducts, isotopes and contaminants. CPVA is a free, user-friendly tool to help users to identify peak background noises and contaminants, resulting in decrease of false-positive or redundant peak calling, thereby improving the data quality of non-targeted metabolomics studies. Availability and implementation The CPVA is freely available at http://cpva.eastus.cloudapp.azure.com. Source code and installation instructions are available on GitHub: https://github.com/13479776/cpva. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Bing Song ◽  
August E. Woerner ◽  
John Planz

Abstract Background Multi-locus genotype data are widely used in population genetics and disease studies. In evaluating the utility of multi-locus data, the independence of markers is commonly considered in many genomic assessments. Generally, pairwise non-random associations are tested by linkage disequilibrium; however, the dependence of one panel might be triplet, quartet, or other. Therefore, a compatible and user-friendly software is necessary for testing and assessing the global linkage disequilibrium among mixed genetic data. Results This study describes a software package for testing the mutual independence of mixed genetic datasets. Mutual independence is defined as no non-random associations among all subsets of the tested panel. The new R package “mixIndependR” calculates basic genetic parameters like allele frequency, genotype frequency, heterozygosity, Hardy–Weinberg equilibrium, and linkage disequilibrium (LD) by mutual independence from population data, regardless of the type of markers, such as simple nucleotide polymorphisms, short tandem repeats, insertions and deletions, and any other genetic markers. A novel method of assessing the dependence of mixed genetic panels is developed in this study and functionally analyzed in the software package. By comparing the observed distribution of two common summary statistics (the number of heterozygous loci [K] and the number of share alleles [X]) with their expected distributions under the assumption of mutual independence, the overall independence is tested. Conclusion The package “mixIndependR” is compatible to all categories of genetic markers and detects the overall non-random associations. Compared to pairwise disequilibrium, the approach described herein tends to have higher power, especially when number of markers is large. With this package, more multi-functional or stronger genetic panels can be developed, like mixed panels with different kinds of markers. In population genetics, the package “mixIndependR” makes it possible to discover more about admixture of populations, natural selection, genetic drift, and population demographics, as a more powerful method of detecting LD. Moreover, this new approach can optimize variants selection in disease studies and contribute to panel combination for treatments in multimorbidity. Application of this approach in real data is expected in the future, and this might bring a leap in the field of genetic technology. Availability The R package mixIndependR, is available on the Comprehensive R Archive Network (CRAN) at: https://cran.r-project.org/web/packages/mixIndependR/index.html.


Author(s):  
Liangli Yang ◽  
Yongmei Su ◽  
Xinjian Zhuo

The outbreak of COVID-19 has a great impact on the world. Considering that there are different infection delays among different populations, which can be expressed as distributed delay, and the distributed time-delay is rarely used in fractional-order model to simulate the real data, here we establish two different types of fractional order (Caputo and Caputo–Fabrizio) COVID-19 models with distributed time-delay. Parameters are estimated by the least-square method according to the report data of China and other 12 countries. The results of Caputo and Caputo–Fabrizio model with distributed time-delay and without delay, the integer-order model with distributed delay are compared. These show that the fractional-order model can be better in fitting the real data. Moreover, Caputo order is better in short-term time fitting, Caputo–Fabrizio order is better in long-term fitting and prediction. Finally, the influence of several parameters is simulated in Caputo order model, which further verifies the importance of taking strict quarantine measures and paying close attention to the incubation period population.


Author(s):  
Lingtao Kong

The exponential distribution has been widely used in engineering, social and biological sciences. In this paper, we propose a new goodness-of-fit test for fuzzy exponentiality using α-pessimistic value. The test statistics is established based on Kullback-Leibler information. By using Monte Carlo method, we obtain the empirical critical points of the test statistic at four different significant levels. To evaluate the performance of the proposed test, we compare it with four commonly used tests through some simulations. Experimental studies show that the proposed test has higher power than other tests in most cases. In particular, for the uniform and linear failure rate alternatives, our method has the best performance. A real data example is investigated to show the application of our test.


Materials ◽  
2019 ◽  
Vol 12 (23) ◽  
pp. 4005 ◽  
Author(s):  
Angelats Lobo ◽  
Ginestra

The classic cell culture involves the use of support in two dimensions, such as a well plate or a Petri dish, that allows the culture of different types of cells. However, this technique does not mimic the natural microenvironment where the cells are exposed to. To solve that, three-dimensional bioprinting techniques were implemented, which involves the use of biopolymers and/or synthetic materials and cells. Because of a lack of information between data sources, the objective of this review paper is, to sum up, all the available information on the topic of bioprinting and to help researchers with the problematics with 3D bioprinters, such as the 3D-Bioplotter™. The 3D-Bioplotter™ has been used in the pre-clinical field since 2000 and could allow the printing of more than one material at the same time, and therefore to increase the complexity of the 3D structure manufactured. It is also very precise with maximum flexibility and a user-friendly and stable software that allows the optimization of the bioprinting process on the technological point of view. Different applications have resulted from the research on this field, mainly focused on regenerative medicine, but the lack of information and/or the possible misunderstandings between papers makes the reproducibility of the tests difficult. Nowadays, the 3D Bioprinting is evolving into another technology called 4D Bioprinting, which promises to be the next step in the bioprinting field and might promote great applications in the future.


Sign in / Sign up

Export Citation Format

Share Document