Evaluation of the statistical power for multiple tests: a case study

Adeline Yeo; Yongming Qu

doi:10.1002/pst.319

Transects, quadrats, or points? What is the best combination to get a precise estimation of a coral community?

10.1101/832790 ◽

2019 ◽

Author(s):

Luis M. Montilla ◽

Emy Miyazawa ◽

Alfredo Ascanio ◽

María López-Hernández ◽

Gloria Mariño-Briceño ◽

...

Keyword(s):

Community Structure ◽

Coral Reef ◽

Effect Size ◽

Standard Error ◽

Scleractinian Coral ◽

Statistical Power ◽

Coral Community ◽

Sampling Effort ◽

Precise Estimation

ABSTRACTThe characteristics of coral reef sampling and monitoring are highly variable, with numbers of units and sampling effort varying from one study to another. Numerous works have been carried out to determine an appropriate effect size through statistical power, however, always from a univariate perspective. In this work, we used the pseudo multivariate dissimilarity-based standard error (MultSE) approach to assess the precision of sampling scleractinian coral assemblages in reefs of Venezuela between 2017 and 2018 when using different combinations of number of transects, quadrats and points. For this, the MultSE of 36 sites previously sampled was estimated, using four 30m-transects with 15 photo-quadrats each and 25 random points per quadrat. We obtained that the MultSE was highly variable between sites and is not correlated with the univariate standard error nor with the richness of species. Then, a subset of sites was re-annotated using 100 uniformly distributed points, which allowed the simulation of different numbers of transects per site, quadrats per transect and points per quadrat using resampling techniques. The magnitude of the MultSE stabilized by adding more transects, however, adding more quadrats or points does not improve the estimate. For this case study, the error was reduced by half when using 10 transects, 10 quadrats per transect and 25 points per quadrat. We recommend the use of MultSE in reef monitoring programs, in particular when conducting pilot surveys to optimize the estimation of the community structure.

Download Full-text

Assessing the impact of introduced infrastructure at sea with cameras: A case study for spatial scale, time and statistical power

Marine Environmental Research ◽

10.1016/j.marenvres.2019.04.007 ◽

2019 ◽

Vol 147 ◽

pp. 126-137 ◽

Cited By ~ 5

Author(s):

Anthony W.J. Bicknell ◽

Emma V. Sheehan ◽

Brendan J. Godley ◽

Philip D. Doherty ◽

Matthew J. Witt

Keyword(s):

Spatial Scale ◽

Statistical Power ◽

The Impact

Download Full-text

The use of islands and cluster-randomized trials to investigate vector control interventions: a case study on the Bijagós archipelago, Guinea-Bissau

Philosophical Transactions of the Royal Society B Biological Sciences ◽

10.1098/rstb.2019.0807 ◽

2020 ◽

Vol 376 (1818) ◽

pp. 20190807 ◽

Cited By ~ 1

Author(s):

Robert T. Jones ◽

Elizabeth Pretorius ◽

Thomas H. Ant ◽

John Bradley ◽

Anna Last ◽

...

Keyword(s):

Vector Control ◽

Statistical Power ◽

Randomized Trials ◽

Control Strategies ◽

Cluster Randomized Trials ◽

Vector Borne Diseases ◽

Guinea Bissau ◽

Vector Borne ◽

Cluster Randomized

Vector-borne diseases threaten the health of populations around the world. While key interventions continue to provide protection from vectors, there remains a need to develop and test new vector control tools. Cluster-randomized trials, in which the intervention or control is randomly allocated to clusters, are commonly selected for such evaluations, but their design must carefully consider cluster size and cluster separation, as well as the movement of people and vectors, to ensure sufficient statistical power and avoid contamination of results. Island settings present an opportunity to conduct these studies. Here, we explore the benefits and challenges of conducting intervention studies on islands and introduce the Bijagós archipelago of Guinea-Bissau as a potential study site for interventions intended to control vector-borne diseases. This article is part of the theme issue ‘Novel control strategies for mosquito-borne diseases'.

Download Full-text

Impact of genotyping errors on statistical power of association tests in genomic analyses: A case study

Genetic Epidemiology ◽

10.1002/gepi.22027 ◽

2016 ◽

Vol 41 (2) ◽

pp. 152-162 ◽

Cited By ~ 6

Author(s):

Lin Hou ◽

Ning Sun ◽

Shrikant Mane ◽

Fred Sayward ◽

Nallakkandi Rajeevan ◽

...

Keyword(s):

Statistical Power ◽

Genotyping Errors ◽

Association Tests ◽

Genomic Analyses

Download Full-text

Effectiveness and cost-effectiveness of TeleStroke consultations to support the care of patients who had a stroke presenting to regional emergency departments in Western Australia: an economic evaluation case study protocol

BMJ Open ◽

10.1136/bmjopen-2020-043836 ◽

2021 ◽

Vol 11 (1) ◽

pp. e043836

Author(s):

Christina Tsou ◽

Suzanne Robinson ◽

James Boyd ◽

Shruthi Kamath ◽

Justin Yeung ◽

...

Keyword(s):

Cost Effectiveness ◽

Western Australia ◽

Statistical Power ◽

Stroke Care ◽

Service Development ◽

Tree Model ◽

Human Research ◽

Human Research Ethics ◽

The Impact

IntroductionThe Western Australia (WA) Acute TeleStroke Programme commenced incrementally across regional WA during 2016–2017. Since the introduction of the TeleStroke Programme, there has been monitoring of service outputs, including regional patient access to tertiary stroke specialist advice and reperfusion treatment; however, the impact of consultation with a stroke specialist via telehealth (videoconferencing or telephone) on the effectiveness and cost-effectiveness of stroke care and the drivers of cost-effectiveness has not been systematically evaluated.Methods and analysisThe aim of the case study was to examine the impact of consultation with a stroke specialist via telehealth on the effectiveness and cost-effectiveness of stroke and transient ischaemic attack care using a mixed methods approach. A categorical decision tree model will be constructed in collaboration with clinicians and programme managers. A before and after comparison using state-wide administrative datasets will be used to run the base model. If sample size and statistical power permits, the cases and comparators will be matched by stroke type and presence of CT scan at the initial site of presentation, age category and presenting hospital. The drivers of cost-effectiveness will be explored through stakeholder interviews. Data from the qualitative analysis will be cross-referenced with trends emerging from the quantitative dataset and used to guide the factors to be involved in subgroup and sensitivity analysis.Ethics and disseminationEthics approval for this case study has been granted from the Western Australian Country Health Service Human Research and Ethics Committee (RGS3076). Reciprocal approval has been granted from Curtin University Human Research Ethics Office (HRE2019-0740). Findings will be disseminated publicly through conference presentation and peer-review publications. Interim findings will be released as internal reports to inform the service development.

Download Full-text

Pisum sativum has no competitive responses to neighbours: a case study in reproducible ecology

10.1101/2020.09.29.318550 ◽

2020 ◽

Author(s):

Mariah L Mobley ◽

Audrey S Kruse ◽

Gordon G McNickle

Keyword(s):

Pisum Sativum ◽

Statistical Power ◽

Meta Analysis ◽

Statistical Significance ◽

List Type ◽

Growth Responses ◽

Belowground Competition ◽

New Findings ◽

Replication Crisis

ABSTRACTMany fields of science have experienced a replication crisis, where results from experiments with low statistical power published in the literature cannot be replicated. Ecology so far has not been drawn into this crisis, but there is no reason to think that this problem is absent in our field. Here, we originally attempted to replicate findings that showed pea (Pisum sativum L.) roots had strong differences in growth in the presence or absence of neighbours. Our original goal was just to develop a simple model system for studying how plant roots respond to competition from neighbours.In an attempt to replicate previous findings, we performed four separate experiments with 480 individual plants, across three years. Each time plants were grown in the full factorial combination of above and belowground competition. In addition, pea has been studied in similar experiments across six additional studies. Thus, we used meta-analysis to combine previous findings with our new findings.We were unable to replicate previous findings, and in all four experiments plants grew the same whether there were neighbours or not. Despite variability in individual studies, meta-analysis revealed that pea has no growth responses to neighbours and grows the same whether there is or is not below ground competition.Synthesis: Many other fields have gradually been drawn into a growing replication crisis, that is thought to be the result of low statistical power. Even though this is just one case study where a somewhat controversial result could not be reproduced, there is no reason to think ecology is immune from the replication crisis. We suggest that solutions developed in other fields might pre-emptively ward off similar problems. These include stricter cut-offs for statistical significance, a growing use of large replicated studies, and considering avenues for pre-registration of methods.

Download Full-text

Enhancing the biological relevance of Gene Co-expression Networks: A plant mitochondrial case study

10.1101/682492 ◽

2019 ◽

Cited By ~ 1

Author(s):

Simon R. Law ◽

Therese G. Kellgren ◽

Rafael Björk ◽

Patrik Ryden ◽

Olivier Keech

Keyword(s):

Statistical Power ◽

Primary Treatment ◽

Core Gene ◽

Core Network ◽

Gene Set ◽

Mitochondrial Proteome ◽

Biological Relevance ◽

Novel Approach ◽

Complex Datasets

AbstractGene Co-expression Networks (GCNs) are obtained by a variety of mathematical of models commonly derived on data sampled from diverse developmental processes, tissue types, pathologies, mutant backgrounds, and stress conditions. These networks aim to identify genes with similar expression dynamics, but are prone to introduce false-positive and -negative relations, especially in the instance of large and highly complex datasets. With the aim of optimizing the relevance of edges in GCNs and enhancing global biological insight, we propose a novel approach that involves a data-centering step performed simultaneously per gene and per sub-experiment, called centralisation within sub-experiments (CSE).Using a gene set encoding for the plant mitochondrial proteome as a case study, our results show that CSE-based GCNs had significantly more edges within the majority of the considered functional sub-networks, such as the mitochondrial electron transport chain and its sub-complexes, than GCNs not using CSE; thus demonstrating that the CSE-based GCNs are efficient at predicting those canonical functions and associated pathways, also referred to as the “core network”. Furthermore, we show that CSE, in conjunction with conventional correlation analyses can be used to fine-tune the prediction of the function for uncharacterised genes; while in combination with analyses based on non-centralised data can augment those conventional stress analyses with the innate connections underpinning the dynamic system examined.Therefore, CSE appears as an alternative method to conventional batch correction approaches. The method is easy to implement into a pre-existing GCN analysis pipeline and can provide accentuated biological relevance to conventional GCNs by allowing users to delineate a “core” gene network.Author SummaryGene Co-expression networks (GCNs) are the product of a variety of mathematical models that identify causal relationships in gene expression dynamics, but are prone to the misdiagnoses of false-positives and -negatives, especially in the instance of large and highly complex datasets. In light of the burgeoning output of next generation sequencing projects performed on any species, under different developmental or clinical conditions, the statistical power and complexity of these networks will undoubtedly increase, while their biological relevance will be fiercely challenged. Here, we propose a novel approach to primarily generate a “core” GCN with augmented biological relevance. Our method, which involves data-centering steps and thus effectively removes all primary treatment / tissue /patient effects, is simple to employ and can be easily implemented into pre-existing GCN analysis pipelines. The gained biological relevance of such an approach was validated using a subcellular gene set encoding for the plant mitochondrial proteome, and by applying numerous steps to challenge its application.

Download Full-text

A cross continental scale comparison of Australian offshore recreational fisheries research and its applications to Marine Park and fisheries management

ICES Journal of Marine Science ◽

10.1093/icesjms/fsz092 ◽

2019 ◽

Vol 77 (3) ◽

pp. 1190-1205 ◽

Cited By ~ 1

Author(s):

T P Lynch ◽

C B Smallwood ◽

F A Ochwada-Doyle ◽

J Lyle ◽

J Williams ◽

...

Keyword(s):

Spatial Data ◽

Statistical Power ◽

New South ◽

Survey Methods ◽

Recreational Fishing ◽

Continental Scale ◽

Marine Park ◽

South Wales ◽

Survey Designs

Abstract Recreational fishing is popular in Australia and is managed by individual states in consultation with the Commonwealth for those fisheries that they regulate and also for Australian Marine Parks (AMPs). Fishers regularly access both state and offshore Commonwealth waters but this offshore component of the recreational fishery is poorly understood. Our study tested the functionality of existing state-based surveys in Western Australia (WA) and New South Wales (NSW) to better inform Commonwealth fisheries and AMP managers about recreational fishing in their jurisdictions. Catch estimates for nine species of interest to the Commonwealth were developed and two case study AMPs [Ningaloo (WA) and The Hunter (NSW)] were also chosen to test the ability of the state survey data to be disaggregated to the park scale. As each state’s fishery survey designs were contextual to their own management needs, the application of the data to Commonwealth jurisdictions were limited by their statistical power, however aspects of each states surveys still provided useful information. Continued evolution of state-wide survey methods, including collection of precise spatial data, and regional over-sampling would be beneficial, particularly where there are multiple stakeholder and jurisdictional interests. National coordination, to temporally align state surveys, would also add value to the existing approaches.

Download Full-text

Inventory time-cost and statistical power: a case study of a Lao rattan

Forest Ecology and Management ◽

10.1016/s0378-1127(00)00589-2 ◽

2001 ◽

Vol 150 (3) ◽

pp. 313-322 ◽

Cited By ~ 16

Author(s):

Tom D Evans ◽

Oulathong V Viengkham

Keyword(s):

Statistical Power ◽

Time Cost

Download Full-text

Sample size calculations for the experimental comparison of multiple algorithms on multiple problem instances

Journal of Heuristics ◽

10.1007/s10732-020-09454-w ◽

2020 ◽

Vol 26 (6) ◽

pp. 851-883

Author(s):

Felipe Campelo ◽

Elizabeth F. Wanner

Keyword(s):

Sample Size ◽

Statistical Power ◽

Experimental Comparison ◽

Scheduling Problems ◽

Significance Level ◽

Problem Class ◽

Sample Size Calculations ◽

Problem Instances ◽

Multiple Algorithms

Abstract This work presents a statistically principled method for estimating the required number of instances in the experimental comparison of multiple algorithms on a given problem class of interest. This approach generalises earlier results by allowing researchers to design experiments based on the desired best, worst, mean or median-case statistical power to detect differences between algorithms larger than a certain threshold. Holm’s step-down procedure is used to maintain the overall significance level controlled at desired levels, without resulting in overly conservative experiments. This paper also presents an approach for sampling each algorithm on each instance, based on optimal sample size ratios that minimise the total required number of runs subject to a desired accuracy in the estimation of paired differences. A case study investigating the effect of 21 variants of a custom-tailored Simulated Annealing for a class of scheduling problems is used to illustrate the application of the proposed methods for sample size calculations in the experimental comparison of algorithms.

Download Full-text