Measuring statistical evidence and multiple testing

Michael Evans; Jabed Tomal

doi:10.1139/facets-2017-0121

Measuring statistical evidence and multiple testing

FACETS ◽

10.1139/facets-2017-0121 ◽

2018 ◽

Vol 3 (1) ◽

pp. 563-583 ◽

Cited By ~ 4

Author(s):

Michael Evans ◽

Jabed Tomal

Keyword(s):

Multiple Testing ◽

False Positives ◽

Statistical Evidence ◽

Current Interest ◽

False Negatives ◽

P Values ◽

Optimal Property ◽

Testing Algorithm ◽

Statistical Criteria

The measurement of statistical evidence is of considerable current interest in fields where statistical criteria are used to determine knowledge. The most commonly used approach to measuring such evidence is through the use of p-values, even though these are known to possess a number of properties that lead to doubts concerning their validity as measures of evidence. It is less well known that there are alternatives with the desired properties of a measure of statistical evidence. The measure of evidence given by the relative belief ratio is employed in this paper. A relative belief multiple testing algorithm was developed to control for false positives and false negatives through bounds on the evidence determined by measures of bias. The relative belief multiple testing algorithm was shown to be consistent and to possess an optimal property when considering the testing of a hypothesis randomly chosen from the collection of considered hypotheses. The relative belief multiple testing algorithm was applied to the problem of inducing sparsity. Priors were chosen via elicitation, and sparsity was induced only when justified by the evidence and there was no dependence on any particular form of a prior for this purpose.

Download Full-text

Estimating the occurrence of false positives and false negatives in microarray studies by approximating and partitioning the empirical distribution of p-values

Bioinformatics ◽

10.1093/bioinformatics/btg148 ◽

2003 ◽

Vol 19 (10) ◽

pp. 1236-1242 ◽

Cited By ~ 252

Author(s):

S. Pounds ◽

S. W. Morris

Keyword(s):

Empirical Distribution ◽

False Positives ◽

False Negatives ◽

P Values ◽

Microarray Studies

Download Full-text

Evaluation of Second-Level Inference in fMRI Analysis

Computational Intelligence and Neuroscience ◽

10.1155/2016/1068434 ◽

2016 ◽

Vol 2016 ◽

pp. 1-22 ◽

Cited By ~ 4

Author(s):

Sanne P. Roels ◽

Tom Loeys ◽

Beatrijs Moerkerke

Keyword(s):

Error Rate ◽

Cluster Size ◽

Multiple Testing ◽

Real Data ◽

False Positives ◽

Familywise Error Rate ◽

False Negatives ◽

Step Procedure ◽

Fmri Analysis ◽

The Impact

We investigate the impact of decisions in the second-level (i.e., over subjects) inferential process in functional magnetic resonance imaging on (1) the balance between false positives and false negatives and on (2) the data-analytical stability, both proxies for the reproducibility of results. Second-level analysis based on a mass univariate approach typically consists of 3 phases. First, one proceeds via a general linear model for a test image that consists of pooled information from different subjects. We evaluate models that take into account first-level (within-subjects) variability and models that do not take into account this variability. Second, one proceeds via inference based on parametrical assumptions or via permutation-based inference. Third, we evaluate 3 commonly used procedures to address the multiple testing problem: familywise error rate correction, False Discovery Rate (FDR) correction, and a two-step procedure with minimal cluster size. Based on a simulation study and real data we find that the two-step procedure with minimal cluster size results in most stable results, followed by the familywise error rate correction. The FDR results in most variable results, for both permutation-based inference and parametrical inference. Modeling the subject-specific variability yields a better balance between false positives and false negatives when using parametric inference.

Download Full-text

Sky Segmentation for Enhanced Depth Reconstruction and Bokeh Rendering with Efficient Architectures

Electronic Imaging ◽

10.2352/issn.2470-1173.2020.14.coimg-378 ◽

2020 ◽

Vol 2020 (14) ◽

pp. 378-1-378-7

Author(s):

Tyler Nuanes ◽

Matt Elsey ◽

Radek Grzeszczuk ◽

John Paul Shen

Keyword(s):

Real Time ◽

Mobile Device ◽

Computational Cost ◽

False Positives ◽

Compact Model ◽

High Quality ◽

False Negatives ◽

Trade Off ◽

Depth Reconstruction ◽

Binary Classifiers

We present a high-quality sky segmentation model for depth refinement and investigate residual architecture performance to inform optimally shrinking the network. We describe a model that runs in near real-time on mobile device, present a new, highquality dataset, and detail a unique weighing to trade off false positives and false negatives in binary classifiers. We show how the optimizations improve bokeh rendering by correcting stereo depth misprediction in sky regions. We detail techniques used to preserve edges, reject false positives, and ensure generalization to the diversity of sky scenes. Finally, we present a compact model and compare performance of four popular residual architectures (ShuffleNet, MobileNetV2, Resnet-101, and Resnet-34-like) at constant computational cost.

Download Full-text

Automatic Extraction of Acronyms from Text

10.26686/wgtn.12922298 ◽

2020 ◽

Author(s):

Stuart Yeates

Keyword(s):

Digital Library ◽

False Positives ◽

Automatic Extraction ◽

False Negatives ◽

Library Research ◽

Communications Theory ◽

Textual Content

A brief introduction to acronyms is given and motivation for extracting them in a digital library environment is discussed. A technique for extracting acronyms is given with an analysis of the results. The technique is found to have a low number of false negatives and a high number of false positives. Introduction Digital library research seeks to build tools to enable access of content, while making as few as possible assumptions about the content, since assumptions limit the range of applicability of the tools. Generally, the broader the assumptions the more widely applicable the tools. For example, keyword based indexing [5] is based on communications theory and applies to all natural human textual languages (allowances for differences in character sets and similar localisation issues not withstanding) . The algorithm described in this paper makes much stronger assumptions about the content. It assumes textual content that contains acronyms, an assumption which is known to hold for...

Download Full-text

Avoiding Interest-Based Revenues While Constructing Shariah-Compliant Portfolios: False Negatives and False Positives

SSRN Electronic Journal ◽

10.2139/ssrn.2975790 ◽

2017 ◽

Author(s):

zggr Arslan-Ayaydin ◽

Kris Boudt ◽

Muhammad Wajid Raza

Keyword(s):

False Positives ◽

False Negatives

Download Full-text

Evaluation of Positive T- and B-Cell Gene Rearrangement Studies Among Patients Without a Definitive Diagnosis by Other Assays

American Journal of Clinical Pathology ◽

10.1093/ajcp/aqz112.067 ◽

2019 ◽

Vol 152 (Supplement_1) ◽

pp. S35-S36

Author(s):

Hadrian Mendoza ◽

Christopher Tormey ◽

Alexa Siddon

Keyword(s):

T Cell ◽

False Positive ◽

Gene Rearrangement ◽

Hematologic Malignancy ◽

False Negative ◽

False Positives ◽

False Negatives ◽

True Negative ◽

Flow Cytometric ◽

Pathology Reports

Abstract In the evaluation of bone marrow (BM) and peripheral blood (PB) for hematologic malignancy, positive immunoglobulin heavy chain (IG) or T-cell receptor (TCR) gene rearrangement results may be detected despite unrevealing results from morphologic, flow cytometric, immunohistochemical (IHC), and/or cytogenetic studies. The significance of positive rearrangement studies in the context of otherwise normal ancillary findings is unknown, and as such, we hypothesized that gene rearrangement studies may be predictive of an emerging B- or T-cell clone in the absence of other abnormal laboratory tests. Data from all patients who underwent IG or TCR gene rearrangement testing at the authors’ affiliated VA hospital between January 1, 2013, and July 6, 2018, were extracted from the electronic medical record. Date of testing; specimen source; and morphologic, flow cytometric, IHC, and cytogenetic characterization of the tissue source were recorded from pathology reports. Gene rearrangement results were categorized as true positive, false positive, false negative, or true negative. Lastly, patient records were reviewed for subsequent diagnosis of hematologic malignancy in patients with positive gene rearrangement results with negative ancillary testing. A total of 136 patients, who had 203 gene rearrangement studies (50 PB and 153 BM), were analyzed. In TCR studies, there were 2 false positives and 1 false negative in 47 PB assays, as well as 7 false positives and 1 false negative in 54 BM assays. Regarding IG studies, 3 false positives and 12 false negatives in 99 BM studies were identified. Sensitivity and specificity, respectively, were calculated for PB TCR studies (94% and 93%), BM IG studies (71% and 95%), and BM TCR studies (92% and 83%). Analysis of PB IG gene rearrangement studies was not performed due to the small number of tests (3; all true negative). None of the 12 patients with false-positive IG/TCR gene rearrangement studies later developed a lymphoproliferative disorder, although 2 patients were later diagnosed with acute myeloid leukemia. Of the 14 false negatives, 10 (71%) were related to a diagnosis of plasma cell neoplasms. Results from the present study suggest that positive IG/TCR gene rearrangement studies are not predictive of lymphoproliferative disorders in the context of otherwise negative BM or PB findings. As such, when faced with equivocal pathology reports, clinicians can be practically advised that isolated positive IG/TCR gene rearrangement results may not indicate the need for closer surveillance.

Download Full-text

Tabu search for DNA sequencing with false negatives and false positives

European Journal of Operational Research ◽

10.1016/s0377-2217(99)00456-7 ◽

2000 ◽

Vol 125 (2) ◽

pp. 257-265 ◽

Cited By ~ 31

Author(s):

J Błażewicz ◽

P Formanowicz ◽

M Kasprzak ◽

W.T Markiewicz ◽

J Wȩglarz

Keyword(s):

Tabu Search ◽

Dna Sequencing ◽

False Positives ◽

False Negatives

Download Full-text

Abstract MP11: Circulating Plasma Biomarkers Associated With Brain Arteriovenous Malformations

Stroke ◽

10.1161/str.52.suppl_1.mp11 ◽

2021 ◽

Vol 52 (Suppl_1) ◽

Author(s):

Sarah E Wetzel-Strong ◽

Shantel M Weinsheimer ◽

Jeffrey Nelson ◽

Ludmila Pawlikowska ◽

Dewi Clark ◽

...

Keyword(s):

Multiple Testing ◽

Statistical Significance ◽

Protein Profiling ◽

P Value ◽

P Values ◽

Plasma Biomarkers ◽

Standard Curve ◽

Disease States ◽

Heparin Plasma ◽

Circulating Levels

Objective: Circulating plasma protein profiling may aid in the identification of cerebrovascular disease signatures. This study aimed to identify circulating angiogenic and inflammatory biomarkers that may serve as biomarkers to differentiate sporadic brain arteriovenous malformation (bAVM) patients from other conditions with brain AVMs, including hereditary hemorrhagic telangiectasia (HHT) patients. Methods: The Quantibody Human Angiogenesis Array 1000 (Raybiotech) is an ELISA multiplex panel that was used to assess the levels of 60 proteins related to angiogenesis and inflammation in heparin plasma samples from 13 sporadic unruptured bAVM patients (69% male, mean age 51 years) and 37 patients with HHT (40% male, mean age 47 years, n=19 (51%) with bAVM). The Quantibody Q-Analyzer tool was used to calculate biomarker concentrations based on the standard curve for each marker and log-transformed marker levels were evaluated for associations between disease states using a multivariable interval regression model adjusted for age, sex, ethnicity and collection site. Statistical significance was based on Bonferroni correction for multiple testing of 60 biomarkers (P< 8.3x10 - 4 ). Results: Circulating levels of two plasma proteins differed significantly between sporadic bAVM and HHT patients: PDGF-BB (P=2.6x10 -4 , PI= 3.37, 95% CI:1.76-6.46) and CCL5 (P=6.0x10 -6 , PI=3.50, 95% CI=2.04-6.03). When considering markers with a nominal p-value of less than 0.01, MMP1 and angiostatin levels also differed between patients with sporadic bAVM and HHT. Markers with nominal p-values less than 0.05 when comparing sporadic brain AVM and HHT patients also included angiostatin, IL2, VEGF, GRO, CXCL16, ITAC, and TGFB3. Among HHT patients, the circulating levels of UPAR and IL6 were elevated in patients with documented bAVMs when considering markers with nominal p-values less than 0.05. Conclusions: This study identified differential expression of two promising plasma biomarkers that differentiate sporadic bAVMs from patients with HHT. Furthermore, this study allowed us to evaluate markers that are associated with the presence of bAVMs in HHT patients, which may offer insight into mechanisms underlying bAVM pathophysiology.

Download Full-text

The Impact of Increasing Prevalence, False Omissions, and Diagnostic Uncertainty on Coronavirus Disease 2019 (COVID-19) Test Performance

Archives of Pathology & Laboratory Medicine ◽

10.5858/arpa.2020-0716-sa ◽

2021 ◽

Author(s):

Gerald J. Kost

Keyword(s):

Test Performance ◽

Influenza A ◽

Geometric Mean ◽

False Positives ◽

Test Quality ◽

False Negatives ◽

Predictive Values ◽

Sensitivity Specificity ◽

Tier 3 ◽

The Impact

ABSTRACT Context. Coronavirus disease 2019 (COVID-19) test performance depends on predictive values in settings of increasing disease prevalence. Geospatially distributed diagnostics with minimal uncertainty facilitate efficient point-of-need strategies. Objectives. To use original mathematics to interpret COVID-19 test metrics; assess Food and Drug Administration Emergency Use Authorizations and Health Canada targets; compare predictive values for multiplex, antigen, polymerase chain reaction kit, point-of-care antibody, and home tests; enhance test performance; and improve decision-making. Design. PubMed/newsprint generated articles documenting prevalence. Mathematica and open access software helped perform recursive calculations, graph multivariate relationships, and visualize performance by comparing predictive value geometric mean-squared patterns. Results. Tiered sensitivity/specificity comprise: T1) 90%, 95%; T2) 95%, 97.5%; and T3) 100%, ≥99%. Tier 1 false negatives exceed true negatives at >90.5% prevalence; false positives exceeded true positives at <5.3% prevalence. High sensitivity/specificity tests reduce false negatives and false positives yielding superior predictive values. Recursive testing improves predictive values. Visual logistics facilitate test comparisons. Antigen test quality falls off as prevalence increases. Multiplex severe acute respiratory syndrome (SARS)-CoV-2)*Influenza A/B*Respiratory-Syncytial Virus (RSV) testing performs reasonably well compared to Tier 3. Tier 3 performance with a Tier 2 confidence band lower limit will generate excellent performance and reliability. Conclusions. The overriding principle is select the best combined performance and reliability pattern for the prevalence bracket. Some public health professionals recommend repetitive testing to compensate for low sensitivity. More logically, improved COVID-19 assays with less uncertainty conserve resources. Multiplex differentiation of COVID-19 from Influenza A/B-RSV represents an effective strategy if seasonal flu surges next year.

Download Full-text

A Short Mental Status Questionnaire

Canadian Journal on Aging / La Revue canadienne du vieillissement ◽

10.1017/s0714980800013465 ◽

1982 ◽

Vol 1 (1-2) ◽

pp. 16-20 ◽

Cited By ~ 29

Author(s):

Duncan Robertson ◽

Kenneth Rockwood ◽

Paul Stolee

Keyword(s):

Cognitive Impairment ◽

Mental Status ◽

Clinical Assessment ◽

Elderly Population ◽

The Elderly ◽

False Positives ◽

False Negatives ◽

Institutionalized Elderly ◽

Severe Cognitive Impairment ◽

Mild Impairment

ABSTRACTA mental status questionnaire (MSQ) developed tor use in surveys of the non-institutionalized elderly has been validated against clinical assessment. The MSQ identities moderate and severe cognitive impairment in the elderly. However, using the suggested scoring subjects with mild impairment cannot be separated from normals.The test is short, acceptable and reproducible and rate for false-positives and false-negatives fall well within acceptable limits for use in estimating the prevalence of dementia in the non-institutionalized elderly population.

Download Full-text