Use of Whole Genome Sequencing Data for a First in Silico Specificity Evaluation of the RT-qPCR Assays Used for SARS-CoV-2 Detection

Mathieu Gand; Kevin Vanneste; Isabelle Thomas; Steven Van Gucht; Arnaud Capron; Philippe Herman; Nancy H. C. Roosens; Sigrid C. J. De Keersmaecker

doi:10.3390/ijms21155585

Use of Whole Genome Sequencing Data for a First in Silico Specificity Evaluation of the RT-qPCR Assays Used for SARS-CoV-2 Detection

International Journal of Molecular Sciences ◽

10.3390/ijms21155585 ◽

2020 ◽

Vol 21 (15) ◽

pp. 5585

Author(s):

Mathieu Gand ◽

Kevin Vanneste ◽

Isabelle Thomas ◽

Steven Van Gucht ◽

Arnaud Capron ◽

...

Keyword(s):

In Silico ◽

Protein S ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Sequencing Data ◽

Viral Genomes ◽

Genome Data ◽

Gene Coding ◽

Control And Prevention ◽

The Impact

The current COronaVIrus Disease 2019 (COVID-19) pandemic started in December 2019. COVID-19 cases are confirmed by the detection of SARS-CoV-2 RNA in biological samples by RT-qPCR. However, limited numbers of SARS-CoV-2 genomes were available when the first RT-qPCR methods were developed in January 2020 for initial in silico specificity evaluation and to verify whether the targeted loci are highly conserved. Now that more whole genome data have become available, we used the bioinformatics tool SCREENED and a total of 4755 publicly available SARS-CoV-2 genomes, downloaded at two different time points, to evaluate the specificity of 12 RT-qPCR tests (consisting of a total of 30 primers and probe sets) used for SARS-CoV-2 detection and the impact of the virus’ genetic evolution on four of them. The exclusivity of these methods was also assessed using the human reference genome and 2624 closely related other respiratory viral genomes. The specificity of the assays was generally good and stable over time. An exception is the first method developed by the China Center for Disease Control and prevention (CDC), which exhibits three primer mismatches present in 358 SARS-CoV-2 genomes sequenced mainly in Europe from February 2020 onwards. The best results were obtained for the assay of Chan et al. (2020) targeting the gene coding for the spiking protein (S). This demonstrates that our user-friendly strategy can be used for a first in silico specificity evaluation of future RT-qPCR tests, as well as verifying that the former methods are still capable of detecting circulating SARS-CoV-2 variants.

Download Full-text

Common Treatment, Common Variant: Evolutionary Prediction of Functional Pharmacogenomic Variants

Journal of Personalized Medicine ◽

10.3390/jpm11020131 ◽

2021 ◽

Vol 11 (2) ◽

pp. 131

Author(s):

Laura B. Scheinfeldt ◽

Andrew Brangan ◽

Dara M. Kusic ◽

Sudhir Kumar ◽

Neda Gharani

Keyword(s):

In Silico ◽

Drug Efficacy ◽

Common Variant ◽

Genomic Research ◽

Whole Genome Sequencing Data ◽

Mendelian Disease ◽

Allele Frequency Distribution ◽

Sequencing Data ◽

Patient Race ◽

The Impact

Pharmacogenomics holds the promise of personalized drug efficacy optimization and drug toxicity minimization. Much of the research conducted to date, however, suffers from an ascertainment bias towards European participants. Here, we leverage publicly available, whole genome sequencing data collected from global populations, evolutionary characteristics, and annotated protein features to construct a new in silico machine learning pharmacogenetic identification method called XGB-PGX. When applied to pharmacogenetic data, XGB-PGX outperformed all existing prediction methods and identified over 2000 new pharmacogenetic variants. While there are modest pharmacogenetic allele frequency distribution differences across global population samples, the most striking distinction is between the relatively rare putatively neutral pharmacogene variants and the relatively common established and newly predicted functional pharamacogenetic variants. Our findings therefore support a focus on individual patient pharmacogenetic testing rather than on clinical presumptions about patient race, ethnicity, or ancestral geographic residence. We further encourage more attention be given to the impact of common variation on drug response and propose a new ‘common treatment, common variant’ perspective for pharmacogenetic prediction that is distinct from the types of variation that underlie complex and Mendelian disease. XGB-PGX has identified many new pharmacovariants that are present across all global communities; however, communities that have been underrepresented in genomic research are likely to benefit the most from XGB-PGX’s in silico predictions.

Download Full-text

Peer Review #1 of "Challenging a bioinformatic tool’s ability to detect microbial contaminants using in silico whole genome sequencing data (v0.2)"

10.7287/peerj.3729v0.2/reviews/1 ◽

2017 ◽

Author(s):

S Kumar

Keyword(s):

Whole Genome Sequencing ◽

Peer Review ◽

Genome Sequencing ◽

In Silico ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Sequencing Data ◽

Microbial Contaminants

Download Full-text

PyClone-VI: scalable inference of clonal population structures using whole genome data

BMC Bioinformatics ◽

10.1186/s12859-020-03919-2 ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Sierra Gillis ◽

Andrew Roth

Keyword(s):

Malignant Cell ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Cancer Evolution ◽

Sequencing Data ◽

Computationally Efficient ◽

Clonal Population ◽

Genome Data ◽

Population Structures ◽

Clonal Population Structure

Abstract Background At diagnosis tumours are typically composed of a mixture of genomically distinct malignant cell populations. Bulk sequencing of tumour samples coupled with computational deconvolution can be used to identify these populations and study cancer evolution. Existing computational methods for populations deconvolution are slow and/or potentially inaccurate when applied to large datasets generated by whole genome sequencing data. Results We describe PyClone-VI, a computationally efficient Bayesian statistical method for inferring the clonal population structure of cancers. We demonstrate the utility of the method by analyzing data from 1717 patients from PCAWG study and 100 patients from the TRACERx study. Conclusions Our proposed method is 10–100× times faster than existing methods, while providing results which are as accurate. Software implementing our method is freely available https://github.com/Roth-Lab/pyclone-vi.

Download Full-text

Strategy to Develop and Evaluate a Multiplex RT-ddPCR in Response to SARS-CoV-2 Genomic Evolution

Current Issues in Molecular Biology ◽

10.3390/cimb43030134 ◽

2021 ◽

Vol 43 (3) ◽

pp. 1937-1949

Author(s):

Laura A. E. Van Poelvoorde ◽

Mathieu Gand ◽

Marie-Alice Fraiture ◽

Sigrid C. J. De Keersmaecker ◽

Bavo Verhaegen ◽

...

Keyword(s):

In Silico ◽

Virus Evolution ◽

Clinical Samples ◽

Whole Genome Sequencing Data ◽

Preliminary Evaluation ◽

Whole Genome ◽

Sequencing Data ◽

Method Performance ◽

The World ◽

Pcr Assays

The worldwide emergence and spread of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) since 2019 has highlighted the importance of rapid and reliable diagnostic testing to prevent and control the viral transmission. However, inaccurate results may occur due to false negatives (FN) caused by polymorphisms or point mutations related to the virus evolution and compromise the accuracy of the diagnostic tests. Therefore, PCR-based SARS-CoV-2 diagnostics should be evaluated and evolve together with the rapidly increasing number of new variants appearing around the world. However, even by using a large collection of samples, laboratories are not able to test a representative collection of samples that deals with the same level of diversity that is continuously evolving worldwide. In the present study, we proposed a methodology based on an in silico and in vitro analysis. First, we used all information offered by available whole-genome sequencing data for SARS-CoV-2 for the selection of the two PCR assays targeting two different regions in the genome, and to monitor the possible impact of virus evolution on the specificity of the primers and probes of the PCR assays during and after the development of the assays. Besides this first essential in silico evaluation, a minimal set of testing was proposed to generate experimental evidence on the method performance, such as specificity, sensitivity and applicability. Therefore, a duplex reverse-transcription droplet digital PCR (RT-ddPCR) method was evaluated in silico by using 154 489 whole-genome sequences of SARS-CoV-2 strains that were representative for the circulating strains around the world. The RT-ddPCR platform was selected as it presented several advantages to detect and quantify SARS-CoV-2 RNA in clinical samples and wastewater. Next, the assays were successfully experimentally evaluated for their sensitivity and specificity. A preliminary evaluation of the applicability of the developed method was performed using both clinical and wastewater samples.

Download Full-text

Peer Review #1 of "Challenging a bioinformatic tool’s ability to detect microbial contaminants using in silico whole genome sequencing data (v0.1)"

10.7287/peerj.3729v0.1/reviews/1 ◽

2017 ◽

Author(s):

S Kumar

Keyword(s):

Whole Genome Sequencing ◽

Peer Review ◽

Genome Sequencing ◽

In Silico ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Sequencing Data ◽

Microbial Contaminants

Download Full-text

Salmonella Serotyping; Comparison of the Traditional Method to a Microarray-Based Method and an in silico Platform Using Whole Genome Sequencing Data

Frontiers in Microbiology ◽

10.3389/fmicb.2019.02554 ◽

2019 ◽

Vol 10 ◽

Cited By ~ 7

Author(s):

Benjamin Diep ◽

Caroline Barretto ◽

Anne-Catherine Portmann ◽

Coralie Fournier ◽

Aneta Karczmarek ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

In Silico ◽

Traditional Method ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Sequencing Data

Download Full-text

In silico detection of phylogenetic informative Y-chromosomal single nucleotide polymorphisms from whole genome sequencing data

Electrophoresis ◽

10.1002/elps.201300459 ◽

2014 ◽

Vol 35 (21-22) ◽

pp. 3102-3110 ◽

Cited By ~ 5

Author(s):

Anneleen Van Geystelen ◽

Tom Wenseleers ◽

Ronny Decorte ◽

Maarten J. L. Caspers ◽

Maarten H. D. Larmuseau

Keyword(s):

Single Nucleotide Polymorphisms ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

In Silico ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Nucleotide Polymorphisms ◽

Sequencing Data ◽

Single Nucleotide

Download Full-text

Blood group typing from whole-genome sequencing data

PLoS ONE ◽

10.1371/journal.pone.0242168 ◽

2020 ◽

Vol 15 (11) ◽

pp. e0242168

Author(s):

Julien Paganini ◽

Peter L. Nagy ◽

Nicholas Rouse ◽

Philippe Gouret ◽

Jacques Chiaroni ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Blood Group ◽

Genome Sequencing ◽

Hla Typing ◽

Next Generation Sequencing Data ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Sequencing Data ◽

Personalized Care ◽

Genome Data

Many questions can be explored thanks to whole-genome data. The aim of this study was to overcome their main limits, software availability and database accuracy, and estimate the feasibility of red blood cell (RBC) antigen typing from whole-genome sequencing (WGS) data. We analyzed whole-genome data from 79 individuals for HLA-DRB1 and 9 RBC antigens. Whole-genome sequencing data was analyzed with software allowing phasing of variable positions to define alleles or haplotypes and validated for HLA typing from next-generation sequencing data. A dedicated database was set up with 1648 variable positions analyzed in KEL (KEL), ACKR1 (FY), SLC14A1 (JK), ACHE (YT), ART4 (DO), AQP1 (CO), CD44 (IN), SLC4A1 (DI) and ICAM4 (LW). Whole-genome sequencing typing was compared to that previously obtained by amplicon-based monoallelic sequencing and by SNaPshot analysis. Whole-genome sequencing data were also explored for other alleles. Our results showed 93% of concordance for blood group polymorphisms and 91% for HLA-DRB1. Incorrect typing and unresolved results confirm that WGS should be considered reliable with read depths strictly above 15x. Our results supported that RBC antigen typing from WGS is feasible but requires improvements in read depth for SNV polymorphisms typing accuracy. We also showed the potential for WGS in screening donors with rare blood antigens, such as weak JK alleles. The development of WGS analysis in immunogenetics laboratories would offer personalized care in the management of RBC disorders.

Download Full-text

Peer Review #2 of "Challenging a bioinformatic tool’s ability to detect microbial contaminants using in silico whole genome sequencing data (v0.1)"

10.7287/peerj.3729v0.1/reviews/2 ◽

2017 ◽

Keyword(s):

Whole Genome Sequencing ◽

Peer Review ◽

Genome Sequencing ◽

In Silico ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Sequencing Data ◽

Microbial Contaminants

Download Full-text

Population Stratification at the Phenotypic Variance level and Implication for the Analysis of Whole Genome Sequencing Data from Multiple Studies

10.1101/2020.03.03.973420 ◽

2020 ◽

Author(s):

Tamar Sofer ◽

Xiuwen Zheng ◽

Cecelia A. Laurie ◽

Stephanie M. Gogarten ◽

Jennifer A. Brody ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Statistical Power ◽

Epidemiological Studies ◽

Whole Genome Sequencing Data ◽

Phenotypic Variance ◽

Whole Genome ◽

Sequencing Data ◽

Level Data ◽

The Impact

SummaryIn modern Whole Genome Sequencing (WGS) epidemiological studies, participant-level data from multiple studies are often pooled and results are obtained from a single analysis. We consider the impact of differential phenotype variances by study, which we term ‘variance stratification’. Unaccounted for, variance stratification can lead to both decreased statistical power, and increased false positives rates, depending on how allele frequencies, sample sizes, and phenotypic variances vary across the studies that are pooled. We describe a WGS-appropriate analysis approach, implemented in freely-available software, which allows study-specific variances and thereby improves performance in practice. We also illustrate the variance stratification problem, its solutions, and a corresponding diagnostic procedure in data from the Trans-Omics for Precision Medicine Whole Genome Sequencing Program (TOPMed), used in association tests for hemoglobin concentrations and BMI.

Download Full-text