scholarly journals The evolutionary dynamics and fitness landscape of clonal haematopoiesis

2019 ◽  
Author(s):  
Caroline J. Watson ◽  
Alana Papula ◽  
Yeuk P. G. Poon ◽  
Wing H. Wong ◽  
Andrew L. Young ◽  
...  

Somatic mutations acquired in healthy tissues as we age are major determinants of cancer risk. Whether variants confer a fitness advantage or rise to detectable frequencies by chance, however, remains largely unknown. Here, by combining blood sequencing data from ∼50,000 individuals, we reveal how mutation, genetic drift and fitness differences combine to shape the genetic diversity of healthy blood (‘clonal haematopoiesis’). By analysing the spectrum of variant allele frequencies we quantify fitness advantages for key pathogenic variants and genes and provide bounds on the number of haematopoietic stem cells. Positive selection, not drift, is the major force shaping clonal haematopoiesis. The remarkably wide variation in variant allele frequencies observed across individuals is driven by chance differences in the timing of mutation acquisition combined with differences in the cell-intrinsic fitness effect of variants. Contrary to the widely held view that clonal haematopoiesis is driven by ageing-related alterations in the stem cell niche, the data are consistent with the age dependence being driven simply by continuing risk of mutations and subsequent clonal expansions that lead to increased detectability at older ages.

Science ◽  
2020 ◽  
Vol 367 (6485) ◽  
pp. 1449-1454 ◽  
Author(s):  
Caroline J. Watson ◽  
A. L. Papula ◽  
Gladys Y. P. Poon ◽  
Wing H. Wong ◽  
Andrew L. Young ◽  
...  

Somatic mutations acquired in healthy tissues as we age are major determinants of cancer risk. Whether variants confer a fitness advantage or rise to detectable frequencies by chance remains largely unknown. Blood sequencing data from ~50,000 individuals reveal how mutation, genetic drift, and fitness shape the genetic diversity of healthy blood (clonal hematopoiesis). We show that positive selection, not drift, is the major force shaping clonal hematopoiesis, provide bounds on the number of hematopoietic stem cells, and quantify the fitness advantages of key pathogenic variants, at single-nucleotide resolution, as well as the distribution of fitness effects (fitness landscape) within commonly mutated driver genes. These data are consistent with clonal hematopoiesis being driven by a continuing risk of mutations and clonal expansions that become increasingly detectable with age.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Fenglin Liu ◽  
Yuanyuan Zhang ◽  
Lei Zhang ◽  
Ziyi Li ◽  
Qiao Fang ◽  
...  

Abstract Background Systematic interrogation of single-nucleotide variants (SNVs) is one of the most promising approaches to delineate the cellular heterogeneity and phylogenetic relationships at the single-cell level. While SNV detection from abundant single-cell RNA sequencing (scRNA-seq) data is applicable and cost-effective in identifying expressed variants, inferring sub-clones, and deciphering genotype-phenotype linkages, there is a lack of computational methods specifically developed for SNV calling in scRNA-seq. Although variant callers for bulk RNA-seq have been sporadically used in scRNA-seq, the performances of different tools have not been assessed. Results Here, we perform a systematic comparison of seven tools including SAMtools, the GATK pipeline, CTAT, FreeBayes, MuTect2, Strelka2, and VarScan2, using both simulation and scRNA-seq datasets, and identify multiple elements influencing their performance. While the specificities are generally high, with sensitivities exceeding 90% for most tools when calling homozygous SNVs in high-confident coding regions with sufficient read depths, such sensitivities dramatically decrease when calling SNVs with low read depths, low variant allele frequencies, or in specific genomic contexts. SAMtools shows the highest sensitivity in most cases especially with low supporting reads, despite the relatively low specificity in introns or high-identity regions. Strelka2 shows consistently good performance when sufficient supporting reads are provided, while FreeBayes shows good performance in the cases of high variant allele frequencies. Conclusions We recommend SAMtools, Strelka2, FreeBayes, or CTAT, depending on the specific conditions of usage. Our study provides the first benchmarking to evaluate the performances of different SNV detection tools for scRNA-seq data.


Author(s):  
Etthel M Windels ◽  
Richard Fox ◽  
Krishna Yerramsetty ◽  
Katherine Krouse ◽  
Tom Wenseleers ◽  
...  

Abstract Bacterial persistence is a potential cause of antibiotic therapy failure. Antibiotic-tolerant persisters originate from phenotypic differentiation within a susceptible population, occurring with a frequency that can be altered by mutations. Recent studies have proven that persistence is a highly evolvable trait and, consequently, an important evolutionary strategy of bacterial populations to adapt to high-dose antibiotic therapy. Yet, the factors that govern the evolutionary dynamics of persistence are currently poorly understood. Theoretical studies predict far-reaching effects of bottlenecking on the evolutionary adaption of bacterial populations, but these effects have never been investigated in the context of persistence. Bottlenecking events are frequently encountered by infecting pathogens during host-to-host transmission and antibiotic treatment. In this study, we used a combination of experimental evolution and barcoded knockout libraries to examine how population bottlenecking affects the evolutionary dynamics of persistence. In accordance with existing hypotheses, small bottlenecks were found to restrict the adaptive potential of populations and result in more heterogeneous evolutionary outcomes. Evolutionary trajectories followed in small-bottlenecking regimes additionally suggest that the fitness landscape associated with persistence has a rugged topography, with distinct trajectories towards increased persistence that are accessible to evolving populations. Furthermore, sequencing data of evolved populations and knockout libraries after selection reveal various genes that are potentially involved in persistence, including previously known as well as novel targets. Together, our results do not only provide experimental evidence for evolutionary theories, but also contribute to a better understanding of the environmental and genetic factors that guide bacterial adaptation to antibiotic treatment.


2021 ◽  
Author(s):  
Elżbieta Kaja ◽  
Adrian Lejman ◽  
Dawid Sielski ◽  
Mateusz Sypniewski ◽  
Tomasz Lech Gambin ◽  
...  

Although Slavic populations account for over 3.5% of world inhabitants, no centralized, open source reference database of genetic variation of any Slavic population exists to date. Such data are crucial for either biomedical research and genetic counseling and are essential for archeological and historical studies. Polish population, homogenous and sedentary in its nature but influenced by many migrations of the past, is unique and could serve as a good genetic reference for middle European Slavic nations. The aim of the present study was to describe first results of analyses of a newly created national database of Polish genomic variant allele frequencies. Never before has any study on the whole genomes of Polish population been conducted on such a large number of individuals (1,079). A wide spectrum of genomic variation was identified and genotyped, such as small and structural variants, runs of homozygosity, mitochondrial haplogroups and Mendelian inconsistencies. The allele frequencies were calculated for 943 unrelated individuals and released publicly as The Thousand Polish Genomes database. A precise detection and characterisation of rare variants enriched in the Polish population allowed to confirm the allele frequencies for known pathogenic variants in diseases, such as Smith-Lemli-Opitz syndrome (SLOS) or Nijmegen breakage syndrome (NBS). Additionally, the analysis of OMIM AR genes led to the identification of 22 genes with significantly different cumulative allele frequencies in the Polish (POL) vs European NFE population. We hope that The Thousand Polish Genomes database will contribute to the worldwide genomic data resources for researchers and clinicians.


2019 ◽  
Vol 35 (14) ◽  
pp. i398-i407 ◽  
Author(s):  
Pavel Skums ◽  
Viachaslau Tsyvina ◽  
Alex Zelikovsky

Abstract Summary Intra-tumor heterogeneity is one of the major factors influencing cancer progression and treatment outcome. However, evolutionary dynamics of cancer clone populations remain poorly understood. Quantification of clonal selection and inference of fitness landscapes of tumors is a key step to understanding evolutionary mechanisms driving cancer. These problems could be addressed using single-cell sequencing (scSeq), which provides an unprecedented insight into intra-tumor heterogeneity allowing to study and quantify selective advantages of individual clones. Here, we present Single Cell Inference of FItness Landscape (SCIFIL), a computational tool for inference of fitness landscapes of heterogeneous cancer clone populations from scSeq data. SCIFIL allows to estimate maximum likelihood fitnesses of clone variants, measure their selective advantages and order of appearance by fitting an evolutionary model into the tumor phylogeny. We demonstrate the accuracy our approach, and show how it could be applied to experimental tumor data to study clonal selection and infer evolutionary history. SCIFIL can be used to provide new insight into the evolutionary dynamics of cancer. Availability and implementation Its source code is available at https://github.com/compbel/SCIFIL.


BMJ ◽  
2021 ◽  
pp. n214
Author(s):  
Weedon MN ◽  
Jackson L ◽  
Harrison JW ◽  
Ruth KS ◽  
Tyrrell J ◽  
...  

Abstract Objective To determine whether the sensitivity and specificity of SNP chips are adequate for detecting rare pathogenic variants in a clinically unselected population. Design Retrospective, population based diagnostic evaluation. Participants 49 908 people recruited to the UK Biobank with SNP chip and next generation sequencing data, and an additional 21 people who purchased consumer genetic tests and shared their data online via the Personal Genome Project. Main outcome measures Genotyping (that is, identification of the correct DNA base at a specific genomic location) using SNP chips versus sequencing, with results split by frequency of that genotype in the population. Rare pathogenic variants in the BRCA1 and BRCA2 genes were selected as an exemplar for detailed analysis of clinically actionable variants in the UK Biobank, and BRCA related cancers (breast, ovarian, prostate, and pancreatic) were assessed in participants through use of cancer registry data. Results Overall, genotyping using SNP chips performed well compared with sequencing; sensitivity, specificity, positive predictive value, and negative predictive value were all above 99% for 108 574 common variants directly genotyped on the SNP chips and sequenced in the UK Biobank. However, the likelihood of a true positive result decreased dramatically with decreasing variant frequency; for variants that are very rare in the population, with a frequency below 0.001% in UK Biobank, the positive predictive value was very low and only 16% of 4757 heterozygous genotypes from the SNP chips were confirmed with sequencing data. Results were similar for SNP chip data from the Personal Genome Project, and 20/21 individuals analysed had at least one false positive rare pathogenic variant that had been incorrectly genotyped. For pathogenic variants in the BRCA1 and BRCA2 genes, which are individually very rare, the overall performance metrics for the SNP chips versus sequencing in the UK Biobank were: sensitivity 34.6%, specificity 98.3%, positive predictive value 4.2%, and negative predictive value 99.9%. Rates of BRCA related cancers in UK Biobank participants with a positive SNP chip result were similar to those for age matched controls (odds ratio 1.31, 95% confidence interval 0.99 to 1.71) because the vast majority of variants were false positives, whereas sequence positive participants had a significantly increased risk (odds ratio 4.05, 2.72 to 6.03). Conclusions SNP chips are extremely unreliable for genotyping very rare pathogenic variants and should not be used to guide health decisions without validation.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Takumi Miura ◽  
Satoshi Yasuda ◽  
Yoji Sato

Abstract Background Next-generation sequencing (NGS) has profoundly changed the approach to genetic/genomic research. Particularly, the clinical utility of NGS in detecting mutations associated with disease risk has contributed to the development of effective therapeutic strategies. Recently, comprehensive analysis of somatic genetic mutations by NGS has also been used as a new approach for controlling the quality of cell substrates for manufacturing biopharmaceuticals. However, the quality evaluation of cell substrates by NGS largely depends on the limit of detection (LOD) for rare somatic mutations. The purpose of this study was to develop a simple method for evaluating the ability of whole-exome sequencing (WES) by NGS to detect mutations with low allele frequency. To estimate the LOD of WES for low-frequency somatic mutations, we repeatedly and independently performed WES of a reference genomic DNA using the same NGS platform and assay design. LOD was defined as the allele frequency with a relative standard deviation (RSD) value of 30% and was estimated by a moving average curve of the relation between RSD and allele frequency. Results Allele frequencies of 20 mutations in the reference material that had been pre-validated by droplet digital PCR (ddPCR) were obtained from 5, 15, 30, or 40 G base pair (Gbp) sequencing data per run. There was a significant association between the allele frequencies measured by WES and those pre-validated by ddPCR, whose p-value decreased as the sequencing data size increased. By this method, the LOD of allele frequency in WES with the sequencing data of 15 Gbp or more was estimated to be between 5 and 10%. Conclusions For properly interpreting the WES data of somatic genetic mutations, it is necessary to have a cutoff threshold of low allele frequencies. The in-house LOD estimated by the simple method shown in this study provides a rationale for setting the cutoff.


2020 ◽  
Vol 28 (12) ◽  
pp. 1763-1768
Author(s):  
Thomas Bourinaris ◽  
◽  
Damian Smedley ◽  
Valentina Cipriani ◽  
Isabella Sheikh ◽  
...  

AbstractHereditary spastic paraplegia (HSP) is a group of heterogeneous inherited degenerative disorders characterized by lower limb spasticity. Fifty percent of HSP patients remain yet genetically undiagnosed. The 100,000 Genomes Project (100KGP) is a large UK-wide initiative to provide genetic diagnosis to previously undiagnosed patients and families with rare conditions. Over 400 HSP families were recruited to the 100KGP. In order to obtain genetic diagnoses, gene-based burden testing was carried out for rare, predicted pathogenic variants using candidate variants from the Exomiser analysis of the genome sequencing data. A significant gene-disease association was identified for UBAP1 and HSP. Three protein truncating variants were identified in 13 patients from 7 families. All patients presented with juvenile form of pure HSP, with median age at onset 10 years, showing autosomal dominant inheritance or de novo occurrence. Additional clinical features included parkinsonism and learning difficulties, but their association with UBAP1 needs to be established.


2017 ◽  
Vol 256 ◽  
pp. S80
Author(s):  
Ruslan Bayramov ◽  
Muhammet Ensar Dogan ◽  
Meltem Cerrah Gunes ◽  
Muge Gulcihan Unal ◽  
Mehmet Boz ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document