scholarly journals Rapid construction of a whole-genome transposon insertion collection for Shewanella oneidensis by Knockout Sudoku

2016 ◽  
Vol 7 (1) ◽  
Author(s):  
Michael Baym ◽  
Lev Shaket ◽  
Isao A. Anzai ◽  
Oluwakemi Adesina ◽  
Buz Barstow

Abstract Whole-genome knockout collections are invaluable for connecting gene sequence to function, yet traditionally, their construction has required an extraordinary technical effort. Here we report a method for the construction and purification of a curated whole-genome collection of single-gene transposon disruption mutants termed Knockout Sudoku. Using simple combinatorial pooling, a highly oversampled collection of mutants is condensed into a next-generation sequencing library in a single day, a 30- to 100-fold improvement over prior methods. The identities of the mutants in the collection are then solved by a probabilistic algorithm that uses internal self-consistency within the sequencing data set, followed by rapid algorithmically guided condensation to a minimal representative set of mutants, validation, and curation. Starting from a progenitor collection of 39,918 mutants, we compile a quality-controlled knockout collection of the electroactive microbe Shewanella oneidensis MR-1 containing representatives for 3,667 genes that is functionally validated by high-throughput kinetic measurements of quinone reduction.

2016 ◽  
Author(s):  
Michael Baym ◽  
Lev Shaket ◽  
Isao A. Anzai ◽  
Oluwakemi Adesina ◽  
Buz Barstow

AbstractWhole-genome knockout collections are invaluable for connecting gene sequence to function, yet traditionally they have needed an extraordinary technical effort to construct. Knockout Sudoku is a new method for directing the construction and purification of a curated whole-genome collection of singlegene disruption mutants generated by transposon mutagenesis. Using a simple combinatorial pooling scheme, a highly oversampled collection of transposon mutants can be condensed into a next-generation sequencing library in a single day. The identities of the mutants in the collection are then solved by a predictive algorithm based on Bayesian inference, allowing for rapid curation and validation. Starting from a progenitor collection of 39,918 transposon mutants, we compiled a quality-controlled knockout collection of the electroactive microbe Shewanella oneidensis MR–1 containing representatives for 3,667 genes. High-throughput kinetic measurements on this collection provide a comprehensive view of multiple extracellular electron transfer pathways operating in parallel.


Blood ◽  
2021 ◽  
Vol 138 (Supplement 1) ◽  
pp. 58-58
Author(s):  
Anna E. Marneth ◽  
Jonas S. Jutzi ◽  
Angel Guerra-Moreno ◽  
Michele Ciboddo ◽  
María José Jiménez Santos ◽  
...  

Abstract Somatic mutations in the ER chaperone calreticulin (CALR) are frequent and disease-initiating in myeloproliferative neoplasms (MPN). Although the mechanism of mutant CALR-induced MPN is known to involve pathogenic binding between mutant CALR and MPL, this insight has not yet been exploited therapeutically. Consequently, a major deficiency is the lack of clonally selective therapeutic agents with curative potential. Hence, we set out to discover and validate unique genetic dependencies for mutant CALR-driven oncogenesis. We first performed a whole-genome CRISPR knockout screen in CALR Δ52 MPL-expressing hematopoietic cells to identify genes that were differentially required for the growth of cytokine-independent, transformed CALR Δ52 cells as compared to control cells. Using gene-set enrichment analyses, we identified the N-glycan biosynthesis, unfolded protein response, and the protein secretion pathways to be amongst the most significantly differentially depleted pathways (FDR q values <0.001, 0.014, and 0.025, respectively) in CALR Δ52 cells. We performed a secondary CRISPR pooled screen focused on significant pathways from the primary screen and confirmed these findings. Strikingly, seven of the top ten hits in both screens were linked to protein N-glycosylation. Four of those genes encode proteins involved in the enzymatic activity of dolichol-phosphate mannose synthase (DPM1, DPM2, DPM3, and MPDU1). This enzyme synthesizes dolichol D-mannosyl phosphate, an essential substrate for protein N-glycosylation. Importantly, these findings from an unbiased whole-genome screen align with prior mechanistic studies demonstrating that both the N-glycosylation sites on MPL and the lectin-binding sites on CALR Δ52 are required for mutant CALR-driven oncogenesis. We next performed single gene CRISPR Cas9 validation studies and found that DPM2 is required for CALR Δ52-mediated transformation, as demonstrated by increased cell death, reduced p-STAT5 and decreased MPL cell-surface levels, when Dpm2 is knocked out. Importantly, cells cultured in cytokine-rich medium were unaffected by DPM2 loss. Upon cytokine withdrawal, a sub-clone of non-edited Dpm2WT CALR Δ52 cells grew out, further demonstrating requirement for DPM2 for the survival of CALR Δ52 cells. Additionally, we observed a >50% reduction in ex vivo myeloid colony formation of murine CalrΔ52 Dpm2 ko bone marrow (BM) compared with CRISPR-Cas9 non-targeting controls, with non-significant effects on CalrWT BM cells. To enable clinical translation, we performed a pharmacological screen targeting pathways significantly depleted in our CRISPR screens. Screening 70 drugs, we found that the N-glycosylation pathway was the only pathway in which all tested compounds preferentially killed CALR Δ52 transformed cells. We then treated primary Calr Δ52/+ mice with a clinical grade N-glycosylation (N-Gi) inhibitor and found platelet counts (Sysmex) to be significantly reduced (vehicle 3x10 6/mL, N-Gi 1x10 6/mL after 18 days, p<.0001). Concordantly, the proportion of megakaryocyte erythrocyte progenitors (MEPs) was significantly reduced in CalrΔ52 BM (p=0.03). We next performed competitive BM transplantation assays using CD45.2 UBC-GFP MxCre CalrΔ52 knockin and CD45.1 mice. We found that mice treated with N-Gi had significantly reduced platelet counts (vehicle 1440x10 6/mL, N-Gi 845x10 6/mL, p=0.005) as well as significantly reduced platelet chimerism (vehicle 55%, N-Gi 27%, p<0.001), indicating a distinct vulnerability of CalrΔ52 over WT cells. Finally, we interrogated RNA-sequencing data from primary human MPN platelets. We found N-glycosylation-related pathways to be significantly upregulated in CALR-mutated platelets (n = 13) compared to healthy control platelets (n = 21), highlighting the relevance of our findings to human MPN. In summary, using unbiased genetic and focused pharmacological screens, we identified the N-glycan biosynthesis pathway as essential for mutant CALR-driven oncogenesis. Using a pre-clinical MPN model, we found that in vivo inhibition of N-glycosylation normalizes key features of MPN and preferentially targets CalrΔ52 over WT cells. These findings have therapeutic implications through inhibiting N-glycosylation alone or in combination with other agents to advance the development of clonally selective therapeutic approaches in CALR-mutant MPN. AEM and JSJ contributed equally. Figure 1 Figure 1. Disclosures Mullally: Janssen, PharmaEssentia, Constellation and Relay Therapeutics: Consultancy.


2015 ◽  
Author(s):  
Rudy Arthur ◽  
Jared O'Connell ◽  
Ole Schulz-Trieglaff ◽  
Anthony J Cox

Whole-genome low-coverage sequencing has been combined with linkage-disequilibrium (LD) based genotype refinement to accurately and cost-effectively infer genotypes in large cohorts of individuals. Most genotype refinement methods are based on hidden Markov models, which are accurate but computationally expensive. We introduce an algorithm that models LD using a simple multivariate Gaussian distribution. The key feature of our algorithm is its speed, it is hundreds of times faster than other methods on the same data set and its scaling behaviour is linear in the number of samples. We demonstrate the performance of the method on both low-coverage and high-coverage samples.


2019 ◽  
Vol 123 (7) ◽  
pp. 1231-1251 ◽  
Author(s):  
Dalel Ahmed ◽  
Aurore Comte ◽  
Franck Curk ◽  
Gilles Costantino ◽  
François Luro ◽  
...  

Abstract Background and Aims Reticulate evolution, coupled with reproductive features limiting further interspecific recombinations, results in admixed mosaics of large genomic fragments from the ancestral taxa. Whole-genome sequencing (WGS) data are powerful tools to decipher such complex genomes but still too costly to be used for large populations. The aim of this work was to develop an approach to infer phylogenomic structures in diploid, triploid and tetraploid individuals from sequencing data in reduced genome complexity libraries. The approach was applied to the cultivated Citrus gene pool resulting from reticulate evolution involving four ancestral taxa, C. maxima, C. medica, C. micrantha and C. reticulata. Methods A genotyping by sequencing library was established with the restriction enzyme ApeKI applying one base (A) selection. Diagnostic single nucleotide polymorphisms (DSNPs) for the four ancestral taxa were mined in 29 representative varieties. A generic pipeline based on a maximum likelihood analysis of the number of read data was established to infer ancestral contributions along the genome of diploid, triploid and tetraploid individuals. The pipeline was applied to 48 diploid, four triploid and one tetraploid citrus accessions. Key Results Among 43 598 mined SNPs, we identified a set of 15 946 DSNPs covering the whole genome with a distribution similar to that of gene sequences. The set efficiently inferred the phylogenomic karyotype of the 53 analysed accessions, providing patterns for common accessions very close to that previously established using WGS data. The complex phylogenomic karyotypes of 21 cultivated citrus, including bergamot, triploid and tetraploid limes, were revealed for the first time. Conclusions The pipeline, available online, efficiently inferred the phylogenomic structures of diploid, triploid and tetraploid citrus. It will be useful for any species whose reproductive behaviour resulted in an interspecific mosaic of large genomic fragments. It can also be used for the first generations of interspecific breeding schemes.


2019 ◽  
Vol 57 (6) ◽  
Author(s):  
R. C. Jones ◽  
L. G. Harris ◽  
S. Morgan ◽  
M. C. Ruddy ◽  
M. Perry ◽  
...  

ABSTRACT An inability to standardize the bioinformatic data produced by whole-genome sequencing (WGS) has been a barrier to its widespread use in tuberculosis phylogenetics. The aim of this study was to carry out a phylogenetic analysis of tuberculosis in Wales, United Kingdom, using Ridom SeqSphere software for core genome multilocus sequence typing (cgMLST) analysis of whole-genome sequencing data. The phylogenetics of tuberculosis in Wales have not previously been studied. Sixty-six Mycobacterium tuberculosis isolates (including 42 outbreak-associated isolates) from south Wales were sequenced using an Illumina platform. Isolates were assigned to principal genetic groups, single nucleotide polymorphism (SNP) cluster groups, lineages, and sublineages using SNP-calling protocols. WGS data were submitted to the Ridom SeqSphere software for cgMLST analysis and analyzed alongside 179 previously lineage-defined isolates. The data set was dominated by the Euro-American lineage, with the sublineage composition being dominated by T, X, and Haarlem family strains. The cgMLST analysis successfully assigned 58 isolates to major lineages, and the results were consistent with those obtained by traditional SNP mapping methods. In addition, the cgMLST scheme was used to resolve an outbreak of tuberculosis occurring in the region. This study supports the use of a cgMLST method for standardized phylogenetic assignment of tuberculosis isolates and for outbreak resolution and provides the first insight into Welsh tuberculosis phylogenetics, identifying the presence of the Haarlem sublineage commonly associated with virulent traits.


F1000Research ◽  
2015 ◽  
Vol 4 ◽  
pp. 1407 ◽  
Author(s):  
Andy G. Lynch

It is now commonplace to investigate tumour samples using whole-genome sequencing, and some commonly performed tasks are the estimation of cellularity (or sample purity), the genome-wide profiling of copy numbers, and the assessment of sub-clonal behaviours. Several tools are available to undertake these tasks, but often give conflicting results – not least because there is often genuine uncertainty due to a lack of model identifiability. Presented here is a tool, "Crambled", that allows for an intuitive visual comparison of the conflicting solutions. Crambled is implemented as a Shiny application within R, and is accompanied by example images from two use cases (one tumour sample with matched normal sequencing, and one standalone cell line example) as well as functions to generate the necessary images from any sequencing data set. Through the use of Crambled, a user may gain insight into why each tool has offered its given solution and combined with a knowledge of the disease being studied can choose between the competing solutions in an informed manner.


2018 ◽  
Author(s):  
Anna Supernat ◽  
Oskar Valdimar Vidarsson ◽  
Vidar M. Steen ◽  
Tomasz Stokowy

ABSTRACTTesting of patients with genetics-related disorders is in progress of shifting from single gene assays to gene panel sequencing, whole-exome sequencing (WES) and whole-genome sequencing (WGS). Since WGS is unquestionably becoming a new foundation for molecular analyses, we decided to compare three currently used tools for variant calling of human whole genome sequencing data. We tested DeepVariant, a new TensorFlow machine learning-based variant caller, and compared this tool to GATK 4.0 and SpeedSeq, using 30×, 15× and 10× WGS data of the well-known NA12878 DNA reference sample.According to our comparison, the performance on SNV calling was almost similar in 30× data, with all three variant callers reaching F-Scores (i.e. harmonic mean of recall and precision) equal to 0.98. In contrast, DeepVariant was more precise in indel calling than GATK and SpeedSeq, as demonstrated by F-Scores of 0.94, 0.90 and 0.84, respectively.We conclude that the DeepVariant tool has great potential and usefulness for analysis of WGS data in medical genetics.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Jacob Morrison ◽  
Julie M. Koeman ◽  
Benjamin K. Johnson ◽  
Kelly K. Foy ◽  
Ian Beddows ◽  
...  

Abstract Background With rapidly dropping sequencing cost, the popularity of whole-genome DNA methylation sequencing has been on the rise. Multiple library preparation protocols currently exist. We have performed 22 whole-genome DNA methylation sequencing experiments on snap frozen human samples, and extensively benchmarked common library preparation protocols for whole-genome DNA methylation sequencing, including three traditional bisulfite-based protocols and a new enzyme-based protocol. In addition, different input DNA quantities were compared for two kits compatible with a reduced starting quantity. In addition, we also present bioinformatic analysis pipelines for sequencing data from each of these library types. Results An assortment of metrics were collected for each kit, including raw read statistics, library quality and uniformity metrics, cytosine retention, and CpG beta value consistency between technical replicates. Overall, the NEBNext Enzymatic Methyl-seq and Swift Accel-NGS Methyl-Seq kits performed quantitatively better than the other two protocols. In addition, the NEB and Swift kits performed well at low-input amounts, validating their utility in applications where DNA is the limiting factor. Results The NEBNext Enzymatic Methyl-seq kit appeared to be the best option for whole-genome DNA methylation sequencing of high-quality DNA, closely followed by the Swift kit, which potentially works better for degraded samples. Further, a general bioinformatic pipeline is applicable across the four protocols, with the exception of extra trimming needed for the Swift Biosciences’s Accel-NGS Methyl-Seq protocol to remove the Adaptase sequence.


2021 ◽  
Author(s):  
Jacob Morrison ◽  
Julie M. Koeman ◽  
Benjamin K. Johnson ◽  
Kelly K. Foy ◽  
Wanding Zhou ◽  
...  

Abstract Background: With rapidly dropping sequencing cost, the popularity of whole-genome DNA methylation sequencing has been on the rise. Multiple library preparation protocols exist, but a systematic evaluation and benchmarking of their performance against each other is currently lacking. We have performed 22 whole-genome DNA methylation sequencing experiments on fresh frozen human samples, and extensively benchmarked common library preparation protocols for whole-genome DNA methylation sequencing, including three traditional bisulfite-based protocols and a new enzyme-based protocol. Additionally, different input DNA quantities were compared for two kits compatible with a reduced starting quantity. In addition, we also present bioinformatic analysis pipelines for sequencing data from each of these library types. Results: An assortment of metrics were collected for each kit, including raw read statistics, library quality and uniformity metrics, cytosine retention, and CpG beta value consistency between technical replicates. Overall, the NEBNext Enzymatic Methyl-seq kit performed quantitatively better than the other three protocols at two different DNA input amounts. Additionally, the results for the different input amounts were generally consistent across all metrics. Conclusions: Based on these results, we recommend use of the NEBNext Enzymatic Methyl-seq kit for whole-genome DNA methylation sequencing. Further, a general bioinformatic pipeline is applicable across the four protocols, with the exception of extra trimming needed for the Swift Bioscience's Accel-NGS Methyl-Seq protocol to remove the Adaptase sequence.


Sign in / Sign up

Export Citation Format

Share Document