Whole genome sequencing identifies a novel factor required for secretory granule maturation in Tetrahymena thermophila

Mapping Intimacies ◽

10.1101/042085 ◽

2016 ◽

Author(s):

Cassandra Kontur ◽

Santosh Kumar ◽

Xun Lan ◽

Jonathan K Pritchard ◽

Aaron P Turkewitz

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Tetrahymena Thermophila ◽

Secretory Granules ◽

Protein A ◽

Model Organisms ◽

Whole Genome ◽

Lysosomal Sorting ◽

Granule Maturation ◽

Multiple Strains

Unbiased genetic approaches have a unique ability to identify novel genes associated with specific biological pathways. Thanks to next generation sequencing, forward genetic strategies can be expanded into a wider range of model organisms. The formation of secretory granules, called mucocysts, in the ciliate Tetrahymena thermophila relies in part on ancestral lysosomal sorting machinery but is also likely to involve novel factors. In prior work, multiple strains with defect in mucocyst biogenesis were generated by nitrosoguanidine mutagenesis, and characterized using genetic and cell biological approaches, but the genetic lesions themselves were unknown. Here, we show that analyzing one such mutant by whole genome sequencing reveals a novel factor in mucocyst formation. Strain UC620 has both morphological and biochemical defects in mucocyst maturation, a process analogous to dense core granule maturation in animals. Illumina sequencing of a pool of UC620 F2 clones identified a missense mutation in a novel gene called MMA1 (Mucocyst maturation). The defects in UC620 were rescued by expression of a wildtype copy of MMA1, and disruption of MMA1 in an otherwise wildtype strain generated a phenocopy of UC620. The product of MMA1, characterized as a CFP-tagged copy, encodes a large soluble cytosolic protein. A small fraction of Mma1p-CFP is pelletable, which may reflect association with endosomes. The gene has no identifiable homologs except in other Tetrahymena species, and therefore represents an evolutionarily recent innovation that is required for granule maturation.

Download Full-text

Whole Genome Sequencing Identifies a Novel Factor Required for Secretory Granule Maturation in Tetrahymena thermophila

G3 Genes|Genome|Genetics ◽

10.1534/g3.116.028878 ◽

2016 ◽

Vol 6 (8) ◽

pp. 2505-2516 ◽

Cited By ~ 6

Author(s):

Cassandra Kontur ◽

Santosh Kumar ◽

Xun Lan ◽

Jonathan K. Pritchard ◽

Aaron P. Turkewitz

Keyword(s):

Whole Genome Sequencing ◽

Secretory Granule ◽

Genome Sequencing ◽

Tetrahymena Thermophila ◽

Whole Genome ◽

Granule Maturation

Download Full-text

Whole-genome sequencing of the endemic Antarctic fungus Antarctomyces pellizariae reveals an ice-binding protein, a scarce set of secondary metabolites gene clusters and provides insights on Thelebolales phylogeny

Genomics ◽

10.1016/j.ygeno.2020.05.004 ◽

2020 ◽

Vol 112 (5) ◽

pp. 2915-2921 ◽

Cited By ~ 2

Author(s):

Thiago Mafra Batista ◽

Heron Oliveira Hilario ◽

Gabriel Antônio Mendes de Brito ◽

Rennan Garcias Moreira ◽

Carolina Furtado ◽

...

Keyword(s):

Secondary Metabolites ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

Binding Protein ◽

Protein A ◽

Gene Clusters ◽

Whole Genome ◽

Ice Binding ◽

Ice Binding Protein

Download Full-text

SplitStrains, a tool to identify and separate mixed Mycobacterium tuberculosis infections from WGS data

10.1101/2021.02.07.21250981 ◽

2021 ◽

Author(s):

Einar Gabbasov ◽

Miguel Moreno-Molina ◽

Iñaki Comas ◽

Maxwell Libbrecht ◽

Leonid Chindelevitch

Keyword(s):

Public Health ◽

Mycobacterium Tuberculosis ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

Bacterial Pathogens ◽

Superior Performance ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Sequencing Data ◽

Multiple Strains

AbstractThe occurrence of multiple strains of a bacterial pathogen such as M. tuberculosis or C. difficile within a single human host, referred to as a mixed infection, has important implications for both healthcare and public health. However, methods for detecting it, and especially determining the proportion and identities of the underlying strains, from WGS (whole-genome sequencing) data, have been limited.In this paper we introduce SplitStrains, a novel method for addressing these challenges. Grounded in a rigorous statistical model, SplitStrains not only demonstrates superior performance in proportion estimation to other existing methods on both simulated as well as real M. tuberculosis data, but also successfully determines the identity of the underlying strains.We conclude that SplitStrains is a powerful addition to the existing toolkit of analytical methods for data coming from bacterial pathogens, and holds the promise of enabling previously inaccessible conclusions to be drawn in the realm of public health microbiology.Author summaryWhen multiple strains of a pathogenic organism are present in a patient, it may be necessary to not only detect this, but also to identify the individual strains. However, this problem has not yet been solved for bacterial pathogens processed via whole-genome sequencing. In this paper, we propose the SplitStrains algorithm for detecting multiple strains in a sample, identifying their proportions, and inferring their sequences, in the case of Mycobacterium tuberculosis. We test it on both simulated and real data, with encouraging results. We believe that our work opens new horizons in public health microbiology by allowing a more precise detection, identification and quantification of multiple infecting strains within a sample.

Download Full-text

Challenges and Approaches to Genotyping Repetitive DNA

G3 Genes|Genome|Genetics ◽

10.1534/g3.119.400771 ◽

2019 ◽

Vol 10 (1) ◽

pp. 417-430 ◽

Cited By ~ 2

Author(s):

Elizabeth A. Morton ◽

Ashley N. Hall ◽

Elizabeth Kwan ◽

Calvin Mok ◽

Konstantin Queitsch ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Repetitive Dna ◽

Complex Traits ◽

Copy Number ◽

Model Organisms ◽

Whole Genome ◽

Copy Number Estimation ◽

Rdna Copy ◽

Rdna Copy Number

Individuals within a species can exhibit vast variation in copy number of repetitive DNA elements. This variation may contribute to complex traits such as lifespan and disease, yet it is only infrequently considered in genotype-phenotype associations. Although the possible importance of copy number variation is widely recognized, accurate copy number quantification remains challenging. Here, we assess the technical reproducibility of several major methods for copy number estimation as they apply to the large repetitive ribosomal DNA array (rDNA). rDNA encodes the ribosomal RNAs and exists as a tandem gene array in all eukaryotes. Repeat units of rDNA are kilobases in size, often with several hundred units comprising the array, making rDNA particularly intractable to common quantification techniques. We evaluate pulsed-field gel electrophoresis, droplet digital PCR, and Nextera-based whole genome sequencing as approaches to copy number estimation, comparing techniques across model organisms and spanning wide ranges of copy numbers. Nextera-based whole genome sequencing, though commonly used in recent literature, produced high error. We explore possible causes for this error and provide recommendations for best practices in rDNA copy number estimation. We present a resource of high-confidence rDNA copy number estimates for a set of S. cerevisiae and C. elegans strains for future use. We furthermore explore the possibility for FISH-based copy number estimation, an alternative that could potentially characterize copy number on a cellular level.

Download Full-text

Identifying TCDD-resistance genes via murine and rat comparative genomics and transcriptomics

10.1101/602698 ◽

2019 ◽

Cited By ~ 1

Author(s):

Stephenie D. Prokopec ◽

Aileen Lu ◽

Sandy Che-Eun S. Lee ◽

Cindy Q. Yao ◽

Ren X. Sun ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Transgenic Mouse ◽

Genome Sequencing ◽

Mrna Abundance ◽

Model Organisms ◽

Whole Genome ◽

Contributing Factors ◽

Rat Models ◽

Toxic Responses ◽

Mouse Lines

AbstractThe aryl hydrocarbon receptor (AHR) mediates many of the toxic effects of 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD). However, the AHR alone is insufficient to explain the widely different outcomes among organisms. Attempts to identify unknown factor(s) have been confounded by genetic variability of model organisms. Here, we evaluated three transgenic mouse lines, each expressing a different rat AHR isoform (rWT, DEL, and INS), as well as C57BL/6 and DBA/2 mice. We supplement these with whole-genome sequencing and transcriptomic analyses of the corresponding rat models: Long-Evans (L-E) and Han/Wistar (H/W) rats. These integrated multi-species genomic and transcriptomic data were used to identify genes associated with TCDD-response phenotypes.We identified several genes that show consistent transcriptional changes in both transgenic mice and rats. Hepatic Pxdc1 was significantly repressed by TCDD in C57BL/6, rWT mice, and in L-E rat. Three genes demonstrated different AHRE-1 (full) motif occurrences within their promoter regions: Cxxc5 had fewer occurrences in H/W, as compared with L-E; Sugp1 and Hgfac (in either L-E or H/W respectively). These genes also showed different patterns of mRNA abundance across strains.The AHR isoform explains much of the transcriptional variability: up to 50% of genes with altered mRNA abundance following TCDD exposure are associated with a single AHR isoform (30% and 10% unique to DEL and rWT respectively following 500 μg/kg TCDD). Genomic and transcriptomic evidence allowed identification of genes potentially involved in phenotypic outcomes: Pxdc1 had differential mRNA abundance by phenotype; Cxxc5 had altered AHR binding sites and differential mRNA abundance.Author SummaryEnvironmental contaminants such as dioxins cause many toxic responses, anything from chloracne (common in humans) to death. These toxic responses are mostly regulated by the Ahr, a ligand-activated transcription factor with roles in drug metabolism and immune responses, however other contributing factors remain unclear. Studies are complicated by the underlying genetic heterogeneity of model organisms. Our team evaluated a number of mouse and rat models, including two strains of mouse, two strains of rat and three transgenic mouse lines which differ only at the Ahr locus, that present widely different sensitivities to the most potent dioxin: 2,3,7,8 tetrachlorodibenzo-p-dioxin (TCDD). We identified a number of changes to gene expression that were associated with different toxic responses. We then contrasted these findings with results from whole-genome sequencing of the H/W and L-E rats and found some key genes, such as Cxxc5 and Mafb, which might contribute to TCDD toxicity. These transcriptomic and genomic datasets will provide a valuable resource for future studies into the mechanisms of dioxin toxicities.

Download Full-text

Mapping challenging mutations by whole-genome sequencing

10.1101/036046 ◽

2016 ◽

Author(s):

Harold E. Smith ◽

Amy S. Fabritius ◽

Aimee Jaramillo-Lambert ◽

Andy Golden

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Single Gene ◽

Global Scale ◽

Model Organisms ◽

Whole Genome ◽

Genetic Screens ◽

Alternative Approach ◽

Mutation Identification

ABSTRACTWhole-genome sequencing provides a rapid and powerful method for identifying mutations on a global scale, and has spurred a renewed enthusiasm for classical genetic screens in model organisms. The most commonly characterized category of mutation consists of monogenic, recessive traits, due to their genetic tractability. Therefore, most of the mapping methods for mutation identification by whole-genome sequencing are directed toward alleles that fulfill those criteria (i.e., single-gene, homozygous variants). However, such approaches are not entirely suitable for the characterization of a variety of more challenging mutations, such as dominant and semi-dominant alleles or multigenic traits. Therefore, we have developed strategies for the identification of those classes of mutations, using polymorphism mapping in Caenorhabditis elegans as our model for validation. We also report an alternative approach for mutation identification from traditional recombinant crosses, and a solution to the technical challenge of sequencing sterile or terminally arrested strains where population size is limiting. The methods described herein extend the applicability of whole-genome sequencing to a broader spectrum of mutations, including classes that are difficult to map by traditional means.

Download Full-text

Genomic Sequencing To Identify Potential Causative Mutation(s) of Neurospora crassa col-4

Microbiology Resource Announcements ◽

10.1128/mra.01009-19 ◽

2020 ◽

Vol 9 (2) ◽

Author(s):

Thomas A. Randall

Keyword(s):

Neurospora Crassa ◽

Whole Genome Sequencing ◽

Genetic Markers ◽

Genome Sequencing ◽

Model Organisms ◽

Causative Mutation ◽

Genomic Sequencing ◽

Whole Genome ◽

Content Type

In many cases, genes for commonly used genetic markers in model organisms have not been identified; therefore, it is of interest to identify the causative genes. Whole-genome sequencing was used to identify potential causative mutations for a col-4 allele of Neurospora crassa.

Download Full-text

Outbreak of invasive wound mucormycosis in a burn unit due to multiple strains of Mucor circinelloides f. circinelloides resolved by whole genome sequencing

10.1101/233049 ◽

2017 ◽

Author(s):

Dea Garcia-Hermoso ◽

Alexis Criscuolo ◽

Soo chan Lee ◽

Matthieu Legrand ◽

Marc Chaouat ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Reference Genome ◽

Mucor Circinelloides ◽

Clinical Isolates ◽

Whole Genome ◽

Direct Transmission ◽

Burn Unit ◽

Environmental Reservoir ◽

Multiple Strains

AbstractMucorales are ubiquitous environmental molds responsible for mucormycosis in diabetic, immunocompromised, and severely burned patients. Small outbreaks of invasive wound mucormycosis (IWM) have already been reported in burn units without extensive microbiological investigations. We faced an outbreak of IWM in our center and investigated the clinical isolates with whole genome sequencing (WGS) analysis.We analyzed M. circinelloides isolates from patients in our burn unit (BU1) together with non-outbreak isolates from burn unit 2 (BU2, Paris area) and from France over a two-year period (2013-2015). For each isolate, WGS and a de novo genome assembly was performed from read data extracted from the aligned contig sequences of the reference genome (1006PhL).A total of 21 isolates were sequenced including 14 isolates from six BU1 patients. Phylogenetic classification showed that the clinical isolates clustered in four highly divergent clades. Clade1 contained at least one of the strains from the six epidemiologically-linked BU1 patients. The clinical isolates seemed specific to each patient. Two patients were infected with more than two strains from different clades suggesting that an environmental reservoir of clonally unrelated isolates was the source of contamination. Only two patients shared one strain in BU1, suggesting direct transmission or contamination with the same environmental source.WGS coupled with precise epidemiological data and analysis of several isolates per patients revealed in our study a complex situation with both potential cross-transmission and multiple contaminations with a heterogeneous pool of strains from a cryptic environmental reservoir.ImportanceInvasive wound mucormycosis (IWM) is a severe infection due to the environmental molds belonging to the order Mucorales. Severely burned patients are particularly at risk for IWM. Here, we used Whole Genome Sequencing (WGS) analysis to resolve an outbreak of IWM due to Mucor circinelloides that occurred in our hospital (BU1). We sequenced 21 clinical isolates, including 14 from BU1 and 7 unrelated isolates, and compared them to the reference genome (1006PhL). This analysis revealed that the outbreak was mainly due to multiple strains that seemed patient-specific, suggesting that the patients were more likely infected from a pool of diverse strains from the environment rather than from direct transmission between the patients. This study revealed the complexity of a Mucorales outbreak in the settings of IWM in burn patients, which has been highlighted based on whole genome sequencing and careful sampling.

Download Full-text

First de-novo transcriptome assembly of a South American frog, Oreobates cruralis, enables population genomic studies of Neotropical amphibians

10.7287/peerj.preprints.2980v1 ◽

2017 ◽

Author(s):

Santiago Montero-Mendieta ◽

Manfred Grabherr ◽

Henrik Lantz ◽

Ignacio De la Riva ◽

Jennifer A Leonard ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

De Novo ◽

Transcriptome Assembly ◽

Cost Effective ◽

Model Organisms ◽

South American ◽

Whole Genome ◽

Rna Seq ◽

De Novo Transcriptome

Whole genome sequencing is opening the door to novel insights into the population structure and evolutionary history of poorly known species. In organisms with large genomes, which includes most amphibians, whole-genome sequencing is excessively challenging and transcriptome sequencing (RNA-seq) represents a cost-effective tool to explore genome-wide variability. Non-model organisms do not usually have a reference genome to facilitate assembly and the transcriptome sequence must be assembled de-novo. We used RNA-seq to obtain the transcriptome profile for Oreobates cruralis, a poorly known South American direct-developing frog. In total, 550,871 transcripts were assembled, corresponding to 422,999 putative genes. Of those, we identified 23,500, 37,349, 38,120 and 45,885 genes present in the Pfam, EggNOG, KEGG and GO databases, respectively. Interestingly, our results suggested that genes related to immune system and defense mechanisms are abundant in the transcriptome of O. cruralis. We also present a workflow to assist with pre-processing, assembling, evaluating and functionally annotating a de-novo transcriptome from RNA-seq data of non-model organisms. Our workflow guides the inexperienced user in an intuitive way through all the necessary steps to build de-novo transcriptome assemblies using readily available software and is freely available at: https://github.com/biomendi/PRACTICAL-GUIDE-TO-BUILD-DE-NOVO-TRANSCRIPTOME-ASSEMBLIES-FOR-NON-MODEL-ORGANISMS/wiki

Download Full-text

Whole Genome Sequencing Provides an Added Value to the Investigation of Staphylococcal Food Poisoning Outbreaks

Frontiers in Microbiology ◽

10.3389/fmicb.2021.750278 ◽

2021 ◽

Vol 12 ◽

Author(s):

Stéphanie Nouws ◽

Bert Bogaerts ◽

Bavo Verhaegen ◽

Sarah Denayer ◽

Lasse Laeremans ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Protein A ◽

Food Poisoning ◽

Gene Profiling ◽

Added Value ◽

Outbreak Investigation ◽

Whole Genome ◽

Foodborne Outbreak ◽

Chain Reactions

Through staphylococcal enterotoxin (SE) production, Staphylococcus aureus is a common cause of food poisoning. Detection of staphylococcal food poisoning (SFP) is mostly performed using immunoassays, which, however, only detect five of 27 SEs described to date. Polymerase chain reactions are, therefore, frequently used in complement to identify a bigger arsenal of SE at the gene level (se) but are labor-intensive. Complete se profiling of isolates from different sources, i.e., food and human cases, is, however, important to provide an indication of their potential link within foodborne outbreak investigation. In addition to complete se gene profiling, relatedness between isolates is determined with more certainty using pulsed-field gel electrophoresis, Staphylococcus protein A gene typing and other methods, but these are shown to lack resolution. We evaluated how whole genome sequencing (WGS) can offer a solution to these shortcomings. By WGS analysis of a selection of S. aureus isolates, including some belonging to a confirmed foodborne outbreak, its added value as the ultimate multiplexing method was demonstrated. In contrast to PCR-based se gene detection for which primers are sometimes shown to be non-specific, WGS enabled complete se gene profiling with high performance, provided that a database containing reference sequences for all se genes was constructed and employed. The custom compiled database and applied parameters were made publicly available in an online user-friendly interface. As an all-in-one approach with high resolution, WGS additionally allowed inferring correct isolate relationships. The different DNA extraction kits that were tested affected neither se gene profiling nor relatedness determination, which is interesting for data sharing during SFP outbreak investigation. Although confirming the production of enterotoxins remains important for SFP investigation, we delivered a proof-of-concept that WGS is a valid alternative and/or complementary tool for outbreak investigation.

Download Full-text