High-Resolution Maps of Mouse Reference Populations

Genome sequence of the model rice variety KitaakeX

BMC Genomics ◽

10.1186/s12864-019-6262-4 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 6

Author(s):

Rashmi Jain ◽

Jerry Jenkins ◽

Shengqiang Shu ◽

Mawsheng Chern ◽

Joel A. Martin ◽

...

Keyword(s):

Functional Genomics ◽

De Novo ◽

Rice Variety ◽

Rice Genome ◽

Life Cycles ◽

Model Organisms ◽

High Quality ◽

Rice Varieties ◽

Protein Coding ◽

High Quality Genome

Abstract Background The availability of thousands of complete rice genome sequences from diverse varieties and accessions has laid the foundation for in-depth exploration of the rice genome. One drawback to these collections is that most of these rice varieties have long life cycles, and/or low transformation efficiencies, which limits their usefulness as model organisms for functional genomics studies. In contrast, the rice variety Kitaake has a rapid life cycle (9 weeks seed to seed) and is easy to transform and propagate. For these reasons, Kitaake has emerged as a model for studies of diverse monocotyledonous species. Results Here, we report the de novo genome sequencing and analysis of Oryza sativa ssp. japonica variety KitaakeX, a Kitaake plant carrying the rice XA21 immune receptor. Our KitaakeX sequence assembly contains 377.6 Mb, consisting of 33 scaffolds (476 contigs) with a contig N50 of 1.4 Mb. Complementing the assembly are detailed gene annotations of 35,594 protein coding genes. We identified 331,335 genomic variations between KitaakeX and Nipponbare (ssp. japonica), and 2,785,991 variations between KitaakeX and Zhenshan97 (ssp. indica). We also compared Kitaake resequencing reads to the KitaakeX assembly and identified 219 small variations. The high-quality genome of the model rice plant KitaakeX will accelerate rice functional genomics. Conclusions The high quality, de novo assembly of the KitaakeX genome will serve as a useful reference genome for rice and will accelerate functional genomics studies of rice and other species.

Download Full-text

Reference Genome for the Highly Transformable Setaria viridis ME034V

G3 Genes|Genome|Genetics ◽

10.1534/g3.120.401345 ◽

2020 ◽

Vol 10 (10) ◽

pp. 3467-3478 ◽

Cited By ~ 2

Author(s):

Peter M. Thielen ◽

Amanda L. Pendleton ◽

Robert A. Player ◽

Kenneth V. Bowden ◽

Thomas J. Lawton ◽

...

Keyword(s):

De Novo ◽

Gene Families ◽

Model Organisms ◽

Phylogenomic Analysis ◽

Setaria Viridis ◽

Sequencing Technology ◽

Protein Coding ◽

Genotype Frequencies ◽

Green Foxtail ◽

Genome Assemblies

Setaria viridis (green foxtail) is an important model system for improving cereal crops due to its diploid genome, ease of cultivation, and use of C4 photosynthesis. The S. viridis accession ME034V is exceptionally transformable, but the lack of a sequenced genome for this accession has limited its utility. We present a 397 Mb highly contiguous de novo assembly of ME034V using ultra-long nanopore sequencing technology (read N50 = 41kb). We estimate that this genome is largely complete based on our updated k-mer based genome size estimate of 401 Mb for S. viridis. Genome annotation identified 37,908 protein-coding genes and >300k repetitive elements comprising 46% of the genome. We compared the ME034V assembly with two other previously sequenced Setaria genomes as well as to a diversity panel of 235 S. viridis accessions. We found the genome assemblies to be largely syntenic, but numerous unique polymorphic structural variants were discovered. Several ME034V deletions may be associated with recent retrotransposition of copia and gypsy LTR repeat families, as evidenced by their low genotype frequencies in the sampled population. Lastly, we performed a phylogenomic analysis to identify gene families that have expanded in Setaria, including those involved in specialized metabolism and plant defense response. The high continuity of the ME034V genome assembly validates the utility of ultra-long DNA sequencing to improve genetic resources for emerging model organisms. Structural variation present in Setaria illustrates the importance of obtaining the proper genome reference for genetic experiments. Thus, we anticipate that the ME034V genome will be of significant utility for the Setaria research community.

Download Full-text

Genome sequence of the model rice variety KitaakeX

10.21203/rs.2.10528/v1 ◽

2019 ◽

Author(s):

Rashmi Jain ◽

Jerry Jenkins ◽

Shengqiang Shu ◽

Mawsheng Chern ◽

Joel A. Martin ◽

...

Keyword(s):

Functional Genomics ◽

De Novo ◽

Rice Variety ◽

Rice Genome ◽

Life Cycles ◽

Model Organisms ◽

High Quality ◽

Japonica Variety ◽

Rice Varieties ◽

Protein Coding

Abstract Background: The availability of thousands of complete rice genome sequences from diverse varieties and accessions has laid the foundation for in-depth exploration of the rice genome. One drawback to these collections is that most of these rice varieties have long life cycles, and/or low transformation efficiencies, which limits their usefulness as model organisms for functional genomics studies. In contrast, the rice variety Kitaake has a rapid life cycle (9 weeks seed to seed) and is easy to propagate. For these reasons, Kitaake has emerged as a model for studies of diverse monocotyledonous species. Results: Here, we report the de novo genome sequencing and analysis of Oryza sativa ssp. japonica variety KitaakeX, a Kitaake plant carrying the rice XA21 immune receptor. Our KitaakeX sequence assembly contains 377.6 Mb, consisting of 33 scaffolds (476 contigs) with a contig N50 of 1.4 Mb. Complementing the assembly are detailed gene annotations of 35,594 protein coding genes. We identified 331,335 genomic variations between KitaakeX and Nipponbare (ssp. japonica), and 2,785,991 variations between KitaakeX and Zhenshan97 (ssp. indica). We also compared Kitaake resequencing reads to the KitaakeX assembly and identified 219 small variations. The high-quality genome of the model rice plant KitaakeX will accelerate rice functional genomics. Conclusions: The high quality, de novo assembly of the KitaakeX genome will serve as a useful reference genome for rice and will accelerate functional genomics studies of rice and other species.

Download Full-text

Reference genome for the highly transformable Setaria viridis cultivar ME034V

10.1101/2020.05.02.073684 ◽

2020 ◽

Author(s):

Peter M. Thielen ◽

Amanda L. Pendleton ◽

Robert A. Player ◽

Kenneth V. Bowden ◽

Thomas J. Lawton ◽

...

Keyword(s):

De Novo ◽

Gene Families ◽

Model Organisms ◽

Phylogenomic Analysis ◽

Setaria Viridis ◽

Sequencing Technology ◽

Protein Coding ◽

Genotype Frequencies ◽

Green Foxtail ◽

Genome Assemblies

ABSTRACTSetaria viridis (green foxtail) is an important model system for improving cereal crops due to its diploid genome, ease of cultivation, and use of C4 photosynthesis. The S. viridis cultivar ME034V is exceptionally transformable, but the lack of a sequenced genome for this cultivar has limited its utility. We present a 397 Mb highly contiguous de novo assembly of ME034V using ultra-long nanopore sequencing technology (read N50=41kb). We estimate that this genome is largely complete based on our updated k-mer based genome size estimate of 401 Mb for S. viridis. Genome annotation identified 37,908 protein-coding genes and >300k repetitive elements comprising 46% of the genome. We compared the ME034V assembly with two other previously sequenced Setaria genomes as well as to a diversity panel of 235 S. viridis cultivars. We found the genome assemblies to be largely syntenic, but numerous unique polymorphic structural variants were discovered. Several ME034V deletions may be associated with recent retrotransposition of copia and gypsy LTR repeat families, as evidenced by their low genotype frequencies in the sampled population. Lastly, we performed a phylogenomic analysis to identify gene families that have expanded in Setaria, including those involved in specialized metabolism and plant defense response. The high continuity of the ME034V genome assembly validates the utility of ultra-long DNA sequencing to improve genetic resources for emerging model organisms. Structural variation present in Setaria illustrates the importance of obtaining the proper genome reference for genetic experiments. Thus, we anticipate that the ME034V genome will be of significant utility for the Setaria research community.

Download Full-text

Genome sequence of the model rice variety KitaakeX

10.21203/rs.2.10528/v3 ◽

2019 ◽

Author(s):

Rashmi Jain ◽

Jerry Jenkins ◽

Shengqiang Shu ◽

Mawsheng Chern ◽

Joel A. Martin ◽

...

Keyword(s):

Functional Genomics ◽

De Novo ◽

Rice Variety ◽

Rice Genome ◽

Life Cycles ◽

Model Organisms ◽

High Quality ◽

Japonica Variety ◽

Rice Varieties ◽

Protein Coding

Abstract Background: The availability of thousands of complete rice genome sequences from diverse varieties and accessions has laid the foundation for in-depth exploration of the rice genome. One drawback to these collections is that most of these rice varieties have long life cycles, and/or low transformation efficiencies, which limits their usefulness as model organisms for functional genomics studies. In contrast, the rice variety Kitaake has a rapid life cycle (9 weeks seed to seed) and is easy to propagate. For these reasons, Kitaake has emerged as a model for studies of diverse monocotyledonous species. Results: Here, we report the de novo genome sequencing and analysis of Oryza sativa ssp. japonica variety KitaakeX, a Kitaake plant carrying the rice XA21 immune receptor. Our KitaakeX sequence assembly contains 377.6 Mb, consisting of 33 scaffolds (476 contigs) with a contig N50 of 1.4 Mb. Complementing the assembly are detailed gene annotations of 35,594 protein coding genes. We identified 331,335 genomic variations between KitaakeX and Nipponbare (ssp. japonica), and 2,785,991 variations between KitaakeX and Zhenshan97 (ssp. indica). We also compared Kitaake resequencing reads to the KitaakeX assembly and identified 219 small variations. The high-quality genome of the model rice plant KitaakeX will accelerate rice functional genomics. Conclusions: The high quality, de novo assembly of the KitaakeX genome will serve as a useful reference genome for rice and will accelerate functional genomics studies of rice and other species.

Download Full-text

Genome sequence of the model rice variety KitaakeX

10.21203/rs.2.10528/v2 ◽

2019 ◽

Author(s):

Rashmi Jain ◽

Jerry Jenkins ◽

Shengqiang Shu ◽

Mawsheng Chern ◽

Joel A. Martin ◽

...

Keyword(s):

Functional Genomics ◽

De Novo ◽

Rice Variety ◽

Rice Genome ◽

Life Cycles ◽

Model Organisms ◽

High Quality ◽

Japonica Variety ◽

Rice Varieties ◽

Protein Coding

Abstract Background: The availability of thousands of complete rice genome sequences from diverse varieties and accessions has laid the foundation for in-depth exploration of the rice genome. One drawback to these collections is that most of these rice varieties have long life cycles, and/or low transformation efficiencies, which limits their usefulness as model organisms for functional genomics studies. In contrast, the rice variety Kitaake has a rapid life cycle (9 weeks seed to seed) and is easy to propagate. For these reasons, Kitaake has emerged as a model for studies of diverse monocotyledonous species. Results: Here, we report the de novo genome sequencing and analysis of Oryza sativa ssp. japonica variety KitaakeX, a Kitaake plant carrying the rice XA21 immune receptor. Our KitaakeX sequence assembly contains 377.6 Mb, consisting of 33 scaffolds (476 contigs) with a contig N50 of 1.4 Mb. Complementing the assembly are detailed gene annotations of 35,594 protein coding genes. We identified 331,335 genomic variations between KitaakeX and Nipponbare (ssp. japonica), and 2,785,991 variations between KitaakeX and Zhenshan97 (ssp. indica). We also compared Kitaake resequencing reads to the KitaakeX assembly and identified 219 small variations. The high-quality genome of the model rice plant KitaakeX will accelerate rice functional genomics. Conclusions: The high quality, de novo assembly of the KitaakeX genome will serve as a useful reference genome for rice and will accelerate functional genomics studies of rice and other species.

Download Full-text

The neurotranscriptome of the Aedes aegypti mosquito

10.1101/026823 ◽

2015 ◽

Author(s):

Benjamin J Matthews ◽

Carolyn S McBride ◽

Matthew DeGennaro ◽

Orion Despo ◽

Leslie B Vosshall

Keyword(s):

Gene Expression ◽

Aedes Aegypti ◽

Vector Control ◽

De Novo ◽

Transcriptome Assembly ◽

Molecular Genetic ◽

Blood Feeding ◽

Model Organisms ◽

Protein Coding ◽

Male And Female

Background A complete genome sequence and the advent of genome editing open up non-traditional model organisms to mechanistic genetic studies. The mosquito Aedes aegypti is an important vector of infectious diseases such as dengue, chikungunya, and yellow fever, and has a large and complex genome, which has slowed annotation efforts. We used comprehensive transcriptomic analysis of adult gene expression to improve the genome annotation and to provide a detailed tissue-specific catalogue of neural gene expression at different adult behavioral states. Results We carried out deep RNA sequencing across all major peripheral male and female sensory tissues, the brain, and (female) ovary. Furthermore, we examined gene expression across three important phases of the female reproductive cycle, a remarkable example of behavioral switching in which a female mosquito alternates between obtaining blood-meals from humans and laying eggs. Using genome-guided alignments and de novo transcriptome assembly, our re-annotation includes 572 new putative protein-coding genes and updates to 13.5% and 50.3% of existing transcripts within coding sequences and untranslated regions, respectively. Using this updated annotation, we detail gene expression in each tissue, identifying large numbers of transcripts regulated by blood-feeding and sexually dimorphic transcripts that may provide clues to the biology of male- and female-specific behaviors, such as mating and blood-feeding, which are areas of intensive study for those interested in vector control. Conclusions This neurotranscriptome forms a strong foundation for the study of genes in the mosquito nervous system and investigation of sensory-driven behaviors and their regulation. Furthermore, understanding the molecular genetic basis of mosquito chemosensory behavior has important implications for vector control.

Download Full-text

De Novo Profiling of Long Non-Coding RNAs Involved in MC-LR-Induced Liver Injury in Whitefish: Discovery and Perspectives

International Journal of Molecular Sciences ◽

10.3390/ijms22020941 ◽

2021 ◽

Vol 22 (2) ◽

pp. 941

Author(s):

Maciej Florczyk ◽

Paweł Brzuzan ◽

Maciej Woźny

Keyword(s):

Liver Injury ◽

Molecular Mechanisms ◽

Liver Toxicity ◽

De Novo ◽

Minimum Free Energy ◽

Model Organisms ◽

Coregonus Lavaretus ◽

Protein Coding ◽

Liver Transcriptome ◽

Non Coding Rnas

Microcystin-LR (MC-LR) is a potent hepatotoxin for which a substantial gap in knowledge persists regarding the underlying molecular mechanisms of liver toxicity and injury. Although long non-coding RNAs (lncRNAs) have been extensively studied in model organisms, our knowledge concerning the role of lncRNAs in liver injury is limited. Given that lncRNAs show low levels of sequence conservation, their role becomes even more unclear in non-model organisms without an annotated genome, like whitefish (Coregonus lavaretus). The objective of this study was to discover and profile aberrantly expressed polyadenylated lncRNAs that are involved in MC-LR-induced liver injury in whitefish. Using RNA sequencing (RNA-Seq) data, we de novo assembled a high-quality whitefish liver transcriptome. This enabled us to find 94 differentially expressed (DE) putative evolutionary conserved lncRNAs, such as MALAT1, HOTTIP, HOTAIR or HULC, and 4429 DE putative novel whitefish lncRNAs, which differed from annotated protein-coding transcripts (PCTs) in terms of minimum free energy, guanine-cytosine (GC) base-pair content and length. Additionally, we identified DE non-coding transcripts that might be 3′ autonomous untranslated regions (3′UTRs) of mRNAs. We found both evolutionary conserved lncRNAs as well as novel whitefish lncRNAs that could serve as biomarkers of liver injury.

Download Full-text

Error, noise and bias in de novo transcriptome assemblies

10.1101/585745 ◽

2019 ◽

Cited By ~ 3

Author(s):

Adam H. Freedman ◽

Michele Clamp ◽

Timothy B. Sackton

Keyword(s):

De Novo ◽

Transcriptome Assembly ◽

Error Rates ◽

Computational Method ◽

Model Organisms ◽

Data Sets ◽

Protein Coding ◽

Gene Level ◽

Assembly Algorithms ◽

The Impact

ABSTRACTDe novo transcriptome assembly is a powerful tool, widely used over the last decade for making evolutionary inferences. However, it relies on two implicit assumptions: that the assembled transcriptome is an unbiased representation of the underlying expressed transcriptome, and that expression estimates from the assembly are good, if noisy approximations of the relative abundance of expressed transcripts. Using publicly available data for model organisms, we demonstrate that, across assembly algorithms and data sets, these assumptions are consistently violated. Bias exists at the nucleotide level, with genotyping error rates ranging from 30-83%. As a result, diversity is underestimated in transcriptome assemblies, with consistent under-estimation of heterozygosity in all but the most inbred samples. Even at the gene level, expression estimates show wide deviations from map-to-reference estimates, and positive bias at lower expression levels. Standard filtering of transcriptome assemblies improves the robustness of gene expression estimates but leads to the loss of a meaningful number of protein-coding genes, including many that are highly expressed. We demonstrate a computational method, length-rescaled CPM, to partly alleviate noise and bias in expression estimates. Researchers should consider ways to minimize the impact of bias in transcriptome assemblies.

Download Full-text

Analysis of Inter-Chromosomal Distribution of Disease-Related Genes in Human Genome

Current Protein and Peptide Science ◽

10.2174/1389203721666200426233158 ◽

2020 ◽

Vol 21 (11) ◽

pp. 1068-1077

Author(s):

Xiaochao Sun ◽

Bin Yang ◽

Qunye Zhang

Keyword(s):

Spatial Distribution ◽

Model Organisms ◽

Nucleotide Polymorphisms ◽

Chromosomal Distribution ◽

Single Nucleotide ◽

Protein Coding ◽

Single Chromosome ◽

Deletion Mutations ◽

Protein Coding Genes ◽

Disease Related Genes

: Many studies have shown that the spatial distribution of genes within a single chromosome exhibits distinct patterns. However, little is known about the characteristics of inter-chromosomal distribution of genes (including protein-coding genes, processed transcripts and pseudogenes) in different genomes. In this study, we explored these issues using the available genomic data of both human and model organisms. Moreover, we also analyzed the distribution pattern of protein-coding genes that have been associated with 14 common diseases and the insert/deletion mutations and single nucleotide polymorphisms detected by whole genome sequencing in an acute promyelocyte leukemia patient. We obtained the following novel findings. Firstly, inter-chromosomal distribution of genes displays a nonstochastic pattern and the gene densities in different chromosomes are heterogeneous. This kind of heterogeneity is observed in genomes of both lower and higher species. Secondly, protein-coding genes involved in certain biological processes tend to be enriched in one or a few chromosomes. Our findings have added new insights into our understanding of the spatial distribution of genome and disease- related genes across chromosomes. These results could be useful in improving the efficiency of disease-associated gene screening studies by targeting specific chromosomes.

Download Full-text