scholarly journals De Novo Sequencing, Assembly, and Annotation of Four Threespine Stickleback Genomes Based on Microfluidic Partitioned DNA Libraries

Genes ◽  
2019 ◽  
Vol 10 (6) ◽  
pp. 426 ◽  
Author(s):  
Daniel Berner ◽  
Marius Roesti ◽  
Steven Bilobram ◽  
Simon K. Chan ◽  
Heather Kirk ◽  
...  

The threespine stickleback is a geographically widespread and ecologically highly diverse fish that has emerged as a powerful model system for evolutionary genomics and developmental biology. Investigations in this species currently rely on a single high-quality reference genome, but would benefit from the availability of additional, independently sequenced and assembled genomes. We present here the assembly of four new stickleback genomes, based on the sequencing of microfluidic partitioned DNA libraries. The base pair lengths of the four genomes reach 92–101% of the standard reference genome length. Together with their de novo gene annotation, these assemblies offer a resource enhancing genomic investigations in stickleback. The genomes and their annotations are available from the Dryad Digital Repository (https://doi.org/10.5061/dryad.113j3h7).

2021 ◽  
Author(s):  
Xinxin Yi ◽  
Jing Liu ◽  
Shengcai Chen ◽  
Hao Wu ◽  
Min Liu ◽  
...  

Cultivated soybean (Glycine max) is an important source for protein and oil. Many elite cultivars with different traits have been developed for different conditions. Each soybean strain has its own genetic diversity, and the availability of more high-quality soybean genomes can enhance comparative genomic analysis for identifying genetic underpinnings for its unique traits. In this study, we constructed a high-quality de novo assembly of an elite soybean cultivar Jidou 17 (JD17) with chromsome contiguity and high accuracy. We annotated 52,840 gene models and reconstructed 74,054 high-quality full-length transcripts. We performed a genome-wide comparative analysis based on the reference genome of JD17 with three published soybeans (WM82, ZH13 and W05) , which identified five large inversions and two large translocations specific to JD17, 20,984 - 46,912 PAVs spanning 13.1 - 46.9 Mb in size, and 5 - 53 large PAV clusters larger than 500kb. 1,695,741 - 3,664,629 SNPs and 446,689 - 800,489 Indels were identified and annotated between JD17 and them. Symbiotic nitrogen fixation (SNF) genes were identified and the effects from these variants were further evaluated. It was found that the coding sequences of 9 nitrogen fixation-related genes were greatly affected. The high-quality genome assembly of JD17 can serve as a valuable reference for soybean functional genomics research.


2021 ◽  
Vol 11 (1) ◽  
pp. 1-9
Author(s):  
Ariel Gershman ◽  
Tatiana G Romer ◽  
Yunfan Fan ◽  
Roham Razaghi ◽  
Wendy A Smith ◽  
...  

Abstract The tobacco hornworm, Manduca sexta, is a lepidopteran insect that is used extensively as a model system for studying insect biology, development, neuroscience, and immunity. However, current studies rely on the highly fragmented reference genome Msex_1.0, which was created using now-outdated technologies and is hindered by a variety of deficiencies and inaccuracies. We present a new reference genome for M. sexta, JHU_Msex_v1.0, applying a combination of modern technologies in a de novo assembly to increase continuity, accuracy, and completeness. The assembly is 470 Mb and is ∼20× more continuous than the original assembly, with scaffold N50 > 14 Mb. We annotated the assembly by lifting over existing annotations and supplementing with additional supporting RNA-based data for a total of 25,256 genes. The new reference assembly is accessible in annotated form for public use. We demonstrate that improved continuity of the M. sexta genome improves resequencing studies and benefits future research on M. sexta as a model organism.


2017 ◽  
Author(s):  
Zhipeng Li ◽  
Zeshan Lin ◽  
Lei Chen ◽  
Hengxing Ba ◽  
Yongzhi Yang ◽  
...  

AbstractBackgroundReindeer (Rangifer tarandus) is the only fully domesticated species in the Cervidae family, and is the only cervid with a circumpolar distribution. Unlike all other cervids, female reindeer regularly grow cranial appendages (antlers, the defining characteristics of cervids), as well as males. Moreover, reindeer milk contains more protein and less lactose than bovids’ milk. A high quality reference genome of this specie will assist efforts to elucidate these and other important features in the reindeer.FindingsWe obtained 723.2 Gb (Gigabase) of raw reads by an Illumina Hiseq 4000 platform, and a 2.64 Gb final assembly, representing 95.7% of the estimated genome (2.76 Gb according to k-mer analysis), including 92.6% of expected genes according to BUSCO analysis. The contig N50 and scaffold N50 sizes were 89.7 kilo base (kb) and 0.94 mega base (Mb), respectively. We annotated 21,555 protein-coding genes and 1.07 Gb of repetitive sequences by de novo and homology-based prediction. Homology-based searches detected 159 rRNA, 547 miRNA, 1,339 snRNA and 863 tRNA sequences in the genome of R. tarandus. The divergence time between R. tarandus, and ancestors of Bos taurus and Capra hircus, is estimated to be 29.55 million years ago (Mya).ConclusionsOur results provide the first high-quality reference genome for the reindeer, and a valuable resource for studying evolution, domestication and other unusual characteristics of the reindeer.


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e9114 ◽  
Author(s):  
Jiawei Wang ◽  
Weizhen Liu ◽  
Dongzi Zhu ◽  
Xiang Zhou ◽  
Po Hong ◽  
...  

The sweet cherry (Prunus avium) is one of the most economically important fruit species in the world. However, there is a limited amount of genetic information available for this species, which hinders breeding efforts at a molecular level. We were able to describe a high-quality reference genome assembly and annotation of the diploid sweet cherry (2n = 2x = 16) cv. Tieton using linked-read sequencing technology. We generated over 750 million clean reads, representing 112.63 GB of raw sequencing data. The Supernova assembler produced a more highly-ordered and continuous genome sequence than the current P. avium draft genome, with a contig N50 of 63.65 KB and a scaffold N50 of 2.48 MB. The final scaffold assembly was 280.33 MB in length, representing 82.12% of the estimated Tieton genome. Eight chromosome-scale pseudomolecules were constructed, completing a 214 MB sequence of the final scaffold assembly. De novo, homology-based, and RNA-seq methods were used together to predict 30,975 protein-coding loci. 98.39% of core eukaryotic genes and 97.43% of single copy orthologues were identified in the embryo plant, indicating the completeness of the assembly. Linked-read sequencing technology was effective in constructing a high-quality reference genome of the sweet cherry, which will benefit the molecular breeding and cultivar identification in this species.


2021 ◽  
Author(s):  
Milyausha Kaskinova ◽  
Bayazit Yunusbayev ◽  
Radick Altinbaev ◽  
Rika Raffiudin ◽  
Madeline H. Carpenter ◽  
...  

ABSTRACTApis mellifera L., the western honey bee is a major crop pollinator that plays a key role in beekeeping and serves as an important model organism in social behavior studies. Recent efforts have improved on the quality of the honey bee reference genome and developed a chromosome-level assembly of sixteen chromosomes, two of which are gapless. However, the rest suffer from 51 gaps, 160 unplaced/unlocalized scaffolds, and the lack of 2 distal telomeres. The gaps are located at the hard-to-assemble extended highly repetitive chromosomal regions that may contain functional genomic elements. Here, we use de-novo re-assemblies from the most recent reference genome Amel_HAv_3.1 raw reads and other long-read-based assemblies (INRA_AMelMel_1.0, ASM1384120v1, and ASM1384124v1) of the honey bee genome to resolve 13 gaps, five unplaced/unlocalized scaffolds and, the lacking telomeres of the Amel_HAv_3.1. The total length of the resolved gaps is 848,747 bp. The accuracy of the corrected assembly was validated by mapping PacBio reads and performing gene annotation assessment. Comparative analysis suggests that the PacBio-reads-based assemblies of the honey bee genomes failed in the same highly repetitive extended regions of the chromosomes, especially on chromosome 10. To fully resolve these extended repetitive regions, further work using ultra-long Nanopore sequencing would be needed. Our updated assembly facilitates more accurate reference-guided scaffolding and marker/sequence mapping in honey bee genomics studies.


Author(s):  
Kerry Reid ◽  
Michael A. Bell ◽  
Krishna R. Veeramah

The repeated adaptation of oceanic threespine sticklebacks to fresh water has made it a premier organism to study parallel evolution. These small fish have multiple distinct ecotypes that display a wide range of diverse phenotypic traits. Ecotypes are easily crossed in the laboratory, and families are large and develop quickly enough for quantitative trait locus analyses, positioning the threespine stickleback as a versatile model organism to address a wide range of biological questions. Extensive genomic resources, including linkage maps, a high-quality reference genome, and developmental genetics tools have led to insights into the genomic basis of adaptation and the identification of genomic changes controlling traits in vertebrates. Recently, threespine sticklebacks have been used as a model system to identify the genomic basis of highly complex traits, such as behavior and host–microbiome and host–parasite interactions. We review the latest findings and new avenues of research that have led the threespine stickleback to be considered a supermodel of evolutionary genomics. Expected final online publication date for the Annual Review of Genomics and Human Genetics Volume 22 is August 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


GigaScience ◽  
2019 ◽  
Vol 8 (11) ◽  
Author(s):  
Sihan Lu ◽  
Jie Yang ◽  
Xuelei Dai ◽  
Feiang Xie ◽  
Jinwu He ◽  
...  

AbstractBackgroundPapilio bianor Cramer, 1777 (commonly known as the Chinese peacock butterfly) (Insecta, Lepidoptera, Papilionidae) is a widely distributed swallowtail butterfly with a wide number of geographic populations ranging from the southeast of Russia to China, Japan, India, Vietnam, Myanmar, and Thailand. Its wing color consists of both pigmentary colored scales (black, reddish) and structural colored scales (iridescent blue or green dust). A high-quality reference genome of P. bianor is an important foundation for investigating iridescent color evolution, phylogeography, and the evolution of swallowtail butterflies.FindingsWe obtained a chromosome-level de novo genome assembly of the highly heterozygous P. bianor using long Pacific Biosciences sequencing reads and high-throughput chromosome conformation capture technology. The final assembly is 421.52 Mb on 30 chromosomes (29 autosomes and 1 Z sex chromosome) with 13.12 Mb scaffold N50. In total, 15,375 protein-coding genes and 233.09 Mb of repetitive sequences were identified. Phylogenetic analyses indicated that P. bianor separated from a common ancestor of swallowtails ∼23.69–36.04 million years ago. Demographic history suggested that the population expansion of this species from the last interglacial period to the last glacial maximum possibly resulted from its decreased natural enemies and its adaptation to climate change during the glacial period.ConclusionsWe present a high-quality chromosome-level reference genome of P. bianor using long-read single-molecule sequencing and Hi-C–based chromatin interaction maps. Our results lay the foundation for exploring the genetic basis of special biological features of P. bianor and also provide a useful data source for comparative genomics and phylogenomics among butterflies and moths.


2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Haolin Wu ◽  
Tao Ma ◽  
Minghui Kang ◽  
Fandi Ai ◽  
Junlin Zhang ◽  
...  

Abstract Actinidia chinensis (kiwifruit) is a perennial horticultural crop species of the Actinidiaceae family with high nutritional and economic value. Two versions of the A. chinensis genomes have been previously assembled, based mainly on relatively short reads. Here, we report an improved chromosome-level reference genome of A. chinensis (v3.0), based mainly on PacBio long reads and Hi-C data. The high-quality assembled genome is 653 Mb long, with 0.76% heterozygosity. At least 43% of the genome consists of repetitive sequences, and the most abundant long terminal repeats were further identified and account for 23.38% of our novel genome. It has clear improvements in contiguity, accuracy, and gene annotation over the two previous versions and contains 40,464 annotated protein-coding genes, of which 94.41% are functionally annotated. Moreover, further analyses of genetic collinearity revealed that the kiwifruit genome has undergone two whole-genome duplications: one affecting all Ericales families near the K-T extinction event and a recent genus-specific duplication. The reference genome presented here will be highly useful for further molecular elucidation of diverse traits and for the breeding of this horticultural crop, as well as evolutionary studies with related taxa.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Richard J. Edwards ◽  
Matt A. Field ◽  
James M. Ferguson ◽  
Olga Dudchenko ◽  
Jens Keilwagen ◽  
...  

Abstract Background Basenjis are considered an ancient dog breed of central African origins that still live and hunt with tribesmen in the African Congo. Nicknamed the barkless dog, Basenjis possess unique phylogeny, geographical origins and traits, making their genome structure of great interest. The increasing number of available canid reference genomes allows us to examine the impact the choice of reference genome makes with regard to reference genome quality and breed relatedness. Results Here, we report two high quality de novo Basenji genome assemblies: a female, China (CanFam_Bas), and a male, Wags. We conduct pairwise comparisons and report structural variations between assembled genomes of three dog breeds: Basenji (CanFam_Bas), Boxer (CanFam3.1) and German Shepherd Dog (GSD) (CanFam_GSD). CanFam_Bas is superior to CanFam3.1 in terms of genome contiguity and comparable overall to the high quality CanFam_GSD assembly. By aligning short read data from 58 representative dog breeds to three reference genomes, we demonstrate how the choice of reference genome significantly impacts both read mapping and variant detection. Conclusions The growing number of high-quality canid reference genomes means the choice of reference genome is an increasingly critical decision in subsequent canid variant analyses. The basal position of the Basenji makes it suitable for variant analysis for targeted applications of specific dog breeds. However, we believe more comprehensive analyses across the entire family of canids is more suited to a pangenome approach. Collectively this work highlights the importance the choice of reference genome makes in all variation studies.


Author(s):  
Ariel Gershman ◽  
Tatiana Gelaf Romer ◽  
Yunfan Fan ◽  
Roham Razaghi ◽  
Wendy A. Smith ◽  
...  

AbstractThe Tobacco hornworm, Manduca sexta, is a lepidopteran insect that is used extensively as a model system for studying insect biology, development, neuroscience and immunity. However, current studies rely on the highly fragmented reference genome Msex_1.0, which was created using now-outdated technologies and is hindered by a variety of deficiencies and inaccuracies. We present the new reference genome for M. sexta, JHU_Msex_v1.0, applying a combination of modern technologies in a de novo assembly to increase continuity, accuracy, and completeness. The assembly is 470 Mb and is ~20x more continuous than the original assembly, with scaffold N50 >14 Mb. We annotated the assembly by lifting over existing annotations and supplementing with additional supporting RNA-based data for a total of 25,256 genes. The new reference assembly is accessible in annotated form for public use. We demonstrate that improved continuity of the M. sexta genome improves resequencing studies and benefits future research on M. sexta as a model organism.


Sign in / Sign up

Export Citation Format

Share Document