de novo genome assembly Latest Research Papers

Chromosome‐level de novo genome assembly of Telopea speciosissima (New South Wales waratah) using long‐reads, linked‐reads and Hi‐C

Molecular Ecology Resources ◽

10.1111/1755-0998.13574 ◽

2022 ◽

Author(s):

Stephanie H. Chen ◽

Maurizio Rossetto ◽

Marlien Merwe ◽

Patricia Lu‐Irving ◽

Jia‐Yee S. Yap ◽

...

Keyword(s):

Genome Assembly ◽

De Novo ◽

New South ◽

New South Wales ◽

De Novo Genome Assembly ◽

South Wales ◽

Long Reads ◽

Chromosome Level

Download Full-text

Genome sequencing and population resequencing provide insights into the genetic basis of domestication and diversity of vegetable soybean

Horticulture Research ◽

10.1093/hr/uhab052 ◽

2022 ◽

Vol 9 ◽

Author(s):

Na Liu ◽

Yongchao Niu ◽

Guwen Zhang ◽

Zhijuan Feng ◽

Yuanpeng Bo ◽

...

Keyword(s):

De Novo ◽

Phylogenetic Analyses ◽

Repetitive Sequences ◽

Wild Soybean ◽

Sugar Transport ◽

Sucrose Phosphate Synthase ◽

Comparative Genomic ◽

Vegetable Soybean ◽

De Novo Genome Assembly ◽

The Difference

Abstract Vegetable soybean is one of the most important vegetables in China, and the demand for this vegetable has markedly increased worldwide over the past two decades. Here, we present a high-quality de novo genome assembly of the vegetable soybean cultivar Zhenong 6 (ZN6), which is one of the most popular cultivars in China. The 20 pseudochromosomes cover 94.57% of the total 1.01 Gb assembly size, with contig N50 of 3.84 Mb and scaffold N50 of 48.41 Mb. A total of 55 517 protein-coding genes were annotated. Approximately 54.85% of the assembled genome was annotated as repetitive sequences, with the most abundant long terminal repeat transposable elements. Comparative genomic and phylogenetic analyses with grain soybean Williams 82, six other Fabaceae species and Arabidopsis thaliana genomes highlight the difference of ZN6 with other species. Furthermore, we resequenced 60 vegetable soybean accessions. Alongside 103 previously resequenced wild soybean and 155 previously resequenced grain soybean accessions, we performed analyses of population structure and selective sweep of vegetable, grain, and wild soybean. They were clearly divided into three clades. We found 1112 and 1047 genes under selection in the vegetable soybean and grain soybean populations compared with the wild soybean population, respectively. Among them, we identified 134 selected genes shared between vegetable soybean and grain soybean populations. Additionally, we report four sucrose synthase genes, one sucrose-phosphate synthase gene, and four sugar transport genes as candidate genes related to important traits such as seed sweetness and seed size in vegetable soybean. This study provides essential genomic resources to promote evolutionary and functional genomics studies and genomically informed breeding for vegetable soybean.

Download Full-text

Third-Generation Sequencing: The Spearhead towards the Radical Transformation of Modern Genomics

Life ◽

10.3390/life12010030 ◽

2021 ◽

Vol 12 (1) ◽

pp. 30

Author(s):

Konstantina Athanasopoulou ◽

Michaela A. Boti ◽

Panagiotis G. Adamopoulos ◽

Paraskevi C. Skourou ◽

Andreas Scorilas

Keyword(s):

De Novo ◽

Direct Detection ◽

Transcriptional Profiling ◽

Third Generation ◽

De Novo Genome Assembly ◽

Rna Molecules ◽

Third Generation Sequencing ◽

Long Reads ◽

Long Read ◽

Generation Sequencing

Although next-generation sequencing (NGS) technology revolutionized sequencing, offering a tremendous sequencing capacity with groundbreaking depth and accuracy, it continues to demonstrate serious limitations. In the early 2010s, the introduction of a novel set of sequencing methodologies, presented by two platforms, Pacific Biosciences (PacBio) and Oxford Nanopore Sequencing (ONT), gave birth to third-generation sequencing (TGS). The innovative long-read technologies turn genome sequencing into an ease-of-handle procedure by greatly reducing the average time of library construction workflows and simplifying the process of de novo genome assembly due to the generation of long reads. Long sequencing reads produced by both TGS methodologies have already facilitated the decipherment of transcriptional profiling since they enable the identification of full-length transcripts without the need for assembly or the use of sophisticated bioinformatics tools. Long-read technologies have also provided new insights into the field of epitranscriptomics, by allowing the direct detection of RNA modifications on native RNA molecules. This review highlights the advantageous features of the newly introduced TGS technologies, discusses their limitations and provides an in-depth comparison regarding their scientific background and available protocols as well as their potential utility in research and clinical applications.

Download Full-text

Reference-Guided De Novo Genome Assembly to Dissect a QTL Region for Submergence Tolerance Derived from Ciherang-Sub1

Plants ◽

10.3390/plants10122740 ◽

2021 ◽

Vol 10 (12) ◽

pp. 2740

Author(s):

Yuya Liang ◽

Shichen Wang ◽

Chersty L. Harper ◽

Nithya K. Subramanian ◽

Rodante E. Tabien ◽

...

Keyword(s):

Genome Assembly ◽

De Novo ◽

Global Climate ◽

Major Effect ◽

Sequence Information ◽

Whole Genome ◽

Submergence Tolerance ◽

De Novo Genome Assembly ◽

Rice Varieties ◽

Genome Profile

Global climate change has increased the number of severe flooding events that affect agriculture, including rice production in the U.S. and internationally. Heavy rainfall can cause rice plants to be completely submerged, which can significantly affect grain yield or completely destroy the plants. Recently, a major effect submergence tolerance QTL during the vegetative stage, qSub8.1, which originated from Ciherang-Sub1, was identified in a mapping population derived from a cross between Ciherang-Sub1 and IR10F365. Ciherang-Sub1 was, in turn, derived from a cross between Ciherang and IR64-Sub1. Here, we characterize the qSub8.1 region by analyzing the sequence information of Ciherang-Sub1 and its two parents (Ciherang and IR64-Sub1) and compare the whole genome profile of these varieties with the Nipponbare and Minghui 63 (MH63) reference genomes. The three rice varieties were sequenced with 150 bp pair-end whole-genome shotgun sequencing (Illumina HiSeq4000), followed by performing the Trimmomatic-SOAPdenovo2-MUMmer3 pipeline for genome assembly, resulting in approximate genome sizes of 354.4, 343.7, and 344.7 Mb, with N50 values of 25.1, 25.4, and 26.1 kb, respectively. The results showed that the Ciherang-Sub1 genome is composed of 59–63% Ciherang, 22–24% of IR64-Sub1, and 15–17% of unknown sources. The genome profile revealed a more detailed genomic composition than previous marker-assisted breeding and showed that the qSub8.1 region is mostly from Ciherang, with some introgressed segments from IR64-Sub1 and currently unknown source(s).

Download Full-text

Chromosome‐level de novo genome assembly and whole‐genome resequencing of threatened species Acanthochlamys bracteata (Velloziaceae) provide insights into alpine plant divergence in a biodiversity hotspot

Molecular Ecology Resources ◽

10.1111/1755-0998.13562 ◽

2021 ◽

Author(s):

Bo Xu ◽

Min Liao ◽

Heng‐ning Deng ◽

Chao‐chao Yan ◽

Lv Yun‐yun ◽

...

Keyword(s):

Threatened Species ◽

Genome Assembly ◽

De Novo ◽

Biodiversity Hotspot ◽

Whole Genome ◽

Alpine Plant ◽

Genome Resequencing ◽

De Novo Genome Assembly ◽

Whole Genome Resequencing ◽

Chromosome Level

Download Full-text

Accurate long-read de novo assembly evaluation with Inspector

Genome Biology ◽

10.1186/s13059-021-02527-4 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Yu Chen ◽

Yixin Zhang ◽

Amy Y. Wang ◽

Min Gao ◽

Zechen Chong

Keyword(s):

Genome Assembly ◽

De Novo Assembly ◽

In Silico ◽

Large Scale ◽

De Novo ◽

Small Scale ◽

De Novo Genome Assembly ◽

Consensus Sequences ◽

Assembly Evaluation ◽

Long Read

AbstractLong-read de novo genome assembly continues to advance rapidly. However, there is a lack of effective tools to accurately evaluate the assembly results, especially for structural errors. We present Inspector, a reference-free long-read de novo assembly evaluator which faithfully reports types of errors and their precise locations. Notably, Inspector can correct the assembly errors based on consensus sequences derived from raw reads covering erroneous regions. Based on in silico and long-read assembly results from multiple long-read data and assemblers, we demonstrate that in addition to providing generic metrics, Inspector can accurately identify both large-scale and small-scale assembly errors.

Download Full-text

Assembly of complete diploid phased chromosomes from draft genome sequences

10.1101/2021.11.11.468134 ◽

2021 ◽

Author(s):

Andrea Minio ◽

Noe Cochetel ◽

Amanda M Vondras ◽

Melanie Massonnet ◽

Dario Cantu

Keyword(s):

De Novo ◽

Draft Genome ◽

Biological Information ◽

Genomic Research ◽

Genome Sequences ◽

Animal Kingdom ◽

De Novo Genome Assembly ◽

Closely Related Species ◽

Assembly Strategy ◽

Genome Assemblies

De novo genome assembly is essential for genomic research. High-quality genomes assembled into phased pseudomolecules are challenging to produce and often contain assembly errors caused by repeats, heterozygosity, or the chosen assembly strategy. Although algorithms exist that produce partially phased assemblies, haploid draft assemblies that may lack biological information remain favored because they are easier to generate and use. We developed HaploSync, a suite of tools that produces fully phased, chromosome-scale diploid genome assemblies and performs extensive quality control to limit assembly artifacts. HaploSync uses a genetic map and/or the genome of a closely related species to guide the scaffolding of a diploid assembly into phased pseudomolecules for each chromosome. It compares alternative haplotypes to identify and correct misassemblies independent of a reference, fills assembly gaps with unplaced sequences, and resolves collapsed homozygous regions. In a series of plant, fungal, and animal kingdom case studies, we demonstrate that HaploSync increases the assembly contiguity of phased chromosomes, improves completeness by filling gaps, corrects scaffolding, and correctly phases highly heterozygous, complex regions.

Download Full-text

Functional DNA annotation from a preliminary de novo genome assembly of Brycon orbignyanus, an endangered Neotropical migratory fish

Latin American Data in Science ◽

10.53805/lads.v1i2.12 ◽

2021 ◽

Vol 1 (2) ◽

pp. 42-48

Author(s):

Raissa Graciano ◽

Rafael Sachetto Oliveira ◽

Isllas Miguel dos Santos ◽

Gabriel de Menezes Yazbeck

Keyword(s):

Genome Assembly ◽

De Novo ◽

Basic Research ◽

Migratory Fish ◽

De Novo Genome Assembly ◽

Fish Stocks ◽

Fish Spawning ◽

High Throughput Dna Sequencing ◽

Functional Dna ◽

Low Coverage

The predicted sequence for thousands of genes revealed by a preliminary low-coverage genome assembly is presented for Brycon orbignyanus, an endangered migratory fish. Neotropical migratory fish stocks have been drastically reduced due to accumulated environmental pressure. Brycon orbignyanus, once one of the main fisheries species in the Platine Basin, is now very rare in nature and relies on spawning programs and a few well preserved or still untouched sites. The use of high-throughput DNA sequencing is still untapped regarding the functional genome information from B. orbignyanus. In order to help bridging this gap, we present a dataset resulting from the first functional annotation from a de novo genome assembly for B. orbignyanus, from short reads (90 bp), obtained by the HiSeq 2000 platform (Illumina). The annotation was performed for scaffolds over 10 kb using the Maker pipeline, with reference sequences taken from the NCBI for the Characiformes order. This annotation resulted in the prediction of 12,734 genes, classified with the aid of PANTHER. The data presented here can facilitate the development of basic research in this threatened species, along with practical biotechnological tools for different areas, such as commercial and environmental fish spawning operations (e.g. hormonal induction, growth) and human health.

Download Full-text

Hybrid de novo Genome Assembly of Erwinia sp. E602 and Bioinformatic Analysis Characterized a New Plasmid-Borne lac Operon Under Positive Selection

Frontiers in Microbiology ◽

10.3389/fmicb.2021.783195 ◽

2021 ◽

Vol 12 ◽

Author(s):

Yu Xia ◽

Zhi-Yuan Wei ◽

Rui He ◽

Jia-Huan Li ◽

Zhi-Xin Wang ◽

...

Keyword(s):

Positive Selection ◽

Genome Assembly ◽

De Novo ◽

Bioinformatic Analysis ◽

Lac Operon ◽

Pacbio Sequencing ◽

Metabolic Pathway Analysis ◽

De Novo Genome Assembly ◽

Sequencing Technologies ◽

Lactose Metabolism

Our previous study identified a new β-galactosidase in Erwinia sp. E602. To further understand the lactose metabolism in this strain, de novo genome assembly was conducted by using a strategy combining Illumina and PacBio sequencing technology. The whole genome of Erwinia sp. E602 includes a 4.8 Mb chromosome and a 326 kb large plasmid. A total of 4,739 genes, including 4,543 protein-coding genes, 25 rRNAs, 82 tRNAs and 7 other ncRNAs genes were annotated. The plasmid was the largest one characterized in genus Erwinia by far, and it contained a number of genes and pathways responsible for lactose metabolism and regulation. Moreover, a new plasmid-borne lac operon that lacked a typical β-galactoside transacetylase (lacA) gene was identified in the strain. Phylogenetic analysis showed that the genes lacY and lacZ in the operon were under positive selection, indicating the adaptation of lactose metabolism to the environment in Erwinia sp. E602. Our current study demonstrated that the hybrid de novo genome assembly using Illumina and PacBio sequencing technologies, as well as the metabolic pathway analysis, provided a useful strategy for better understanding of the evolution of undiscovered microbial species or strains.

Download Full-text

LongStitch: high-quality genome assembly correction and scaffolding using long reads

BMC Bioinformatics ◽

10.1186/s12859-021-04451-7 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Lauren Coombe ◽

Janet X. Li ◽

Theodora Lo ◽

Johnathan Wong ◽

Vladimir Nikolic ◽

...

Keyword(s):

Genome Assembly ◽

De Novo ◽

Draft Genome ◽

Model Organisms ◽

High Quality ◽

De Novo Genome Assembly ◽

Long Reads ◽

Long Read ◽

Genomic Regions ◽

Genome Assemblies

Abstract Background Generating high-quality de novo genome assemblies is foundational to the genomics study of model and non-model organisms. In recent years, long-read sequencing has greatly benefited genome assembly and scaffolding, a process by which assembled sequences are ordered and oriented through the use of long-range information. Long reads are better able to span repetitive genomic regions compared to short reads, and thus have tremendous utility for resolving problematic regions and helping generate more complete draft assemblies. Here, we present LongStitch, a scalable pipeline that corrects and scaffolds draft genome assemblies exclusively using long reads. Results LongStitch incorporates multiple tools developed by our group and runs in up to three stages, which includes initial assembly correction (Tigmint-long), followed by two incremental scaffolding stages (ntLink and ARKS-long). Tigmint-long and ARKS-long are misassembly correction and scaffolding utilities, respectively, previously developed for linked reads, that we adapted for long reads. Here, we describe the LongStitch pipeline and introduce our new long-read scaffolder, ntLink, which utilizes lightweight minimizer mappings to join contigs. LongStitch was tested on short and long-read assemblies of Caenorhabditis elegans, Oryza sativa, and three different human individuals using corresponding nanopore long-read data, and improves the contiguity of each assembly from 1.2-fold up to 304.6-fold (as measured by NGA50 length). Furthermore, LongStitch generates more contiguous and correct assemblies compared to state-of-the-art long-read scaffolder LRScaf in most tests, and consistently improves upon human assemblies in under five hours using less than 23 GB of RAM. Conclusions Due to its effectiveness and efficiency in improving draft assemblies using long reads, we expect LongStitch to benefit a wide variety of de novo genome assembly projects. The LongStitch pipeline is freely available at https://github.com/bcgsc/longstitch.

Download Full-text

de novo genome assembly
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Chromosome‐level de novo genome assembly of Telopea speciosissima (New South Wales waratah) using long‐reads, linked‐reads and Hi‐C

Genome sequencing and population resequencing provide insights into the genetic basis of domestication and diversity of vegetable soybean

Third-Generation Sequencing: The Spearhead towards the Radical Transformation of Modern Genomics

Reference-Guided De Novo Genome Assembly to Dissect a QTL Region for Submergence Tolerance Derived from Ciherang-Sub1

Chromosome‐level de novo genome assembly and whole‐genome resequencing of threatened species Acanthochlamys bracteata (Velloziaceae) provide insights into alpine plant divergence in a biodiversity hotspot

Accurate long-read de novo assembly evaluation with Inspector

Assembly of complete diploid phased chromosomes from draft genome sequences

Functional DNA annotation from a preliminary de novo genome assembly of Brycon orbignyanus, an endangered Neotropical migratory fish

Hybrid de novo Genome Assembly of Erwinia sp. E602 and Bioinformatic Analysis Characterized a New Plasmid-Borne lac Operon Under Positive Selection

LongStitch: high-quality genome assembly correction and scaffolding using long reads

Export Citation Format

de novo genome assemblyRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Chromosome‐level de novo genome assembly of Telopea speciosissima (New South Wales waratah) using long‐reads, linked‐reads and Hi‐C

Genome sequencing and population resequencing provide insights into the genetic basis of domestication and diversity of vegetable soybean

Third-Generation Sequencing: The Spearhead towards the Radical Transformation of Modern Genomics

Reference-Guided De Novo Genome Assembly to Dissect a QTL Region for Submergence Tolerance Derived from Ciherang-Sub1

Chromosome‐level de novo genome assembly and whole‐genome resequencing of threatened species Acanthochlamys bracteata (Velloziaceae) provide insights into alpine plant divergence in a biodiversity hotspot

Accurate long-read de novo assembly evaluation with Inspector

Assembly of complete diploid phased chromosomes from draft genome sequences

Functional DNA annotation from a preliminary de novo genome assembly of Brycon orbignyanus, an endangered Neotropical migratory fish

Hybrid de novo Genome Assembly of Erwinia sp. E602 and Bioinformatic Analysis Characterized a New Plasmid-Borne lac Operon Under Positive Selection

LongStitch: high-quality genome assembly correction and scaffolding using long reads

de novo genome assembly
Recently Published Documents