FSG: Fast String Graph Construction for De Novo Assembly

Paola Bonizzoni; Gianluca Della Vedova; Yuri Pirola; Marco Previtali; Raffaella Rizzi

doi:10.1089/cmb.2017.0089

Clover: a clustering-oriented de novo assembler for Illumina sequences

BMC Bioinformatics ◽

10.1186/s12859-020-03788-9 ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Ming-Feng Hsieh ◽

Chin Lung Lu ◽

Chuan Yi Tang

Keyword(s):

De Novo Assembly ◽

De Novo ◽

Low Cost ◽

De Bruijn Graph ◽

Illumina Platform ◽

Sequencing Errors ◽

Sequencing Technologies ◽

String Graph ◽

Clustering Approach ◽

De Bruijn

Abstract Background Next-generation sequencing technologies revolutionized genomics by producing high-throughput reads at low cost, and this progress has prompted the recent development of de novo assemblers. Multiple assembly methods based on de Bruijn graph have been shown to be efficient for Illumina reads. However, the sequencing errors generated by the sequencer complicate analysis of de novo assembly and influence the quality of downstream genomic researches. Results In this paper, we develop a de Bruijn assembler, called Clover (clustering-oriented de novo assembler), that utilizes a novel k-mer clustering approach from the overlap-layout-consensus concept to deal with the sequencing errors generated by the Illumina platform. We further evaluate Clover’s performance against several de Bruijn graph assemblers (ABySS, SOAPdenovo, SPAdes and Velvet), overlap-layout-consensus assemblers (Bambus2, CABOG and MSR-CA) and string graph assembler (SGA) on three datasets (Staphylococcus aureus, Rhodobacter sphaeroides and human chromosome 14). The results show that Clover achieves a superior assembly quality in terms of corrected N50 and E-size while remaining a significantly competitive in run time except SOAPdenovo. In addition, Clover was involved in the sequencing projects of bacterial genomes Acinetobacter baumannii TYTH-1 and Morganella morganii KT. Conclusions The marvel clustering-based approach of Clover that integrates the flexibility of the overlap-layout-consensus approach and the efficiency of the de Bruijn graph method has high potential on de novo assembly. Now, Clover is freely available as open source software from https://oz.nthu.edu.tw/~d9562563/src.html.

Download Full-text

GraphSeq: Accelerating String Graph Construction for De Novo Assembly on Spark

10.1101/321729 ◽

2018 ◽

Author(s):

Chung-Tsai Su ◽

Ming-Tai Chang ◽

Yun-Chian Cheng ◽

Yun-Lung Li ◽

Yao-Ting Wang

Keyword(s):

Genome Assembly ◽

De Novo Assembly ◽

De Novo ◽

Data Representation ◽

Important Application ◽

Supplementary Information ◽

De Novo Genome Assembly ◽

String Graph ◽

Computing Framework ◽

Variant Identification

AbstractSummary: De novo genome assembly is an important application on both uncharacterized genome assembly and variant identification in a reference-unbiased way. In comparison with de Brujin graph, string graph is a lossless data representation for de novo assembly. However, string graph construction is computational intensive. We propose GraphSeq to accelerate string graph construction by leveraging the distributed computing framework.Availability and Implementation: GraphSeq is implemented with Scala on Spark and freely available at https://www.atgenomix.com/blog/graphseq.Supplementary information: Supplementary data are available at Bioinformatics online.

Download Full-text

FSG: Fast String Graph Construction for De Novo Assembly of Reads Data

Bioinformatics Research and Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-319-38782-6_3 ◽

2016 ◽

pp. 27-39 ◽

Cited By ~ 1

Author(s):

Paola Bonizzoni ◽

Gianluca Della Vedova ◽

Yuri Pirola ◽

Marco Previtali ◽

Raffaella Rizzi

Keyword(s):

De Novo Assembly ◽

De Novo ◽

String Graph

Download Full-text

Faculty Opinions recommendation of Efficient de novo assembly of single-cell bacterial genomes from short-read data sets.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.13296960.14657061 ◽

2011 ◽

Author(s):

Steven Salzberg

Keyword(s):

Single Cell ◽

De Novo Assembly ◽

De Novo ◽

Data Sets ◽

Bacterial Genomes ◽

Short Read

Download Full-text

Faculty Opinions recommendation of The sequence and de novo assembly of the giant panda genome.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.2367956.1997054 ◽

2010 ◽

Author(s):

Victoria Prince

Keyword(s):

De Novo Assembly ◽

Giant Panda ◽

De Novo

Download Full-text

Initial Steps of Photosystem II de Novo Assembly and Preloading with Manganese Take Place in Biogenesis Centers in Synechocystis

The Plant Cell ◽

10.1105/tpc.111.093914 ◽

2012 ◽

Vol 24 (2) ◽

pp. 660-675 ◽

Cited By ~ 54

Author(s):

Anna Stengel ◽

Irene L. Gügel ◽

Daniel Hilger ◽

Birgit Rengstl ◽

Heinrich Jung ◽

...

Keyword(s):

Photosystem Ii ◽

De Novo Assembly ◽

De Novo

Download Full-text

Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm

Nature Methods ◽

10.1038/s41592-020-01056-5 ◽

2021 ◽

Vol 18 (2) ◽

pp. 170-175 ◽

Cited By ~ 2

Author(s):

Haoyu Cheng ◽

Gregory T. Concepcion ◽

Xiaowen Feng ◽

Haowen Zhang ◽

Heng Li

Keyword(s):

De Novo Assembly ◽

De Novo

Download Full-text

Corrigendum to “Transcriptome de novo assembly and analysis of differentially expressed genes related to cytoplasmic male sterility in onion” [Plant Physiol. Biochem. 125 (2018) 35–44]

Plant Physiology and Biochemistry ◽

10.1016/j.plaphy.2018.06.038 ◽

2018 ◽

Vol 129 ◽

pp. 437

Author(s):

Qiaoling Yuan ◽

Ce Song ◽

Luyao Gao ◽

Huihui Zhang ◽

Cuicui Yang ◽

...

Keyword(s):

Cytoplasmic Male Sterility ◽

Male Sterility ◽

Differentially Expressed Genes ◽

De Novo Assembly ◽

De Novo ◽

Differentially Expressed ◽

Onion Plant

Download Full-text

A long reads-based de-novo assembly of the genome of the Arlee homozygous line reveals chromosomal rearrangements in rainbow trout

G3 Genes|Genome|Genetics ◽

10.1093/g3journal/jkab052 ◽

2021 ◽

Author(s):

Guangtu Gao ◽

Susana Magadan ◽

Geoffrey C Waldbieser ◽

Ramey C Youngblood ◽

Paul A Wheeler ◽

...

Keyword(s):

Rainbow Trout ◽

Chromosome Number ◽

Genome Assembly ◽

De Novo Assembly ◽

De Novo ◽

Sequence Data ◽

Structural Variations ◽

High Coverage ◽

Haploid Chromosome Number ◽

Long Reads

Abstract Currently, there is still a need to improve the contiguity of the rainbow trout reference genome and to use multiple genetic backgrounds that will represent the genetic diversity of this species. The Arlee doubled haploid line was originated from a domesticated hatchery strain that was originally collected from the northern California coast. The Canu pipeline was used to generate the Arlee line genome de-novo assembly from high coverage PacBio long-reads sequence data. The assembly was further improved with Bionano optical maps and Hi-C proximity ligation sequence data to generate 32 major scaffolds corresponding to the karyotype of the Arlee line (2 N = 64). It is composed of 938 scaffolds with N50 of 39.16 Mb and a total length of 2.33 Gb, of which ∼95% was in 32 chromosome sequences with only 438 gaps between contigs and scaffolds. In rainbow trout the haploid chromosome number can vary from 29 to 32. In the Arlee karyotype the haploid chromosome number is 32 because chromosomes Omy04, 14 and 25 are divided into six acrocentric chromosomes. Additional structural variations that were identified in the Arlee genome included the major inversions on chromosomes Omy05 and Omy20 and additional 15 smaller inversions that will require further validation. This is also the first rainbow trout genome assembly that includes a scaffold with the sex-determination gene (sdY) in the chromosome Y sequence. The utility of this genome assembly is demonstrated through the improved annotation of the duplicated genome loci that harbor the IGH genes on chromosomes Omy12 and Omy13.

Download Full-text

De Novo Assembly and Characterization of the Xenocatantops brachycerus Transcriptome

International Journal of Molecular Sciences ◽

10.3390/ijms19020520 ◽

2018 ◽

Vol 19 (2) ◽

pp. 520 ◽

Cited By ~ 5

Author(s):

Le Zhao ◽

Xinmei Zhang ◽

Zhongying Qiu ◽

Yuan Huang

Keyword(s):

De Novo Assembly ◽

De Novo

Download Full-text