Characterization of genetic variation and yield heterosis in Cucumis melo

Mapping Intimacies ◽

10.32747/2016.7600047.bard ◽

2016 ◽

Author(s):

Amit Gur ◽

Edward Buckler ◽

Joseph Burger ◽

Yaakov Tadmor ◽

Iftach Klapp

Keyword(s):

Single Molecule ◽

Genomic Research ◽

F1 Hybrids ◽

Whole Genome ◽

Single Molecule Sequencing ◽

Short Read ◽

Yield Variation ◽

Genome Wide ◽

Project Objectives

Project objectives: 1) Characterization of variation for yield heterosis in melon using Half-Diallele (HDA) design. 2) Development and implementation of image-based yield phenotyping in melon. 3) Characterization of genetic, epigenetic and transcriptional variation across 25 founder lines and selected hybrids. The epigentic part of this objective was modified during the course of the project: instead of characterization of chromatin structure in a single melon line through genome-wide mapping of nucleosomes using MNase-seq approach, we took advantage of rapid advancements in single-molecule sequencing and shifted the focus to Nanoporelong-read sequencing of all 25 founder lines. This analysis provides invaluable information on genome-wide structural variation across our diversity 4) Integrated analyses and development of prediction models Agricultural heterosis relates to hybrids that outperform their inbred parents for yield. First generation (F1) hybrids are produced in many crop species and it is estimated that heterosis increases yield by 15-30% globally. Melon (Cucumismelo) is an economically important species of The Cucurbitaceae family and is among the most important fleshy fruits for fresh consumption Worldwide. The major goal of this project was to explore the patterns and magnitude of yield heterosis in melon and link it to whole genome sequence variation. A core subset of 25 diverse lines was selected from the Newe-Yaar melon diversity panel for whole-genome re-sequencing (WGS) and test-crosses, to produce structured half-diallele design of 300 F1 hybrids (MelHDA25). Yield variation was measured in replicated yield trials at the whole-plant and at the rootstock levels (through a common-scion grafted experiments), across the F1s and parental lines. As part of this project we also developed an algorithmic pipeline for detection and yield estimation of melons from aerial-images, towards future implementation of such high throughput, cost-effective method for remote yield evaluation in open-field melons. We found extensive, highly heritable root-derived yield variation across the diallele population that was characterized by prominent best-parent heterosis (BPH), where hybrids rootstocks outperformed their parents by 38% and 56 % under optimal irrigation and drought- stress, respectively. Through integration of the genotypic data (~4,000,000 SNPs) and yield analyses we show that root-derived hybrids yield is independent of parental genetic distance. However, we mapped novel root-derived yield QTLs through genome-wide association (GWA) analysis and a multi-QTLs model explained more than 45% of the hybrids yield variation, providing a potential route for marker-assisted hybrid rootstock breeding. Four selected hybrid rootstocks are further studied under multiple scion varieties and their validated positive effect on yield performance is now leading to ongoing evaluation of their commercial potential. On the genomic level, this project resulted in 3 layers of data: 1) whole-genome short-read Illumina sequencing (30X) of the 25 founder lines provided us with 25 genome alignments and high-density melon HapMap that is already shown to be an effective resource for QTL annotation and candidate gene analysis in melon. 2) fast advancements in long-read single-molecule sequencing allowed us to shift focus towards this technology and generate ~50X Nanoporesequencing of the 25 founders which in combination with the short-read data now enable de novo assembly of the 25 genomes that will soon lead to construction of the first melon pan-genome. 3) Transcriptomic (3' RNA-Seq) analysis of several selected hybrids and their parents provide preliminary information on differentially expressed genes that can be further used to explain the root-derived yield variation. Taken together, this project expanded our view on yield heterosis in melon with novel specific insights on root-derived yield heterosis. To our knowledge, thus far this is the largest systematic genetic analysis of rootstock effects on yield heterosis in cucurbits or any other crop plant, and our results are now translated into potential breeding applications. The genomic resources that were developed as part of this project are putting melon in the forefront of genomic research and will continue to be useful tool for the cucurbits community in years to come.

Download Full-text

Assembly of Long Error-Prone Reads Using Repeat Graphs

10.1101/247148 ◽

2018 ◽

Cited By ~ 25

Author(s):

Mikhail Kolmogorov ◽

Jeffrey Yuan ◽

Yu Lin ◽

Pavel. A. Pevzner

Keyword(s):

Single Molecule ◽

State Of The Art ◽

De Bruijn Graph ◽

Single Molecule Sequencing ◽

Short Read ◽

Initial Stage ◽

A Genome ◽

De Bruijn ◽

Initial Assembly

ABSTRACTThe problem of genome assembly is ultimately linked to the problem of the characterization of all repeat families in a genome as a repeat graph. The key reason the de Bruijn graph emerged as a popular short read assembly approach is because it offered an elegant representation of all repeats in a genome that reveals their mosaic structure. However, most algorithms for assembling long error-prone reads use an alternative overlap-layout-consensus (OLC) approach that does not provide a repeat characterization. We present the Flye algorithm for constructing the A-Bruijn (assembly) graph from long error-prone reads, that, in contrast to the k-mer-based de Bruijn graph, assembles genomes using an alignment-based A-Bruijn graph. In difference from existing assemblers, Flye does not attempt to construct accurate contigs (at least at the initial assembly stage) but instead simply generates arbitrary paths in the (unknown) assembly graph and further constructs an assembly graph from these paths. Counter-intuitively, this fast but seemingly reckless approach results in the same graph as the assembly graph constructed from accurate contigs. Flye constructs (overlapping) contigs with possible assembly errors at the initial stage, combines them into an accurate assembly graph, resolves repeats in the assembly graph using small variations between various repeat instances that were left unresolved during the initial assembly stage, constructs a new, less tangled assembly graph based on resolved repeats, and finally outputs accurate contigs as paths in this graph. We benchmark Flye against several state-of-the-art Single Molecule Sequencing assemblers and demonstrate that it generates better or comparable assemblies for all analyzed datasets.

Download Full-text

Mapping and phasing of structural variation in patient genomes using nanopore sequencing

10.1101/129379 ◽

2017 ◽

Cited By ~ 4

Author(s):

Mircea Cretu Stancu ◽

Markus J. van Roosmalen ◽

Ivo Renkens ◽

Marleen Nieboer ◽

Sjors Middelkamp ◽

...

Keyword(s):

Single Molecule ◽

De Novo ◽

Structural Variants ◽

Human Genetic Disease ◽

Structural Genomic ◽

Short Read ◽

Sequencing Technologies ◽

Genome Wide ◽

Long Read ◽

Complex Structural

AbstractStructural genomic variants form a common type of genetic alteration underlying human genetic disease and phenotypic variation. Despite major improvements in genome sequencing technology and data analysis, the detection of structural variants still poses challenges, particularly when variants are of high complexity. Emerging long-read single-molecule sequencing technologies provide new opportunities for detection of structural variants. Here, we demonstrate sequencing of the genomes of two patients with congenital abnormalities using the ONT MinION at 11x and 16x mean coverage, respectively. We developed a bioinformatic pipeline - NanoSV - to efficiently map genomic structural variants (SVs) from the long-read data. We demonstrate that the nanopore data are superior to corresponding short-read data with regard to detection of de novo rearrangements originating from complex chromothripsis events in the patients. Additionally, genome-wide surveillance of SVs, revealed 3,253 (33%) novel variants that were missed in short-read data of the same sample, the majority of which are duplications < 200bp in size. Long sequencing reads enabled efficient phasing of genetic variations, allowing the construction of genome-wide maps of phased SVs and SNVs. We employed read-based phasing to show that all de novo chromothripsis breakpoints occurred on paternal chromosomes and we resolved the long-range structure of the chromothripsis. This work demonstrates the value of long-read sequencing for screening whole genomes of patients for complex structural variants.

Download Full-text

Genome-Wide Systematic Characterization of the NPF Family Genes and Their Transcriptional Responses to Multiple Nutrient Stresses in Allotetraploid Rapeseed

International Journal of Molecular Sciences ◽

10.3390/ijms21175947 ◽

2020 ◽

Vol 21 (17) ◽

pp. 5947 ◽

Cited By ~ 1

Author(s):

Hao Zhang ◽

Shuang Li ◽

Mengyao Shi ◽

Sheliang Wang ◽

Lei Shi ◽

...

Keyword(s):

N Uptake ◽

Expression Patterns ◽

Regulatory Elements ◽

Peptide Transporter ◽

Future Research ◽

Whole Genome ◽

Transcriptional Responses ◽

Ammonium Toxicity ◽

Genome Wide

NITRATE TRANSPORTER 1 (NRT1)/PEPTIDE TRANSPORTER (PTR) family (NPF) proteins can transport various substrates, and play crucial roles in governing plant nitrogen (N) uptake and distribution. However, little is known about the NPF genes in Brassica napus. Here, a comprehensive genome-wide systematic characterization of the NPF family led to the identification of 193 NPF genes in the whole genome of B. napus. The BnaNPF family exhibited high levels of genetic diversity among sub-families but this was conserved within each subfamily. Whole-genome duplication and segmental duplication played a major role in BnaNPF evolution. The expression analysis indicated that a broad range of expression patterns for individual gene occurred in response to multiple nutrient stresses, including N, phosphorus (P) and potassium (K) deficiencies, as well as ammonium toxicity. Furthermore, 10 core BnaNPF genes in response to N stress were identified. These genes contained 6–13 transmembrane domains, located in plasma membrane, that respond discrepantly to N deficiency in different tissues. Robust cis-regulatory elements were identified within the promoter regions of the core genes. Taken together, our results suggest that BnaNPFs are versatile transporters that might evolve new functions in B. napus. Our findings benefit future research on this gene family.

Download Full-text

Abstract 4897: Genome-wide structural variation analysis and characterization of Hepatitis B Virus integration sites in hepatocellular carcinomas using Bionano Genomics whole genome imaging

10.1158/1538-7445.am2020-4897 ◽

2020 ◽

Author(s):

Karl Hong ◽

Camille Péneau ◽

Quentin Bayard ◽

Sandrine Imbeaud ◽

Andy Pang ◽

...

Keyword(s):

Hepatitis B Virus ◽

Structural Variation ◽

Whole Genome ◽

Variation Analysis ◽

Hepatocellular Carcinomas ◽

Genome Wide ◽

B Virus ◽

Integration Sites ◽

Virus Integration

Download Full-text

DNA Analysis by Restriction Enzyme (DARE) enables concurrent genomic and epigenomic characterization of single cells

Nucleic Acids Research ◽

10.1093/nar/gkz717 ◽

2019 ◽

Vol 47 (19) ◽

pp. e122-e122

Author(s):

Ramya Viswanathan ◽

Elsie Cheruba ◽

Lih Feng Cheow

Keyword(s):

Dna Methylation ◽

Restriction Enzyme ◽

Copy Number ◽

Dna Analysis ◽

Whole Genome Amplification ◽

Single Cells ◽

Whole Genome ◽

Copy Number Alterations ◽

Genome Wide

Abstract Genome-wide profiling of copy number alterations and DNA methylation in single cells could enable detailed investigation into the genomic and epigenomic heterogeneity of complex cell populations. However, current methods to do this require complex sample processing and cleanup steps, lack consistency, or are biased in their genomic representation. Here, we describe a novel single-tube enzymatic method, DNA Analysis by Restriction Enzyme (DARE), to perform deterministic whole genome amplification while preserving DNA methylation information. This method was evaluated on low amounts of DNA and single cells, and provides accurate copy number aberration calling and representative DNA methylation measurement across the whole genome. Single-cell DARE is an attractive and scalable approach for concurrent genomic and epigenomic characterization of cells in a heterogeneous population.

Download Full-text

298. Characterization of an AAV Capsid Library Using PacBio CCS Single Molecule Sequencing

Molecular Therapy ◽

10.1016/s1525-0016(16)35311-4 ◽

2014 ◽

Vol 22 ◽

pp. S115

Keyword(s):

Single Molecule ◽

Single Molecule Sequencing

Download Full-text

NanoBLASTer: Fast alignment and characterization of Oxford Nanopore single molecule sequencing reads

2016 IEEE 6th International Conference on Computational Advances in Bio and Medical Sciences (ICCABS) ◽

10.1109/iccabs.2016.7802776 ◽

2016 ◽

Cited By ~ 3

Author(s):

Mohammad Ruhul Amin ◽

Steven Skiena ◽

Michael C. Schatz

Keyword(s):

Single Molecule ◽

Single Molecule Sequencing ◽

Oxford Nanopore

Download Full-text

Contig annotation tool CAT robustly classifies assembled metagenomic contigs and long sequences

10.1101/072868 ◽

2016 ◽

Cited By ~ 13

Author(s):

Diego D. Cambuy ◽

Felipe H. Coutinho ◽

Bas E. Dutilh

Keyword(s):

Single Molecule ◽

Dna Sequences ◽

Taxonomic Classification ◽

Annotation Tool ◽

Single Molecule Sequencing ◽

Short Read ◽

Long Read ◽

Micro Organisms ◽

Taxonomic Annotation

AbstractIn modern-day metagenomics, there is an increasing need for robust taxonomic annotation of long DNA sequences from unknown micro-organisms. Long metagenomic sequences may be derived from assembly of short-read metagenomes, or from long-read single molecule sequencing. Here we introduce CAT, a pipeline for robust taxonomic classification of long DNA sequences. We show that CAT correctly classifies contigs at different taxonomic levels, even in simulated metagenomic datasets that are very distantly related from the sequences in the database. CAT is implemented in Python and the required scripts can be freely downloaded from Github.

Download Full-text

Massively multiplex single-molecule oligonucleosome footprinting

eLife ◽

10.7554/elife.59404 ◽

2020 ◽

Vol 9 ◽

Author(s):

Nour J Abdulhay ◽

Colin P McNally ◽

Laura J Hsieh ◽

Sivakanthan Kasinathan ◽

Aidan Keith ◽

...

Keyword(s):

Single Molecule ◽

Nucleosome Occupancy ◽

Proof Of Concept ◽

Binding Motifs ◽

Single Molecule Sequencing ◽

Nucleosome Organization ◽

Genome Wide ◽

Sequencing Method ◽

Transcription Factor Binding Motifs

Our understanding of the beads-on-a-string arrangement of nucleosomes has been built largely on high-resolution sequence-agnostic imaging methods and sequence-resolved bulk biochemical techniques. To bridge the divide between these approaches, we present the single-molecule adenine methylated oligonucleosome sequencing assay (SAMOSA). SAMOSA is a high-throughput single-molecule sequencing method that combines adenine methyltransferase footprinting and single-molecule real-time DNA sequencing to natively and nondestructively measure nucleosome positions on individual chromatin fibres. SAMOSA data allows unbiased classification of single-molecular 'states' of nucleosome occupancy on individual chromatin fibres. We leverage this to estimate nucleosome regularity and spacing on single chromatin fibres genome-wide, at predicted transcription factor binding motifs, and across human epigenomic domains. Our analyses suggest that chromatin is comprised of both regular and irregular single-molecular oligonucleosome patterns that differ subtly in their relative abundance across epigenomic domains. This irregularity is particularly striking in constitutive heterochromatin, which has typically been viewed as a conformationally static entity. Our proof-of-concept study provides a powerful new methodology for studying nucleosome organization at a previously intractable resolution and offers up new avenues for modeling and visualizing higher order chromatin structure.

Download Full-text

Critical assessment of bioinformatics methods for the characterization of pathological repeat expansions with single-molecule sequencing data

Briefings in Bioinformatics ◽

10.1093/bib/bbz099 ◽

2019 ◽

Vol 21 (6) ◽

pp. 1971-1986 ◽

Cited By ~ 1

Author(s):

Matteo Chiara ◽

Federico Zambelli ◽

Ernesto Picardi ◽

David S Horner ◽

Graziano Pesole

Keyword(s):

Single Molecule ◽

Tandem Repeats ◽

Simulated Data ◽

Detailed Comparison ◽

Sequencing Data ◽

Single Molecule Sequencing ◽

Sequencing Technologies ◽

Repeat Expansions

Abstract A number of studies have reported the successful application of single-molecule sequencing technologies to the determination of the size and sequence of pathological expanded microsatellite repeats over the last 5 years. However, different custom bioinformatics pipelines were employed in each study, preventing meaningful comparisons and somewhat limiting the reproducibility of the results. In this review, we provide a brief summary of state-of-the-art methods for the characterization of expanded repeats alleles, along with a detailed comparison of bioinformatics tools for the determination of repeat length and sequence, using both real and simulated data. Our reanalysis of publicly available human genome sequencing data suggests a modest, but statistically significant, increase of the error rate of single-molecule sequencing technologies at genomic regions containing short tandem repeats. However, we observe that all the methods herein tested, irrespective of the strategy used for the analysis of the data (either based on the alignment or assembly of the reads), show high levels of sensitivity in both the detection of expanded tandem repeats and the estimation of the expansion size, suggesting that approaches based on single-molecule sequencing technologies are highly effective for the detection and quantification of tandem repeat expansions and contractions.

Download Full-text