Target Capture and Massive Sequencing of Genes Transcribed inMytilus galloprovincialis

BioMed Research International ◽

10.1155/2014/538549 ◽

2014 ◽

Vol 2014 ◽

pp. 1-9 ◽

Cited By ~ 6

Author(s):

Umberto Rosani ◽

Stefania Domeneghetti ◽

Alberto Pallavicini ◽

Paola Venier

Keyword(s):

De Novo ◽

454 Sequencing ◽

Pcr Amplification ◽

Gene Sequences ◽

Target Capture ◽

Reference Transcript ◽

Mediterranean Mussel ◽

Massive Sequencing ◽

Genomic Regions ◽

Next Generation Sequencing Ngs

Next generation sequencing (NGS) allows fast and massive production of both genome and transcriptome sequence datasets. As the genome of the Mediterranean musselMytilus galloprovincialisis not available at present, we have explored the possibility of reducing the whole genome sequencing efforts by using capture probes coupled with PCR amplification and high-throughput 454-sequencing to enrich selected genomic regions. The enrichment of DNA target sequences was validated by real-time PCR, whereas the efficacy of the applied strategy was evaluated by mapping the 454-output reads against reference transcript data already available forM. galloprovincialisand by measuring coverage, SNPs, number ofde novosequenced introns, and complete gene sequences. Focusing on a target size of nearly 1.5 Mbp, we obtained a target coverage which allowed the identification of more than 250 complete introns, 10,741 SNPs, and also complete gene sequences. This study confirms the transcriptome-based enrichment of gDNA regions as a good strategy to expand knowledge on specific subsets of genes also in nonmodel organisms.

Download Full-text

HAPHPIPE: Haplotype Reconstruction and Phylodynamics for Deep Sequencing of Intra-Host Viral Populations

Molecular Biology and Evolution ◽

10.1093/molbev/msaa315 ◽

2020 ◽

Author(s):

Matthew L Bendall ◽

Keylie M Gibson ◽

Margaret C Steiner ◽

Uzma Rentia ◽

Marcos Pérez-Losada ◽

...

Keyword(s):

Deep Sequencing ◽

De Novo ◽

Consensus Sequence ◽

Haplotype Reconstruction ◽

Consensus Sequences ◽

Genome Wide ◽

Genomic Regions ◽

Next Generation Sequencing Ngs ◽

Ngs Data ◽

Generation Sequencing

Abstract Deep sequencing of viral populations using next generation sequencing (NGS) offers opportunities to understand and investigate evolution, transmission dynamics, and population genetics. Currently, the standard practice for processing NGS data to study viral populations is to summarize all the observed sequences from a sample as a single consensus sequence, thus discarding valuable information about the intra-host viral molecular epidemiology. Furthermore, existing analytical pipelines may only analyze genomic regions involved in drug resistance, thus are not suited for full viral genome analysis. Here we present HAPHPIPE, a HAplotype and PHylodynamics PIPEline for genome-wide assembly of viral consensus sequences and haplotypes. The HAPHPIPE protocol includes modules for quality trimming, error correction, de novo assembly, alignment, and haplotype reconstruction. The resulting consensus sequences, haplotypes, and alignments can be further analyzed using a variety of phylogenetic and population genetic software. HAPHPIPE is designed to provide users with a single pipeline to rapidly analyze sequences from viral populations generated from NGS platforms and provide quality output properly formatted for downstream evolutionary analyses.

Download Full-text

Rapid, multiplexed, whole genome and plasmid sequencing of foodborne pathogens using long-read nanopore technology

10.1101/558718 ◽

2019 ◽

Cited By ~ 2

Author(s):

Tonya L. Taylor ◽

Jeremy D. Volkening ◽

Eric DeJesus ◽

Mustafa Simmons ◽

Kiril M. Dimitrov ◽

...

Keyword(s):

Foodborne Pathogens ◽

De Novo ◽

United States Public Health ◽

Turnaround Time ◽

Public Health Agencies ◽

Antimicrobial Resistance Genes ◽

Analysis Workflow ◽

Long Read ◽

Genomic Regions ◽

Next Generation Sequencing Ngs

AbstractUnited States public health agencies are focusing on next-generation sequencing (NGS) to quickly identify and characterize foodborne pathogens. Here, the MinION nanopore, long-read sequencer was used to simultaneously sequence the entire chromosome and plasmids of Salmonella enterica subsp. enterica serovar Bareilly and Escherichia coli O157:H7. A rapid, random sequencing approach, coupled with de novo genome assembly within a customized data analysis workflow, that can resolve highly-repetitive genomic regions, was developed. In sequencing runs, as short as four hours, using nanopore data alone, full-length genomes were obtained with an average identity of 99.87% for Salmonella Bareilly and 99.89% for E. coli in comparison to the respective MiSeq references. These long-read assemblies provided information on serotype, virulence factors, and antimicrobial resistance genes. Using a custom-developed, SNP-selection workflow, the potential of the nanopore-only assemblies (after only 30 minutes of sequencing) for rapid phylogenetic inference, with identical topology compared to the published dataset, was demonstrated. To achieve maximum quality assemblies, the developed bioinformatics workflow employed additional polishing steps to correct the systematic errors produced by the nanopore-only assemblies. Nanopore sequencing provided a shorter (10 hours library preparation and sequencing) turnaround time compared to other NGS technologies.

Download Full-text

Publisher Correction: Germline de novo mutation clusters arise during oocyte aging in genomic regions with high double-strand-break incidence

Nature Genetics ◽

10.1038/s41588-021-00905-z ◽

2021 ◽

Author(s):

Jakob M. Goldmann ◽

Vladimir B. Seplyarskiy ◽

Wendy S. W. Wong ◽

Thierry Vilboux ◽

Pieter B. Neerincx ◽

...

Keyword(s):

De Novo ◽

Strand Break ◽

Double Strand Break ◽

De Novo Mutation ◽

Oocyte Aging ◽

Genomic Regions

Download Full-text

A study of transposable element-associated structural variations (TASVs) using a de novo-assembled Korean genome

Experimental & Molecular Medicine ◽

10.1038/s12276-021-00586-y ◽

2021 ◽

Author(s):

Seyoung Mun ◽

Songmi Kim ◽

Wooseok Lee ◽

Keunsoo Kang ◽

Thomas J. Meyer ◽

...

Keyword(s):

Genome Sequencing ◽

Genome Assembly ◽

De Novo ◽

Personal Genome ◽

Human Populations ◽

Whole Genome ◽

Structural Variations ◽

Insert Size ◽

Human Genomes ◽

Next Generation Sequencing Ngs

AbstractAdvances in next-generation sequencing (NGS) technology have made personal genome sequencing possible, and indeed, many individual human genomes have now been sequenced. Comparisons of these individual genomes have revealed substantial genomic differences between human populations as well as between individuals from closely related ethnic groups. Transposable elements (TEs) are known to be one of the major sources of these variations and act through various mechanisms, including de novo insertion, insertion-mediated deletion, and TE–TE recombination-mediated deletion. In this study, we carried out de novo whole-genome sequencing of one Korean individual (KPGP9) via multiple insert-size libraries. The de novo whole-genome assembly resulted in 31,305 scaffolds with a scaffold N50 size of 13.23 Mb. Furthermore, through computational data analysis and experimental verification, we revealed that 182 TE-associated structural variation (TASV) insertions and 89 TASV deletions contributed 64,232 bp in sequence gain and 82,772 bp in sequence loss, respectively, in the KPGP9 genome relative to the hg19 reference genome. We also verified structural differences associated with TASVs by comparative analysis with TASVs in recent genomes (AK1 and TCGA genomes) and reported their details. Here, we constructed a new Korean de novo whole-genome assembly and provide the first study, to our knowledge, focused on the identification of TASVs in an individual Korean genome. Our findings again highlight the role of TEs as a major driver of structural variations in human individual genomes.

Download Full-text

Long-Range PCR-Based NGS Applications to Diagnose Mendelian Retinal Diseases

International Journal of Molecular Sciences ◽

10.3390/ijms22041508 ◽

2021 ◽

Vol 22 (4) ◽

pp. 1508

Author(s):

Jordi Maggi ◽

Samuel Koller ◽

Luzy Bähr ◽

Silke Feil ◽

Fatma Kivrak Pfiffner ◽

...

Keyword(s):

Genetic Testing ◽

Long Range ◽

Copy Number Variants ◽

Retinal Disease ◽

Promoter Regions ◽

Long Range Pcr ◽

Cost Efficient ◽

Nucleotide Resolution ◽

Genomic Regions ◽

Next Generation Sequencing Ngs

The purpose of this study was to develop a flexible, cost-efficient, next-generation sequencing (NGS) protocol for genetic testing. Long-range polymerase chain reaction (PCR) amplicons of up to 20 kb in size were designed to amplify entire genomic regions for a panel (n = 35) of inherited retinal disease (IRD)-associated loci. Amplicons were pooled and sequenced by NGS. The analysis was applied to 227 probands diagnosed with IRD: (A) 108 previously molecularly diagnosed, (B) 94 without previous genetic testing, and (C) 25 undiagnosed after whole-exome sequencing (WES). The method was validated with 100% sensitivity on cohort A. Long-range PCR-based sequencing revealed likely causative variant(s) in 51% and 24% of proband from cohorts B and C, respectively. Breakpoints of 3 copy number variants (CNVs) could be characterized. Long-range PCR libraries spike-in extended coverage of WES. Read phasing confirmed compound heterozygosity in 5 probands. The proposed sequencing protocol provided deep coverage of the entire gene, including intronic and promoter regions. Our method can be used (i) as a first-tier assay to reduce genetic testing costs, (ii) to elucidate missing heritability cases, (iii) to characterize breakpoints of CNVs at nucleotide resolution, (iv) to extend WES data to non-coding regions by spiking-in long-range PCR libraries, and (v) to help with phasing of candidate variants.

Download Full-text

Construction of individual ddRAD libraries v2

10.17504/protocols.io.bv4tn8wn ◽

2021 ◽

Author(s):

Claire Daguin Thiebaut ◽

Stephanie Ruault ◽

Charlotte Roby ◽

Thomas Broquet ◽

Frédérique Viard ◽

...

Keyword(s):

Magnetic Beads ◽

Sex Chromosome ◽

De Novo ◽

Pcr Amplification ◽

Restriction Enzymes ◽

Tree Frog ◽

Size Selection ◽

Hyla Arborea ◽

Sequencing Method ◽

Sample Representation

This protocol describes a double digested restriction-site associated DNA (ddRADseq) procedure, that is a variation on the original RAD sequencing method (Davey & Blaxter 2011), which is used for de novo SNP discovery and genotyping. This protocol differs from the original ddRADseq protocol (Peterson et al 2012), in which the samples are pooled just after the ligation to adaptors (i.e. before size selection and PCR). The present ddRAD protocol as been slightly adapted from Alan Brelsford's protocol published in the supplementary material of this paper: Brelsford, A., Dufresnes, C. & Perrin, N. 2016. High-density sex-specific linkage maps of a European tree frog (Hyla arborea) identify the sex chromosome without information on offspring sex. Heredity 116, 177–181 (2016). https://doi.org/10.1038/hdy.2015.83 In the present protocol, all samples are treated separately, in a microplate, until final PCR amplification performed before pooling. Despite being slightly more costly and time-consuming in the lab, it allows for fine adjustement of each sample representation in the final library pool, ensuring similar number of sequencing reads per sample in the final dataset. Briefly, genomic DNA from the samples are individually digested with 2 restriction enzymes (one rare-cutter and one more frequent cutter) then ligated to a barcoded adaptor (among 24 available) at one side, and a single adaptor at the other side, purified with magnetic beads, and PCR-amplified allowing the addition of a Illumina index (among 12 available) for multiplexing a maximum of 288 sample per library. Samples are then pooled in equimolar conditions after visualisation on an agarose gel. Purification and size selection is then performed before final quality control of the library and sequencing.

Download Full-text

Low-Bias RNA Sequencing of the HIV-2 Genome from Blood Plasma

Journal of Virology ◽

10.1128/jvi.00677-18 ◽

2018 ◽

Vol 93 (1) ◽

Cited By ~ 3

Author(s):

Katherine L. James ◽

Thushan I. de Silva ◽

Katherine Brown ◽

Hilton Whittle ◽

Stephen Taylor ◽

...

Keyword(s):

Genetic Diversity ◽

Blood Plasma ◽

De Novo ◽

A Priori ◽

Pcr Amplification ◽

Hiv Vaccine ◽

Whole Genome ◽

Plasma Samples ◽

Target Enrichment ◽

Rna Seq

ABSTRACTAccurate determination of the genetic diversity present in the HIV quasispecies is critical for the development of a preventative vaccine: in particular, little is known about viral genetic diversity for the second type of HIV, HIV-2. A better understanding of HIV-2 biology is relevant to the HIV vaccine field because a substantial proportion of infected people experience long-term viral control, and prior HIV-2 infection has been associated with slower HIV-1 disease progression in coinfected subjects. The majority of traditional and next-generation sequencing methods have relied on target amplification prior to sequencing, introducing biases that may obscure the true signals of diversity in the viral population. Additionally, target enrichment through PCR requiresa priorisequence knowledge, which is lacking for HIV-2. Therefore, a target enrichment free method of library preparation would be valuable for the field. We applied an RNA shotgun sequencing (RNA-Seq) method without PCR amplification to cultured viral stocks and patient plasma samples from HIV-2-infected individuals. Libraries generated from total plasma RNA were analyzed with a two-step pipeline: (i)de novogenome assembly, followed by (ii) read remapping. By this approach, whole-genome sequences were generated with a 28× to 67× mean depth of coverage. Assembled reads showed a low level of GC bias, and comparison of the genome diversities at the intrahost level showed low diversity in the accessory genevpxin all patients. Our study demonstrates that RNA-Seq is a feasible full-genomede novosequencing method for blood plasma samples collected from HIV-2-infected individuals.IMPORTANCEAn accurate picture of viral genetic diversity is critical for the development of a globally effective HIV vaccine. However, sequencing strategies are often complicated by target enrichment prior to sequencing, introducing biases that can distort variant frequencies, which are not easily corrected for in downstream analyses. Additionally, detaileda priorisequence knowledge is needed to inform robust primer design when employing PCR amplification, a factor that is often lacking when working with tropical diseases localized in developing countries. Previous work has demonstrated that direct RNA shotgun sequencing (RNA-Seq) can be used to circumvent these issues for hepatitis C virus (HCV) and norovirus. We applied RNA-Seq to total RNA extracted from HIV-2 blood plasma samples, demonstrating the applicability of this technique to HIV-2 and allowing us to generate a dynamic picture of genetic diversity over the whole genome of HIV-2 in the context of low-bias sequencing.

Download Full-text

De novo Genome Assembly from Next-Generation Sequencing (NGS) Reads

Next-Generation Sequencing Data Analysis ◽

10.1201/b19532-11 ◽

2016 ◽

pp. 144-155

Keyword(s):

Next Generation Sequencing ◽

Genome Assembly ◽

De Novo ◽

Next Generation ◽

De Novo Genome Assembly ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

Download Full-text

Optimizing de novo genome assembly from PCR-amplified metagenomes

PeerJ ◽

10.7717/peerj.6902 ◽

2019 ◽

Vol 7 ◽

pp. e6902 ◽

Cited By ~ 9

Author(s):

Simon Roux ◽

Gareth Trubl ◽

Danielle Goudeau ◽

Nandita Nath ◽

Estelle Couradeau ◽

...

Keyword(s):

Genome Assembly ◽

De Novo ◽

Pcr Amplification ◽

Error Rates ◽

De Novo Genome Assembly ◽

Low Input ◽

Assembly Algorithm ◽

Coverage Bias ◽

Size Number ◽

Assembly Pipeline

Background Metagenomics has transformed our understanding of microbial diversity across ecosystems, with recent advances enabling de novo assembly of genomes from metagenomes. These metagenome-assembled genomes are critical to provide ecological, evolutionary, and metabolic context for all the microbes and viruses yet to be cultivated. Metagenomes can now be generated from nanogram to subnanogram amounts of DNA. However, these libraries require several rounds of PCR amplification before sequencing, and recent data suggest these typically yield smaller and more fragmented assemblies than regular metagenomes. Methods Here we evaluate de novo assembly methods of 169 PCR-amplified metagenomes, including 25 for which an unamplified counterpart is available, to optimize specific assembly approaches for PCR-amplified libraries. We first evaluated coverage bias by mapping reads from PCR-amplified metagenomes onto reference contigs obtained from unamplified metagenomes of the same samples. Then, we compared different assembly pipelines in terms of assembly size (number of bp in contigs ≥ 10 kb) and error rates to evaluate which are the best suited for PCR-amplified metagenomes. Results Read mapping analyses revealed that the depth of coverage within individual genomes is significantly more uneven in PCR-amplified datasets versus unamplified metagenomes, with regions of high depth of coverage enriched in short inserts. This enrichment scales with the number of PCR cycles performed, and is presumably due to preferential amplification of short inserts. Standard assembly pipelines are confounded by this type of coverage unevenness, so we evaluated other assembly options to mitigate these issues. We found that a pipeline combining read deduplication and an assembly algorithm originally designed to recover genomes from libraries generated after whole genome amplification (single-cell SPAdes) frequently improved assembly of contigs ≥10 kb by 10 to 100-fold for low input metagenomes. Conclusions PCR-amplified metagenomes have enabled scientists to explore communities traditionally challenging to describe, including some with extremely low biomass or from which DNA is particularly difficult to extract. Here we show that a modified assembly pipeline can lead to an improved de novo genome assembly from PCR-amplified datasets, and enables a better genome recovery from low input metagenomes.

Download Full-text

Optimizing de novo genome assembly from PCR-amplified metagenomes

10.7287/peerj.preprints.27453 ◽

2018 ◽

Author(s):

Simon Roux ◽

Gareth Trubl ◽

Danielle Goudeau ◽

Nandita Nath ◽

Estelle Couradeau ◽

...

Keyword(s):

Genome Assembly ◽

De Novo Assembly ◽

De Novo ◽

Pcr Amplification ◽

Error Rates ◽

De Novo Genome Assembly ◽

Low Input ◽

Assembly Algorithm ◽

Coverage Bias ◽

Assembly Pipeline

Background. Metagenomics has transformed our understanding of microbial diversity across ecosystems, with recent advances enabling de novo assembly of genomes from metagenomes. These metagenome-assembled genomes are critical to provide ecological, evolutionary, and metabolic context for all the microbes and viruses yet to be cultivated. Metagenomes can now be generated from nanogram to subnanogram amounts of DNA. However, these libraries require several rounds of PCR amplification before sequencing, and recent data suggest these typically yield smaller and more fragmented assemblies than regular metagenomes. Methods. Here we evaluate de novo assembly methods of 169 PCR-amplified metagenomes, including 25 for which an unamplified counterpart is available, to optimize specific assembly approaches for PCR-amplified libraries. We first evaluated coverage bias by mapping reads from PCR-amplified metagenomes onto reference contigs obtained from unamplified metagenomes of the same samples. Then, we compared different assembly pipelines in terms of assembly size (number of bp in contigs ≥ 10kb) and error rates to evaluate which are the best suited for PCR-amplified metagenomes. Results. Read mapping analyses revealed that the depth of coverage within individual genomes is significantly more uneven in PCR-amplified datasets versus unamplified metagenomes, with regions of high depth of coverage enriched in short inserts. This enrichment scales with the number of PCR cycles performed, and is presumably due to preferential amplification of short inserts. Standard assembly pipelines are confounded by this type of coverage unevenness, so we evaluated other assembly options to mitigate these issues. We found that a pipeline combining read deduplication and an assembly algorithm originally designed to recover genomes from libraries generated after whole genome amplification (single-cell SPAdes) frequently improved assembly of contigs ≥ 10kb by 10 to 100-fold for low input metagenomes. Conclusions. PCR-amplified metagenomes have enabled scientists to explore communities traditionally challenging to describe, including some with extremely low biomass or from which DNA is particularly difficult to extract. Here we show that a modified assembly pipeline can lead to an improved de novo genome assembly from PCR-amplified datasets, and enables a better genome recovery from low input metagenomes.

Download Full-text