Defining Loci in Restriction-Based Reduced Representation Genomic Data from Nonmodel Species: Sources of Bias and Diagnostics for Optimal Clustering

BioMed Research International ◽

10.1155/2014/675158 ◽

2014 ◽

Vol 2014 ◽

pp. 1-9 ◽

Cited By ~ 46

Author(s):

Daniel C. Ilut ◽

Marie L. Nydam ◽

Matthew P. Hare

Keyword(s):

Population Genomics ◽

De Novo ◽

Genomic Data ◽

Null Alleles ◽

Great Promise ◽

Sequence Difference ◽

Reduced Representation ◽

Ciona Savignyi ◽

Optimum Threshold ◽

Generation Sequencing

Next generation sequencing holds great promise for applications of phylogeography, landscape genetics, and population genomics in wild populations of nonmodel species, but the robustness of inferences hinges on careful experimental design and effective bioinformatic removal of predictable artifacts. Addressing this issue, we use published genomes from a tunicate, stickleback, and soybean to illustrate the potential for bioinformatic artifacts and introduce a protocol to minimize two sources of error expected from similarity-based de-novo clustering of stacked reads: the splitting of alleles into different clusters, which creates false homozygosity, and the grouping of paralogs into the same cluster, which creates false heterozygosity. We present an empirical application focused onCiona savignyi, a tunicate with very high SNP heterozygosity (~0.05), because high diversity challenges the computational efficiency of most existing nonmodel pipelines while also potentially exacerbating paralog artifacts. The simulated and empirical data illustrate the advantages of using higher sequence difference clustering thresholds than is typical and demonstrate the utility of our protocol for efficiently identifying an optimum threshold from data without prior knowledge of heterozygosity. The empiricalCiona savignyidata also highlight null alleles as a potentially large source of false homozygosity in restriction-based reduced representation genomic data.

Download Full-text

Comparative Analysis of SNP Discovery and Genotyping in Fagus sylvatica L. and Quercus robur L. Using RADseq, GBS, and ddRAD Methods

Forests ◽

10.3390/f12020222 ◽

2021 ◽

Vol 12 (2) ◽

pp. 222

Author(s):

Bartosz Ulaszewski ◽

Joanna Meger ◽

Jaroslaw Burczyk

Keyword(s):

Population Genomics ◽

De Novo ◽

Genetic Studies ◽

Genomic Libraries ◽

Reduced Representation ◽

Large Numbers ◽

Broadleaved Tree Species ◽

Fagus Sylvatica L ◽

Reference Genomes ◽

Future Population

Next-generation sequencing of reduced representation genomic libraries (RRL) is capable of providing large numbers of genetic markers for population genetic studies at relatively low costs. However, one major concern of these types of markers is the precision of genotyping, which is related to the common problem of missing data, which appears to be particularly important in association and genomic selection studies. We evaluated three RRL approaches (GBS, RADseq, ddRAD) and different SNP identification methods (de novo or based on a reference genome) to find the best solutions for future population genomics studies in two economically and ecologically important broadleaved tree species, namely F. sylvatica and Q. robur. We found that the use of ddRAD method coupled with SNP calling based on reference genomes provided the largest numbers of markers (28 k and 36 k for beech and oak, respectively), given standard filtering criteria. Using technical replicates of samples, we demonstrated that more than 80% of SNP loci should be considered as reliable markers in GBS and ddRAD, but not in RADseq data. According to the reference genomes’ annotations, more than 30% of the identified ddRAD loci appeared to be related to genes. Our findings provide a solid support for using ddRAD-based SNPs for future population genomics studies in beech and oak.

Download Full-text

Genome-wide methylation sequencing identifies progression-related epigenetic drivers in myelodysplastic syndromes

Cell Death and Disease ◽

10.1038/s41419-020-03213-2 ◽

2020 ◽

Vol 11 (11) ◽

Author(s):

Jing-dong Zhou ◽

Ting-juan Zhang ◽

Zi-jun Xu ◽

Zhao-qun Deng ◽

Yu Gu ◽

...

Keyword(s):

Cancer Progression ◽

Myelodysplastic Syndromes ◽

Bisulfite Sequencing ◽

De Novo ◽

Dna Hypermethylation ◽

Reduced Representation ◽

Targeted Bisulfite Sequencing ◽

Specific Pcr ◽

Genome Wide ◽

Potential Biomarker

AbstractThe potential mechanism of myelodysplastic syndromes (MDS) progressing to acute myeloid leukemia (AML) remains poorly elucidated. It has been proved that epigenetic alterations play crucial roles in the pathogenesis of cancer progression including MDS. However, fewer studies explored the whole-genome methylation alterations during MDS progression. Reduced representation bisulfite sequencing was conducted in four paired MDS/secondary AML (MDS/sAML) patients and intended to explore the underlying methylation-associated epigenetic drivers in MDS progression. In four paired MDS/sAML patients, cases at sAML stage exhibited significantly increased methylation level as compared with the matched MDS stage. A total of 1090 differentially methylated fragments (DMFs) (441 hypermethylated and 649 hypomethylated) were identified involving in MDS pathogenesis, whereas 103 DMFs (96 hypermethylated and 7 hypomethylated) were involved in MDS progression. Targeted bisulfite sequencing further identified that aberrant GFRA1, IRX1, NPY, and ZNF300 methylation were frequent events in an additional group of de novo MDS and AML patients, of which only ZNF300 methylation was associated with ZNF300 expression. Subsequently, ZNF300 hypermethylation in larger cohorts of de novo MDS and AML patients was confirmed by real-time quantitative methylation-specific PCR. It was illustrated that ZNF300 methylation could act as a potential biomarker for the diagnosis and prognosis in MDS and AML patients. Functional experiments demonstrated the anti-proliferative and pro-apoptotic role of ZNF300 overexpression in MDS-derived AML cell-line SKM-1. Collectively, genome-wide DNA hypermethylation were frequent events during MDS progression. Among these changes, ZNF300 methylation, a regulator of ZNF300 expression, acted as an epigenetic driver in MDS progression. These findings provided a theoretical basis for the usage of demethylation drugs in MDS patients against disease progression.

Download Full-text

Next‐generation sequencing in two cases of de novo acute basophilic leukaemia

Journal of Cellular and Molecular Medicine ◽

10.1111/jcmm.16591 ◽

2021 ◽

Author(s):

Takuya Shimizu ◽

Tadakazu Kondo ◽

Yasuhito Nannya ◽

Mizuki Watanabe ◽

Toshio Kitawaki ◽

...

Keyword(s):

Next Generation Sequencing ◽

De Novo ◽

Next Generation ◽

Generation Sequencing

Download Full-text

Prospective enzymes for omega-3 PUFA biosynthesis found in endoparasitic classes within the phylum Platyhelminthes

Journal of Helminthology ◽

10.1017/s0022149x20000954 ◽

2020 ◽

Vol 94 ◽

Author(s):

D. Babaran ◽

M.T. Arts ◽

R.J. Botelho ◽

S.A. Locke ◽

J. Koprivnikar

Keyword(s):

Parasite Species ◽

De Novo ◽

Genomic Data ◽

Wide Distribution ◽

Omega 3 ◽

Free Living ◽

Chain Pufa ◽

Pufa Biosynthesis ◽

Aquatic Food ◽

Long Chain

Abstract The free-living infectious stages of macroparasites, specifically, the cercariae of trematodes (flatworms), are likely to be significant (albeit underappreciated) vectors of nutritionally important polyunsaturated fatty acids (PUFA) to consumers within aquatic food webs, and other macroparasites could serve similar roles. In the context of de novo omega-3 (n-3) PUFA biosynthesis, it was thought that most animals lack the fatty acid (FA) desaturase enzymes that convert stearic acid (18:0) into ɑ-linolenic acid (ALA; 18:3n-3), the main FA precursor for n-3 long-chain PUFA. Recently, novel sequences of these enzymes were recovered from 80 species from six invertebrate phyla, with experimental confirmation of gene function in five phyla. Given this wide distribution, and the unusual attributes of flatworm genomes, we conducted an additional search for genes for de novo n-3 PUFA in the phylum Platyhelminthes. Searches with experimentally confirmed sequences from Rotifera recovered nine relevant FA desaturase sequences from eight species in four genera in the two exclusively endoparasite classes (Trematoda and Cestoda). These results could indicate adaptations of these particular parasite species, or may reflect the uneven taxonomic coverage of sequence databases. Although additional genomic data and, particularly, experimental study of gene functionality are important future validation steps, our results indicate endoparasitic platyhelminths may have enzymes for de novo n-3 PUFA biosynthesis, thereby contributing to global PUFA production, but also representing a potential target for clinical antihelmintic applications.

Download Full-text

Molecular marker information from de novo assembled transcriptomes of chilli pepper (Capsicum annuum L.) varieties based on next-generation sequencing technology

Plant Genetic Resources ◽

10.1017/s147926211400032x ◽

2014 ◽

Vol 12 (S1) ◽

pp. S83-S86 ◽

Cited By ~ 1

Author(s):

Yul-Kyun Ahn ◽

Swati Tripathi ◽

Young-Il Cho ◽

Jeong-Ho Kim ◽

Hye-Eun Lee ◽

...

Keyword(s):

Molecular Markers ◽

Next Generation Sequencing ◽

De Novo ◽

Transcriptome Assembly ◽

Sequence Variant ◽

Nucleotide Polymorphisms ◽

Next Generation ◽

Chilli Pepper ◽

Next Generation Sequencing Technology ◽

Generation Sequencing

Next-generation sequencing technique has been known as a useful tool for de novo transcriptome assembly, functional annotation of genes and identification of molecular markers. This study was carried out to mine molecular markers from de novo assembled transcriptomes of four chilli pepper varieties, the highly pungent ‘Saengryeg 211’ and non-pungent ‘Saengryeg 213’ and variably pigmented ‘Mandarin’ and ‘Blackcluster’. Pyrosequencing of the complementary DNA library resulted in 361,671, 274,269, 279,221, and 316,357 raw reads, which were assembled in 23,607, 19,894, 18,340 and 20,357 contigs, for the four varieties, respectively. Detailed sequence variant analysis identified numerous potential single-nucleotide polymorphisms (SNPs) and simple sequence repeats (SSRs) for all the varieties for which the primers were designed. The transcriptome information and SNP/SSR markers generated in this study provide valuable resources for high-density molecular genetic mapping in chilli pepper and Quantitative trait loci analysis related to fruit qualities. These markers for pepper will be highly valuable for marker-assisted breeding and other genetic studies.

Download Full-text

Bridging the Gap between Vertebrate Cytogenetics and Genomics with Single-Chromosome Sequencing (ChromSeq)

Genes ◽

10.3390/genes12010124 ◽

2021 ◽

Vol 12 (1) ◽

pp. 124

Author(s):

Alessio Iannucci ◽

Alexey I. Makunin ◽

Artem P. Lisachov ◽

Claudio Ciofi ◽

Roscoe Stanyon ◽

...

Keyword(s):

Genome Evolution ◽

Karyotype Evolution ◽

Genomic Data ◽

Anolis Carolinensis ◽

Vertebrate Genome ◽

Single Chromosome ◽

Sequencing Technologies ◽

Novel Approaches ◽

Genome Assemblies ◽

Generation Sequencing

The study of vertebrate genome evolution is currently facing a revolution, brought about by next generation sequencing technologies that allow researchers to produce nearly complete and error-free genome assemblies. Novel approaches however do not always provide a direct link with information on vertebrate genome evolution gained from cytogenetic approaches. It is useful to preserve and link cytogenetic data with novel genomic discoveries. Sequencing of DNA from single isolated chromosomes (ChromSeq) is an elegant approach to determine the chromosome content and assign genome assemblies to chromosomes, thus bridging the gap between cytogenetics and genomics. The aim of this paper is to describe how ChromSeq can support the study of vertebrate genome evolution and how it can help link cytogenetic and genomic data. We show key examples of ChromSeq application in the refinement of vertebrate genome assemblies and in the study of vertebrate chromosome and karyotype evolution. We also provide a general overview of the approach and a concrete example of genome refinement using this method in the species Anolis carolinensis.

Download Full-text

A gene-by-gene population genomics platform: de novo assembly, annotation and genealogical analysis of 108 representative Neisseria meningitidis genomes

BMC Genomics ◽

10.1186/1471-2164-15-1138 ◽

2014 ◽

Vol 15 (1) ◽

pp. 1138 ◽

Cited By ~ 112

Author(s):

Holly B Bratcher ◽

Craig Corton ◽

Keith A Jolley ◽

Julian Parkhill ◽

Martin CJ Maiden

Keyword(s):

Neisseria Meningitidis ◽

De Novo Assembly ◽

Population Genomics ◽

De Novo ◽

Genealogical Analysis

Download Full-text

De novo transcriptome based on next-generation sequencing reveals candidate genes with sex-specific expression in Arapaima gigas (Schinz, 1822), an ancient Amazonian freshwater fish

PLoS ONE ◽

10.1371/journal.pone.0206379 ◽

2018 ◽

Vol 13 (10) ◽

pp. e0206379 ◽

Cited By ~ 1

Author(s):

Luciana Watanabe ◽

Fátima Gomes ◽

João Vianez ◽

Márcio Nunes ◽

Jedson Cardoso ◽

...

Keyword(s):

Next Generation Sequencing ◽

Freshwater Fish ◽

Candidate Genes ◽

De Novo ◽

Next Generation ◽

De Novo Transcriptome ◽

Specific Expression ◽

Arapaima Gigas ◽

Generation Sequencing

Download Full-text

De novo Genome Assembly from Next-Generation Sequencing (NGS) Reads

Next-Generation Sequencing Data Analysis ◽

10.1201/b19532-11 ◽

2016 ◽

pp. 144-155

Keyword(s):

Next Generation Sequencing ◽

Genome Assembly ◽

De Novo ◽

Next Generation ◽

De Novo Genome Assembly ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

Download Full-text

dDocent: a RADseq, variant-calling pipeline designed for population genomics of non-model organisms

10.7287/peerj.preprints.314 ◽

2014 ◽

Author(s):

Jonathan Puritz ◽

Christopher M. Hollenbeck ◽

John R. Gold

Keyword(s):

Population Genomics ◽

De Novo ◽

Variant Calling ◽

Population Level ◽

Model Organisms ◽

Effective Population ◽

Reduction Techniques ◽

Indel Polymorphisms ◽

Indel Calling ◽

Population Sizes

Restriction-site associated DNA sequencing (RADseq) has become a powerful and useful approach for population genomics. Currently, no software exists that utilizes both paired-end reads from RADseq data to efficiently produce population-informative variant calls, especially for organisms with large effective population sizes and high levels of genetic polymorphism but for which no genomic resources exist. dDocent is an analysis pipeline with a user-friendly, command-line interface designed to process individually barcoded RADseq data (with double cut sites) into informative SNPs/Indels for population-level analyses. The pipeline, written in BASH, uses data reduction techniques and other stand-alone software packages to perform quality trimming and adapter removal, de novo assembly of RAD loci, read mapping, SNP and Indel calling, and baseline data filtering. Double-digest RAD data from population pairings of three different marine fishes were used to compare dDocent with Stacks, the first generally available, widely used pipeline for analysis of RADseq data. dDocent consistently identified more SNPs shared across greater numbers of individuals and with higher levels of coverage. This is most likely due to the fact that dDocent quality trims instead of filtering and incorporates both forward and reverse reads in assembly, mapping, and SNP calling, thus enabling use of reads with Indel polymorphisms. The pipeline and a comprehensive user guide can be found at (http://dDocent.wordpress.com).

Download Full-text