Genetic structure of the American ginseng (Panax quinquefolius L.) in Eastern Canada using reduced-representation high-throughput sequencing

Simon Joly; Annie Archambault; Stéphanie Pellerin; Andrée Nault

doi:10.1139/cjb-2016-0144

Genetic structure of the American ginseng (Panax quinquefolius L.) in Eastern Canada using reduced-representation high-throughput sequencing

Botany ◽

10.1139/cjb-2016-0144 ◽

2017 ◽

Vol 95 (4) ◽

pp. 429-434 ◽

Cited By ~ 3

Author(s):

Simon Joly ◽

Annie Archambault ◽

Stéphanie Pellerin ◽

Andrée Nault

Keyword(s):

Genetic Variation ◽

Genetic Structure ◽

High Throughput ◽

High Throughput Sequencing ◽

Natural Populations ◽

American Ginseng ◽

Panax Quinquefolius ◽

Eastern Canada ◽

Reduced Representation ◽

Wide Range

The American ginseng (Panax quinquefolius L.) has been used for a wide range of medicinal purposes for more than 300 years, and is at risk in most of its range because of harvesting in natural populations, herbivory, and habitat loss. Its genetic structure is largely unknown in the previously glaciated areas of Eastern Canada, although such information could provide useful information for restoration strategies. We generated and analysed data from a reduced-representation high-throughput sequencing approach with a BAMOVA population model to partition the genetic variation within and among six natural populations of American ginseng in Eastern Canada. We found that an important and significant fraction of the genetic variation was structured among populations ([Formula: see text] = 42%; FST = 34%) at the geographical scale of the study (<250 km). No clear evidence of isolation-by-distance was observed. This important genetic structure observed among American ginseng populations from a region that was covered by ice during the last glaciations is similar to what had been found in previous studies on southern populations or throughout the species range.

Download Full-text

Impact of Genetic Variation in Gene Regulatory Sequences: A Population Genomics Perspective

Frontiers in Genetics ◽

10.3389/fgene.2021.660899 ◽

2021 ◽

Vol 12 ◽

Author(s):

Manas Joshi ◽

Adamandia Kapopoulou ◽

Stefan Laurent

Keyword(s):

Gene Expression ◽

Genetic Variation ◽

High Throughput ◽

High Throughput Sequencing ◽

Population Genomics ◽

Natural Populations ◽

Regulatory Elements ◽

Regulatory Sequences ◽

Coding Sequences ◽

The Impact

The unprecedented rise of high-throughput sequencing and assay technologies has provided a detailed insight into the non-coding sequences and their potential role as gene expression regulators. These regulatory non-coding sequences are also referred to as cis-regulatory elements (CREs). Genetic variants occurring within CREs have been shown to be associated with altered gene expression and phenotypic changes. Such variants are known to occur spontaneously and ultimately get fixed, due to selection and genetic drift, in natural populations and, in some cases, pave the way for speciation. Hence, the study of genetic variation at CREs has improved our overall understanding of the processes of local adaptation and evolution. Recent advances in high-throughput sequencing and better annotations of CREs have enabled the evaluation of the impact of such variation on gene expression, phenotypic alteration and fitness. Here, we review recent research on the evolution of CREs and concentrate on studies that have investigated genetic variation occurring in these regulatory sequences within the context of population genetics.

Download Full-text

seqCAT: a Bioconductor R-package for variant analysis of high throughput sequencing data

F1000Research ◽

10.12688/f1000research.16083.1 ◽

2018 ◽

Vol 7 ◽

pp. 1466 ◽

Cited By ~ 2

Author(s):

Erik Fasterius ◽

Cristina Al-Khalili Szigyarto

Keyword(s):

Genetic Variation ◽

Liver Cancer ◽

High Throughput ◽

High Throughput Sequencing ◽

R Package ◽

Ease Of Use ◽

Sequencing Data ◽

Dna And Rna ◽

High Throughput Sequencing Data ◽

Wide Range

High throughput sequencing technologies are flourishing in the biological sciences, enabling unprecedented insights into e.g. genetic variation, but require extensive bioinformatic expertise for the analysis. There is thus a need for simple yet effective software that can analyse both existing and novel data, providing interpretable biological results with little bioinformatic prowess. We present seqCAT, a Bioconductor toolkit for analysing genetic variation in high throughput sequencing data. It is a highly accessible, easy-to-use and well-documented R-package that enables a wide range of researchers to analyse their own and publicly available data, providing biologically relevant conclusions and publication-ready figures. SeqCAT can provide information regarding genetic similarities between an arbitrary number of samples, validate specific variants as well as define functionally similar variant groups for further downstream analyses. Its ease of use, installation, complete data-to-conclusions functionality and the inherent flexibility of the R programming language make seqCAT a powerful tool for variant analyses compared to already existing solutions. A publicly available dataset of liver cancer-derived organoids is analysed herein using the seqCAT package, demonstrating that the organoids are genetically stable. A previously known liver cancer-related mutation is additionally shown to be present in a sample though it was not listed in the original publication. Differences between DNA- and RNA-based variant calls in this dataset are also analysed revealing a high median concordance of 97.5%.

Download Full-text

seqCAT: a Bioconductor R-package for variant analysis of high throughput sequencing data

F1000Research ◽

10.12688/f1000research.16083.2 ◽

2019 ◽

Vol 7 ◽

pp. 1466 ◽

Cited By ~ 1

Author(s):

Erik Fasterius ◽

Cristina Al-Khalili Szigyarto

Keyword(s):

Genetic Variation ◽

Liver Cancer ◽

High Throughput ◽

High Throughput Sequencing ◽

R Package ◽

Ease Of Use ◽

Sequencing Data ◽

Dna And Rna ◽

High Throughput Sequencing Data ◽

Wide Range

High throughput sequencing technologies are flourishing in the biological sciences, enabling unprecedented insights into e.g. genetic variation, but require extensive bioinformatic expertise for the analysis. There is thus a need for simple yet effective software that can analyse both existing and novel data, providing interpretable biological results with little bioinformatic prowess. We present seqCAT, a Bioconductor toolkit for analysing genetic variation in high throughput sequencing data. It is a highly accessible, easy-to-use and well-documented R-package that enables a wide range of researchers to analyse their own and publicly available data, providing biologically relevant conclusions and publication-ready figures. SeqCAT can provide information regarding genetic similarities between an arbitrary number of samples, validate specific variants as well as define functionally similar variant groups for further downstream analyses. Its ease of use, installation, complete data-to-conclusions functionality and the inherent flexibility of the R programming language make seqCAT a powerful tool for variant analyses compared to already existing solutions. A publicly available dataset of liver cancer-derived organoids is analysed herein using the seqCAT package, corroborating the original authors' conclusions that the organoids are genetically stable. A previously known liver cancer-related mutation is additionally shown to be present in a sample though it was not listed in the original publication. Differences between DNA- and RNA-based variant calls in this dataset are also analysed revealing a high median concordance of 97.5%. SeqCAT is an open source software under a MIT licence available at https://bioconductor.org/packages/release/bioc/html/seqCAT.html.

Download Full-text

A high-throughput sequencing determination method for upstream genetic structure (UGS) of ISEcp1-blaCTX-M transposition unit and application of the UGS to classification of bacterial isolates possessing blaCTX-M

Journal of Infection and Chemotherapy ◽

10.1016/j.jiac.2021.04.001 ◽

2021 ◽

Author(s):

Nobuyoshi Yagi ◽

Kouta Hamamoto ◽

Kim Ngan Thi Bui ◽

Shuhei Ueda ◽

Saki Tawata ◽

...

Keyword(s):

Genetic Structure ◽

High Throughput ◽

High Throughput Sequencing ◽

Determination Method ◽

Bacterial Isolates

Download Full-text

Chinese Fir Breeding in the High-Throughput Sequencing Era: Insights from SNPs

Forests ◽

10.3390/f10080681 ◽

2019 ◽

Vol 10 (8) ◽

pp. 681 ◽

Cited By ~ 1

Author(s):

Huiquan Zheng ◽

Dehuo Hu ◽

Ruping Wei ◽

Shu Yan ◽

Runhui Wang

Keyword(s):

Population Structure ◽

Genetic Structure ◽

High Throughput ◽

High Throughput Sequencing ◽

Breeding Population ◽

Population Diversity ◽

Chinese Fir ◽

High Density ◽

Population Structure Analysis ◽

Snp Panel

Knowledge on population diversity and structure is of fundamental importance for conifer breeding programs. In this study, we concentrated on the development and application of high-density single nucleotide polymorphism (SNP) markers through a high-throughput sequencing technique termed as specific-locus amplified fragment sequencing (SLAF-seq) for the economically important conifer tree species, Chinese fir (Cunninghamia lanceolata). Based on the SLAF-seq, we successfully established a high-density SNP panel consisting of 108,753 genomic SNPs from Chinese fir. This SNP panel facilitated us in gaining insight into the genetic base of the Chinese fir advance breeding population with 221 genotypes for its genetic variation, relationship and diversity, and population structure status. Overall, the present population appears to have considerable genetic variability. Most (94.15%) of the variability was attributed to the genetic differentiation of genotypes, very limited (5.85%) variation occurred on the population (sub-origin set) level. Correspondingly, low FST (0.0285–0.0990) values were seen for the sub-origin sets. When viewing the genetic structure of the population regardless of its sub-origin set feature, the present SNP data opened a new population picture where the advanced Chinese fir breeding population could be divided into four genetic sets, as evidenced by phylogenetic tree and population structure analysis results, albeit some difference in membership of the corresponding set (cluster vs. group). It also suggested that all the genetic sets were admixed clades revealing a complex relationship of the genotypes of this population. With a step wise pruning procedure, we captured a core collection (core 0.650) harboring 143 genotypes that maintains all the allele, diversity, and specific genetic structure of the whole population. This generalist core is valuable for the Chinese fir advanced breeding program and further genetic/genomic studies.

Download Full-text

Accurate detection for a wide range of mutation and editing sites of microRNAs from small RNA high-throughput sequencing profiles

Nucleic Acids Research ◽

10.1093/nar/gkw471 ◽

2016 ◽

Vol 44 (14) ◽

pp. e123-e123 ◽

Cited By ~ 24

Author(s):

Yun Zheng ◽

Bo Ji ◽

Renhua Song ◽

Shengpeng Wang ◽

Ting Li ◽

...

Keyword(s):

High Throughput ◽

Small Rna ◽

High Throughput Sequencing ◽

Accurate Detection ◽

Wide Range

Download Full-text

Identification of Novel Viruses in Amblyomma americanum , Dermacentor variabilis , and Ixodes scapularis Ticks

mSphere ◽

10.1128/msphere.00614-17 ◽

2018 ◽

Vol 3 (2) ◽

Cited By ~ 35

Author(s):

Rafal Tokarz ◽

Stephen Sameroff ◽

Teresa Tagliafierro ◽

Komal Jain ◽

Simon H. Williams ◽

...

Keyword(s):

New York ◽

High Throughput ◽

High Throughput Sequencing ◽

Ixodes Scapularis ◽

The United States ◽

Dermacentor Variabilis ◽

Amblyomma Americanum ◽

Animal Pathogens ◽

Wide Range ◽

Sequencing Platforms

ABSTRACT Ticks carry a wide range of known human and animal pathogens and are postulated to carry others with the potential to cause disease. Here we report a discovery effort wherein unbiased high-throughput sequencing was used to characterize the virome of 2,021 ticks, including Ixodes scapularis ( n = 1,138), Amblyomma americanum ( n = 720), and Dermacentor variabilis ( n = 163), collected in New York, Connecticut, and Virginia in 2015 and 2016. We identified 33 viruses, including 24 putative novel viral species. The most frequently detected viruses were phylogenetically related to members of the Bunyaviridae and Rhabdoviridae families, as well as the recently proposed Chuviridae . Our work expands our understanding of tick viromes and underscores the high viral diversity that is present in ticks. IMPORTANCE The incidence of tick-borne disease is increasing, driven by rapid geographical expansion of ticks and the discovery of new tick-associated pathogens. The examination of the tick microbiome is essential in order to understand the relationship between microbes and their tick hosts and to facilitate the identification of new tick-borne pathogens. Genomic analyses using unbiased high-throughput sequencing platforms have proven valuable for investigations of tick bacterial diversity, but the examination of tick viromes has historically not been well explored. By performing a comprehensive virome analysis of the three primary tick species associated with human disease in the United States, we gained substantial insight into tick virome diversity and can begin to assess a potential role of these viruses in the tick life cycle.

Download Full-text

jackalope: a swift, versatile phylogenomic and high-throughput sequencing simulator

10.1101/650747 ◽

2019 ◽

Author(s):

Lucas A. Nell

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Population Genomics ◽

R Package ◽

Gene Trees ◽

Sequencing Platform ◽

Genomic Variants ◽

Pacific Biosciences ◽

Wide Range ◽

Reference Genomes

AbstractHigh-throughput sequencing (HTS) is central to the study of population genomics and has an increasingly important role in constructing phylogenies. Choices in research design for sequencing projects can include a wide range of factors, such as sequencing platform, depth of coverage, and bioinformatic tools. Simulating HTS data better informs these decisions. However, current standalone HTS simulators cannot generate genomic variants under even somewhat complex evolutionary scenarios, which greatly reduces their usefulness for fields such as population genomics and phylogenomics. Here I present the R package jackalope that simply and efficiently simulates (i) variants from reference genomes and (ii) reads from both Illumina and Pacific Biosciences (PacBio) platforms. Genomic variants can be simulated using phylogenies, gene trees, coalescent-simulation output, population-genomic summary statistics, and Variant Call Format (VCF) files. jackalope can simulate single, paired-end, or mate-pair Illumina reads, as well as reads from Pacific Biosciences. These simulations include sequencing errors, mapping qualities, multiplexing, and optical/PCR duplicates. It can read reference genomes from FASTA files and can simulate new ones, and all outputs can be written to standard file formats. jackalope is available for Mac, Windows, and Linux systems.

Download Full-text

Learning Sparse Log-Ratios for High-Throughput Sequencing Data

10.1101/2021.02.11.430695 ◽

2021 ◽

Author(s):

Elliott Gordon-Rodriguez ◽

Thomas P. Quinn ◽

John P. Cunningham

Keyword(s):

High Throughput ◽

Latent Variables ◽

High Throughput Sequencing ◽

Compositional Data ◽

Predictive Accuracy ◽

Learning Algorithm ◽

Sequencing Data ◽

Genetic Sequencing ◽

Wide Range ◽

Benchmark Datasets

AbstractThe automatic discovery of interpretable features that are associated to an outcome of interest is a central goal of bioinformatics. In the context of high-throughput genetic sequencing data, and Compositional Data more generally, an important class of features are the log-ratios between subsets of the input variables. However, the space of these log-ratios grows combinatorially with the dimension of the input, and as a result, existing learning algorithms do not scale to increasingly common high-dimensional datasets. Building on recent literature on continuous relaxations of discrete latent variables, we design a novel learning algorithm that identifies sparse log-ratios several orders of magnitude faster than competing methods. As well as dramatically reducing runtime, our method outperforms its competitors in terms of sparsity and predictive accuracy, as measured across a wide range of benchmark datasets.

Download Full-text

Dung beetles as vertebrate samplers – a test of high throughput analysis of dung beetle iDNA

10.1101/2021.02.10.430568 ◽

2021 ◽

Author(s):

Rosie Drinkwater ◽

Elizabeth L. Clare ◽

Arthur Y. C. Chung ◽

Stephen J. Rossiter ◽

Eleanor M. Slade

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Dung Beetle ◽

Blood Feeding ◽

Preliminary Evidence ◽

Dung Beetles ◽

High Throughput Analysis ◽

Humid Forest ◽

Terrestrial Vertebrates ◽

Wide Range

AbstractThe application of environmental DNA (eDNA) sampling in biodiversity surveys has gained widespread acceptance, especially in aquatic systems where free eDNA can be readily collected by filtering water. In terrestrial systems, eDNA-based approaches for assaying vertebrate biodiversity have tended to rely on blood-feeding invertebrates, including leeches and mosquitoes (termed invertebrate-derived DNA or iDNA). However, a key limitation of using blood-feeding taxa as samplers is that they are difficult to trap, and, in the case of leeches, are highly restricted to humid forest ecosystems. Dung beetles (superfamily Scarabaeoidea) feed on the faecal matter of terrestrial vertebrates and offer several potential benefits over blood-feeding invertebrates as samplers of vertebrate DNA. Importantly, these beetles can be easily captured in large numbers using simple, inexpensive baited traps; are globally distributed; and also occur in a wide range of biomes, allowing mammal diversity to be compared across habitats. In this exploratory study, we test the potential utility of dung beetles as vertebrate samplers by sequencing the mammal DNA contained within their guts. First, using a controlled feeding experiment, we show that mammalian DNA can be retrieved from the guts of large dung beetles (Catharsius renaudpauliani) for up to 10 hours after feeding. Second, by combining high-throughput sequencing of a multi-species assemblage of dung beetles with PCR replicates, we show that multiple mammal taxa can be identified with high confidence. By providing preliminary evidence that dung beetles can be used as a source of mammal DNA, our study highlights the potential for this widespread group to be used in future biodiversity monitoring surveys.

Download Full-text