Identification of SNPs Associated with Stress Response Traits within High Stress and Low Stress Lines of Japanese Quail

Steven Shumaker; Bhuwan Khatri; Stephanie Shouse; Dongwon Seo; Seong Kang; Wayne Kuenzel; Byungwhi Kong

doi:10.3390/genes12030405

Identification of SNPs Associated with Stress Response Traits within High Stress and Low Stress Lines of Japanese Quail

Genes ◽

10.3390/genes12030405 ◽

2021 ◽

Vol 12 (3) ◽

pp. 405

Author(s):

Steven Shumaker ◽

Bhuwan Khatri ◽

Stephanie Shouse ◽

Dongwon Seo ◽

Seong Kang ◽

...

Keyword(s):

Japanese Quail ◽

Stress Responses ◽

High Stress ◽

Read Depth ◽

Poultry Production ◽

Nucleotide Polymorphisms ◽

Production Traits ◽

Sequencing Data ◽

Illumina Hiseq ◽

Genetic Mechanisms

Mitigation of stress is of great importance in poultry production, as chronic stress can affect the efficiency of production traits. Selective breeding with a focus on stress responses can be used to combat the effects of stress. To better understand the genetic mechanisms driving differences in stress responses of a selectively bred population of Japanese quail, we performed genomic resequencing on 24 birds from High Stress (HS) and Low Stress (LS) lines of Japanese quail using Illumina HiSeq 2 × 150 bp paired end read technology in order to analyze Single Nucleotide Polymorphisms (SNPs) within the genome of each line. SNPs are common mutations that can lead to genotypic and phenotypic variations in animals. Following alignment of the sequencing data to the quail genome, 6,364,907 SNPs were found across both lines of quail. 10,364 of these SNPs occurred in coding regions, from which 2886 unique, non-synonymous SNPs with a SNP% ≥ 0.90 and a read depth ≥ 10 were identified. Using Ingenuity Pathway Analysis, we identified genes affected by SNPs in pathways tied to immune responses, DNA repair, and neurological signaling. Our findings support the idea that the SNPs found within HS and LS lines of quail could direct the observed changes in phenotype.

Download Full-text

Reliability of genomic variants across different next-generation sequencing platforms and bioinformatic processing pipelines

10.21203/rs.3.rs-50691/v2 ◽

2020 ◽

Author(s):

Stephan Weißbach ◽

Stanislav Jur`Evic Sys ◽

Charlotte Hewel ◽

Hristo Todorov ◽

Susann Schweiger ◽

...

Keyword(s):

Next Generation Sequencing ◽

Gc Content ◽

Nucleotide Polymorphisms ◽

Next Generation ◽

Sequencing Data ◽

Illumina Hiseq ◽

Cross Sectional ◽

Single Nucleotide ◽

Alu Elements ◽

Generation Sequencing

Abstract Background Next Generation Sequencing (NGS) is the fundament of various studies, providing insights into questions from biology and medicine. Nevertheless, integrating data from different experimental backgrounds can introduce strong biases. In order to methodically investigate the magnitude of systematic errors in single nucleotide variant calls, we performed a cross-sectional observational study on a genomic cohort of 99 subjects each sequenced via (i) Illumina HiSeq X, (ii) Illumina HiSeq, and (iii) Complete Genomics and processed with the respective bioinformatic pipeline. We also repeated variant calling for the Illumina cohorts with GATK, which allowed us to investigate the effect of the bioinformatics analysis strategy separately from the sequencing platform's impact.Results The number of detected variants/variant classes per individual was highly dependent on the experimental setup. We observed a statistically significant overrepresentation of variants uniquely called by a single setup, indicating potential systematic biases. Insertion/deletion polymorphisms (InDels) were associated with decreased concordance compared to single nucleotide polymorphisms (SNPs). The discrepancies in InDel absolute numbers were particularly prominent in introns, Alu elements, simple repeats, and regions with medium GC content. Notably, reprocessing sequencing data following the best practice recommendations of GATK considerably improved concordance between the respective setups.Conclusion We provide empirical evidence of systematic heterogeneity in variant calls between alternative experimental and data analysis setups. Furthermore, our results demonstrate the benefit of reprocessing genomic data with harmonized pipelines when integrating data from different studies.

Download Full-text

Comprehensive Transcriptome Study to Develop Molecular Resources of the CopepodCalanus sinicusfor Their Potential Ecological Applications

BioMed Research International ◽

10.1155/2014/493825 ◽

2014 ◽

Vol 2014 ◽

pp. 1-12 ◽

Cited By ~ 12

Author(s):

Qing Yang ◽

Fanyue Sun ◽

Zhi Yang ◽

Hongjun Li

Keyword(s):

Gene Annotation ◽

Average Length ◽

Rapid Development ◽

Northwest Pacific ◽

Nucleotide Polymorphisms ◽

Rna Seq ◽

Sequencing Data ◽

Illumina Hiseq ◽

Northwest Pacific Ocean ◽

Snp Validation

Calanus sinicusBrodsky (Copepoda, Crustacea) is a dominant zooplanktonic species widely distributed in the margin seas of the Northwest Pacific Ocean. In this study, we utilized an RNA-Seq-based approach to develop molecular resources forC. sinicus. Adult samples were sequenced using the Illumina HiSeq 2000 platform. The sequencing data generated 69,751 contigs from 58.9 million filtered reads. The assembled contigs had an average length of 928.8 bp. Gene annotation allowed the identification of 43,417 unigene hits against the NCBI database. Gene ontology (GO) and KEGG pathway mapping analysis revealed various functional genes related to diverse biological functions and processes. Transcripts potentially involved in stress response and lipid metabolism were identified among these genes. Furthermore, 4,871 microsatellites and 110,137 single nucleotide polymorphisms (SNPs) were identified in theC. sinicustranscriptome sequences. SNP validation by the melting temperature (Tm)-shift method suggested that 16 primer pairs amplified target products and showed biallelic polymorphism among 30 individuals. The present work demonstrates the power of Illumina-based RNA-Seq for the rapid development of molecular resources in nonmodel species. The validated SNP set from our study is currently being utilized in an ongoing ecological analysis to support a future study ofC. sinicuspopulation genetics.

Download Full-text

Reliability of genomic variants across different next-generation sequencing platforms and bioinformatic processing pipelines

BMC Genomics ◽

10.1186/s12864-020-07362-8 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Stephan Weißbach ◽

Stanislav Sys ◽

Charlotte Hewel ◽

Hristo Todorov ◽

Susann Schweiger ◽

...

Keyword(s):

Next Generation Sequencing ◽

Gc Content ◽

Nucleotide Polymorphisms ◽

Next Generation ◽

Sequencing Data ◽

Illumina Hiseq ◽

Cross Sectional ◽

Single Nucleotide ◽

Alu Elements ◽

Generation Sequencing

Abstract Background Next Generation Sequencing (NGS) is the fundament of various studies, providing insights into questions from biology and medicine. Nevertheless, integrating data from different experimental backgrounds can introduce strong biases. In order to methodically investigate the magnitude of systematic errors in single nucleotide variant calls, we performed a cross-sectional observational study on a genomic cohort of 99 subjects each sequenced via (i) Illumina HiSeq X, (ii) Illumina HiSeq, and (iii) Complete Genomics and processed with the respective bioinformatic pipeline. We also repeated variant calling for the Illumina cohorts with GATK, which allowed us to investigate the effect of the bioinformatics analysis strategy separately from the sequencing platform’s impact. Results The number of detected variants/variant classes per individual was highly dependent on the experimental setup. We observed a statistically significant overrepresentation of variants uniquely called by a single setup, indicating potential systematic biases. Insertion/deletion polymorphisms (indels) were associated with decreased concordance compared to single nucleotide polymorphisms (SNPs). The discrepancies in indel absolute numbers were particularly prominent in introns, Alu elements, simple repeats, and regions with medium GC content. Notably, reprocessing sequencing data following the best practice recommendations of GATK considerably improved concordance between the respective setups. Conclusion We provide empirical evidence of systematic heterogeneity in variant calls between alternative experimental and data analysis setups. Furthermore, our results demonstrate the benefit of reprocessing genomic data with harmonized pipelines when integrating data from different studies.

Download Full-text

The complete chloroplast DNA sequence of eleven grape cultivars. simultaneous resequencing methodology

OENO One ◽

10.20870/oeno-one.2014.48.2.1568 ◽

2014 ◽

Vol 48 (2) ◽

pp. 99 ◽

Cited By ~ 2

Author(s):

Vazha Tabidze ◽

Grigol Baramidze ◽

Ia Pipia ◽

Mari Gogniashvili ◽

Levan Ujmajuridze ◽

...

Keyword(s):

Chloroplast Dna ◽

Dna Sequence ◽

Genomic Dna ◽

Read Depth ◽

Nucleotide Polymorphisms ◽

Sequencing Data ◽

Chromosomal Dna ◽

Single Nucleotide ◽

The Difference ◽

Grape Cultivars

Aims: The chloroplast DNA sequence of eight Georgian grape cultivars (Rkatsiteli, Saperavi, Meskhuri Mtsvane, Chkhaveri, Aladasturi, Krakhuna, Tsitska, Tsolikouri) and three French cultivars (Chardonnay, Gouais Blanc, Chasselas), belonging to four different haplogroups (AAA, ATT, ATA, GTA), was determined by Illumina resequencing of genomic DNA. The chloroplast DNA sequence of the Maxxa cultivar was used as reference.Methods and results: The comparison of sequenced chloroplast DNA gave 100 % identity to Chardonnay and Gouais Blanc, differing from Meskhuri Mtsvane by two insertions/deletions (indels) (all ATA haplogroup). The difference between Chasselas and Saperavi was a single insertion (both ATT haplogroup), while Maxxa, Chkhaveri, Aladasturi, Krakhuna, Tsitska and Tsolikouri were all identical (all members of the GTA haplogroup). Forty-seven identical single nucleotide polymorphisms (SNPs) were detected in the AAA, ATA and ATT haplogroups in comparison to the reference DNA. Additionally, 18 SNPs were detected for the ATT haplogroup, 4 for AAA, 6 for ATA and 11 for both AAA and ATA. The phylogenetic results show that the ATT, AAA and ATA haplogroups are more closely related to each other than to the GTA haplogroup.Conclusion: In the sequencing data of grape genomic DNA at the coverage (read depth) of chromosomal DNA 30-40, the coverage of chloroplast DNA reaches several thousand reads per bp due to the high number of chloroplast DNA copies in genomic DNA, much higher than necessary for resequencing. Based on these data, a new methodology of simultaneous resequencing of large number of chloroplast DNA was developed without preliminary chloroplast isolation or chloroplast enrichment.Significance and impact of the study: This method has great potential for expanding both phylogenetic and population genetic information on the evolution of domesticated crops.

Download Full-text

Reliability of genomic variants across different next-generation sequencing platforms and bioinformatic processing pipelines

10.21203/rs.3.rs-50691/v3 ◽

2021 ◽

Author(s):

Stephan Weißbach ◽

Stanislav Jur`Evic Sys ◽

Charlotte Hewel ◽

Hristo Todorov ◽

Susann Schweiger ◽

...

Keyword(s):

Next Generation Sequencing ◽

Gc Content ◽

Nucleotide Polymorphisms ◽

Next Generation ◽

Sequencing Data ◽

Illumina Hiseq ◽

Cross Sectional ◽

Single Nucleotide ◽

Alu Elements ◽

Generation Sequencing

Abstract BackgroundNext Generation Sequencing (NGS) is the fundament of various studies, providing insights into questions from biology and medicine. Nevertheless, integrating data from different experimental backgrounds can introduce strong biases. In order to methodically investigate the magnitude of systematic errors in single nucleotide variant calls, we performed a cross-sectional observational study on a genomic cohort of 99 subjects each sequenced via (i) Illumina HiSeq X, (ii) Illumina HiSeq, and (iii) Complete Genomics and processed with the respective bioinformatic pipeline. We also repeated variant calling for the Illumina cohorts with GATK, which allowed us to investigate the effect of the bioinformatics analysis strategy separately from the sequencing platform's impact. Results The number of detected variants/variant classes per individual was highly dependent on the experimental setup. We observed a statistically significant overrepresentation of variants uniquely called by a single setup, indicating potential systematic biases. Insertion/deletion polymorphisms (indels) were associated with decreased concordance compared to single nucleotide polymorphisms (SNPs). The discrepancies in indel absolute numbers were particularly prominent in introns, Alu elements, simple repeats, and regions with medium GC content. Notably, reprocessing sequencing data following the best practice recommendations of GATK considerably improved concordance between the respective setups.ConclusionWe provide empirical evidence of systematic heterogeneity in variant calls between alternative experimental and data analysis setups. Furthermore, our results demonstrate the benefit of reprocessing genomic data with harmonized pipelines when integrating data from different studies.

Download Full-text

CNVpytor: a tool for copy number variation detection and analysis from read depth and allele imbalance in whole-genome sequencing

GigaScience ◽

10.1093/gigascience/giab074 ◽

2021 ◽

Vol 10 (11) ◽

Cited By ~ 1

Author(s):

Milovan Suvakov ◽

Arijit Panda ◽

Colin Diesh ◽

Ian Holmes ◽

Alexej Abyzov

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Copy Number ◽

Read Depth ◽

Copy Number Variations ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Nucleotide Polymorphisms ◽

Sequencing Data ◽

Modular Architecture

Abstract Background Detecting copy number variations (CNVs) and copy number alterations (CNAs) based on whole-genome sequencing data is important for personalized genomics and treatment. CNVnator is one of the most popular tools for CNV/CNA discovery and analysis based on read depth. Findings Herein, we present an extension of CNVnator developed in Python—CNVpytor. CNVpytor inherits the reimplemented core engine of its predecessor and extends visualization, modularization, performance, and functionality. Additionally, CNVpytor uses B-allele frequency likelihood information from single-nucleotide polymorphisms and small indels data as additional evidence for CNVs/CNAs and as primary information for copy number–neutral losses of heterozygosity. Conclusions CNVpytor is significantly faster than CNVnator—particularly for parsing alignment files (2–20 times faster)—and has (20–50 times) smaller intermediate files. CNV calls can be filtered using several criteria, annotated, and merged over multiple samples. Modular architecture allows it to be used in shared and cloud environments such as Google Colab and Jupyter notebook. Data can be exported into JBrowse, while a lightweight plugin version of CNVpytor for JBrowse enables nearly instant and GUI-assisted analysis of CNVs by any user. CNVpytor release and the source code are available on GitHub at https://github.com/abyzovlab/CNVpytor under the MIT license.

Download Full-text

Genome-Wide Analysis of Sex Disparities in the Genetic Architecture of Lung and Colorectal Cancers

Genes ◽

10.3390/genes12050686 ◽

2021 ◽

Vol 12 (5) ◽

pp. 686

Author(s):

Alireza Nazarian ◽

Alexander M. Kulminski

Keyword(s):

Nucleotide Polymorphisms ◽

Genetic Associations ◽

Significance Level ◽

Association Analyses ◽

Complex Disorders ◽

Specific Effects ◽

Genome Wide ◽

Genetic Mechanisms ◽

Sex Disparities ◽

Almost All

Almost all complex disorders have manifested epidemiological and clinical sex disparities which might partially arise from sex-specific genetic mechanisms. Addressing such differences can be important from a precision medicine perspective which aims to make medical interventions more personalized and effective. We investigated sex-specific genetic associations with colorectal (CRCa) and lung (LCa) cancers using genome-wide single-nucleotide polymorphisms (SNPs) data from three independent datasets. The genome-wide association analyses revealed that 33 SNPs were associated with CRCa/LCa at P < 5.0 × 10−6 neither males or females. Of these, 26 SNPs had sex-specific effects as their effect sizes were statistically different between the two sexes at a Bonferroni-adjusted significance level of 0.0015. None had proxy SNPs within their ±1 Mb regions and the closest genes to 32 SNPs were not previously associated with the corresponding cancers. The pathway enrichment analyses demonstrated the associations of 35 pathways with CRCa or LCa which were mostly implicated in immune system responses, cell cycle, and chromosome stability. The significant pathways were mostly enriched in either males or females. Our findings provided novel insights into the potential sex-specific genetic heterogeneity of CRCa and LCa at SNP and pathway levels.

Download Full-text

Risk prediction and marker selection in nonsynonymous single nucleotide polymorphisms using whole genome sequencing data

Animal Cells and Systems ◽

10.1080/19768354.2020.1860125 ◽

2020 ◽

Vol 24 (6) ◽

pp. 321-328

Author(s):

Young-Sup Lee ◽

KyeongHye Won ◽

Donghyun Shin ◽

Jae-Don Oh

Keyword(s):

Single Nucleotide Polymorphisms ◽

Whole Genome Sequencing ◽

Risk Prediction ◽

Genome Sequencing ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Nucleotide Polymorphisms ◽

Sequencing Data ◽

Single Nucleotide ◽

Marker Selection

Download Full-text

Transcriptomic Analysis of Rice Plants Overexpressing PsGAPDH in Response to Salinity Stress

Genes ◽

10.3390/genes12050641 ◽

2021 ◽

Vol 12 (5) ◽

pp. 641

Author(s):

Hyemin Lim ◽

Hyunju Hwang ◽

Taelim Kim ◽

Soyoung Kim ◽

Hoyong Chung ◽

...

Keyword(s):

Salt Stress ◽

Abiotic Stress ◽

Salinity Stress ◽

Functional Enrichment ◽

Differentially Expressed ◽

Pathway Enrichment Analysis ◽

Sequencing Data ◽

Illumina Hiseq ◽

Rice Plants ◽

Significant Difference

In plants, glyceraldehyde-3-phosphate dehydrogenase (GAPDH) is a main enzyme in the glycolytic pathway. It plays an essential role in glycerolipid metabolism and response to various stresses. To examine the function of PsGAPDH (Pleurotus sajor-caju GAPDH) in response to abiotic stress, we generated transgenic rice plants with single-copy/intergenic/homozygous overexpression PsGAPDH (PsGAPDH-OX) and investigated their responses to salinity stress. Seedling growth and germination rates of PsGAPDH-OX were significantly increased under salt stress conditions compared to those of the wild type. To elucidate the role of PsGAPDH-OX in salt stress tolerance of rice, an Illumina HiSeq 2000 platform was used to analyze transcriptome profiles of leaves under salt stress. Analysis results of sequencing data showed that 1124 transcripts were differentially expressed. Using the list of differentially expressed genes (DEGs), functional enrichment analyses of DEGs such as Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were performed. KEGG pathway enrichment analysis revealed that unigenes exhibiting differential expression were involved in starch and sucrose metabolism. Interestingly, trehalose-6-phosphate synthase (TPS) genes, of which expression was enhanced by abiotic stress, showed a significant difference in PsGAPDH-OX. Findings of this study suggest that PsGAPDH plays a role in the adaptation of rice plants to salt stress.

Download Full-text

Parentage Analysis in Giant Grouper (Epinephelus lanceolatus) Using Microsatellite and SNP Markers from Genotyping-by-Sequencing Data

Genes ◽

10.3390/genes12071042 ◽

2021 ◽

Vol 12 (7) ◽

pp. 1042

Author(s):

Zhuoying Weng ◽

Yang Yang ◽

Xi Wang ◽

Lina Wu ◽

Sijie Hua ◽

...

Keyword(s):

Fishery Management ◽

Genotyping By Sequencing ◽

Parentage Analysis ◽

Snp Markers ◽

Individual Identification ◽

Pedigree Information ◽

Nucleotide Polymorphisms ◽

Sequencing Data ◽

Polymorphic Snps ◽

Mixed Family

Pedigree information is necessary for the maintenance of diversity for wild and captive populations. Accurate pedigree is determined by molecular marker-based parentage analysis, which may be influenced by the polymorphism and number of markers, integrity of samples, relatedness of parents, or different analysis programs. Here, we described the first development of 208 single nucleotide polymorphisms (SNPs) and 11 microsatellites for giant grouper (Epinephelus lanceolatus) taking advantage of Genotyping-by-sequencing (GBS), and compared the power of SNPs and microsatellites for parentage and relatedness analysis, based on a mixed family composed of 4 candidate females, 4 candidate males and 289 offspring. CERVUS, PAPA and COLONY were used for mutually verification. We found that SNPs had a better potential for relatedness estimation, exclusion of non-parentage and individual identification than microsatellites, and > 98% accuracy of parentage assignment could be achieved by 100 polymorphic SNPs (MAF cut-off < 0.4) or 10 polymorphic microsatellites (mean Ho = 0.821, mean PIC = 0.651). This study provides a reference for the development of molecular markers for parentage analysis taking advantage of next-generation sequencing, and contributes to the molecular breeding, fishery management and population conservation.

Download Full-text