scholarly journals Mining SNPs From EST Databases

1999 ◽  
Vol 9 (2) ◽  
pp. 167-174 ◽  
Author(s):  
Leslie Picoult-Newberg ◽  
Trey E. Ideker ◽  
Mark G. Pohl ◽  
Scott L. Taylor ◽  
Miriam A. Donaldson ◽  
...  

There is considerable interest in the discovery and characterization of single nucleotide polymorphisms (SNPs) to enable the analysis of the potential relationships between human genotype and phenotype. Here we present a strategy that permits the rapid discovery of SNPs from publicly available expressed sequence tag (EST) databases. From a set of ESTs derived from 19 different cDNA libraries, we assembled 300,000 distinct sequences and identified 850 mismatches from contiguous EST data sets (candidate SNP sites), without de novo sequencing. Through a polymerase-mediated, single-base, primer extension technique, Genetic Bit Analysis (GBA), we confirmed the presence of a subset of these candidate SNP sites and have estimated the allele frequencies in three human populations with different ethnic origins. Altogether, our approach provides a basis for rapid and efficient regional and genome-wide SNP discovery using data assembled from sequences from different libraries of cDNAs.[The SNPs identified in this study can be found in the National Center of Biotechnology (NCBI) SNP database under submitter handles ORCHID (SNPS-981210-A) and debnick (SNPS-981209-A and SNPS-981209-B).]

2014 ◽  
Vol 17 (4) ◽  
Author(s):  
Raymond K. Walters ◽  
Charles Laurin ◽  
Gitta H. Lubke

Epistasis is a growing area of research in genome-wide studies, but the differences between alternative definitions of epistasis remain a source of confusion for many researchers. One problem is that models for epistasis are presented in a number of formats, some of which have difficult-to-interpret parameters. In addition, the relation between the different models is rarely explained. Existing software for testing epistatic interactions between single-nucleotide polymorphisms (SNPs) does not provide the flexibility to compare the available model parameterizations. For that reason we have developed an R package for investigating epistatic and penetrance models, EpiPen, to aid users who wish to easily compare, interpret, and utilize models for two-locus epistatic interactions. EpiPen facilitates research on SNP-SNP interactions by allowing the R user to easily convert between common parametric forms for two-locus interactions, generate data for simulation studies, and perform power analyses for the selected model with a continuous or dichotomous phenotype. The usefulness of the package for model interpretation and power analysis is illustrated using data on rheumatoid arthritis.


2015 ◽  
Author(s):  
Sanaa Afroz Ahmed ◽  
Chien-Chi Lo ◽  
Po-E Li ◽  
Karen W Davenport ◽  
Patrick S.G. Chain

Next-generation sequencing is increasingly being used to examine closely related organisms. However, while genome-wide single nucleotide polymorphisms (SNPs) provide an excellent resource for phylogenetic reconstruction, to date evolutionary analyses have been performed using different ad hoc methods that are not often widely applicable across different projects. To facilitate the construction of robust phylogenies, we have developed a method for genome-wide identification/characterization of SNPs from sequencing reads and genome assemblies. Our phylogenetic and molecular evolutionary (PhaME) analysis software is unique in its ability to take reads and draft/complete genome(s) as input, derive core genome alignments, identify SNPs, construct phylogenies and perform evolutionary analyses. Several examples using genomes and read datasets for bacterial, eukaryotic and viral linages demonstrate the broad and robust functionality of PhaME. Furthermore, the ability to incorporate raw metagenomic reads from clinical samples with suspected infectious agents shows promise for the rapid phylogenetic characterization of pathogens within complex samples.


2020 ◽  
Vol 20 (1) ◽  
Author(s):  
Weizhuo Zhu ◽  
Yiyi Guo ◽  
Yeke Chen ◽  
Dezhi Wu ◽  
Lixi Jiang

Abstract Background Transcription factors GATAs are involved in plant developmental processes and respond to environmental stresses through binding DNA regulatory regions to regulate their downstream genes. However, little information on the GATA genes in Brassica napus is available. The release of the reference genome of B. napus provides a good opportunity to perform a genome-wide characterization of GATA family genes in rapeseed. Results In this study, 96 GATA genes randomly distributing on 19 chromosomes were identified in B. napus, which were classified into four subfamilies based on phylogenetic analysis and their domain structures. The amino acids of BnGATAs were obvious divergence among four subfamilies in terms of their GATA domains, structures and motif compositions. Gene duplication and synteny between the genomes of B. napus and A. thaliana were also analyzed to provide insights into evolutionary characteristics. Moreover, BnGATAs showed different expression patterns in various tissues and under diverse abiotic stresses. Single nucleotide polymorphisms (SNPs) distributions of BnGATAs in a core collection germplasm are probably associated with functional disparity under environmental stress condition in different genotypes of B. napus. Conclusion The present study was investigated genomic structures, evolution features, expression patterns and SNP distributions of 96 BnGATAs. The results enrich our understanding of the GATA genes in rapeseed.


BMB Reports ◽  
2006 ◽  
Vol 39 (2) ◽  
pp. 183-188 ◽  
Author(s):  
Seung-Hwan Lee ◽  
Eung-Woo Park ◽  
Yong-Min Cho ◽  
Ji-Woong Lee ◽  
Hyoung-Yong Kim ◽  
...  

2021 ◽  
Vol 7 (12) ◽  
pp. 1076
Author(s):  
Wenbing Gong ◽  
Nan Shen ◽  
Lin Zhang ◽  
Yinbing Bian ◽  
Yang Xiao

Meiotic crossover plays a critical role in generating genetic variations and is a central component of breeding. However, our understanding of crossover in mushroom-forming fungi is limited. Here, in Lentinula edodes, we characterized the chromosome-wide intragenic crossovers, by utilizing the single-nucleotide polymorphisms (SNPs) datasets of an F1 haploid progeny. A total of 884 intragenic crossovers were identified in 110 single-spore isolates, the majority of which were closer to transcript start sites. About 71.5% of the intragenic crossovers were clustered into 65 crossover hotspots. A 10 bp motif (GCTCTCGAAA) was significantly enriched in the hotspot regions. Crossover frequencies around mating-type A (MAT-A) loci were enhanced and formed a hotspot in L. edodes. Genome-wide quantitative trait loci (QTLs) mapping identified sixteen crossover-QTLs, contributing 8.5–29.1% of variations. Most of the detected crossover-QTLs were co-located with crossover hotspots. Both cis- and trans-QTLs contributed to the nonuniformity of crossover along chromosomes. On chr2, we identified a QTL hotspot that regulated local, global crossover variation and crossover hotspot in L. edodes. These findings and observations provide a comprehensive view of the crossover landscape in L. edodes, and advance our understandings of conservation and diversity of meiotic recombination in mushroom-forming fungi.


2020 ◽  
Author(s):  
Weizhuo Zhu ◽  
Yiyi Guo ◽  
Yeke Chen ◽  
Dezhi Wu ◽  
Lixi Jiang

Abstract Background: Transcription factors GATAs are involved in plant developmental processes and respond to environmental stresses through binding DNA regulatory regions to regulate their downstream genes. However, little information on the GATA genes in Brassica napus is available. The release of the reference genome of B. napus provides a good opportunity to perform a genome-wide characterization of GATA family genes in rapeseed.Results: In this study, 96 GATA genes randomly distributing on 19 chromosomes were identified in B. napus, which were classified into four subfamilies based on phylogenetic analysis and their domain structures. The amino acids of BnGATAs were obvious divergence among four subfamilies in terms of their GATA domains, structures and motif compositions. Gene duplication and synteny between the genomes of B. napus and A. thaliana were also analyzed to provide insights into evolutionary characteristics. Moreover, BnGATAs showed different expression patterns in various tissues and under diverse abiotic stresses. Single nucleotide polymorphisms (SNPs) distributions of BnGATAs in a core collection germplasm are probably associated with functional disparity under environmental stress condition in different genotypes of B. napus.Conclusion: The present study was investigated genomic structures, evolution features, expression patterns and SNP distributions of 96 BnGATAs. The results enrich our understanding of the GATA genes in rapeseed.


2014 ◽  
Vol 17 (4) ◽  
pp. 272-278 ◽  
Author(s):  
Raymond K. Walters ◽  
Charles Laurin ◽  
Gitta H. Lubke

Epistasis is a growing area of research in genome-wide studies, but the differences between alternative definitions of epistasis remain a source of confusion for many researchers. One problem is that models for epistasis are presented in a number of formats, some of which have difficult-to-interpret parameters. In addition, the relation between the different models is rarely explained. Existing software for testing epistatic interactions between single-nucleotide polymorphisms (SNPs) does not provide the flexibility to compare the available model parameterizations. For that reason we have developed an R package for investigating epistatic and penetrance models, Epi2Loc, to aid users who wish to easily compare, interpret, and utilize models for two-locus epistatic interactions. Epi2Loc facilitates research on SNP–SNP interactions by allowing the R user to easily convert between common parametric forms for two-locus interactions, generate data for simulation studies, and perform power analyses for the selected model with a continuous or dichotomous phenotype. The usefulness of the package for model interpretation and power analysis is illustrated using data on rheumatoid arthritis.


Cephalalgia ◽  
2014 ◽  
Vol 35 (6) ◽  
pp. 500-507 ◽  
Author(s):  
MA Louter ◽  
J Fernandez-Morales ◽  
B de Vries ◽  
B Winsvold ◽  
V Anttila ◽  
...  

Introduction Chronic migraine (CM) is at the severe end of the clinical migraine spectrum, but its genetic background is unknown. Our study searched for evidence that genetic factors are involved in the chronification process. Methods We initially selected 144 single-nucleotide polymorphisms (SNPs) from 48 candidate genes, which we tested for association in two stages: The first stage encompassed 262 CM patients, the second investigated 226 patients with high-frequency migraine (HFM). Subsequently, SNPs with p values < 0.05 were forwarded to the replication stage containing 531 patients with CM or HFM. Results Eight SNPs were significantly associated with CM and HFM in the two-stage phase. None survived replication in the third stage. Discussion We present the first comprehensive genetic association study for migraine chronification. There were no significant findings. Future studies may benefit from larger, genome-wide data sets or should use other genetic approaches to identify genetic factors involved in migraine chronification.


Genome ◽  
2005 ◽  
Vol 48 (3) ◽  
pp. 562-570 ◽  
Author(s):  
Maeli Melotto ◽  
Claudia B Monteiro-Vitorello ◽  
Adriano G Bruschi ◽  
Luis E.A Camargo

To rapidly and cost-effectively generate gene expression data, we developed an annotated unigene database of common bean (Phaseolus vulgaris L.). In this study, 3 cDNA libraries were constructed from the bean breeding line SEL1308, 1 from young leaf and 2 from seedlings inoculated or not inoculated with the fungal pathogen Colletotrichum lindemuthianum (Sacc. & Magnus) Briosi & Cavara, which causes anthracnose in common bean. To this date, 5255 single-pass sequences have been included in the database after selection based on sequence quality. These ESTs were trimmed and clustered using the computer programs Phred and CAP3 to form a unigene collection of 3126 unique sequences. Within clusters, 318 single nucleotide polymorphisms (SNPs) and 68 insertions–deletions (indels) were found, indicating the presence of paralogous gene families in our database. Each unigene sequence was analyzed for possible function using their similarity to known genes represented in the GenBank database and classified into 14 categories. Only 314 unigenes showed significant similarities to Phaseolus genomic sequences and P. vulgaris ESTs, which indicates that 90% (2818 unigenes) of our database represent newly discovered common bean genes. In addition, 12% (387 unigenes) were shown to be specific to common bean. This study represents a first step towards the discovery of novel genes in beans and a valuable source of molecular markers for expressed gene tagging and mapping.Key words: expressed sequence tag (EST), Colletotrichum lindemuthianum, Phaseolus vulgaris, simple sequence repeat (SSR), single nucleotide polymorphism (SNP).


Sign in / Sign up

Export Citation Format

Share Document