Finding and extending ancient simple sequence repeat-derived regions in the human genome

Mapping Intimacies ◽

10.1101/697813 ◽

2019 ◽

Author(s):

Jonathan A. Shortt ◽

Robert P. Ruggiero ◽

Corey Cox ◽

Aaron C. Wacholder ◽

David D. Pollock

Keyword(s):

Human Genome ◽

Simple Sequence Repeat ◽

Genome Structure ◽

Length Distribution ◽

High Sensitivity ◽

Training Set ◽

Protein Coding ◽

Ssr Loci ◽

Simple Sequence ◽

Repeated Motif

AbstractBackgroundPreviously, 3% of the human genome has been annotated as simple sequence repeats (SSRs), similar to the proportion annotated as protein coding. The origin of much of the genome is not well annotated, however, and some of the unidentified regions are likely to be ancient SSR-derived regions not identified by current methods. The identification of these regions is complicated because SSRs appear to evolve through complex cycles of expansion and contraction, often interrupted by mutations that alter both the repeated motif and mutation rate. We applied an empirical, kmer-based, approach to identify genome regions that are likely derived from SSRs.ResultsThe sequences flanking annotated SSRs are enriched for similar sequences and for SSRs with similar motifs, suggesting that the evolutionary remains of SSR activity abound in regions near obvious SSRs. Using our previously described P-clouds approach, we identified ‘SSR-clouds’, groups of similar kmers (or ‘oligos’) that are enriched near a training set of unbroken SSR loci, and then used the SSR-clouds to detect likely SSR-derived regions throughout the genome.ConclusionsOur analysis indicates that the amount of likely SSR-derived sequence in the human genome is 6.77%, over twice as much as previous estimates, including millions of newly identified ancient SSR-derived loci. SSR-clouds identified poly-A sequences adjacent to transposable element termini in over 74% of the oldest class ofAlu(roughly,AluJ), validating the sensitivity of the approach. Poly-A’s annotated by SSR-clouds also had a length distribution that was more consistent with their poly-A origins, with mean about 35 bp even in olderAlus. This work demonstrate that the high sensitivity provided by SSR-Clouds improves the detection of SSR-derived regions and will enable deeper analysis of how decaying repeats contribute to genome structure.

Download Full-text

Simple Sequence Repeat and S-Locus Genotyping to Assist the Genetic Characterization and Breeding of Polyploid Prunus Species, P. spinosa and P. domestica subsp. insititia

Biochemical Genetics ◽

10.1007/s10528-021-10090-7 ◽

2021 ◽

Author(s):

Júlia Halász ◽

Noémi Makovics-Zsohár ◽

Ferenc Szőke ◽

Sezai Ercisli ◽

Attila Hegedűs

Keyword(s):

Simple Sequence Repeat ◽

Principal Component ◽

Sequence Repeat ◽

Breeding Programs ◽

Allele Number ◽

Genetic Potential ◽

Ssr Loci ◽

High Level ◽

Diversity Studies ◽

Simple Sequence

AbstractPolyploid Prunus spinosa (2n = 4 ×) and P. domestica subsp. insititia (2n = 6 ×) represent enormous genetic potential in Central Europe, which can be exploited in breeding programs. In Hungary, 16 cultivar candidates and a recognized cultivar ‘Zempléni’ were selected from wild-growing populations including ten P. spinosa, four P. domestica subsp. insititia and three P. spinosa × P. domestica hybrids (2n = 5 ×) were also created. Genotyping in eleven simple sequence repeat (SSR) loci and the multiallelic S-locus was used to characterize genetic variability and achieve a reliable identification of tested accessions. Nine SSR loci proved to be polymorphic and eight of those were highly informative (PIC values ˃ 0.7). A total of 129 SSR alleles were identified, which means 14.3 average allele number per locus and all accessions but two clones could be discriminated based on unique SSR fingerprints. A total of 23 S-RNase alleles were identified and the complete and partial S-genotype was determined for 10 and 7 accessions, respectively. The DNA sequence was determined for a total of 17 fragments representing 11 S-RNase alleles. ‘Zempléni’ was confirmed to be self-compatible carrying at least one non-functional S-RNase allele (SJ). Our results indicate that the S-allele pools of wild-growing P. spinosa and P. domestica subsp. insititia are overlapping in Hungary. Phylogenetic and principal component analyses confirmed the high level of diversity and genetic differentiation present within the analysed accessions and indicated putative ancestor–descendant relationships. Our data confirm that S-locus genotyping is suitable for diversity studies in polyploid Prunus species but non-related accessions sharing common S-alleles may distort phylogenetic inferences.

Download Full-text

Evaluation of Genetic Diversity and Pedigree within Crapemyrtle Cultivars Using Simple Sequence Repeat Markers

Journal of the American Society for Horticultural Science ◽

10.21273/jashs.136.2.116 ◽

2011 ◽

Vol 136 (2) ◽

pp. 116-128 ◽

Cited By ~ 14

Author(s):

Xinwang Wang ◽

Phillip A. Wadl ◽

Cecil Pounders ◽

Robert N. Trigiano ◽

Raul I. Cabrera ◽

...

Keyword(s):

Genetic Diversity ◽

Simple Sequence Repeat ◽

Interspecific Hybrids ◽

Size Variation ◽

Sequence Repeat ◽

Allele Size ◽

Low Genetic Diversity ◽

Ssr Loci ◽

Diversity Estimates ◽

Simple Sequence

Genetic diversity was estimated for 51 Lagerstroemia indica L. cultivars, five Lagerstroemia fauriei Koehne cultivars, and 37 interspecific hybrids using 78 simple sequence repeat (SSR) markers. SSR loci were highly variable among the cultivars, detecting an average of 6.6 alleles (amplicons) per locus. Each locus detected 13.6 genotypes on average. Cluster analysis identified three main groups that consisted of individual cultivars from L. indica, L. fauriei, and their interspecific hybrids. However, only 18.1% of the overall variation was the result of differences between these groups, which may be attributable to pedigree-based breeding strategies that use current cultivars as parents for future selections. Clustering within each group generally reflected breeding pedigrees but was not supported by bootstrap replicates. Low statistical support was likely the result of low genetic diversity estimates, which indicated that only 25.5% of the total allele size variation was attributable to differences between the species L. indica and L. fauriei. Most allele size variation, or 74.5%, was common to L. indica and L. fauriei. Thus, introgression of other Lagestroemia species such as Lagestroemia limii Merr. (L. chekiangensis Cheng), Lagestroemia speciosa (L.) Pers., and Lagestroemia subcostata Koehne may significantly expand crapemyrtle breeding programs. This study verified relationships between existing cultivars and identified potentially untapped sources of germplasm.

Download Full-text

Phylogeny and Comparative Analysis of Chinese Chamaesium Species Revealed by the Complete Plastid Genome

Plants ◽

10.3390/plants9080965 ◽

2020 ◽

Vol 9 (8) ◽

pp. 965 ◽

Cited By ~ 1

Author(s):

Xian-Lin Guo ◽

Hong-Yi Zheng ◽

Megan Price ◽

Song-Dong Zhou ◽

Xing-Jin He

Keyword(s):

Plastid Genome ◽

Molecular Phylogenetics ◽

Genome Structure ◽

Gc Content ◽

Plastid Genomes ◽

Second Generation Sequencing ◽

Ssr Loci ◽

Small Genus ◽

Generation Sequencing ◽

Simple Sequence

Chamaesium H. Wolff (Apiaceae, Apioideae) is a small genus mainly distributed in the Hengduan Mountains and the Himalayas. Ten species of Chamaesium have been described and nine species are distributed in China. Recent advances in molecular phylogenetics have revolutionized our understanding of Chinese Chamaesium taxonomy and evolution. However, an accurate phylogenetic relationship in Chamaesium based on the second-generation sequencing technology remains poorly understood. Here, we newly assembled nine plastid genomes from the nine Chinese Chamaesium species and combined these genomes with eight other species from five genera to perform a phylogenic analysis by maximum likelihood (ML) using the complete plastid genome and analyzed genome structure, GC content, species pairwise Ka/Ks ratios and the simple sequence repeat (SSR) component. We found that the nine species’ plastid genomes ranged from 152,703 bp (C. thalictrifolium) to 155,712 bp (C. mallaeanum), and contained 133 genes, 34 SSR types and 585 SSR loci. We also found 20,953–21,115 codons from 53 coding sequence (CDS) regions, 38.4–38.7% GC content of the total genome and low Ka/Ks (0.27–0.43) ratios of 53 aligned CDS. These results will facilitate our further understanding of the evolution of the genus Chamaesium.

Download Full-text

SSR Locator: Tool for Simple Sequence Repeat Discovery Integrated with Primer Design and PCR Simulation

International Journal of Plant Genomics ◽

10.1155/2008/412696 ◽

2008 ◽

Vol 2008 ◽

pp. 1-9 ◽

Cited By ~ 94

Author(s):

Luciano Carlos da Maia ◽

Dario Abel Palmieri ◽

Velci Queiroz de Souza ◽

Mauricio Marini Kopp ◽

Fernando Irajá Félix de Carvalho ◽

...

Keyword(s):

Simple Sequence Repeat ◽

Primer Design ◽

Cdna Sequences ◽

Additional Information ◽

Ssr Loci ◽

Eukaryotic Organisms ◽

Simple Sequence ◽

Rice Cdna ◽

Tandem Duplications ◽

Short Tandem

Microsatellites or SSRs (simple sequence repeats) are ubiquitous short tandem duplications occurring in eukaryotic organisms. These sequences are among the best marker technologies applied in plant genetics and breeding. The abundant genomic, BAC, and EST sequences available in databases allow the survey regarding presence and location of SSR loci. Additional information concerning primer sequences is also the target of plant geneticists and breeders. In this paper, we describe a utility that integrates SSR searches, frequency of occurrence of motifs and arrangements, primer design, and PCR simulation against other databases. This simulation allows the performance of global alignments and identity and homology searches between different amplified sequences, that is, amplicons. In order to validate the tool functions, SSR discovery searches were performed in a database containing 28 469 nonredundant rice cDNA sequences.

Download Full-text

Genome-Wide Characterization of Simple Sequence Repeat (SSR) Loci in Chinese Jujube and Jujube SSR Primer Transferability

PLoS ONE ◽

10.1371/journal.pone.0127812 ◽

2015 ◽

Vol 10 (5) ◽

pp. e0127812 ◽

Cited By ~ 24

Author(s):

Jing Xiao ◽

Jin Zhao ◽

Mengjun Liu ◽

Ping Liu ◽

Li Dai ◽

...

Keyword(s):

Simple Sequence Repeat ◽

Sequence Repeat ◽

Chinese Jujube ◽

Genome Wide ◽

Ssr Loci ◽

Simple Sequence

Download Full-text

Characterization and analysis of simple sequence repeat (SSR) loci in seashore paspalum (Paspalum vaginatum Swartz)

Theoretical and Applied Genetics ◽

10.1007/bf00220857 ◽

1995 ◽

Vol 91 (1) ◽

pp. 47-52 ◽

Cited By ~ 22

Author(s):

Z.-W. Liu ◽

R. L. Jarret ◽

S. Kresovich ◽

R. R. Duncan

Keyword(s):

Simple Sequence Repeat ◽

Sequence Repeat ◽

Seashore Paspalum ◽

Paspalum Vaginatum ◽

Ssr Loci ◽

Simple Sequence

Download Full-text

Isolation and characterization of new polymorphic simple sequence repeat loci in grape (Vitis vinifera L.)

Genome ◽

10.1139/g96-080 ◽

1996 ◽

Vol 39 (4) ◽

pp. 628-633 ◽

Cited By ~ 268

Author(s):

J. E. Bowers ◽

G. S. Dangl ◽

R. Vignani ◽

C. P. Meredith

Keyword(s):

Vitis Vinifera ◽

Silver Staining ◽

Simple Sequence Repeat ◽

Pinot Noir ◽

Sequence Repeat ◽

Wine Grapes ◽

Table Grapes ◽

Isolation And Characterization ◽

Ssr Loci ◽

Simple Sequence

Four new simple sequence repeat (SSR) loci (designated VVMD5, VVMD6, VVMD7, and VVMD8) were characterized in grape and analyzed by silver staining in 77 cultivars of Vitis vinifera. Amplification products ranged in size from 141 to 263 base pairs (bp). The number of alleles observed per locus ranged from 5 to 11 and the number of diploid genotypes per locus ranged from 13 to 27. At each locus at least 75% of the cultivars were heterozygous. Alleles differing in length by only 1 bp could be distinguished by silver staining, and size estimates were within 1 or 2 bp, depending on the locus, of those obtained by fluorescence detection at previously reported loci. Allele frequencies were generally similar in wine grapes and table grapes, with some exceptions. Some alleles were found only in one of the two groups of cultivars. All 77 cultivars were distinguished by the four loci with the exception of four wine grapes considered to be somatic variants of the same cultivar, 'Pinot noir', 'Pinot gris', 'Pinot blanc', and 'Meunier'; two table grapes that are known to be synonymous, 'Keshmesh' and 'Thompson Seedless'; and three table grapes, 'Dattier', 'Rhazaki Arhanon', and 'Markandi', the first two of which have been suggested to be synonymous. Although the high polymorphism at grape SSR loci suggests that very few loci would theoretically be needed to separate all cultivars, the economic and legal significance of grape variety identification requires the increased resolution that can be provided by a larger number of loci. The ease with which SSR markers and data can be shared internationally should encourage their broad use, which will in turn increase the power of these markers for both identification and genetic analysis of grape. Key words : grape, Vitis, microsatellite, simple sequence repeat, DNA typing, identification.

Download Full-text

Microsatellite DNA markers in Populus tremuloides

Genome ◽

10.1139/g99-134 ◽

2000 ◽

Vol 43 (2) ◽

pp. 293-297 ◽

Cited By ~ 28

Author(s):

Muhammad H Rahman ◽

S Dayanandan ◽

Om P Rajora

Keyword(s):

Populus Tremuloides ◽

Dna Markers ◽

Microsatellite Dna ◽

Simple Sequence Repeat ◽

Mendelian Inheritance ◽

Inheritance Pattern ◽

Sequence Repeat ◽

Microsatellite Dna Markers ◽

Ssr Loci ◽

Simple Sequence

Markers for eight new microsatellite DNA or simple sequence repeat (SSR) loci were developed and characterized in trembling aspen (Populus tremuloides) from a partial genomic library. Informativeness of these microsatellite DNA markers was examined by determining polymorphisms in 38 P. tremuloides individuals. Inheritance of selected markers was tested in progenies of controlled crosses. Six characterized SSR loci were of dinucleotide repeats (two perfect and four imperfect), and one each of trinucleotide and tetranucleotide repeats. The monomorphic SSR locus (PTR15) was of a compound imperfect dinucleotide repeat. The primers of one highly polymorphic SSR locus (PTR7) amplified two loci, and alleles could not be assigned to a specific locus. At the other six polymorphic loci, 25 alleles were detected in 38 P. tremuloides individuals; the number of alleles ranged from 2 to 7, with an average of 4.2 alleles per locus, and the observed heterozygosity ranged from 0.05 to 0.61, with an average of 0.36 per locus. The two perfect dinucleotide and one trinucleotide microsatellite DNA loci were the most informative. Microsatellite DNA variants of four SSR loci characterized previously followed a single-locus Mendelian inheritance pattern, whereas those of PTR7 from the present study showed a two-locus Mendelian inheritance pattern in controlled crosses. The microsatellite DNA markers developed and reported here could be used for assisting various genetic, breeding, biotechnology, genome mapping, conservation, and sustainable forest management programs in poplars. Key words: poplar, microsatellites, genetic mapping, simple sequence repeat (SSR) markers, DNA fingerprinting.

Download Full-text

Multilocus Simple Sequence Repeat Markers for Differentiating Strains and Evaluating Genetic Diversity of Xylella fastidiosa

Applied and Environmental Microbiology ◽

10.1128/aem.71.8.4888-4892.2005 ◽

2005 ◽

Vol 71 (8) ◽

pp. 4888-4892 ◽

Cited By ~ 19

Author(s):

Hong Lin ◽

Edwin L. Civerolo ◽

Rong Hu ◽

Samuel Barros ◽

Marta Francis ◽

...

Keyword(s):

Simple Sequence Repeat ◽

Xylella Fastidiosa ◽

Sequence Repeat ◽

Leaf Scorch ◽

Genome Wide ◽

A Genome ◽

Genome Wide Search ◽

Ssr Loci ◽

Simple Sequence ◽

Almond Leaf Scorch

ABSTRACT A genome-wide search was performed to identify simple sequence repeat (SSR) loci among the available sequence databases from four strains of Xylella fastidiosa (strains causing Pierce's disease, citrus variegated chlorosis, almond leaf scorch, and oleander leaf scorch). Thirty-four SSR loci were selected for SSR primer design and were validated in PCR experiments. These multilocus SSR primers, distributed across the X. fastidiosa genome, clearly differentiated and clustered X. fastidiosa strains collected from grape, almond, citrus, and oleander. They are well suited for differentiating strains and studying X. fastidiosa epidemiology and population genetics.

Download Full-text

A comparative study of Inter Simple Sequence Repeat (ISSR), Random Amplified Polymorphic DNA (RAPD) and Simple Sequence Repeat (SSR) loci in assessing genetic diversity inAmaranthus

Indian Journal of Genetics and Plant Breeding (The) ◽

10.5958/j.0975-6906.73.4.062 ◽

2013 ◽

Vol 73 (4) ◽

pp. 411 ◽

Cited By ~ 4

Author(s):

Balwant Singh ◽

Shailesh Pandey ◽

J. Kumar

Keyword(s):

Genetic Diversity ◽

Comparative Study ◽

Simple Sequence Repeat ◽

Random Amplified Polymorphic Dna ◽

Inter Simple Sequence Repeat ◽

Sequence Repeat ◽

Ssr Loci ◽

Simple Sequence

Download Full-text