Microhomologies Are Associated with Tandem Duplications and Structural Variation in Plant Mitochondrial Genomes

Hanhan Xia; Wei Zhao; Yong Shi; Xiao-Ru Wang; Baosheng Wang

doi:10.1093/gbe/evaa172

Microhomologies Are Associated with Tandem Duplications and Structural Variation in Plant Mitochondrial Genomes

Genome Biology and Evolution ◽

10.1093/gbe/evaa172 ◽

2020 ◽

Vol 12 (11) ◽

pp. 1965-1974

Author(s):

Hanhan Xia ◽

Wei Zhao ◽

Yong Shi ◽

Xiao-Ru Wang ◽

Baosheng Wang

Keyword(s):

Tandem Repeat ◽

Structural Variation ◽

Tandem Repeats ◽

Repeat Unit ◽

Mitochondrial Genomes ◽

Repeat Array ◽

Mitochondrial Genome Evolution ◽

Tandem Duplications ◽

Short Tandem

Abstract Short tandem repeats (STRs) contribute to structural variation in plant mitochondrial genomes, but the mechanisms underlying their formation and expansion are unclear. In this study, we detected high polymorphism in the nad7-1 region of the Pinus tabuliformis mitogenome caused by the rapid accumulation of STRs and rearrangements over a few million years ago. The STRs in nad7-1 have a 7-bp microhomology (TAG7) flanking the repeat array. We then scanned the mitogenomes of 136 seed plants to understand the role of microhomology in the formation of STR and mitogenome evolution. A total of 13,170 STRs were identified, and almost half of them were associated with microhomologies. A substantial amount (1,197) of microhomologies was long enough to mediate structural variation, and the length of microhomology is positively correlated with the length of tandem repeat unit. These results suggest that microhomology may be involved in the formation of tandem repeat via microhomology-mediated pathway, and the formation of longer duplicates required greater length of microhomology. We examined the abundance of these 1,197 microhomologies, and found 75% of them were enriched in the plant mitogenomes. Further analyses of the 400 prevalent microhomologies revealed that 175 of them showed differential enrichment between angiosperms and gymnosperms and 186 differed between angiosperms and conifers, indicating lineage-specific usage and expansion of microhomologies. Our study sheds light on the sources of structural variation in plant mitochondrial genomes and highlights the importance of microhomology in mitochondrial genome evolution.

Download Full-text

The Vsa Shield of Mycoplasma pulmonis Is Antiphagocytic

Infection and Immunity ◽

10.1128/iai.06009-11 ◽

2011 ◽

Vol 80 (2) ◽

pp. 704-709 ◽

Cited By ~ 13

Author(s):

Brandon M. Shaw ◽

Warren L. Simmons ◽

Kevin Dybvig

Keyword(s):

Tandem Repeat ◽

Tandem Repeats ◽

Repeat Unit ◽

Mycoplasma Pulmonis ◽

Content Type ◽

Tandem Repeat Region ◽

Chronic Respiratory Infection

ABSTRACTThe infection of mice withMycoplasma pulmonisis a model for studying chronic mycoplasmal respiratory disease. Manyin vivoandin vitrostudies have used the organism to gain a better understanding of host-pathogen interactions in chronic respiratory infection. The organism's Vsa proteins contain an extensive tandem repeat region. The length of the tandem repeat unit varies from as few as 11 amino acids to as many as 19. The number of tandem repeats can be as high as 60. The number of repeats varies at a high frequency due to slipped-strand mispairing events that occur during DNA replication. When the number of repeats is high, e.g., 40, the mycoplasma is resistant to lysis by complement but does not form a robust biofilm. When the number of repeats is low, e.g., 5, the mycoplasma is killed by complement when the cells are dispersed but has the capacity to form a biofilm that resists complement. Here, we examine the role of the Vsa proteins in the avoidance of phagocytosis and find that cells producing a protein with many tandem repeats are relatively resistant to killing by macrophages. These results may be pertinent to understanding the functions of similar proteins that have extensive repeat regions in other microbes.

Download Full-text

Structural variation of novel alleles at the Hum vWA and Hum FES/FPS short tandem repeat loci

International Journal of Legal Medicine ◽

10.1007/bf01845614 ◽

1995 ◽

Vol 108 (1) ◽

pp. 31-35 ◽

Cited By ~ 16

Author(s):

M. D. Barber ◽

R. C. Piercy ◽

J. F. Andersen ◽

B. H. Parkin

Keyword(s):

Tandem Repeat ◽

Short Tandem Repeat ◽

Structural Variation ◽

Short Tandem Repeat Loci ◽

Novel Alleles ◽

Short Tandem

Download Full-text

Comprehensive genetic diagnosis of tandem repeat expansion disorders with programmable targeted nanopore sequencing

10.1101/2021.09.27.21263187 ◽

2021 ◽

Author(s):

Igor Stevanovski ◽

Sanjog R. Chintalaphani ◽

Hasindu Gamaarachchi ◽

James M. Ferguson ◽

Sandy S. Pineda ◽

...

Keyword(s):

Tandem Repeat ◽

Tandem Repeats ◽

Fragile X ◽

Genetic Diagnosis ◽

Neuromuscular Diseases ◽

Nanopore Sequencing ◽

Molecular Tests ◽

Genetic Landscape ◽

Long Read ◽

Short Tandem

ABSTRACTShort-tandem repeat (STR) expansions are an important class of pathogenic genetic variants. Over forty neurological and neuromuscular diseases are caused by STR expansions, with 37 different genes implicated to date. Here we describe the use of programmable targeted long-read sequencing with Oxford Nanopore’s ReadUntil function for parallel genotyping of all known neuropathogenic STRs in a single, simple assay. Our approach enables accurate, haplotype-resolved assembly and DNA methylation profiling of expanded and non-expanded STR sites. In doing so, the assay correctly diagnoses all individuals in a cohort of patients (n = 27) with various neurogenetic diseases, including Huntington’s disease, fragile X syndrome and cerebellar ataxia (CANVAS) and others. Targeted long-read sequencing solves large and complex STR expansions that confound established molecular tests and short-read sequencing, and identifies non-canonical STR motif conformations and internal sequence interruptions. Even in our relatively small cohort, we observe a wide diversity of STR alleles of known and unknown pathogenicity, suggesting that long-read sequencing will redefine the genetic landscape of STR expansion disorders. Finally, we show how the flexible inclusion of pharmacogenomics (PGx) genes as secondary ReadUntil targets can identify clinically actionable PGx genotypes to further inform patient care, at no extra cost. Our study addresses the need for improved techniques for genetic diagnosis of STR expansion disorders and illustrates the broad utility of programmable long-read sequencing for clinical genomics.One sentence summaryThis study describes the development and validation of a programmable targeted nanopore sequencing assay for parallel genetic diagnosis of all known pathogenic short-tandem repeats (STRs) in a single, simple test.

Download Full-text

Contribution of ABO-Rhesus/Electrophoresis of hemoglobin methods and Short Tandem Repeats analysis in the determination of paternity in Burkina Faso, West Africa

10.21203/rs.3.rs-41454/v1 ◽

2020 ◽

Author(s):

Missa Millogo ◽

Serge Theophile Soubeiga ◽

Bapio Valerie Jean Telesphore Bazie ◽

Theodora Mahoukede Zohoncon ◽

Albert Theophane Yonli ◽

...

Keyword(s):

Tandem Repeat ◽

Short Tandem Repeat ◽

High Performance ◽

Short Tandem Repeats ◽

Tandem Repeats ◽

Repeat Analysis ◽

Paternity Index ◽

Genetic Analyzer ◽

Short Tandem

Abstract Background: the establishment of filiation by the current ABO, HLA, MNS, Kells and serum tests, pose a real reliability problem. It is then necessary to combine these methods with or to use high-performance methods such as microsatellite genetic analysis or short tandem repeats. This study aimed to compare the short tandem repeat technique with ABO/Rhesus system in combination with electrophoresis of hemoglobin. Methods: Fourteen (14) contested paternity trios were investigated. Blood samples were collected to determine blood groups using the Beth-Vincent method and the type of hemoglobin by electrophoresis. Blood spots on FTA paper were used for the analysis of 16 STR loci (D8S1179, D21S11, D7S820, CSF1PO, D3S1358, TH01, D13S317, D16S539, D2S1338, D19S433, vWA, TPOX, D18S51, D5S818, FGA, Amel) by capillary electrophoresis on the ABI 31310 Genetic Analyzer. The generated sequences were analyzed with GeneMapper® software version 3.2.1. The data were analyzed to determine the paternity index and the probability of paternity. Results: Of the fourteen (14) trios studied, ten (10) cases were probable inclusion, three (03) cases were exclusion and one (01) case was an undetermined paternity outcome with the ABO-Rhesus/ electrophoresis of hemoglobin system. While the analysis of genetic polymorphisms in DNA gave five (05) inclusions versus nine (09) exclusions of paternity. Of the 10 probable inclusion cases given by the ABO-Rhesus/Electrophoresis of hemoglobin system, only 05 cases (50%) were confirmed for inclusion by Short tandem repeat analysis. Conclusion: The analysis of short tandem repeat with sixteen genetic markers is more reliable in determining paternity than ABO-Rhesus/hemoglobin electrophoresis techniques.

Download Full-text

Species DNA-based identification for detection of processed meat adulteration: is there a role of human short tandem repeats (STRs)?

Egyptian Journal of Forensic Sciences ◽

10.1186/s41935-019-0121-y ◽

2019 ◽

Vol 9 (1) ◽

Author(s):

Ghada Ali Omran ◽

Asmaa Osama Tolba ◽

Eman Ezz El-Dawela El-Sharkawy ◽

Doaa Mohammed Abdel-Aziz ◽

Hussein Youssef Ahmed

Keyword(s):

Short Tandem Repeats ◽

Tandem Repeats ◽

Processed Meat ◽

Short Tandem

Download Full-text

The Role of Short Tandem Repeat Profiling in Disputed Paternity Cases

المجلة العربية للدراسات الأمنية و التدريب ◽

10.12816/0006260 ◽

2014 ◽

Vol 30 (60) ◽

pp. 247-264

Author(s):

Ahmad Muhammad Refaat

Keyword(s):

Tandem Repeat ◽

Short Tandem Repeat ◽

Short Tandem Repeat Profiling ◽

Short Tandem ◽

Paternity Cases

Download Full-text

Human mucin gene MUC4: organization of its 5′-region and polymorphism of its central tandem repeat array

Biochemical Journal ◽

10.1042/bj3320739 ◽

1998 ◽

Vol 332 (3) ◽

pp. 739-748 ◽

Cited By ~ 68

Author(s):

Séverine NOLLET ◽

Nicolas MONIAUX ◽

Jacques MAURY ◽

Danièle PETITPREZ ◽

Pierre DEGAND ◽

...

Keyword(s):

Signal Peptide ◽

Tandem Repeat ◽

Tandem Repeats ◽

Cysteine Residue ◽

Coding Region ◽

Exon 2 ◽

Repeat Array ◽

Mucin Gene ◽

Repeat Domain ◽

Tandem Repeat Array

In a previous study we isolated a partial cDNA with a tandem repeat of 48 bp, which allowed us to map a novel human mucin gene named MUC4to chromosome 3q29. Here we report the organization and sequence of the 5´-region and its junction with the tandem repeat array of MUC4. Analysis of three overlapping genomic clones allowed us to obtain a partial restriction map of MUC4 and to locate the complete 48 bp tandem repeat domain on a PstI/EcoRI genomic fragment that exhibits a very large variation in number of tandem repeats (7–19 kb). cDNA clonal extension allowed us to obtain the entire 5´ coding region of MUC4. Exon 1 consists of a 5´ untranslated region and an 82 bp fragment encoding the signal peptide. This latter shows a high degree of similarity to the signal peptide of another apomucin, ASGP-1. Exon 2 is extremely large and contains a unique sequence that is followed by the whole tandem repeat domain. It encodes only one cysteine residue, making MUC4 different from mucin genes belonging to the 11p15.5 family. Moreover, an intron downstream from the tandem repeat array consists mainly of a 15 bp tandem repeat that exhibits a polymorphism in having a variable number of tandem repeats.

Download Full-text

Determining the Genetic Similarities and Variability of Javanese and Arab Ethnic Families with DNA Fingerprint in Malang East Java Indonesia

JURNAL ILMIAH SAINS ◽

10.35799/jis.17.1.2017.15292 ◽

2017 ◽

Vol 17 (1) ◽

pp. 51

Author(s):

Nila Kartika Sari

Keyword(s):

Tandem Repeat ◽

Short Tandem Repeat ◽

Tandem Repeats ◽

Pcr Amplification ◽

Dna Fingerprint ◽

Salting Out ◽

Breeding Populations ◽

Human White Blood Cell ◽

And Migration ◽

Short Tandem

PENENTUAN SIMILARITAS DAN VARIABILITAS GENETIK PADA KELUARGA ETNIS JAWA DAN ARAB DENGAN DNA FINGERPRINT DI MALANG, JAWA TIMUR, INDONESIA ABSTRAKLebih dari sepertiga genom manusia terdiri dari urutan daerah berulang (Repeat area) yang terdiri dari Minisatellite atau Variant Number Of Tandem Repeats (VNTR) dan Microsatellite atau Short Tandem Repeat (STR). STR sebagai daerah berulang dengan rentang alel yang pendek sering digunakan untuk tes paternitas, penelitian penyakit genetik dalam bidang kesehatan, arkeologi molekular, maupun kasus kriminalitas dalam bidang forensic. Tujuan dari penelitian ini adalah untuk mengidentifikasi DNA Fingerprint pada etnis Jawa – Arab dengan menentukan similaritas dan variabilitas genetiknya. Bahan dan metode yang digunakan untuk mengerjakan adalah menggunakan sel darah putih manusia yang berasal dari tiga generasi dalam tiga keluarga yang terdiri dari : (1) Nenek – Ibu, Ayah – anak perempuan, (2) Kakek – Ibu, Ayah – Anak perempuan, (3) Kakek, Nenek – Ibu, Ayah – Anak laki-laki. Isolasi DNA pada tiap sampel diperoleh dengan salting out, selanjutnya Amplifikasi PCR dengan menggunakan 13 CODIS yang meliputi TPOX, D3S1358, FGA, D5S818, CSF1PO, D7S820, D8S1179, TH01, VWA, D13S317, D16S539, D18S51, D21S11 dan amelogenin yang dapat dilihat melalui hasil elektroforesis gel poliakrilamid 8% dengan Chemidoc Gel Imaging. Analisis profil pita pada tiap individu untuk menentukan similaritas dan variabilitas genetik serta pola alel dengan menggunakan software Quantity One. Variasi pola pita DNA dianalisis dengan menggunakan program software GENEPOP package versi 4.2 yang akan didapat frekuensi alel, heterozigositas, dan migrasi alel. Berdasarkan identifikasi yang dilakukan diperoleh bahwa nilai heterozigositas pada populasi III (93.8461%) memiliki nilai heterozigositas lebih tinggi dibandingkan dengan populasi I (88.4615%) dan II (76.9230%) dan telah terjadi migrasi alel 0.341373%. Adanya persentase migrasi alel tersebut meskipun kecil menunjukkan telah terjadi Breeding diantara populasi Jawa dengan populasi Arab sehingga meningkatkan rata-rata nilai heterozigositas pada tiap populasi. Pola alel heterozigot dengan berdasarkan nilai heterozigositas, jumlah alel pada D21S11, VWA dan THO1 dapat direkomendasikan sebagai penanda molekular untuk identifikasi variasi genetik.Kata kunci: Etnis Jawa–Arab, DNA Fingerprint, 13 CODIS DETERMINING THE GENETIC SIMILARITIES AND VARIABILITY OF JAVANESE AND ARAB ETHNIC FAMILIES WITH DNA FINGERPRINT IN MALANG EAST JAVA INDONESIAABSTRACTMore than one-third of human genome consists of repetitive sequence region (Repeat Area) which consist of Minisatellite or Variant Number Of Tandem Repeats (VNTR) and Microsatellite or Short Tandem Repeat (STR). Based on its short allele range STR can be used for the paternity testing study of genetics disease, molecular archeology, as well as in forensic crime cases. The aim of this study is to identify Javanese – Arab Ethnic DNA fingerprint in determining the similarities and genetic variability. Materials and methods to accomplish this, we used human white blood cell from three generations of three family consists of: (1) grandmother-mother, father-daughter, (2) grandfather-mother, father-daughter, (3) grandfather, grandmother–mother, father-son. DNA blood samples were Isolated by salting out, furthermore PCR amplification used by applying 13 CODIS which consists of TPOX, D3S1358, FGA, D5S818, CSF1PO, D7S820, D8S1179, TH01, VWA, D13S317, D16S539, D18S51, D21S11 and amelogenin, and then it was visualized by 8% polyacrylamid gel. The Fingerprint profile was visualized by 8% polyacrylamide gel and took the picture by ChemiDoc gel Imaging and measure the intensity band pattern by Quantity One software. Variations in the pattern of DNA bands were analyzed using the program GENEPOP software package version 4.2 that will be obtained allele frequencies, heterozygosity, and allele migration. Based on identification, this result showed analysis heterozygosity values, population III (93.8461%) have higher heterozygosity values compared with the population I (88.4615%) and II (76.9230%) and migration of alleles 0.341373%. The percentage of the migration though minor allele had occurred Breeding populations between Java to the Patterns of heterozygous alleles with values based on heterozygosity, number of alleles at D21S11,VWA and THO1 can be recommended as a molecular marker for the identification of genetic variation.Keywords: Javanese – Arab Ethnics, DNA fingerprint, 13 CODIS

Download Full-text

Finding Long Tandem Repeats In Long Noisy Reads

Bioinformatics ◽

10.1093/bioinformatics/btaa865 ◽

2020 ◽

Author(s):

Shinichi Morishita ◽

Kazuki Ichikawa ◽

Gene Myers

Keyword(s):

Tandem Repeat ◽

Error Rate ◽

Tandem Repeats ◽

Repeat Unit ◽

Error Rates ◽

De Bruijn Graph ◽

Frequency Distributions ◽

Sequencing Technologies ◽

Long Reads ◽

Repeat Expansions

Abstract Motivation Long tandem repeat expansions of more than 1000 nt have been suggested to be associated with diseases, but remain largely unexplored in individual human genomes because read lengths have been too short. However, new long-read sequencing technologies can produce single reads of 10,000 nt or more that can span such repeat expansions, although these long reads have high error rates, of 10%-20%, which complicates the detection of repetitive elements. Moreover, most traditional algorithms for finding tandem repeats are designed to find short tandem repeats (< 1000 nt) and cannot effectively handle the high error rate of long reads in a reasonable amount of time. Results Here, we report an efficient algorithm for solving this problem that takes advantage of the length of the repeat. Namely, a long tandem repeat has hundreds or thousands of approximate copies of the repeated unit, so despite the error rate, many short k-mers will be error-free in many copies of the unit. We exploited this characteristic to develop a method for first estimating regions that could contain a tandem repeat, by analyzing the k-mer frequency distributions of fixed-size windows across the target read, followed by an algorithm that assembles the k-mers of a putative region into the consensus repeat unit by greedily traversing a de Bruijn graph. Experimental results indicated that the proposed algorithm largely outperformed Tandem Repeats Finder (TRF), a widely used program for finding tandem repeats, in terms of sensitivity. Software availability https://github.com/morisUtokyo/mTR

Download Full-text

OSTRFPD: Multifunctional Tool for Genome-Wide Short Tandem Repeat Analysis for DNA, Transcripts, and Amino Acid Sequences with Integrated Primer Designer

Evolutionary Bioinformatics ◽

10.1177/1176934319843130 ◽

2019 ◽

Vol 15 ◽

pp. 117693431984313

Author(s):

Vivek Bhakta Mathema ◽

Arjen M Dondorp ◽

Mallika Imwong

Keyword(s):

Tandem Repeat ◽

Short Tandem Repeat ◽

Tandem Repeats ◽

Gc Content ◽

Plasmodium Species ◽

Single Step ◽

Amino Acid Sequences ◽

Practical Implementation ◽

Genome Wide ◽

Short Tandem

Microsatellite mining is a common outcome of the in silico approach to genomic studies. The resulting short tandemly repeated DNA could be used as molecular markers for studying polymorphism, genotyping and forensics. The omni short tandem repeat finder and primer designer (OSTRFPD) is among the few versatile, platform-independent open-source tools written in Python that enables researchers to identify and analyse genome-wide short tandem repeats in both nucleic acids and protein sequences. OSTRFPD is designed to run either in a user-friendly fully featured graphical interface or in a command line interface mode for advanced users. OSTRFPD can detect both perfect and imperfect repeats of low complexity with customisable scores. Moreover, the software has built-in architecture to simultaneously filter selection of flanking regions in DNA and generate microsatellite-targeted primers implementing the Primer3 platform. The software has built-in motif-sequence generator engines and an additional option to use the dictionary mode for custom motif searches. The software generates search results including general statistics containing motif categorisation, repeat frequencies, densities, coverage, guanine–cytosine (GC) content, and simple text-based imperfect alignment visualisation. Thus, OSTRFPD presents users with a quick single-step solution package to assist development of microsatellite markers and categorise tandemly repeated amino acids in proteome databases. Practical implementation of OSTRFPD was demonstrated using publicly available whole-genome sequences of selected Plasmodium species. OSTRFPD is freely available and open-sourced for improvement and user-specific adaptation.

Download Full-text