Comparison of protein repeat classifications based on structure and sequence families

2015 ◽  
Vol 43 (5) ◽  
pp. 832-837 ◽  
Author(s):  
Lisanna Paladin ◽  
Silvio C.E. Tosatto

Tandem repeats (TR) in proteins are common in nature and have several unique functions. They come in various forms that are frequently difficult to recognize from a sequence. A previously proposed structural classification has been recently implemented in the RepeatsDB database. This defines five main classes, mainly based on repeat unit length, with subclasses representing specific folds. Sequence-based classifications, such as Pfam, provide an alternative classification based on evolutionarily conserved repeat families. Here, we discuss a detailed comparison between the structural classes in RepeatsDB and the corresponding Pfam repeat families and clans. Most instances are found to map one-to-one between structure and sequence. Some notable exceptions such as leucine-rich repeats (LRRs) and α-solenoids are discussed.

2002 ◽  
Vol 22 (3) ◽  
pp. 953-964 ◽  
Author(s):  
Peter A. Jauert ◽  
Sharon N. Edmiston ◽  
Kathleen Conway ◽  
David T. Kirkpatrick

ABSTRACT Minisatellite DNA is repetitive DNA with a repeat unit length from 15 to 100 bp. While stable during mitosis, it destabilizes during meiosis, altering both in length and in sequence composition. The basis for this instability is unknown. To investigate the factors controlling minisatellite stability, a minisatellite sequence 3′ of the human HRAS1 gene was introduced into the Saccharomyces cerevisiae genome, replacing the wild-type HIS4 promoter. The minisatellite tract exhibited the same phenotypes in yeast that it exhibited in mammalian systems. The insertion stimulated transcription of the HIS4 gene; mRNA production was detected at levels above those seen with the wild-type promoter. The insertion stimulated meiotic recombination and created a hot spot for initiation of double-strand breaks during meiosis in the regions immediately flanking the repetitive DNA. The tract length altered at a high frequency during meiosis, and both expansions and contractions in length were detected. Tract expansion, but not contraction, was controlled by the product of the RAD1 gene. RAD1 is the first gene identified that controls specifically the expansion of minisatellite tracts. A model for tract length alteration based on these results is presented.


1989 ◽  
Vol 9 (9) ◽  
pp. 3621-3629
Author(s):  
J T Joseph ◽  
S M Aldritt ◽  
T Unnasch ◽  
O Puijalon ◽  
D F Wirth

We have identified a conserved, repeated, and highly transcribed DNA element from the avian malarial parasite Plasmodium gallinaceum. The element produced multiple transcripts in both zygotes and asexual blood stages of this parasite. It was found to be highly conserved in all of five malarial species tested and hybridized at reduced stringency to other members of the phylum Apicomplexa, including the genera Babesia, Eimeria, Toxoplasma, and Theileria. The copy number of the element was about 15, and it had a circularly permuted restriction map with a repeat unit length of about 6.2 kilobases. It could be separated from the main genomic DNA by using sucrose gradients and agarose gels, and it migrated separately from the recognized Plasmodium chromosomes on pulse-field gels. In the accompanying paper (S. M. Aldritt, J. T. Joseph, and D. F. Wirth, Mol. Cell. Biol. 9:3614-3620, 1989), evidence is presented that element contains the mitochondrial genes for the protein cytochrome b and a fragment of the large rRNA. We postulate that this element is an episome in the mitochondria of the obligate parasites belonging to the phylum Apicomplexa.


Parasitology ◽  
2016 ◽  
Vol 144 (1) ◽  
pp. 37-47 ◽  
Author(s):  
RACHEL M. CHALMERS ◽  
GUY ROBINSON ◽  
EMILY HOTCHKISS ◽  
CLAIRE ALEXANDER ◽  
SOPHIE MAY ◽  
...  

SUMMARYCryptosporidium parvum is the major cause of livestock and zoonotically-acquired human cryptosporidiosis. The ability to track sources of contamination and routes of transmission by further differentiation of isolates would assist risk assessment and outbreak investigations. Multiple-locus variable-number of tandem-repeats (VNTR) analysis provides a means for rapid characterization by fragment sizing and estimation of copy numbers, but structured, harmonized development has been lacking for Cryptosporidium spp. To investigate potential for application in C. parvum surveillance and outbreak investigations, we studied nine commonly used VNTR loci (MSA, MSD, MSF, MM5, MM18, MM19, MS9-Mallon, GP60 and TP14) for chromosome distribution, repeat unit length and heterogeneity, and flanking region proximity and conservation. To investigate performance in vitro, we compared these loci in 14 C. parvum samples by capillary electrophoresis in three laboratories. We found that many loci did not contain simple repeat units but were more complex, hindering calculations of repeat unit copy number for standardized reporting nomenclature. However, sequenced reference DNA enabled reproducible fragment sizing and inter-laboratory allele assignation based on size normalized to that of the sequenced fragments by both single round and nested polymerase chain reactions. Additional Cryptosporidium loci need to be identified and validated for robust inter-laboratory surveillance and outbreak investigations.


1989 ◽  
Vol 9 (9) ◽  
pp. 3621-3629 ◽  
Author(s):  
J T Joseph ◽  
S M Aldritt ◽  
T Unnasch ◽  
O Puijalon ◽  
D F Wirth

We have identified a conserved, repeated, and highly transcribed DNA element from the avian malarial parasite Plasmodium gallinaceum. The element produced multiple transcripts in both zygotes and asexual blood stages of this parasite. It was found to be highly conserved in all of five malarial species tested and hybridized at reduced stringency to other members of the phylum Apicomplexa, including the genera Babesia, Eimeria, Toxoplasma, and Theileria. The copy number of the element was about 15, and it had a circularly permuted restriction map with a repeat unit length of about 6.2 kilobases. It could be separated from the main genomic DNA by using sucrose gradients and agarose gels, and it migrated separately from the recognized Plasmodium chromosomes on pulse-field gels. In the accompanying paper (S. M. Aldritt, J. T. Joseph, and D. F. Wirth, Mol. Cell. Biol. 9:3614-3620, 1989), evidence is presented that element contains the mitochondrial genes for the protein cytochrome b and a fragment of the large rRNA. We postulate that this element is an episome in the mitochondria of the obligate parasites belonging to the phylum Apicomplexa.


2015 ◽  
Author(s):  
Tugce Bilgin Sonay ◽  
Tiago Carvalho ◽  
Mark Robinson ◽  
Maja Greminger ◽  
Michael Krützen ◽  
...  

Tandem repeats (TR) are stretches of DNA that are highly variable in length and mutate rapidly, and thus an important source of genetic variation. This variation is highly informative for population and conservation genetics, and has also been associated with several pathological conditions and with gene expression regulation. However, genome-wide surveys of TR variation have been scarce due to the technical difficulties derived from short-read technology. Here, we explored the genome-wide diversity of TRs in a panel of 83 human and nonhuman great ape genomes, and their impact on gene expression evolution. We found that populations and species diversity patterns can be efficiently captured with short TRs (repeat unit length 1-5 base pairs) with potential applications in conservation genetics. We also examined the potential evolutionary role of TRs in gene expression differences between humans and primates by using 30,275 larger TRs (repeat unit length 2-50 base pairs). About one third of the 13,035 one-to-one orthologous genes contained TRs within 5 kilobase pairs of their transcription start site, and had higher expression divergence than genes without such TRs. The same observation held for genes with repeats in their 3′ untranslated region, in introns, and in exons. Using our polymorphism data for the shortest TRs, we found that genes with polymorphic repeats in their promoters showed higher expression divergence in humans and chimpanzees compared to genes with fixed or no TRs in the promoters. Our findings highlight the potential contribution of TRs to recent human evolution through gene regulation.


Parasitology ◽  
2006 ◽  
Vol 134 (5) ◽  
pp. 637-650 ◽  
Author(s):  
M. C. BRUCE ◽  
A. MACHESO ◽  
M. R. GALINSKI ◽  
J. W. BARNWELL

SUMMARYPlasmodium malariae, a protozoan parasite that causes malaria in humans, has a global distribution in tropical and subtropical regions and is commonly found in sympatry with otherPlasmodiumspecies of humans. Little is known about the genetics or population structure ofP. malariae. In the present study, we describe polymorphic genetic markers forP. malariaeand present the first molecular epidemiological data for this parasite. Six microsatellite or minisatellite markers were validated using 76P. malariaesamples from a diverse geographical range. The repeat unit length varied from 2 to17 bp, and up to 10 different alleles per locus were detected. Multiple genotypes ofP. malariaewere detected in 33 of 70 samples from humans with naturally acquired infection. Heterozygosity was calculated to be between 0·236 and 0·811. Allelic diversity was reduced for samples from South America and, at some loci, in samples from Thailand compared with those from Malawi. The number of unique multilocus genotypes defined using the 6 markers was significantly greater in Malawi than in Thailand, even when data from single genotype infections were used. There was a significant reduction in the multiplicity of infection in symptomatic infections compared with asymptomatic ones, suggesting that clinical episodes are usually caused by the expansion of a single genotype.


2000 ◽  
Vol 20 (23) ◽  
pp. 8996-9008 ◽  
Author(s):  
Andrea Herold ◽  
Mikita Suyama ◽  
João P. Rodrigues ◽  
Isabelle C. Braun ◽  
Ulrike Kutay ◽  
...  

ABSTRACT Vertebrate TAP (also called NXF1) and its yeast orthologue, Mex67p, have been implicated in the export of mRNAs from the nucleus. The TAP protein includes a noncanonical RNP-type RNA binding domain, four leucine-rich repeats, an NTF2-like domain that allows heterodimerization with p15 (also called NXT1), and a ubiquitin-associated domain that mediates the interaction with nucleoporins. Here we show that TAP belongs to an evolutionarily conserved family of proteins that has more than one member in higher eukaryotes. Not only the overall domain organization but also residues important for p15 and nucleoporin interaction are conserved in most family members. We characterize two of four human TAP homologues and show that one of them, NXF2, binds RNA, localizes to the nuclear envelope, and exhibits RNA export activity. NXF3, which does not bind RNA or localize to the nuclear rim, has no RNA export activity. Database searches revealed that although only one p15(nxt) gene is present in the Drosophila melanogaster and Caenorhabditis elegans genomes, there is at least one additional p15 homologue (p15-2 [also called NXT2]) encoded by the human genome. Both human p15 homologues bind TAP, NXF2, and NXF3. Together, our results indicate that the TAP-p15 mRNA export pathway has diversified in higher eukaryotes compared to yeast, perhaps reflecting a greater substrate complexity.


Gene ◽  
2002 ◽  
Vol 286 (1) ◽  
pp. 121-126 ◽  
Author(s):  
Massimiliano Bonafè ◽  
Cristiana Barbi ◽  
Fabiola Olivieri ◽  
Anatoli Yashin ◽  
Kirill F Andreev ◽  
...  

Development ◽  
1992 ◽  
Vol 114 (1) ◽  
pp. 221-232 ◽  
Author(s):  
P.M. Macdonald

Specification of the posterior body plan in Drosophila requires the action of a determinant prelocalized to the posterior pole of the embryo. During embryogenesis this determinant appears to move anteriorly in a process dependent on the pumilio (pum) gene. This report describes the cloning and molecular characterization of a cDNA derived from the pum gene, and the analysis of pum mRNA and protein expression during early Drosophila development. The pum gene is unusually large; comparison of genomic and cDNA sequences reveals that the pum transcription unit is at least 160 kb in length. The pum cDNA encodes a 157 × 10(3) M(r) protein which consists mainly of regions enriched in a single amino acid, usually glycine, alanine, glutamine or serine/threonine. Six tandem repeats of a 36 amino acid repeat unit are also present. Pum protein is cytoplasmic and is concentrated in a subcortical region of the embryo. The distribution of pum protein exhibits no asymmetry along the anteroposterior axis of the embryo.


Sign in / Sign up

Export Citation Format

Share Document