A Linear Algebra Approach to Fast DNA Mixture Analysis Using GPUs

Genetic Reconstruction and Forensic Analysis of Chinese Shandong and Yunnan Han Populations by Co-Analyzing Y Chromosomal STRs and SNPs

Genes ◽

10.3390/genes11070743 ◽

2020 ◽

Vol 11 (7) ◽

pp. 743

Author(s):

Caiyong Yin ◽

Kaiyuan Su ◽

Ziwei He ◽

Dian Zhai ◽

Kejian Guo ◽

...

Keyword(s):

Tandem Repeats ◽

Forensic Analysis ◽

Copy Number Variations ◽

Null Alleles ◽

Nucleotide Polymorphisms ◽

Evolutionary Pathway ◽

Individual Level ◽

Population Comparisons ◽

Heated Debate ◽

The Individual

Y chromosomal short tandem repeats (Y-STRs) have been widely harnessed for forensic applications, such as pedigree source searching from public security databases and male identification from male–female mixed samples. For various populations, databases composed of Y-STR haplotypes have been built to provide investigating leads for solving difficult or cold cases. Recently, the supplementary application of Y chromosomal haplogroup-determining single-nucleotide polymorphisms (SNPs) for forensic purposes was under heated debate. This study provides Y-STR haplotypes for 27 markers typed by the Yfiler™ Plus kit and Y-SNP haplogroups defined by 24 loci within the Y-SNP Pedigree Tagging System for Shandong Han (n = 305) and Yunnan Han (n = 565) populations. The genetic backgrounds of these two populations were explicitly characterized by the analysis of molecular variance (AMOVA) and multi-dimensional scaling (MDS) plots based on 27 Y-STRs. Then, population comparisons were conducted by observing Y-SNP allelic frequencies and Y-SNP haplogroups distribution, estimating forensic parameters, and depicting distribution spectrums of Y-STR alleles in sub-haplogroups. The Y-STR variants, including null alleles, intermedia alleles, and copy number variations (CNVs), were co-listed, and a strong correlation between Y-STR allele variants (“DYS518~.2” alleles) and the Y-SNP haplogroup QR-M45 was observed. A network was reconstructed to illustrate the evolutionary pathway and to figure out the ancestral mutation event. Also, a phylogenetic tree on the individual level was constructed to observe the relevance of the Y-STR haplotypes to the Y-SNP haplogroups. This study provides the evidence that basic genetic backgrounds, which were revealed by both Y-STR and Y-SNP loci, would be useful for uncovering detailed population differences and, more importantly, demonstrates the contributing role of Y-SNPs in population differentiation and male pedigree discrimination.

Download Full-text

Isolated clusters of paired tandemly repeated sequences in the Xenopus laevis genome.

Molecular and Cellular Biology ◽

10.1128/mcb.4.2.254 ◽

1984 ◽

Vol 4 (2) ◽

pp. 254-259 ◽

Cited By ~ 5

Author(s):

D Carroll ◽

J E Garrett ◽

B S Lam

Keyword(s):

Xenopus Laevis ◽

Dna Sequences ◽

Tandem Repeats ◽

Genomic Organization ◽

Evolutionary Conservation ◽

Repeated Dna ◽

Flanking Sequence ◽

Base Pairs ◽

Selection For ◽

Tandemly Repeated Dna

There exist in the Xenopus laevis genome clusters of tandemly repeated DNA sequences, consisting of two types of 393-base-pair repeating unit. Each such cluster contains several units of one of these paired tandem repeats (PTR-1), followed by several units of the other repeat (PTR-2). The number of repeats of each type is variable from cluster to cluster and averages about seven of each type per cluster. Every cluster has ca. 1,000 base pairs of common left flanking sequence (adjacent to the PTR-1 repeats) and 1,000 base pairs of common right flanking sequence (adjacent to the PTR-2 repeats). Beyond these common flanks, the DNA sequences are different in the eight cloned genomic fragments we have studied. Thus, the hundreds of PTR clusters in the genome are dispersed at apparently unrelated sites. Nucleotide sequences of representative PTR-1 and PTR-2 repeats are 64% homologous. These sequences do not reveal an obvious function. However, the related species X. mulleri and X. borealis have sequences homologous to PTR-1 and PTR-2, which show the same repeat lengths and genomic organization. This evolutionary conservation suggests positive selection for the clusters. Maintenance of these sequences at dispersed sites imposes constraints on possible mechanisms of concerted evolution.

Download Full-text

Isolated clusters of paired tandemly repeated sequences in the Xenopus laevis genome

Molecular and Cellular Biology ◽

10.1128/mcb.4.2.254-259.1984 ◽

1984 ◽

Vol 4 (2) ◽

pp. 254-259

Author(s):

D Carroll ◽

J E Garrett ◽

B S Lam

Keyword(s):

Xenopus Laevis ◽

Dna Sequences ◽

Tandem Repeats ◽

Genomic Organization ◽

Evolutionary Conservation ◽

Repeated Dna ◽

Flanking Sequence ◽

Base Pairs ◽

Selection For ◽

Tandemly Repeated Dna

There exist in the Xenopus laevis genome clusters of tandemly repeated DNA sequences, consisting of two types of 393-base-pair repeating unit. Each such cluster contains several units of one of these paired tandem repeats (PTR-1), followed by several units of the other repeat (PTR-2). The number of repeats of each type is variable from cluster to cluster and averages about seven of each type per cluster. Every cluster has ca. 1,000 base pairs of common left flanking sequence (adjacent to the PTR-1 repeats) and 1,000 base pairs of common right flanking sequence (adjacent to the PTR-2 repeats). Beyond these common flanks, the DNA sequences are different in the eight cloned genomic fragments we have studied. Thus, the hundreds of PTR clusters in the genome are dispersed at apparently unrelated sites. Nucleotide sequences of representative PTR-1 and PTR-2 repeats are 64% homologous. These sequences do not reveal an obvious function. However, the related species X. mulleri and X. borealis have sequences homologous to PTR-1 and PTR-2, which show the same repeat lengths and genomic organization. This evolutionary conservation suggests positive selection for the clusters. Maintenance of these sequences at dispersed sites imposes constraints on possible mechanisms of concerted evolution.

Download Full-text

Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling

The Scientific World JOURNAL ◽

10.1100/2012/365104 ◽

2012 ◽

Vol 2012 ◽

pp. 1-10 ◽

Cited By ~ 19

Author(s):

Chun-Tien Chang ◽

Chi-Neu Tsai ◽

Chuan Yi Tang ◽

Chun-Houh Chen ◽

Jang-Hau Lian ◽

...

Keyword(s):

Dna Sequences ◽

Copy Number ◽

Tandem Repeats ◽

Direct Sequencing ◽

Nucleotide Polymorphisms ◽

Dna Index ◽

Paralogous Genes ◽

Base Calling ◽

Mixed Sequence ◽

Reference Sequences

The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such asβ-defensin 4 (DEFB4) and its paralogHSPDP3.

Download Full-text

INTERVERTEBRAL DISC DEGENERATION LINKED TO STRUCTURAL GENE VARIATIONS

Pakistan Journal of Medicine and Dentistry ◽

10.36283/pjmd8-4/012 ◽

2019 ◽

Author(s):

Saeeda Baig

Keyword(s):

Intervertebral Disc ◽

Disc Degeneration ◽

Intervertebral Disc Degeneration ◽

English Language ◽

Tandem Repeats ◽

Lumbar Disc ◽

Disease Process ◽

Nucleotide Polymorphisms ◽

Structural Protein ◽

Lumbar Disc Degeneration

During the recent past focus has shifted from identifying intervertebral disc degeneration as being caused by physical exposure and strain to being linked with a variety of genetic variations. The objective of this review is to provide an up to date review of the existing research data regarding the relation of intervertebral disc degeneration to structural protein genes and their polymorphisms and thus help clearly establish further avenues where research into causation and treatment is needed. A comprehensive search using the keywords “Collagen”, “COL”, “Aggrecan”, “AGC”, “IVDD”, “intervertebral disc degeneration”, and “lumbar disc degeneration” from PubMed and Google Scholar, where literature in the English language was selected spanning from 1991 to 2019. There are many genes involved in the production of structural components of an intervertebral disc. The issues in production of these components involve the over-expression or under-expression of their genes, and single nucleotide polymorphisms and variable number of tandem repeats affecting their structures. These structural genes include primarily the collagen and the aggrecan genes. While genetic and environmental factors all come into play with a disease process like disc degeneration, the bulk of research now shows the significantly larger impact of hereditary over exposure. While further research is needed into some of the lesser studied genes linked to IVDD and also the racial variations in genetic makeup, the focus in the near future should be on establishment of genetic testing to identify individuals at greater risk of disease and deliberation regarding the use of gene therapy to prevent disc degeneration.

Download Full-text

Nucleotide polymorphisms in three genes support host and geographic speciation in tree pathogens belonging toGremmeniellaspp.

Canadian Journal of Botany ◽

10.1139/b02-103 ◽

2002 ◽

Vol 80 (11) ◽

pp. 1151-1159 ◽

Cited By ~ 8

Author(s):

M Dusabenyagasani ◽

G Laflamme ◽

R C Hamelin

Keyword(s):

North America ◽

Dna Sequences ◽

North American ◽

Abies Balsamea ◽

Host Specialization ◽

Rrna Genes ◽

Nucleotide Polymorphisms ◽

Group I ◽

Geographic Separation ◽

Pinus Spp

We detected nucleotide polymorphisms within the genus Gremmeniella in DNA sequences of β-tubulin, glyceraldehyde phosphate dehydrogenase, and mitochondrial small subunit rRNA (mtSSU rRNA) genes. A group-I intron was present in strains originating from fir (Abies spp.) in the mtSSU rRNA locus. This intron in the mtSSU rRNA locus of strains isolated from Abies sachalinensis (Fridr. Schmidt) M.T. Mast in Asia was also found in strains isolated from Abies balsamea (L.) Mill. in North America. Phylogenetic analyses yielded trees that grouped strains by host of origin with strong branch support. Asian strains of Gremmeniella abietina (Lagerberg) Morelet var. abietina isolated from fir (A. sachalinensis) were more closely related to G. abietina var. balsamea from North America, which is found on spruce (Picea spp.) and balsam fir, and European and North American races of G. abietina var. abietina from pines (Pinus spp.) were distantly related. Likewise, North American isolates of Gremmeniella laricina (Ettinger) O. Petrini, L.E. Petrini, G. Laflamme, & G.B. Ouellette, a pathogen of larch, was more closely related to G. laricina from Europe than to G. abietina var. abietina from North America. These data suggest that host specialization might have been the leading evolutionary force shaping Gremmeniella spp., with geographic separation acting as a secondary factor.Key words: Gremmeniella, geographic separation, host specialization, mitochondrial rRNA, nuclear genes.

Download Full-text

Developing a Genetic System in Deinococcus radiodurans for Analyzing Mutations

Genetics ◽

10.1093/genetics/166.2.661 ◽

2004 ◽

Vol 166 (2) ◽

pp. 661-668

Author(s):

Mandy Kim ◽

Erika Wolff ◽

Tiffany Huang ◽

Lilit Garibyan ◽

Ashlee M Earl ◽

...

Keyword(s):

Dna Sequences ◽

Deinococcus Radiodurans ◽

Genetic System ◽

Base Change ◽

Base Substitution ◽

Wild Type ◽

Base Pairs ◽

E Coli ◽

Radiation Induced ◽

Induced Mutagenesis

Abstract We have applied a genetic system for analyzing mutations in Escherichia coli to Deinococcus radiodurans, an extremeophile with an astonishingly high resistance to UV- and ionizing-radiation-induced mutagenesis. Taking advantage of the conservation of the β-subunit of RNA polymerase among most prokaryotes, we derived again in D. radiodurans the rpoB/Rif r system that we developed in E. coli to monitor base substitutions, defining 33 base change substitutions at 22 different base pairs. We sequenced >250 mutations leading to Rif r in D. radiodurans derived spontaneously in wild-type and uvrD (mismatch-repair-deficient) backgrounds and after treatment with N-methyl-N′-nitro-N-nitrosoguanidine (NTG) and 5-azacytidine (5AZ). The specificities of NTG and 5AZ in D. radiodurans are the same as those found for E. coli and other organisms. There are prominent base substitution hotspots in rpoB in both D. radiodurans and E. coli. In several cases these are at different points in each organism, even though the DNA sequences surrounding the hotspots and their corresponding sites are very similar in both D. radiodurans and E. coli. In one case the hotspots occur at the same site in both organisms.

Download Full-text

Heteroalleles in Common Wheat: Multiple Differences between Allelic Variants of the Gli-B1 Locus

International Journal of Molecular Sciences ◽

10.3390/ijms22041832 ◽

2021 ◽

Vol 22 (4) ◽

pp. 1832

Author(s):

Eugene Metakovsky ◽

Laura Pascual ◽

Patrizia Vaccino ◽

Viktor Melnik ◽

Marta Rodriguez-Quijano ◽

...

Keyword(s):

Common Wheat ◽

Dna Sequences ◽

Fragment Length Polymorphism ◽

Snp Markers ◽

Group Iv ◽

Nucleotide Polymorphisms ◽

High Genetic Diversity ◽

Single Nucleotide ◽

Allelic Variants ◽

B Genome

The Gli-B1-encoded γ-gliadins and non-coding γ-gliadin DNA sequences for 15 different alleles of common wheat have been compared using seven tests: electrophoretic mobility (EM) and molecular weight (MW) of the encoded major γ-gliadin, restriction fragment length polymorphism patterns (RFLPs) (three different markers), Gli-B1-γ-gliadin-pseudogene known SNP markers (Single nucleotide polymorphisms) and sequencing the pseudogene GAG56B. It was discovered that encoded γ-gliadins, with contrasting EM, had similar MWs. However, seven allelic variants (designated from I to VII) differed among them in the other six tests: I (alleles Gli-B1i, k, m, o), II (Gli-B1n, q, s), III (Gli-B1b), IV (Gli-B1e, f, g), V (Gli-B1h), VI (Gli-B1d) and VII (Gli-B1a). Allele Gli-B1c (variant VIII) was identical to the alleles from group IV in four of the tests. Some tests might show a fine difference between alleles belonging to the same variant. Our results attest in favor of the independent origin of at least seven variants at the Gli-B1 locus that might originate from deeply diverged genotypes of the donor(s) of the B genome in hexaploid wheat and therefore might be called “heteroallelic”. The donor’s particularities at the Gli-B1 locus might be conserved since that time and decisively contribute to the current high genetic diversity of common wheat.

Download Full-text

THE DNA OF CAENORHABDITIS ELEGANS

Genetics ◽

10.1093/genetics/77.1.95 ◽

1974 ◽

Vol 77 (1) ◽

pp. 95-104

Author(s):

J E Sulston ◽

S Brenner

Keyword(s):

Caenorhabditis Elegans ◽

Chemical Analysis ◽

Dna Sequences ◽

Base Composition ◽

Nematode Caenorhabditis Elegans ◽

5S Rna ◽

Base Pairs ◽

Small Component ◽

E Coli ◽

The Mean

ABSTRACT Chemical analysis and a study of renaturation kinetics show that the nematode, Caenorhabditis elegans, has a haploid DNA content of 8 x IO7 base pairs (20 times the genome of E. coli). Eighty-three percent of the DNA sequences are unique. The mean base composition is 36% GC; a small component, containing the rRNA cistrons, has a base composition of 51% GC. The haploid genome contains about 300 genes for 4s RNA, 110 for 5s RNA, and 55 for (18 + 28)S RNA.

Download Full-text

A geometrical numerical linear algebra approach to residual-enhanced parametric model reduction

PAMM ◽

10.1002/pamm.201410396 ◽

2014 ◽

Vol 14 (1) ◽

pp. 831-832

Author(s):

Ralf Zimmermann

Keyword(s):

Model Reduction ◽

Linear Algebra ◽

Parametric Model ◽

Numerical Linear Algebra ◽

Algebra Approach

Download Full-text