Hierarchical modeling of genome-wide Short Tandem Repeat (STR) markers infers native American prehistory

Author(s):  
Cecil M. Lewis
2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Indhu-Shree Rajan-Babu ◽  
Junran J. Peng ◽  
Readman Chiu ◽  
Patricia Birch ◽  
Madeline Couse ◽  
...  

PLoS ONE ◽  
2014 ◽  
Vol 9 (8) ◽  
pp. e104182 ◽  
Author(s):  
Manosh Kumar Biswas ◽  
Qiang Xu ◽  
Christoph Mayer ◽  
Xiuxin Deng

2019 ◽  
Vol 15 ◽  
pp. 117693431984313
Author(s):  
Vivek Bhakta Mathema ◽  
Arjen M Dondorp ◽  
Mallika Imwong

Microsatellite mining is a common outcome of the in silico approach to genomic studies. The resulting short tandemly repeated DNA could be used as molecular markers for studying polymorphism, genotyping and forensics. The omni short tandem repeat finder and primer designer (OSTRFPD) is among the few versatile, platform-independent open-source tools written in Python that enables researchers to identify and analyse genome-wide short tandem repeats in both nucleic acids and protein sequences. OSTRFPD is designed to run either in a user-friendly fully featured graphical interface or in a command line interface mode for advanced users. OSTRFPD can detect both perfect and imperfect repeats of low complexity with customisable scores. Moreover, the software has built-in architecture to simultaneously filter selection of flanking regions in DNA and generate microsatellite-targeted primers implementing the Primer3 platform. The software has built-in motif-sequence generator engines and an additional option to use the dictionary mode for custom motif searches. The software generates search results including general statistics containing motif categorisation, repeat frequencies, densities, coverage, guanine–cytosine (GC) content, and simple text-based imperfect alignment visualisation. Thus, OSTRFPD presents users with a quick single-step solution package to assist development of microsatellite markers and categorise tandemly repeated amino acids in proteome databases. Practical implementation of OSTRFPD was demonstrated using publicly available whole-genome sequences of selected Plasmodium species. OSTRFPD is freely available and open-sourced for improvement and user-specific adaptation.


Author(s):  
Merlijn H.I. van Haren ◽  
Theun de Groot ◽  
Bram Spruijtenburg ◽  
Kusum Jain ◽  
Anuradha Chowdhary ◽  
...  

Candida krusei is a human pathogenic yeast that can cause candidemia with the lowest 90-day survival rate in comparison to other Candida species. Infections occur frequently in immunocompromised patients and several C. krusei outbreaks in health care facilities have been described. Here, we developed a short tandem repeat (STR) typing scheme for C. krusei to allow for fast and cost-effective genotyping of an outbreak and compared identified relatedness of ten isolates to SNP calling from whole-genome sequencing (WGS). From a selection of 14 novel STR markers, six were used to develop two multiplex PCRs. Additionally, three previously reported markers were selected for a third multiplex PCR. In total, 119 C. krusei isolates were typed using these nine markers and 79 different genotypes were found. STR typing correlated well with WGS SNP typing, as isolates with the same STR genotype varied by 8 and 19 SNPs, while isolates that differed in all STR markers varied at least tens of thousands of SNPs. The STR typing assay was found to be specific for C. krusei , stable in 100 subcloned generations, and comparable to SNP calling by WGS. In summary, this newly developed C. krusei STR typing scheme is a fast, reliable, easy-to-interpret and cost-effective method compared to other typing methods. Moreover, the two newly developed multiplexes showed the same discriminatory power as all nine markers combined, indicating that multiplexes M3-1 and M9 are sufficient to type C. krusei .


2018 ◽  
Vol 64 (09/2018) ◽  
Author(s):  
Raluca Dumache ◽  
Alexandra Enache ◽  
Ligia Barbarii ◽  
Carmen Constantinescu ◽  
Andreea Pascalau ◽  
...  

Blood ◽  
2011 ◽  
Vol 118 (21) ◽  
pp. 4004-4004
Author(s):  
Hye Ran Kim ◽  
Eun-Jeong Won ◽  
Hyun-Jung Choi ◽  
Hwan-Young Kim ◽  
James Moon ◽  
...  

Abstract Abstract 4004 Background: Mitochondrial DNA (mtDNA) is widely used in forensic identification and anthropologic studies on account of its abundance resulting in preferential amplification, sequencing and inherent variability. We developed mtDNA markers to monitor donor cell engraftment after allogeneic stem cell transplantation(SCT), then compared with nuclear short tandem repeat (STR) markers. Patients and methods: The mtDNA control regions and six mtDNA minisatellites (mtMS) (303 poly C, 16184 poly C, 514 (CA) repeat, 3566 poly C, 12385 poly C and 12418 poly A) from the total DNA samples of 215 cases (donor, recipient and after allogeneic SCT) were amplified using the designated specific primers and PCR. The results were compared with those from the six short tandem repeat (STR) markers (D12S391, D18S51, F13A1, HUM RENA-4, HUM FABP2 and Amelogenin). Results: Polymorphisms in the mtDNA control region identify an informative marker in 88% (189 cases) of all cases. Among the six mtMS markers, the informativeness of 303 poly C and 16184 poly C mtMS was 63% and 67% respectively. A combination of direct sequencing through the mtDNA control region, 303poly C and 16184 poly C mtMS could completely distinguish the donor cells from the recipient cells. The results from a typical mixing experiment to determine the sensitivity revealed a detection limit (DL) of the gene scan analysis in a mtDNA mixture to be visible at 1% heteroplasmy in 303 poly C mtMS marker. However, the DL from D12S391 in the same mixing experiment was 5–10% heteroplasmy. Conclusions: mtMS markers, especially 303 poly C and 16184 poly C markers, can provide a sensitive, accurate and quantitative determination of mixed chimerism after a SCT. Disclosures: No relevant conflicts of interest to declare.


2020 ◽  
Author(s):  
Indhu-Shree Rajan-Babu ◽  
Junran Peng ◽  
Readman Chiu ◽  
Arezoo Mohajeri ◽  
Egor Dolzhenko ◽  
...  

ABSTRACTShort tandem repeat (STR) expansions cause several neurological and neuromuscular disorders. Screening for STR expansions in genome-wide (exome and genome) sequencing data can enable diagnosis, optimal clinical management/treatment, and accurate genetic counselling of patients with repeat expansion disorders. We assessed the performance of lobSTR, HipSTR, RepeatSeq, ExpansionHunter, TREDPARSE, GangSTR, STRetch, and exSTRa – bioinformatics tools that have been developed to detect and/or genotype STR expansions – on experimental and simulated genome sequence data with known STR expansions aligned using two different aligners, Isaac and BWA. We then adjusted the parameter settings to optimize the sensitivity and specificity of the STR tools and fed the optimized results into a machine-learning decision tree classifier to determine the best combination of tools to detect full mutation expansions with high diagnostic sensitivity and specificity. The decision tree model supported using ExpansionHunter’s full mutation calls with those of either STRetch or exSTRa for detection of full mutations with precision, recall, and F1-score of 90%, 100%, and 95%, respectively.We used this pipeline to screen the BWA-aligned exome or genome sequence data of 306 families of children with suspected genetic disorders for pathogenic expansions of known disease STR loci. We identified 27 samples, 17 with an apparent full-mutation expansion of the AR, ATXN1, ATXN2, ATXN8, DMPK, FXN, HTT, or TBP locus, nine with an intermediate or premutation allele in the FMR1 locus, and one with a borderline allele in the ATXN2 locus. We report the concordance between our bioinformatics findings and the clinical PCR results in a subset of these samples. Implementation of our bioinformatics workflow can improve the detection of disease STR expansions in exome and genome sequence diagnostics and enhance clinical outcomes for patients with repeat expansion disorders.


Sign in / Sign up

Export Citation Format

Share Document