Longitudinal samples of bacterial genomes potentially bias evolutionary analyses

Mapping Intimacies ◽

10.1101/103465 ◽

2017 ◽

Author(s):

B.J. Arnold ◽

W.P. Hanage

Keyword(s):

Genomic Data ◽

Real Data ◽

Sampling Bias ◽

Bacterial Genomes ◽

Clock Rate ◽

Rare Mutations ◽

Constant Size ◽

Population Genomic ◽

Genomic Studies ◽

Longitudinal Sampling

AbstractSamples of bacteria collected over a period of time are attractive for several reasons, including the ability to estimate the molecular clock rate and to detect fluctuations in allele frequencies over time. However, longitudinal datasets are occasionally used in analyses that assume samples were collected contemporaneously. Using both simulations and genomic data from Neisseria gonorrhoeae, Streptococcus mutans, Campylobacter jejuni, and Helicobacter pylori, we show that longitudinal samples (spanning more than a decade in real data) may suffer from considerable bias that inflates estimates of recombination and the number of rare mutations in a sample of genomic sequences. While longitudinal data are frequently accounted for using the serial coalescent, many studies use other programs or metrics, such as Tajima’s D, that are sensitive to these sampling biases and contain genomic data collected across many years. Notably, longitudinal samples from a population of constant size may exhibit evidence of exponential growth. We suggest that population genomic studies of bacteria should routinely account for temporal diversity in samples or provide evidence that longitudinal sampling bias does not affect conclusions.

Download Full-text

Origin and distribution of different retrotransposons in different taxa

Genetics & Applications ◽

10.31383/ga.vol2iss2pp13-19 ◽

2018 ◽

Vol 2 (2) ◽

pp. 13

Author(s):

Buket Cakmak Guner ◽

Nermin Gozukirmizi

Keyword(s):

Human Genome ◽

Population Studies ◽

Genomic Data ◽

Data Sets ◽

Human Genetic Variation ◽

Chain Reaction ◽

Population Genomic ◽

Genomic Studies ◽

Polymerase Chain ◽

Genome Analyses

Novel genome analysis technologies enable genomic studies of transposable elements (TEs) in different organisms. Population studies of human genome show thousands of individual TE insertions. These insertions are important source of natural human genetic variation. Researchers are beginning to develop population genomic data sets for evaluating the phenotypic impact of human TE polymorphisms. Because of the evidences of horizontal transfer of retrotransposons between different species genome, in this study we aimed to detect barley retrotransposons (Nikita and BAGY2) in the human genome. Inter retrotransposon amplified polymorphism polymerase chain reaction (IRAP PCR) were used to measure the distribution of Nikita and BAGY2 retroelements in the human genome. Analyses reveals that Nikita and BAGY2 are present in the human genome and show different distribution in the genome. The polymorphism ratios of retroelements suggest that Nikita and BAGY2 have been active retrotransposons in the human genome.

Download Full-text

Population genomic data in spider mites point to a role for local adaptation in shaping range shifts

Evolutionary Applications ◽

10.1111/eva.13086 ◽

2020 ◽

Vol 13 (10) ◽

pp. 2821-2835

Author(s):

Lei Chen ◽

Jing‐Tao Sun ◽

Peng‐Yu Jin ◽

Ary A. Hoffmann ◽

Xiao‐Li Bing ◽

...

Keyword(s):

Local Adaptation ◽

Genomic Data ◽

Spider Mites ◽

Range Shifts ◽

Population Genomic

Download Full-text

Tandem Repeats in Bacillus: Unique Features and Taxonomic Distribution

International Journal of Molecular Sciences ◽

10.3390/ijms22105373 ◽

2021 ◽

Vol 22 (10) ◽

pp. 5373

Author(s):

Juan A. Subirana ◽

Xavier Messeguer

Keyword(s):

Tandem Repeats ◽

Bacterial Species ◽

Individual Species ◽

Large Set ◽

Bacterial Genomes ◽

Rna Molecules ◽

Variable Sequence ◽

Repeat Size ◽

Genomic Studies ◽

Dna Tandem Repeats

Little is known about DNA tandem repeats across prokaryotes. We have recently described an enigmatic group of tandem repeats in bacterial genomes with a constant repeat size but variable sequence. These findings strongly suggest that tandem repeat size in some bacteria is under strong selective constraints. Here, we extend these studies and describe tandem repeats in a large set of Bacillus. Some species have very few repeats, while other species have a large number. Most tandem repeats have repeats with a constant size (either 52 or 20–21 nt), but a variable sequence. We characterize in detail these intriguing tandem repeats. Individual species have several families of tandem repeats with the same repeat length and different sequence. This result is in strong contrast with eukaryotes, where tandem repeats of many sizes are found in any species. We discuss the possibility that they are transcribed as small RNA molecules. They may also be involved in the stabilization of the nucleoid through interaction with proteins. We also show that the distribution of tandem repeats in different species has a taxonomic significance. The data we present for all tandem repeats and their families in these bacterial species will be useful for further genomic studies.

Download Full-text

Inferring Adaptive Introgression Using Hidden Markov Models

10.1101/2020.08.02.232934 ◽

2020 ◽

Cited By ~ 1

Author(s):

Jesper Svedberg ◽

Vladimir Shchur ◽

Solomon Reinman ◽

Rasmus Nielsen ◽

Russell Corbett-Detig

Keyword(s):

Markov Models ◽

Hidden Markov ◽

Genomic Data ◽

Diverse Populations ◽

Powerful Method ◽

Adaptive Genetic Variation ◽

Adaptive Introgression ◽

Population Genomic ◽

Significant Interest ◽

Selection Parameters

AbstractAdaptive introgression - the flow of adaptive genetic variation between species or populations - has attracted significant interest in recent years and it has been implicated in a number of cases of adaptation, from pesticide resistance and immunity, to local adaptation. Despite this, methods for identification of adaptive introgression from population genomic data are lacking. Here, we present Ancestry_HMM-S, a Hidden Markov Model based method for identifying genes undergoing adaptive introgression and quantifying the strength of selection acting on them. Through extensive validation, we show that this method performs well on moderately sized datasets for realistic population and selection parameters. We apply Ancestry_HMM-S to a dataset of an admixed Drosophila melanogaster population from South Africa and we identify 17 loci which show signatures of adaptive introgression, four of which have previously been shown to confer resistance to insecticides. Ancestry_HMM-S provides a powerful method for inferring adaptive introgression in datasets that are typically collected when studying admixed populations. This method will enable powerful insights into the genetic consequences of admixture across diverse populations. Ancestry_HMM-S can be downloaded from https://github.com/jesvedberg/Ancestry_HMM-S/.

Download Full-text

Batch effects in population genomic studies with low‐coverage whole genome sequencing data: causes, detection, and mitigation

Molecular Ecology Resources ◽

10.1111/1755-0998.13559 ◽

2021 ◽

Author(s):

Runyang Nicolas Lou ◽

Nina Overgaard Therkildsen

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Batch Effects ◽

Sequencing Data ◽

Population Genomic ◽

Genomic Studies ◽

Low Coverage

Download Full-text

Clinical Value of NGS Genomic Studies for Clinical Management of Pediatric and Young Adult Bone Sarcomas

Cancers ◽

10.3390/cancers13215436 ◽

2021 ◽

Vol 13 (21) ◽

pp. 5436

Author(s):

Miriam Gutiérrez-Jimeno ◽

Piedad Alba-Pavón ◽

Itziar Astigarraga ◽

Teresa Imízcoz ◽

Elena Panizo-Morgado ◽

...

Keyword(s):

Targeted Therapy ◽

Young Adult ◽

Clinical Management ◽

Genomic Data ◽

Myxoid Liposarcoma ◽

Genetic Alteration ◽

Clinical Value ◽

Genomic Studies ◽

Risk Patients ◽

Genomic Characterization

Genomic techniques enable diagnosis and management of children and young adults with sarcomas by identifying high-risk patients and those who may benefit from targeted therapy or participation in clinical trials. Objective: to analyze the performance of an NGS gene panel for the clinical management of pediatric sarcoma patients. We studied 53 pediatric and young adult patients diagnosed with sarcoma, from two Spanish centers. Genomic data were obtained using the Oncomine Childhood Cancer Research Assay, and categorized according to their diagnostic, predictive, or prognostic value. In 44 (83%) of the 53 patients, at least one genetic alteration was identified. In 80% of these patients, the diagnosis was obtained (n = 11) or changed (n = 9), and thus genomic data affected therapy. The most frequent initial misdiagnosis was Ewing’s sarcoma, instead of myxoid liposarcoma (FUS-DDDIT3), rhabdoid soft tissue tumor (SMARCB1), or angiomatoid fibrous histiocytoma (EWSR1-CREB1). In our series, two patients had a genetic alteration with an FDA-approved targeted therapy, and 30% had at least one potentially actionable alteration. NGS-based genomic studies are useful and feasible in diagnosis and clinical management of pediatric sarcomas. Genomic characterization of these rare and heterogeneous tumors also helps in the search for prognostic biomarkers and therapeutic opportunities.

Download Full-text

Estimating Seven Coefficients of Pairwise Relatedness Using Population-Genomic Data

Genetics ◽

10.1534/genetics.116.190660 ◽

2017 ◽

Vol 206 (1) ◽

pp. 105-118 ◽

Cited By ~ 12

Author(s):

Matthew S. Ackerman ◽

Parul Johri ◽

Ken Spitze ◽

Sen Xu ◽

Thomas G. Doak ◽

...

Keyword(s):

Genomic Data ◽

Population Genomic

Download Full-text

Evolution and Adaptation of Forest and Crop Pathogens in the Anthropocene

Phytopathology ◽

10.1094/phyto-08-20-0358-fi ◽

2020 ◽

pp. PHYTO-08-20-035

Author(s):

Pauline Hessenauer ◽

Nicolas Feau ◽

Upinder Gill ◽

Benjamin Schwessinger ◽

Gurcharn S. Brar ◽

...

Keyword(s):

Population Genomics ◽

Production Systems ◽

Management Strategies ◽

Forest Trees ◽

Forest Production ◽

Population Genomic ◽

Genomic Studies ◽

Pathogen Populations ◽

Different Characteristics ◽

Global Food

Anthropocene marks the era when human activity is making a significant impact on earth, its ecological and biogeographical systems. The domestication and intensification of agricultural and forest production systems have had a large impact on plant and tree health. Some pathogens benefitted from these human activities and have evolved and adapted in response to the expansion of crop and forest systems, resulting in global outbreaks. Global pathogen genomics data including population genomics and high-quality reference assemblies are crucial for understanding the evolution and adaptation of pathogens. Crops and forest trees have remarkably different characteristics, such as reproductive time and the level of domestication. They also have different production systems for disease management with more intensive management in crops than forest trees. By comparing and contrasting results from pathogen population genomic studies done on widely different agricultural and forest production systems, we can improve our understanding of pathogen evolution and adaptation to different selection pressures. We find that in spite of these differences, similar processes such as hybridization, host jumps, selection, specialization, and clonal expansion are shaping the pathogen populations in both crops and forest trees. We propose some solutions to reduce these impacts and lower the probability of global pathogen outbreaks so that we can envision better management strategies to sustain global food production as well as ecosystem services.

Download Full-text

Measures of the Degree of Departure from Ignorable Sample Selection

Journal of Survey Statistics and Methodology ◽

10.1093/jssam/smz023 ◽

2019 ◽

Vol 8 (5) ◽

pp. 932-964 ◽

Cited By ~ 1

Author(s):

Roderick J A Little ◽

Brady T West ◽

Philip S Boonstra ◽

Jingwei Hu

Keyword(s):

Sample Selection ◽

Real Data ◽

Sampling Bias ◽

Inclusion Probability ◽

Probability Sampling ◽

Simple Index ◽

Central Value ◽

Credible Intervals ◽

Standardized Measure ◽

Fully Bayesian

Abstract With the current focus of survey researchers on “big data” that are not selected by probability sampling, measures of the degree of potential sampling bias arising from this nonrandom selection are sorely needed. Existing indices of this degree of departure from probability sampling, like the R-indicator, are based on functions of the propensity of inclusion in the sample, estimated by modeling the inclusion probability as a function of auxiliary variables. These methods are agnostic about the relationship between the inclusion probability and survey outcomes, which is a crucial feature of the problem. We propose a simple index of degree of departure from ignorable sample selection that corrects this deficiency, which we call the standardized measure of unadjusted bias (SMUB). The index is based on normal pattern-mixture models for nonresponse applied to this sample selection problem and is grounded in the model-based framework of nonignorable selection first proposed in the context of nonresponse by Don Rubin in 1976. The index depends on an inestimable parameter that measures the deviation from selection at random, which ranges between the values zero and one. We propose the use of a central value of this parameter, 0.5, for computing a point index, and computing the values of SMUB at zero and one to provide a range of the index in a sensitivity analysis. We also provide a fully Bayesian approach for computing credible intervals for the SMUB, reflecting uncertainty in the values of all of the input parameters. The proposed methods have been implemented in R and are illustrated using real data from the National Survey of Family Growth.

Download Full-text

Equitable Expanded Carrier Screening Needs Indigenous Clinical and Population Genomic Data

The American Journal of Human Genetics ◽

10.1016/j.ajhg.2020.06.005 ◽

2020 ◽

Vol 107 (2) ◽

pp. 175-182

Author(s):

Simon Easteal ◽

Ruth M. Arkell ◽

Renzo F. Balboa ◽

Shayne A. Bellingham ◽

Alex D. Brown ◽

...

Keyword(s):

Genomic Data ◽

Carrier Screening ◽

Expanded Carrier Screening ◽

Population Genomic

Download Full-text