scholarly journals Removing reference bias and improving indel calling in ancient DNA data analysis by mapping to a sequence variation graph

2019 ◽  
Author(s):  
Rui Martiniano ◽  
Erik Garrison ◽  
Eppie R. Jones ◽  
Andrea Manica ◽  
Richard Durbin

AbstractBackgroundDuring the last decade, the analysis of ancient DNA (aDNA) sequence has become a powerful tool for the study of past human populations. However, the degraded nature of aDNA means that aDNA molecules are short and frequently mutated by post-mortem chemical modifications. These features decrease read mapping accuracy and increase reference bias, in which reads containing non-reference alleles are less likely to be mapped than those containing reference alleles. Recently, alternative approaches for read mapping and genetic variation analysis have been developed that replace the linear reference by a variation graph which includes known alternative variants at each genetic locus. Here, we evaluate the use of variation graph software vg to avoid reference bias for ancient DNA and compare our approach to existing methods.ResultsWe used vg to align simulated and real aDNA samples to a variation graph containing 1000 Genome Project variants, and compared these with the same data aligned with bwa to the human linear reference genome. We show that use of vg leads to a balanced allelic representation at polymorphic sites, effectively removing reference bias, and more sensitive variant detection in comparison with bwa, especially for insertions and deletions (indels). Alternative approaches that use relaxed bwa parameter settings or filter bwa alignments can also reduce bias, but can have lower sensitivity than vg, particularly for indels.ConclusionsOur findings demonstrate that aligning aDNA sequences to variation graphs effectively mitigates the impact of reference bias when analysing aDNA, while retaining mapping sensitivity and allowing detection of variation, in particular indel variation, that was previously missed.

2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Rui Martiniano ◽  
Erik Garrison ◽  
Eppie R. Jones ◽  
Andrea Manica ◽  
Richard Durbin

Abstract Background During the last decade, the analysis of ancient DNA (aDNA) sequence has become a powerful tool for the study of past human populations. However, the degraded nature of aDNA means that aDNA molecules are short and frequently mutated by post-mortem chemical modifications. These features decrease read mapping accuracy and increase reference bias, in which reads containing non-reference alleles are less likely to be mapped than those containing reference alleles. Alternative approaches have been developed to replace the linear reference with a variation graph which includes known alternative variants at each genetic locus. Here, we evaluate the use of variation graph software to avoid reference bias for aDNA and compare with existing methods. Results We use to align simulated and real aDNA samples to a variation graph containing 1000 Genome Project variants and compare with the same data aligned with to the human linear reference genome. Using leads to a balanced allelic representation at polymorphic sites, effectively removing reference bias, and more sensitive variant detection in comparison with , especially for insertions and deletions (indels). Alternative approaches that use relaxed parameter settings or filter alignments can also reduce bias but can have lower sensitivity than , particularly for indels. Conclusions Our findings demonstrate that aligning aDNA sequences to variation graphs effectively mitigates the impact of reference bias when analyzing aDNA, while retaining mapping sensitivity and allowing detection of variation, in particular indel variation, that was previously missed.


Author(s):  
Adrien Oliva ◽  
Raymond Tobler ◽  
Alan Cooper ◽  
Bastien Llamas ◽  
Yassine Souilmi

Abstract The current standard practice for assembling individual genomes involves mapping millions of short DNA sequences (also known as DNA ‘reads’) against a pre-constructed reference genome. Mapping vast amounts of short reads in a timely manner is a computationally challenging task that inevitably produces artefacts, including biases against alleles not found in the reference genome. This reference bias and other mapping artefacts are expected to be exacerbated in ancient DNA (aDNA) studies, which rely on the analysis of low quantities of damaged and very short DNA fragments (~30–80 bp). Nevertheless, the current gold-standard mapping strategies for aDNA studies have effectively remained unchanged for nearly a decade, during which time new software has emerged. In this study, we used simulated aDNA reads from three different human populations to benchmark the performance of 30 distinct mapping strategies implemented across four different read mapping software—BWA-aln, BWA-mem, NovoAlign and Bowtie2—and quantified the impact of reference bias in downstream population genetic analyses. We show that specific NovoAlign, BWA-aln and BWA-mem parameterizations achieve high mapping precision with low levels of reference bias, particularly after filtering out reads with low mapping qualities. However, unbiased NovoAlign results required the use of an IUPAC reference genome. While relevant only to aDNA projects where reference population data are available, the benefit of using an IUPAC reference demonstrates the value of incorporating population genetic information into the aDNA mapping process, echoing recent results based on graph genome representations.


2018 ◽  
Author(s):  
Torsten Günther ◽  
Carl Nettelblad

AbstractHigh quality reference genomes are an important resource in genomic research projects. A consequence is that DNA fragments carrying the reference allele will be more likely to map suc-cessfully, or receive higher quality scores. This reference bias can have effects on downstream population genomic analysis when heterozygous sites are falsely considered homozygous for the reference allele.In palaeogenomic studies of human populations, mapping against the human reference genome is used to identify endogenous human sequences. Ancient DNA studies usually operate with low sequencing coverages and fragmentation of DNA molecules causes a large proportion of the sequenced fragments to be shorter than 50 bp – reducing the amount of accepted mismatches, and increasing the probability of multiple matching sites in the genome. These ancient DNA specific properties are potentially exacerbating the impact of reference bias on downstream analyses, especially since most studies of ancient human populations use pseudohaploid data, i.e. they randomly sample only one sequencing read per site.We show that reference bias is pervasive in published ancient DNA sequence data of pre-historic humans with some differences between individual genomic regions. We illustrate that the strength of reference bias is negatively correlated with fragment length. Reference bias can cause differences in the results of downstream analyses such as population affinities, heterozygosity estimates and estimates of archaic ancestry. These spurious results highlight how important it is to be aware of these technical artifacts and that we need strategies to mitigate the effect. Therefore, we suggest some post-mapping filtering strategies to resolve reference bias which help to reduce its impact substantially.


2004 ◽  
Vol 34 (1) ◽  
pp. 113-124 ◽  
Author(s):  
K. R. BRUCE ◽  
H. STEIGER ◽  
N. M. KOERNER ◽  
M. ISRAEL ◽  
S. N. YOUNG

Background. Separate lines of research link lowered serotonin tone to interpersonal submissiveness and bulimia nervosa (BN). We explored the impact of co-morbid avoidant personality disorder (APD), as a proxy for submissiveness, on behavioural inhibition and serotonin function in women with BN.Method. Participants included women with BN with co-morbid APD (BNA+, N=13); women with BN but without APD (BNA−, N=23), and control women with neither BN nor APD (N=23). The women were assessed for psychopathological tendencies and eating disorder symptoms, and participated in a computerized laboratory task that measured behavioural inhibition and disinhibition. Participants also provided blood samples for measurement of serial prolactin responses following oral administration of the partial 5-HT agonist meta-chlorophenylpiperazine (m-CPP).Results. The BNA+ group had higher scores than the other groups on self-report measures of submissiveness, social avoidance, restricted emotional expression, affective instability and self-harming behaviours. Compared with the other groups, the BNA+ group tended to be more inhibited under cues for punishment on the computerized task and to have blunted prolactin response following m-CPP. The bulimic groups did not differ from each other on current eating symptoms or on frequencies of other mental disorders.Conclusions. Findings indicate that women with BN and co-morbid APD may be characterized by interpersonal submissiveness and avoidance, affective instability, self-harm, behavioural inhibition in response to threat and lower sensitivity to serotonergic activation. These findings may indicate common, serotonergic factors, associated with social submissiveness, behavioural inhibition to threat and BN.


Vaccines ◽  
2021 ◽  
Vol 9 (6) ◽  
pp. 629
Author(s):  
Megan M. Dunagan ◽  
Kala Hardy ◽  
Toru Takimoto

Influenza A virus (IAV) is a significant human pathogen that causes seasonal epidemics. Although various types of vaccines are available, IAVs still circulate among human populations, possibly due to their ability to circumvent host immune responses. IAV expresses two host shutoff proteins, PA-X and NS1, which antagonize the host innate immune response. By transcriptomic analysis, we previously showed that PA-X is a major contributor for general shutoff, while shutoff active NS1 specifically inhibits the expression of host cytokines, MHC molecules, and genes involved in innate immunity in cultured human cells. So far, the impact of these shutoff proteins in the acquired immune response in vivo has not been determined in detail. In this study, we analyzed the effects of PA-X and NS1 shutoff activities on immune response using recombinant influenza A/California/04/2009 viruses containing mutations affecting the expression of shutoff active PA-X and NS1 in a mouse model. Our data indicate that the virus without shutoff activities induced the strongest T and B cell responses. Both PA-X and NS1 reduced host immune responses, but shutoff active NS1 most effectively suppressed lymphocyte migration to the lungs, antibody production, and the generation of IAV specific CD4+ and CD8+ T cells. NS1 also prevented the generation of protective immunity against a heterologous virus challenge. These data indicate that shutoff active NS1 plays a major role in suppressing host immune responses against IAV infection.


Genes ◽  
2014 ◽  
Vol 5 (3) ◽  
pp. 518-535 ◽  
Author(s):  
Jessica Bailey ◽  
Margaret Pericak-Vance ◽  
Jonathan Haines

1990 ◽  
Vol 117 (2) ◽  
pp. 173-277 ◽  
Author(s):  
C. D. Daykin ◽  
G. B. Hey

AbstractA cash flow model is proposed as a way of analysing uncertainty in the future development of a general insurance company. The company is modelled alongside the market in aggregate so that the impact of changes in premium rates relative to the market can be assessed. An extensive computer model is developed along these lines, intended for use in practical applications by actuaries advising the management of genera1 insurance companies. Simulation methods are used to explore the consequences of uncertainty, particularly in regard to inflation and investments. Some comments are made on the role of actuaries in general insurance. Alternative approaches to describing the behaviour of an insurance firm in the market are considered.


2018 ◽  
Vol 19 (1) ◽  
Author(s):  
Farzaneh Salari ◽  
Fatemeh Zare-Mirakabad ◽  
Mehdi Sadeghi ◽  
Hassan Rokni-Zadeh
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document