scholarly journals Identifying and Tracking Low-Frequency Virus-Specific TCR Clonotypes Using High-Throughput Sequencing

Cell Reports ◽  
2018 ◽  
Vol 25 (9) ◽  
pp. 2369-2378.e4 ◽  
Author(s):  
Kyle Wolf ◽  
Tyler Hether ◽  
Pavlo Gilchuk ◽  
Amrendra Kumar ◽  
Ahmad Rajeh ◽  
...  
2016 ◽  
Vol 6 (1) ◽  
Author(s):  
Jingwen Wang ◽  
Tiina Skoog ◽  
Elisabet Einarsdottir ◽  
Tea Kaartokallio ◽  
Hannele Laivuori ◽  
...  

Viruses ◽  
2020 ◽  
Vol 12 (10) ◽  
pp. 1187
Author(s):  
David J. King ◽  
Graham Freimanis ◽  
Lidia Lasecka-Dykes ◽  
Amin Asfor ◽  
Paolo Ribeca ◽  
...  

High-throughput sequencing such as those provided by Illumina are an efficient way to understand sequence variation within viral populations. However, challenges exist in distinguishing process-introduced error from biological variance, which significantly impacts our ability to identify sub-consensus single-nucleotide variants (SNVs). Here we have taken a systematic approach to evaluate laboratory and bioinformatic pipelines to accurately identify low-frequency SNVs in viral populations. Artificial DNA and RNA “populations” were created by introducing known SNVs at predetermined frequencies into template nucleic acid before being sequenced on an Illumina MiSeq platform. These were used to assess the effects of abundance and starting input material type, technical replicates, read length and quality, short-read aligner, and percentage frequency thresholds on the ability to accurately call variants. Analyses revealed that the abundance and type of input nucleic acid had the greatest impact on the accuracy of SNV calling as measured by a micro-averaged Matthews correlation coefficient score, with DNA and high RNA inputs (107 copies) allowing for variants to be called at a 0.2% frequency. Reduced input RNA (105 copies) required more technical replicates to maintain accuracy, while low RNA inputs (103 copies) suffered from consensus-level errors. Base errors identified at specific motifs identified in all technical replicates were also identified which can be excluded to further increase SNV calling accuracy. These findings indicate that samples with low RNA inputs should be excluded for SNV calling and reinforce the importance of optimising the technical and bioinformatics steps in pipelines that are used to accurately identify sequence variants.


2014 ◽  
Author(s):  
Richard W Lusk

BackgroundTrace quantities of contaminating DNA are widespread in the laboratory environment, but their presence has received little attention in the context of high throughput sequencing. This issue is highlighted by recent works that have rested controversial claims upon sequencing data that appear to support the presence of unexpected exogenous species.ResultsI used reads that preferentially aligned to alternate genomes to infer the distribution of potential contaminant species in a set of independent sequencing experiments. I confirmed that dilute samples are more exposed to contaminating DNA, and, focusing on four single-cell sequencing experiments, found that these contaminants appear to originate from a wide diversity of clades. Although negative control libraries prepared from "blank" samples recovered the highest-frequency contaminants, low-frequency contaminants, which appeared to make heterogeneous contributions to samples prepared in parallel within a single experiment, were not well controlled for. I used these results to show that, despite heavy replication and plausible controls, contamination can explain all of the observations used to support a recent claim that complete genes pass from food to human blood.ConclusionsContamination must be considered a potential source of signals of exogenous species in sequencing data, even if these signals are replicated in independent experiments, vary across conditions, or indicate a species which seems a priori unlikely to contaminate. Negative control libraries processed in parallel are essential to control for contaminant DNAs, but their limited ability to recover low-frequency contaminants must be recognized.


Sign in / Sign up

Export Citation Format

Share Document