Haplotype assignment of longitudinal viral deep-sequencing data using co-variation of variant frequencies
ABSTRACTLongitudinal deep sequencing of viruses can provide detailed information about intra-host evolutionary dynamics including how viruses interact with and transmit between hosts. Many analyses require haplotype reconstruction, identifying which variants are co-located on the same genomic element. Most current methods to perform this reconstruction are based on a high density of variants and cannot perform this reconstruction for slowly evolving viruses. We present a new approach, HaROLD (HAplotype Reconstruction Of Longitudinal Deep sequencing data), which performs this reconstruction based on identifying co-varying variant frequencies using a probabilistic framework. We test this method with synthetic data sets of mixed cytomegalovirus and norovirus genomes, demonstrating high accuracy when longitudinal samples are available.