SMAP: exploiting high-throughput sequencing data of patient derived xenografts
AbstractBackgroundPatient-derived xenograft is the model of reference in oncology fordrug response analyses. Xenografts samples have the specificity to be composedof cells from both the graft and the host species. Sequencing analysis ofxenograft samples therefore requires specific processing methods to properlyreconstruct genomic profiles of both the host and graft compartments.ResultsWe propose a novel xenograft sequencing process pipeline termedSMAP for Simultaneous mapping. SMAP integrates the distinction of host andgraft sequencing reads to the mapping process by simultaneously aligning to bothgenome references. We show that SMAP increases accuracy of species-assignmentwhile reducing the number of discarded ambiguous reads compared to otherexisting methods. Moreover, SMAP includes a module called SMAP-fuz toimprove the detection of chimeric transcript fusion in xenograft RNAseq data. Finally, we apply SMAP on a real dataset and show the relevance of pathway andcell population analysis of the tumoral and stromal compartments.ConclusionsIn high-throughput sequencing analysis of xenografts, our resultsshow that: i. the use of ad hoc sequence processing methods is essential, ii. highsequence homology does not introduce a significant bias when proper methodsare used and iii. the detection of fusion transcripts can be improved using ourapproach. SMAP is available on GitHub: cit-bioinfo.github.io/SMAP.