Chromosomal scale length variation of germline DNA can predict individual cancer risk
AbstractInherited factors are thought to be responsible for a substantial fraction of many different forms of cancer. However, individual cancer risk cannot currently be well quantified by analyzing germ line DNA. Most analyses of germline DNA focus on the additive effects of single nucleotide polymorphisms (SNPs) found. Here we show that chromosomal-scale length variation of germline DNA can be used to predict whether a person will develop cancer. In two independent datasets, the Cancer Genome Atlas (TCGA) project and the UK Biobank, we could classify whether or not a patient had a certain cancer based solely on chromosomal scale length variation. In the TCGA data, we found that all 32 different types of cancer could be predicted better than chance using chromosomal scale length variation data. We found a model that could predict ovarian cancer in women with an area under the receiver operator curve, AUC=0.89. In the UK Biobank data, we could predict breast cancer in women with an AUC=0.83. This method could be used to develop genetic risk scores for other conditions known to have a substantial genetic component and complements genetic risk scores derived from SNPs.