Prioritizing variants in complete Hereditary Breast and Ovarian Cancer (HBOC) genes in patients lacking known BRCA mutations
BRCA1andBRCA2testing for HBOC does not identify all pathogenic variants. Sequencing of 20 complete genes in HBOC patients with uninformative test results (N=287), including non-coding and flanking sequences ofATM,BARD1,BRCA1,BRCA2,CDH1,CHEK2,EPCAM,MLH1,MRE11A,MSH2,MSH6,MUTYH,NBN,PALB2,PMS2,PTEN,RAD51B,STK11,TP53, andXRCC2, identified 38,372 unique variants. We apply information theory (IT) to predict novel functions for and prioritize non-coding variants of uncertain significance (VUS) in throughout regulatory, coding, and intronic regions based on changes in binding sites in these genesof these genes. Besides mRNA splicing, IT provides a common framework to evaluate potential affinity changes inin transcription factor (TFBSs), splicing regulatory (SRBSs), and RNA-binding protein (RBBSs) protein binding sites following mutationat mutated binding sites. We prioritized variants affecting the strengths of 10 variants affecting splice sites (4 natural, 6 cryptic), 148 SRBS, 36 TFBS, and 31 RBBS binding strength-affecting variantss. Three variants were also prioritized based on their predicted effects on mRNA secondary (2°) structure, and 17 for pseudoexon activation. Additionally, 4 frameshift, 2 in-frame deletions, and 5 stop-gain mutations were identified. When combined with pedigree information, complete gene sequence analysis can focus attention on a limited set of variants in a wide spectrum of functional mutation types for downstream functional and co-segregation analysis.