Incomplete annotation of disease-associated genes is limiting our understanding of Mendelian and complex neurogenetic disorders
AbstractThere is growing evidence to suggest that human gene annotation remains incomplete, with a disproportionate impact on the brain transcriptome. We used RNA-sequencing data from GTEx to detect novel transcription in an annotation-agnostic manner across 13 human brain regions and 28 human tissues. We found that genes highly expressed in brain are significantly more likely to be re-annotated, as are genes associated with Mendelian and complex neurodegenerative disorders. We improved the annotation of 63% of known OMIM-morbid genes and 65% of those with a neurological phenotype. We determined that novel transcribed regions, particularly those identified in brain, tend to be poorly conserved across mammals but are significantly depleted for genetic variation within humans. As exemplified by SNCA, we explored the implications of re-annotation for Mendelian and complex Parkinson’s disease. We validated in silico and experimentally a novel, brain-specific, potentially protein-coding exon of SNCA. We release our findings as tissue-specific transcriptomes in BED format and via vizER: http://rytenlab.com/browser/app/vizER. Together these resources will facilitate basic genomics research with the greatest impact on neurogenetics.