Pervasive Translation in Mycobacterium tuberculosis
ABSTRACTORF boundaries in bacterial genomes have largely been drawn by gene prediction algorithms. These algorithms often fail to predict ORFs with non-canonical features. Recent developments in genome-scale mapping of translation have facilitated the empirical identification of ORFs. Here, we use ribosome profiling approaches to map initiating and elongating ribosomes in Mycobacterium tuberculosis. Thus, we identify over 1,000 novel ORFs, revealing that much of the genome encodes proteins in overlapping reading frames, and/or on both strands. Most of the novel ORFs are short (sORFs), impeding their identification by traditional methods. The strong codon bias that characterizes annotated mycobacterial ORFs is not evident in the aggregate novel sORFs; hence most are unlikely to encode functional proteins. Our data suggest that bacterial transcriptomes are subject to pervasive translation. We speculate that the inefficiency of expressing spurious sORFs may be offset by positive contributions to M. tuberculosis biology through activities of a small subset.