The segregation of vocal circuits solves a credit assignment problem associated with multi-objective reinforcement learning
AbstractMotor circuits vary in topographic organization, ranging from a coarse relationship between neuron location and function to highly localized regions controlling specific behaviors. For unclear reasons, vocal learning circuits lie at this second extreme: they repeatedly evolved to be spatially segregated from other parts of the motor system. Here we show that spatially segregated motor circuits can solve a specific problem that arises when an animal tries to learn two things at once. We trained songbirds in vocal and place learning paradigms with brief strobe light flashes and noise bursts. Strobe light negatively reinforced place learning but did not affect song syllable learning. Noise bursts positively reinforced place preference but negatively reinforced syllable learning. These double dissociations indicate that vocalization-related reinforcement signals specifically target the vocal motor system, while place-related reinforcement signals specifically target the navigation system. Non-global, target-specific reinforcement signals have established utility in machine implementation of multi-objective learning. In vocal learners, such signals could enable an animal to practice vocalizing as it does other things such as forage for food or learn to walk.