A new framework for the study of apicomplexan diversity across environments
Apicomplexans are a group of microbial eukaryotes that contain some of the most well-studied parasites, including widespread intracellular pathogens of mammals such as Toxoplasma and Plasmodium (the agent of malaria), and emergent pathogens like Cryptosporidium and Babesia. Decades of research have illuminated the pathogenic mechanisms, molecular biology, and genomics of model apicomplexans, but we know surprisingly little about their diversity and distribution in natural environments. In this study we analyze the distribution of apicomplexans across a range of both host-associated and free-living environments, covering animal hosts from cnidarians to mammals, and ecosystems from soils to fresh and marine waters. Using publicly available small subunit (SSU) rRNA gene databases, high-throughput environmental sequencing (HTES) surveys such as Tara Oceans and VAMPS, as well as our own generated HTES data, we developed an apicomplexan reference database, which includes the largest apicomplexan SSU rRNA tree available to date and encompasses comprehensive sampling of this group and their closest relatives. This tree allowed us to identify and correct incongruences in the molecular identification of sequences, particularly within the hematozoans and the gregarines. Analyzing the diversity and distribution of apicomplexans in HTES studies with this curated reference database also showed a widespread, and quantitatively important, presence of apicomplexans across a variety of free-living environments. These data allow us to describe a remarkable molecular diversity of this group compared with our current knowledge, especially when compared with that identified from described apicomplexan species. This revision is most striking in marine environments, where potentially the most diverse apicomplexans apparently exist, but have not yet been formally recognized. The new database will be useful for both microbial ecology and epidemiological studies, and provide valuable reference for medical and veterinary diagnosis especially in cases of emerging, zoonotic, and cryptic infections.