Developing informative microsatellite markers for non-model species using reference mapping against a model species’ genome

Chih-Ming Hung; Ai-Yun Yu; Yu-Ting Lai; Pei-Jen L. Shaner

doi:10.1038/srep23087

Developing informative microsatellite markers for non-model species using reference mapping against a model species’ genome

Scientific Reports ◽

10.1038/srep23087 ◽

2016 ◽

Vol 6 (1) ◽

Cited By ~ 5

Author(s):

Chih-Ming Hung ◽

Ai-Yun Yu ◽

Yu-Ting Lai ◽

Pei-Jen L. Shaner

Keyword(s):

Microsatellite Markers ◽

Rodent Species ◽

Background Information ◽

Model Species ◽

Breeding Programs ◽

Sequencing Technologies ◽

A Genome ◽

Wide Range ◽

Genomic Locations ◽

Traditional Approaches

Abstract Microsatellites have a wide range of applications from behavioral biology, evolution, to agriculture-based breeding programs. The recent progress in the next-generation sequencing technologies and the rapidly increasing number of published genomes may greatly enhance the current applications of microsatellites by turning them from anonymous to informative markers. Here we developed an approach to anchor microsatellite markers of any target species in a genome of a related model species, through which the genomic locations of the markers, along with any functional genes potentially linked to them, can be revealed. We mapped the shotgun sequence reads of a non-model rodent species Apodemus semotus against the genome of a model species, Mus musculus, and presented 24 polymorphic microsatellite markers with detailed background information for A. semotus in this study. The developed markers can be used in other rodent species, especially those that are closely related to A. semotus or M. musculus. Compared to the traditional approaches based on DNA cloning, our approach is likely to yield more loci for the same cost. This study is a timely demonstration of how a research team can efficiently generate informative (neutral or function-associated) microsatellite markers for their study species and unique biological questions.

Download Full-text

Towards population genomics in non-model species with large genomes: a case study of the marine zooplankton Calanus finmarchicus

Royal Society Open Science ◽

10.1098/rsos.180608 ◽

2019 ◽

Vol 6 (2) ◽

pp. 180608 ◽

Cited By ~ 11

Author(s):

Marvin Choquet ◽

Irina Smolina ◽

Anusha K. S. Dhanasiri ◽

Leocadio Blanco-Bercial ◽

Martina Kopp ◽

...

Keyword(s):

Population Genomics ◽

Single Copy ◽

Calanus Finmarchicus ◽

Model Species ◽

Reduced Representation ◽

Marine Copepod ◽

Marine Zooplankton ◽

Sequencing Technologies ◽

A Genome ◽

Large Genomes

Advances in next-generation sequencing technologies and the development of genome-reduced representation protocols have opened the way to genome-wide population studies in non-model species. However, species with large genomes remain challenging, hampering the development of genomic resources for a number of taxa including marine arthropods. Here, we developed a genome-reduced representation method for the ecologically important marine copepod Calanus finmarchicus (haploid genome size of 6.34 Gbp). We optimized a capture enrichment-based protocol based on 2656 single-copy genes, yielding a total of 154 087 high-quality SNPs in C. finmarchicus including 62 372 in common among the three locations tested. The set of capture probes was also successfully applied to the congeneric C. glacialis . Preliminary analyses of these markers revealed similar levels of genetic diversity between the two Calanus species, while populations of C. glacialis showed stronger genetic structure compared to C. finmarchicus . Using this powerful set of markers, we did not detect any evidence of hybridization between C. finmarchicus and C. glacialis . Finally, we propose a shortened version of our protocol, offering a promising solution for population genomics studies in non-model species with large genomes.

Download Full-text

Improved contiguity of the threespine stickleback genome using long-read sequencing

10.1101/2020.06.30.170787 ◽

2020 ◽

Cited By ~ 1

Author(s):

Shivangi Nath ◽

Daniel E. Shaw ◽

Michael A. White

Keyword(s):

Gasterosteus Aculeatus ◽

Genetic Model ◽

Threespine Stickleback ◽

Model Species ◽

Long Distance ◽

Sequencing Technologies ◽

A Genome ◽

Long Read ◽

The Cost ◽

Stickleback Genome

AbstractWhile the cost and time for assembling a genome have drastically reduced, it still remains a challenge to assemble a highly contiguous genome. These challenges are rapidly being overcome by the integration of long-read sequencing technologies. Here, we use long sequencing reads to improve the contiguity of the threespine stickleback fish (Gasterosteus aculeatus) genome, a prominent genetic model species. Using Pacific Biosciences sequencing, we were able to fill over 76% of the gaps in the genome, improving contiguity over five-fold. Our approach was highly accurate, validated by 10X Genomics long-distance linked-reads. In addition to closing a majority of gaps, we were able to assemble segments of telomeres and centromeres throughout the genome. This highlights the power of using long sequencing reads to assemble highly repetitive and difficult to assemble regions of genomes. This latest genome build has been released through a newly designed community genome browser that aims to consolidate the growing number of genomics datasets available for the threespine stickleback fish.

Download Full-text

MOSGA: Modular Open-Source Genome Annotator

Bioinformatics ◽

10.1093/bioinformatics/btaa1003 ◽

2020 ◽

Author(s):

Roman Martin ◽

Thomas Hackl ◽

Georges Hattab ◽

Matthias G Fischer ◽

Dominik Heider

Keyword(s):

Open Source ◽

Source Code ◽

Supplementary Information ◽

Web Interface ◽

Fully Integrated ◽

Sequencing Technologies ◽

A Genome ◽

Wide Range ◽

User Friendly ◽

Eukaryotic Genomes

Abstract Motivation The generation of high-quality assemblies, even for large eukaryotic genomes, has become a routine task for many biologists thanks to recent advances in sequencing technologies. However, the annotation of these assemblies—a crucial step toward unlocking the biology of the organism of interest—has remained a complex challenge that often requires advanced bioinformatics expertise. Results Here, we present MOSGA (Modular Open-Source Genome Annotator), a genome annotation framework for eukaryotic genomes with a user-friendly web-interface that generates and integrates annotations from various tools. The aggregated results can be analyzed with a fully integrated genome browser and are provided in a format ready for submission to NCBI. MOSGA is built on a portable, customizable and easily extendible Snakemake backend, and thus, can be tailored to a wide range of users and projects. Availability and implementation We provide MOSGA as a web service at https://mosga.mathematik.uni-marburg.de and as a docker container at registry.gitlab.com/mosga/mosga: latest. Source code can be found at https://gitlab.com/mosga/mosga Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

URMAP, an ultra-fast read mapper

PeerJ ◽

10.7717/peerj.9338 ◽

2020 ◽

Vol 8 ◽

pp. e9338

Author(s):

Robert Edgar

Keyword(s):

Variant Calling ◽

Mapping Algorithm ◽

Sequencing Technologies ◽

Mapping Software ◽

A Genome ◽

Biological Studies ◽

Wide Range ◽

Order Of Magnitude ◽

Comparable Accuracy ◽

Validation Tests

Mapping of reads to reference sequences is an essential step in a wide range of biological studies. The large size of datasets generated with next-generation sequencing technologies motivates the development of fast mapping software. Here, I describe URMAP, a new read mapping algorithm. URMAP is an order of magnitude faster than BWA with comparable accuracy on several validation tests. On a Genome in a Bottle (GIAB) variant calling test with 30× coverage 2×150 reads, URMAP achieves high accuracy (precision 0.998, sensitivity 0.982 and F-measure 0.990) with the strelka2 caller. However, GIAB reference variants are shown to be biased against repetitive regions which are difficult to map and may therefore pose an unrealistically easy challenge to read mappers and variant callers.

Download Full-text

Characterization of Annur and Bedakam Ecotypes of Coconut from Kerala State, India, Using Microsatellite Markers

International Journal of Biodiversity ◽

10.1155/2014/260895 ◽

2014 ◽

Vol 2014 ◽

pp. 1-7 ◽

Cited By ~ 3

Author(s):

M. K. Rajesh ◽

K. Samsudeen ◽

P. Rejusha ◽

C. Manjula ◽

Shafeeq Rahman ◽

...

Keyword(s):

Genetic Diversity ◽

Microsatellite Markers ◽

West Coast ◽

Clustering Analysis ◽

Climatic Conditions ◽

Breeding Programs ◽

Wide Range ◽

History Of ◽

Kerala State

The coconut palm is versatile in its adaptability to a wide range of soil and climatic conditions. A long history of its cultivation has resulted in development of many ecotypes, which are adapted to various agro-eco factors prevalent in a particular region. These ecotypes usually are known by the location where they are grown. It is important to explore such adaptation in the coconut population for better utilization of these ecotypes in coconut breeding programs. The aim of the present study was to identify the genetic diversity of the Bedakam and Annur ecotypes of coconut and compare these ecotypes with predominant West Coast Tall (WCT) populations, from which they are presumed to have been derived, using microsatellite markers. All the 17 microsatellite markers used in the study revealed 100% polymorphism. The clustering analysis showed that Annur and Bedakam ecotypes were two separate and distinct populations compared to WCT. It was also evident from the clustering that Annur ecotype was closer to WCT than Bedakam ecotype.

Download Full-text

The way of value of Correlation of genomic DNA microsatellite loci and live weight of Chukchi reindeer

Genetika i razvedenie zhivotnyh ◽

10.31043/2410-2733-2020-3-27-32 ◽

2020 ◽

pp. 27-32

Author(s):

V. Dodokhov ◽

N. Pavlova ◽

T. Rumyantseva ◽

L. Kalashnikova

Keyword(s):

Microsatellite Markers ◽

Agricultural Production ◽

Live Weight ◽

Biodiversity Loss ◽

Tribal Community ◽

High Germination ◽

Wide Range ◽

The Mean ◽

High Level ◽

Dna Microsatellite

The article presents the genetic characteristic of the Chukchi reindeer breed. The object of the study was of the Chukchi reindeer. In recent years, the number of reindeer of the Chukchi breed has declined sharply. Reduced reindeer numbers could lead to biodiversity loss. The Chukchi breed of deer has good meat qualities, has high germination viability and is adapted in adverse tundra conditions of Yakutia. Herding of the Chukchi breed of deer in Yakutia are engaged only in the Nizhnekolymsky district. There are four generic communities and the largest of which is the agricultural production cooperative of nomadic tribal community «Turvaurgin», which was chosen to assess the genetic processes of breed using microsatellite markers: Rt6, BMS1788, Rt 30, Rt1, Rt9, FCB193, Rt7, BMS745, C 143, Rt24, OheQ, C217, C32, NVHRT16, T40, C276. It was found that microsatellite markers have a wide range of alleles and generally have a high informative value for identifying of genetic differences between animals and groups of animal. The number of identified alleles is one of the indicators of the genetic diversity of the population. The total number of detected alleles was 127. The Chukchi breed of deer is characterized by a high level of heterozygosity, and the random crossing system prevails over inbreeding in the population. On average, there were 7.9 alleles (Na) per locus, and the mean number of effective alleles (Ne) was 4.1. The index of fixation averaged 0.001. The polymorphism index (PIC) ranged from 0.217 to 0.946, with an average of 0.695.

Download Full-text

Genetic, lifestyle, and health-related characteristics of adults without celiac disease who follow a gluten-free diet: a population-based study of 124,447 participants

American Journal of Clinical Nutrition ◽

10.1093/ajcn/nqaa291 ◽

2020 ◽

Author(s):

Thomas J Littlejohns ◽

Amanda Y Chong ◽

Naomi E Allen ◽

Matthew Arnold ◽

Kathryn E Bradbury ◽

...

Keyword(s):

Celiac Disease ◽

Genome Wide Association Study ◽

Population Based ◽

Gluten Free Diet ◽

Hospital Inpatient ◽

Gluten Free ◽

A Genome ◽

Wide Range ◽

Health Related ◽

Reported Health

ABSTRACT Background The number of gluten-free diet followers without celiac disease (CD) is increasing. However, little is known about the characteristics of these individuals. Objectives We address this issue by investigating a wide range of genetic and phenotypic characteristics in association with following a gluten-free diet. Methods The cross-sectional association between lifestyle and health-related characteristics and following a gluten-free diet was investigated in 124,447 women and men aged 40–69 y from the population-based UK Biobank study. A genome-wide association study (GWAS) of following a gluten-free diet was performed. Results A total of 1776 (1.4%) participants reported following a gluten-free diet. Gluten-free diet followers were more likely to be women, nonwhite, highly educated, living in more socioeconomically deprived areas, former smokers, have lost weight in the past year, have poorer self-reported health, and have made dietary changes as a result of illness. Conversely, these individuals were less likely to consume alcohol daily, be overweight or obese, have hypertension, or use cholesterol-lowering medication. Participants with hospital inpatient diagnosed blood and immune mechanism disorders (OR: 1.62; 95% CI: 1.18, 2.21) and non-CD digestive system diseases (OR: 1.58; 95% CI: 1.42, 1.77) were more likely to follow a gluten-free diet. The GWAS demonstrated that no genetic variants were associated with being a gluten-free diet follower. Conclusions Gluten-free diet followers have a better cardiovascular risk profile than non-gluten-free diet followers but poorer self-reported health and a higher prevalence of blood and immune disorders and digestive conditions. Reasons for following a gluten-free diet warrant further investigation.

Download Full-text

A genome-scale metabolic model of Saccharomyces cerevisiae that integrates expression constraints and reaction thermodynamics

Nature Communications ◽

10.1038/s41467-021-25158-6 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Omid Oftadeh ◽

Pierre Salvy ◽

Maria Masid ◽

Maxime Curvat ◽

Ljubisa Miskovic ◽

...

Keyword(s):

Saccharomyces Cerevisiae ◽

Expression System ◽

Biomass Composition ◽

Industrial Biotechnology ◽

Overflow Metabolism ◽

Cellular Expression ◽

A Genome ◽

Wide Range ◽

Eukaryotic Organisms ◽

Genome Scale

AbstractEukaryotic organisms play an important role in industrial biotechnology, from the production of fuels and commodity chemicals to therapeutic proteins. To optimize these industrial systems, a mathematical approach can be used to integrate the description of multiple biological networks into a single model for cell analysis and engineering. One of the most accurate models of biological systems include Expression and Thermodynamics FLux (ETFL), which efficiently integrates RNA and protein synthesis with traditional genome-scale metabolic models. However, ETFL is so far only applicable for E. coli. To adapt this model for Saccharomyces cerevisiae, we developed yETFL, in which we augmented the original formulation with additional considerations for biomass composition, the compartmentalized cellular expression system, and the energetic costs of biological processes. We demonstrated the ability of yETFL to predict maximum growth rate, essential genes, and the phenotype of overflow metabolism. We envision that the presented formulation can be extended to a wide range of eukaryotic organisms to the benefit of academic and industrial research.

Download Full-text

Determinants of vitamin D status in Kenyan calves

Scientific Reports ◽

10.1038/s41598-020-77209-5 ◽

2020 ◽

Vol 10 (1) ◽

Author(s):

Rebecca Callaby ◽

Emma Hurst ◽

Ian Handel ◽

Phil Toye ◽

Barend M. de C. Bronsvoort ◽

...

Keyword(s):

Infectious Disease ◽

Vitamin D ◽

Health Outcomes ◽

Critical Role ◽

Subsequent Development ◽

Vitamin D Metabolites ◽

Vitamin D Status ◽

Skeletal Health ◽

A Genome ◽

Wide Range

AbstractVitamin D plays a critical role in calcium homeostasis and in the maintenance and development of skeletal health. Vitamin D status has increasingly been linked to non-skeletal health outcomes such as all-cause mortality, infectious diseases and reproductive outcomes in both humans and veterinary species. We have previously demonstrated a relationship between vitamin D status, assessed by the measurement of serum concentrations of the major vitamin D metabolite 25 hydroxyvitamin D (25(OH)D), and a wide range of non-skeletal health outcomes in companion and wild animals. The aims of this study were to define the host and environmental factors associated with vitamin D status in a cohort of 527 calves from Western Kenya which were part of the Infectious Disease of East African Livestock (IDEAL) cohort. A secondary aim was to explore the relationship between serum 25(OH)D concentrations measured in 7-day old calves and subsequent health outcomes over the following 12 months. A genome wide association study demonstrated that both dietary and endogenously produced vitamin D metabolites were under polygenic control in African calves. In addition, we found that neonatal vitamin D status was not predictive of the subsequent development of an infectious disease event or mortality over the 12 month follow up period.

Download Full-text

Visualising intrinsic disorder and conformational variation in protein ensembles

Faraday Discussions ◽

10.1039/c3fd00138e ◽

2014 ◽

Vol 169 ◽

pp. 179-193 ◽

Cited By ~ 6

Author(s):

Julian Heinrich ◽

Michael Krone ◽

Seán I. O'Donoghue ◽

Daniel Weiskopf

Keyword(s):

Intrinsic Disorder ◽

Molecular Graphics ◽

Post Translational Modifications ◽

3D Structures ◽

Intrinsically Disordered ◽

Intrinsically Disordered Regions ◽

Wide Range ◽

Conformational Variations ◽

Protein Ensembles ◽

Traditional Approaches

Intrinsically disordered regions (IDRs) in proteins are still not well understood, but are increasingly recognised as important in key biological functions, as well as in diseases. IDRs often confound experimental structure determination—however, they are present in many of the available 3D structures, where they exhibit a wide range of conformations, from ill-defined and highly flexible to well-defined upon binding to partner molecules, or upon post-translational modifications. Analysing such large conformational variations across ensembles of 3D structures can be complex and difficult; our goal in this paper is to improve this situation by augmenting traditional approaches (molecular graphics and principal components) with methods from human–computer interaction and information visualisation, especially parallel coordinates. We present a new tool integrating these approaches, and demonstrate how it can dissect ensembles to reveal functional insights into conformational variation and intrinsic disorder.

Download Full-text