Effect of SNP origin on analyses of genetic diversity in cattle

Laercio R. Porto Neto; William Barendse

doi:10.1071/an10073

Effect of SNP origin on analyses of genetic diversity in cattle

Animal Production Science ◽

10.1071/an10073 ◽

2010 ◽

Vol 50 (8) ◽

pp. 792 ◽

Cited By ~ 11

Author(s):

Laercio R. Porto Neto ◽

William Barendse

Keyword(s):

Principal Component ◽

Ascertainment Bias ◽

Population Substructure ◽

Allele Frequency Distribution ◽

Population Bottlenecks ◽

Effective Population ◽

Extended Haplotype ◽

Single Nucleotide ◽

Study Selection ◽

Selection Gene

The methods of single nucleotide polymorphism (SNP) identification can lead to ascertainment bias, which will affect population genetic analyses based on those data. In livestock species, the methods of SNP identification through genome sequencing are likely to suffer from this ascertainment bias. In the present study, a subset of data from the Bovine HapMap Project was re-analysed to quantify the effects of ascertainment bias on a range of common analyses and statistics. Data from 189 animals of the zebu breeds Brahman, Nelore and Gir, taurine beef Angus, Limousin and Hereford and taurine dairy Holstein, Jersey and Brown Swiss were analysed. There were 141 SNPs each of Angus, Brahman and Holstein origin, giving a total of 423 SNPs organised in 141 triplets. Each triplet consisted of one SNP of each breed, separated on average by 0.75 Mb within each triplet and where triplets were separated by 14.96 Mb to ensure that each triplet was unaffected by linkage disequilibrium. The minor allele frequency distribution, estimates of the F-statistic, FST, the partitioning of variance and population substructure were relatively unaffected by breed of origin of the SNPs. Estimates of heterozygosity were significantly affected by breed of origin of the SNPs. The clustering of animals of closely related breeds varied in the principal component analyses (PCA). However, in the PCA the effect of breed of origin of 141 SNPs was similar to the effect of using different panels of 141 SNPs of all three breeds, so the differences found in the PCA may not be all due to bias by the origin of the SNPs. Based on these results, analyses that depend on FST, including signatures of selection, gene flow and effective population size are unlikely to be strongly affected by SNP origin. Analyses that partition genetic variance and some analyses of population substructure will also be largely unaffected. However, analyses that are dependent on locus heterozygosity, which can be used for studying population bottlenecks, or those that study selection using extended haplotype homozygosity may be significantly affected by breed of origin of the SNPs.

Download Full-text

Estimation of effective population size using single-nucleotide polymorphism (SNP) data in Jeju horse

Journal of Animal Science and Technology ◽

10.1186/2055-0391-56-28 ◽

2014 ◽

Vol 56 (1) ◽

pp. 28 ◽

Cited By ~ 7

Author(s):

Kyoung-Tag Do ◽

Joon-Ho Lee ◽

Hak-Kyo Lee ◽

Jun Kim ◽

Kyung-Do Park

Keyword(s):

Single Nucleotide Polymorphism ◽

Population Size ◽

Effective Population Size ◽

Nucleotide Polymorphism ◽

Effective Population ◽

Single Nucleotide ◽

Snp Data ◽

Jeju Horse

Download Full-text

Single-Nucleotide-Polymorphism-Panel Population-Genetics Approach Based on the 1000 Genomes Database and Elite Soccer Players

International Journal of Sports Physiology and Performance ◽

10.1123/ijspp.2018-0715 ◽

2019 ◽

Vol 14 (6) ◽

pp. 711-717 ◽

Cited By ~ 2

Author(s):

Gustavo Monnerat ◽

Alex S. Maior ◽

Marcio Tannure ◽

Lia K.F.C. Back ◽

Caleb G.M. Santos

Keyword(s):

Population Genetics ◽

Association Studies ◽

Principal Component ◽

Sport Performance ◽

Brazilian Population ◽

Soccer Players ◽

Single Nucleotide ◽

1000 Genomes ◽

Professional Soccer ◽

Early Results

Purpose: Soccer is one of the most popular sports worldwide, a physical activity of great physiological demand and complexity. Currently, numerous trials involving physiological responses such as hypertrophy, energy expenditure, vasodilation, cardiac output, VO2max, and recovery have supported the possibility of genomic predictors’ affecting performance. In a complementary way to association studies with single nucleotide polymorphisms (SNPs), the objective was to evaluate if the use of population genetics data from human-genomics databases can provide information for a better understanding of the relationship between heritability and sport performance. Methods: The study included 25 healthy male professional soccer players (25.5 [4.3] y, 177.4 [6.4] cm, 76.4 [6.4] kg, body fat 10.5% [4.3%]) from the Brazilian first-division soccer club. Anthropometric measurements and field and isokinetic tests were performed to evaluate performance and physiologic parameters of subjects. Moreover, 10 genetic polymorphisms previously related to performance were genotyped. The genotypes of the same polymorphisms were obtained for 2504 individuals from the populations deposited in the 1000 Genomes database. A principal-component analysis and matrix genetic-distances approach (Fst) were evaluated. Results: As expected, the admixture Brazilian population has numerous genetic similarities with the European and American populations from genomic databases. Although the African component is absolutely recognized in genomes from the Brazilian population, using the specific performance-related SNPs, surprisingly the African population was one of the most genetically distant of the players (P < .00001). Conclusions: The early results suggest a selective pressure on genes of elite soccer players, possibly related simultaneously to physical-performance, environmental, cognitive, and sociocultural aspects.

Download Full-text

Characterization of a haplotype- reference panel for genotyping by low-pass sequencing in Swiss Large White pigs

10.21203/rs.3.rs-318745/v1 ◽

2021 ◽

Author(s):

Adéla Nosková ◽

Meenu Bhati ◽

Naveen Kumar Kadri ◽

Danang Crysnanto ◽

Stefan Neuenschwander ◽

...

Keyword(s):

Genetic Diversity ◽

Principal Component ◽

Cost Effective ◽

Reference Panel ◽

Effective Population ◽

Large White ◽

Breeding Populations ◽

Low Pass ◽

Signatures Of Selection ◽

Genomic Inbreeding

Abstract Background The key-ancestor approach has been frequently applied to prioritize individuals for whole-genome sequencing based on their marginal genetic contribution to current populations. Using this approach, we selected 70 key ancestors from two lines of the Swiss Large White breed that have been selected divergently for fertility and fattening traits and sequenced their genomes with short paired-end reads. Results Using pedigree records, we estimated the effective population size of the dam and sire line to 72 and 44, respectively. In order to assess sequence variation in both lines, we sequenced the genomes of 70 boars at an average coverage of 16.69-fold. The boars explained 87.95 and 95.35% of the genetic diversity of the breeding populations of the dam and sire line, respectively. Reference-guided variant discovery using the GATK revealed 26,862,369 polymorphic sites. Principal component, admixture and FST analyses indicated considerable genetic differentiation between the lines. Genomic inbreeding quantified using runs of homozygosity was higher in the sire than dam line (0.28 vs 0.26). Using two complementary approaches (CLR and iHS), we detected 51 signatures of selection. However, only six signatures of selection overlapped between both lines. We used the sequenced haplotypes of the 70 key ancestors as a reference panel to call 22,618,811 genotypes in 175 pigs that had been sequenced at very low coverage (1.11-fold) using GLIMPSE. The genotype concordance, non-reference sensitivity and non-reference discrepancy between thus inferred and Illumina PorcineSNP60 BeadChip-called genotypes was 97.60, 98.73 and 3.24%, respectively. The low-pass sequencing-derived genomic relationship coefficients were highly correlated (r > 0.99) with those obtained from microarray genotyping. Conclusions We assessed genetic diversity within and between two lines of the Swiss Large White pig breed. Our analyses revealed considerable differentiation, even though the split into two populations occurred only few generations ago. The sequenced haplotypes of the key ancestor animals enabled us to implement genotyping by low-pass sequencing which offers an intriguing cost-effective approach to increase the variant density over current array-based genotyping by more than 350-fold.

Download Full-text

Unraveling Admixture, Inbreeding, and Recent Selection Signatures in West African Indigenous Cattle Populations in Benin

Frontiers in Genetics ◽

10.3389/fgene.2021.657282 ◽

2021 ◽

Vol 12 ◽

Author(s):

Sèyi Fridaïus Ulrich Vanvanhossou ◽

Tong Yin ◽

Carsten Scheper ◽

Ruedi Fries ◽

Luc Hippolyte Dossa ◽

...

Keyword(s):

Production Systems ◽

Principal Component ◽

West African ◽

Economic Traits ◽

Selection Signatures ◽

Extended Haplotype ◽

Indigenous Cattle ◽

Environmental Pressures ◽

Inbreeding Coefficients ◽

Recent Selection

The Dwarf Lagune and the Savannah Somba cattle in Benin are typical representatives of the endangered West African indigenous Shorthorn taurine. The Lagune was previously exported to African and European countries and bred as Dahomey cattle, whereas the Somba contributed to the formation of two indigenous hybrids known as Borgou and Pabli cattle. These breeds are affected by demographic, economic, and environmental pressures in local production systems. Considering current and historical genomic data, we applied a formal test of admixture, estimated admixture proportions, and computed genomic inbreeding coefficients to characterize the five breeds. Subsequently, we unraveled the most recent selection signatures using the cross-population extended haplotype homozygosity approach, based on the current and historical genotypes. Results from principal component analyses and high proportion of Lagune ancestry confirm the Lagune origin of the European Dahomey cattle. Moreover, the Dahomey cattle displayed neither indicine nor European taurine (EUT) background, but they shared on average 40% of autozygosity from common ancestors, dated approximately eight generations ago. The Lagune cattle presented inbreeding coefficients larger than 0.13; however, the Somba and the hybrids (Borgou and Pabli) were less inbred (≤0.08). We detected evidence of admixture in the Somba and Lagune cattle, but they exhibited a similar African taurine (AFT) ancestral proportion (≥96%) to historical populations, respectively. A moderate and stable AFT ancestral proportion (62%) was also inferred for less admixed hybrid cattle including the Pabli. In contrast, the current Borgou samples displayed a lower AFT ancestral proportion (47%) than historical samples (63%). Irrespective of the admixture proportions, the hybrid populations displayed more selection signatures related to economic traits (reproduction, growth, and milk) than the taurine. In contrast, the taurine, especially the Somba, presented several regions known to be associated with adaptive traits (immunity and feed efficiency). The identified subregion of bovine leukocyte antigen (BoLA) class IIb (including DSB and BOLA-DYA) in Somba cattle is interestingly uncommon in other African breeds, suggesting further investigations to understand its association with specific adaptation to endemic diseases in Benin. Overall, our study provides deeper insights into recent evolutionary processes in the Beninese indigenous cattle and their aptitude for conservation and genetic improvement.

Download Full-text

Simulasi Metode Statistik untuk Seleksi Single Nucleotide Polymorphism pada Populasi Plasmodium

Life Science ◽

10.15294/lifesci.v8i1.29990 ◽

2019 ◽

Vol 8 (1) ◽

pp. 54-64

Author(s):

Mohamad Ikhsan Nurulloh ◽

Yustinus Ulung Anggraito ◽

Hidayat Trimarsanto ◽

Endah Peniati ◽

R. Susanti

Keyword(s):

Principal Component Analysis ◽

Single Nucleotide Polymorphism ◽

Population Structure ◽

Principal Component ◽

Selection Method ◽

Principal Coordinate Analysis ◽

Neighbor Joining ◽

Nucleotide Polymorphism ◽

Single Nucleotide ◽

Informative Snps

Plasmodium is a pathogen that causes malaria which has high genetic diversity and resistance to antimalarial drugs. Information on the population structure of Plasmodium can be used as molecular markers, one of which is Single Nucleotide Polymorphism (SNP). SNP markers are in large numbers and not entirely informative. The existing method has not been effective in producing informative SNPs, therefore it is necessary to develop an effective SNP selection method. The SNP selection method is developed using FST as the main filter (filter) and combines Linkage Disequilibrium (LD). The population structure of the SNP is known to use Principal Component Analysis (PCA), Principal Coordinate Analysis (PCoA), pairwise FST, and neighbor-joining population trees. Informative SNP criteria known by calculating FST and Minor Allele Frequency (MAF). Statistical methods were tested to determine their effectiveness in producing informative SNPs. The method testing was carried out using genetic data simulation of the Plasmodium population. The results of the study show that the statistical method is effective in producing informative SNPs. The informative SNP criteria are SNPs with MAF 0.2-0.4 and FST 0.1-0.4 and 0.8-1.0. Plasmodium merupakan patogen penyebab malaria dengan keanekaragaman genetik tinggi dan memiliki resistensi terhadap obat antimalaria. Informasi sturuktur populasi Plasmodium dapat dimanfaatkan sebagai marka molekuler seperti Single Nucleotide Polymorphism (SNP). Marka SNP terdapat dalam jumlah yang banyak dan tidak seluruhnya informatif. Metode yang telah ada belum efektif dalam menghasilkan SNP informatif sehingga perlu dilakukan pengembangan metode seleksi SNP yang efektif. Metode seleksi SNP dikembangkan menggunakan FST sebagai filter (penyaring) utamanya dan gabungkan Linkage Disequilibrium (LD). Struktur populasi dari SNP diketahui menggunakan Principal Component Analysis (PCA), Principal Coordinate Analysis (PCoA), pairwise FST, dan neighbor-joining population tree. Kriteria SNP informatif yang diketahui dengan menghitung FST dan Minor Allele Frequency (MAF). Metode statistika diuji untuk mengetahui keefektifannya dalam menghasilkan SNP informatif. Pengujian metode dilakukan menggunakan simulasi data genetik populasi Plasmodium. Hasil penelitian menunjukkan metode statistika efektif dalam menghasilkan SNP informatif. Kriteria SNP informatif adalah SNP dengan MAF 0.2-0.4 serta FST 0.1-0.4 dan 0.8-1.0.

Download Full-text

Genetic diversity, population structure, and effective population size in two yellow bat species in south Texas

PeerJ ◽

10.7717/peerj.10348 ◽

2020 ◽

Vol 8 ◽

pp. e10348

Author(s):

Austin S. Chipps ◽

Amanda M. Hale ◽

Sara P. Weaver ◽

Dean A. Williams

Keyword(s):

Genetic Diversity ◽

Population Structure ◽

North America ◽

Wind Energy ◽

Population Size ◽

Effective Population Size ◽

Microsatellite Loci ◽

South Texas ◽

Population Substructure ◽

Effective Population

There are increasing concerns regarding bat mortality at wind energy facilities, especially as installed capacity continues to grow. In North America, wind energy development has recently expanded into the Lower Rio Grande Valley in south Texas where bat species had not previously been exposed to wind turbines. Our study sought to characterize genetic diversity, population structure, and effective population size in Dasypterus ega and D. intermedius, two tree-roosting yellow bats native to this region and for which little is known about their population biology and seasonal movements. There was no evidence of population substructure in either species. Genetic diversity at mitochondrial and microsatellite loci was lower in these yellow bat taxa than in previously studied migratory tree bat species in North America, which may be due to the non-migratory nature of these species at our study site, the fact that our study site is located at a geographic range end for both taxa, and possibly weak ascertainment bias at microsatellite loci. Historical effective population size (NEF) was large for both species, while current estimates of Ne had upper 95% confidence limits that encompassed infinity. We found evidence of strong mitochondrial differentiation between the two putative subspecies of D. intermedius (D. i. floridanus and D. i. intermedius) which are sympatric in this region of Texas, yet little differentiation using microsatellite loci. We suggest this pattern is due to secondary contact and hybridization and possibly incomplete lineage sorting at microsatellite loci. We also found evidence of some hybridization between D. ega and D. intermedius in this region of Texas. We recommend that our data serve as a starting point for the long-term genetic monitoring of these species in order to better understand the impacts of wind-related mortality on these populations over time.

Download Full-text

Cryptic Lineages and a Population Dammed to Incipient Extinction? Insights into the Genetic Structure of a Mekong River Catfish

Journal of Heredity ◽

10.1093/jhered/esz016 ◽

2019 ◽

Vol 110 (5) ◽

pp. 535-547 ◽

Cited By ~ 2

Author(s):

Amanda S Ackiss ◽

Binh T Dang ◽

Christopher E Bird ◽

Ellen E Biesack ◽

Phen Chheng ◽

...

Keyword(s):

Genetic Structure ◽

Natural Populations ◽

Principal Component ◽

Flood Pulse ◽

Mekong River ◽

Effective Population ◽

Long Distance ◽

Genetic Lineages ◽

Low Genetic Diversity ◽

Population Size Estimates

Abstract An understanding of the genetic composition of populations across management boundaries is vital to developing successful strategies for sustaining biodiversity and food resources. This is especially important in ecosystems where habitat fragmentation has altered baseline patterns of gene flow, dividing natural populations into smaller subpopulations and increasing potential loss of genetic variation through genetic drift. River systems can be highly fragmented by dams built for flow regulation and hydropower. We used reduced-representation sequencing to examine genomic patterns in an exploited catfish, Hemibagrus spilopterus, in a hotspot of biodiversity and hydropower development—the Mekong River basin. Our results revealed the presence of 2 highly divergent coexisting genetic lineages which may be cryptic species. Within the lineage with the greatest sample sizes, pairwise FST values, principal component analysis, and a STRUCTURE analysis all suggest that long-distance migration is not common across the Lower Mekong Basin, even in areas where flood-pulse hydrology has limited genetic divergence. In tributaries, effective population size estimates were at least an order of magnitude lower than in the Mekong mainstream indicating these populations may be more vulnerable to perturbations such as human-induced fragmentation. Fish isolated upstream of several dams in one tributary exhibited particularly low genetic diversity, high amounts of relatedness, and a level of inbreeding (GIS = 0.51) that has been associated with inbreeding depression in other outcrossing species. Our results highlight the importance of assessing genetic structure and diversity in riverine fisheries populations across proposed dam development sites for the preservation of these critically important resources.

Download Full-text

Genetic Analysis of Patients With Sickle Cell Anemia and Stroke Before 4 Years of Age Suggest an Important Role for Apoliprotein E

Circulation Genomic and Precision Medicine ◽

10.1161/circgen.120.003025 ◽

2020 ◽

Vol 13 (5) ◽

pp. 531-540

Author(s):

John N. Brewin ◽

Alexander E. Smith ◽

Riley Cook ◽

Sanjay Tewari ◽

Julie Brent ◽

...

Keyword(s):

Ischemic Stroke ◽

Sickle Cell ◽

Sickle Cell Anemia ◽

Odds Ratio ◽

Multiple Testing ◽

Statistical Significance ◽

Principal Component ◽

Population Substructure ◽

Reference Allele ◽

Primary Finding

Background: Ischemic stroke is a devastating complication affecting children with sickle cell anemia. Genetic factors are likely to be important in determining the risk of stroke but are poorly defined. Methods: We have studied a cohort of 19 children who had an overt ischemic stroke before 4 years of age. We predicted genetic determinants of stroke would be more prominent in this group. We performed whole exome sequencing on this cohort and applied 2 hypotheses to our variant filtering. First, we looked for strong, potentially mono- or oligogenic variants for ischemic stroke, and second, we considered that more common polygenic variants will be enriched in our cohort. Candidate variants emerging from both strategies were validated in a cohort of 283 patients with sickle cell anemia and known pediatric cerebrovascular outcomes. We used principal component analysis in this cohort to control for relatedness and population substructure. Results: Our primary finding was that the Apoliprotein E genotypes ε2/ε4 and ε4/ ε4, defined by the interplay of rs7412 and rs429358 , were associated with increased stroke risk, with an odds ratio of 4.35 ([95% CI, 1.85–10.0] P =0.0011) for ischemic stroke in the validation cohort. We also found that rs2297518 in NOS (NO synthase) 2 (odds ratio, 2.25 [95% CI, 1.21–4.19]; P =0.014) and rs2230123 in signal transducer and activator of transcription (odds ratio, 2.60 [95% CI, 1.30–5.20]; P =0.009) both had increased odds ratios for ischemic stroke, although these two variants were below the threshold for statistical significance after correction for multiple testing. Conclusions: These data identify new loci for future functional investigations into cerebrovascular disease in sickle cell anemia. Based on African population reference allele frequencies, the Apoliprotein E genotypes would be present in about 10% of children with sickle cell anemia and represent a genetic risk factor that is potentially modifiable by both dietary and pharmaceutical manipulation of its dyslipidemic effects.

Download Full-text

Population Structure and Genetic Diversity in Korean Cowpea Germplasm Based on SNP Markers

Plants ◽

10.3390/plants9091190 ◽

2020 ◽

Vol 9 (9) ◽

pp. 1190 ◽

Cited By ~ 1

Author(s):

Eunju Seo ◽

Kipoong Kim ◽

Tae-Hwan Jun ◽

Jinsil Choi ◽

Seong-Hoon Kim ◽

...

Keyword(s):

Genetic Diversity ◽

Population Structure ◽

Principal Component ◽

Snp Markers ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Useful Knowledge ◽

Population Structure Analysis ◽

Group 3 ◽

Genetic Diversity Study

Cowpea is one of the most essential legume crops providing inexpensive dietary protein and nutrients. The aim of this study was to understand the genetic diversity and population structure of global and Korean cowpea germplasms. A total of 384 cowpea accessions from 21 countries were genotyped with the Cowpea iSelect Consortium Array containing 51,128 single-nucleotide polymorphisms (SNPs). After SNP filtering, a genetic diversity study was carried out using 35,116 SNPs within 376 cowpea accessions, including 229 Korean accessions. Based on structure and principal component analysis, a total of 376 global accessions were divided into four major populations. Accessions in group 1 were from Asia and Europe, those in groups 2 and 4 were from Korea, and those in group 3 were from West Africa. In addition, 229 Korean accessions were divided into three major populations (Q1, Jeonra province; Q2, Gangwon province; Q3, a mixture of provinces). Additionally, the neighbor-joining tree indicated similar results. Further genetic diversity analysis within the global and Korean population groups indicated low heterozygosity, a low polymorphism information content, and a high inbreeding coefficient in the Korean cowpea accessions. The population structure analysis will provide useful knowledge to support the genetic potential of the cowpea breeding program, especially in Korea.

Download Full-text

Swedish Population Substructure Revealed by Genome-Wide Single Nucleotide Polymorphism Data

PLoS ONE ◽

10.1371/journal.pone.0016747 ◽

2011 ◽

Vol 6 (2) ◽

pp. e16747 ◽

Cited By ~ 24

Author(s):

Elina Salmela ◽

Tuuli Lappalainen ◽

Jianjun Liu ◽

Pertti Sistonen ◽

Peter M. Andersen ◽

...

Keyword(s):

Single Nucleotide Polymorphism ◽

Population Substructure ◽

Single Nucleotide Polymorphism Data ◽

Nucleotide Polymorphism ◽

Swedish Population ◽

Single Nucleotide ◽

Genome Wide ◽

Polymorphism Data

Download Full-text