kinship coefficients
Recently Published Documents


TOTAL DOCUMENTS

25
(FIVE YEARS 3)

H-INDEX

6
(FIVE YEARS 0)

2021 ◽  
Author(s):  
BRUNO MARCHETTI DE SOUZA ◽  
Lucas Moura de Abreu ◽  
Marília de Castro Rodrigues Pappas ◽  
Vânia Cristina Rennó Azevedo ◽  
Paulo Eduardo Telles dos Santos ◽  
...  

Abstract The study investigates the genetic diversity and the ability of genomic-wide selection to predict breeding genomic values of an E. benthamii trial. All individuals (115) of the breeding population were genotyped with 13 microsatellites loci. The diameter at breast height and total height were measured. The data analysis was carried using the softwares: Structure, Popgene, GDA, SPAGeDi1.5 and R. Predictive ability, heritability and standard errors markers were estimated using the RRblup method. The average number of alleles per locus was nine, and the polymorphism level for each locus varied from 3 to 17. The average expected heterozygosity (He=0.655) was very similar to observed heterozygosity and the estimated inbreeding (F = 0.02) was very low. These results corroborate that this population is in Hardy-Weinberg equilibrium for the most loci. The trial genetic diversity is considered high, once the trial sampling demonstrated similar values to the natural populations. The group coancestry (0.085) demonstrate that the trees, in general, related at the half-sib level in this population. By using the Evanno’s method it is inferred that the individuals came from two original populations. The genetic distance calculated among the two groups was low ( =0.21). The heritability estimated from genomic selection for phenotypic traits was very low; however, the heritability estimated using the kinship coefficients was higher. The marker-based heritability using kinship coefficients probably is the more accurate than the one estimated using genomic selection, showing that the population samples can be used to establish breeding populations, hybrids and enriching the species germplasm bank.


PLoS Genetics ◽  
2021 ◽  
Vol 17 (1) ◽  
pp. e1009315
Author(s):  
Ardalan Naseri ◽  
Junjie Shi ◽  
Xihong Lin ◽  
Shaojie Zhang ◽  
Degui Zhi

Inference of relationships from whole-genome genetic data of a cohort is a crucial prerequisite for genome-wide association studies. Typically, relationships are inferred by computing the kinship coefficients (ϕ) and the genome-wide probability of zero IBD sharing (π0) among all pairs of individuals. Current leading methods are based on pairwise comparisons, which may not scale up to very large cohorts (e.g., sample size >1 million). Here, we propose an efficient relationship inference method, RAFFI. RAFFI leverages the efficient RaPID method to call IBD segments first, then estimate the ϕ and π0 from detected IBD segments. This inference is achieved by a data-driven approach that adjusts the estimation based on phasing quality and genotyping quality. Using simulations, we showed that RAFFI is robust against phasing/genotyping errors, admix events, and varying marker densities, and achieves higher accuracy compared to KING, the current leading method, especially for more distant relatives. When applied to the phased UK Biobank data with ~500K individuals, RAFFI is approximately 18 times faster than KING. We expect RAFFI will offer fast and accurate relatedness inference for even larger cohorts.


2021 ◽  
Author(s):  
Wei Jiang ◽  
Xiangyu Zhang ◽  
Siting Li ◽  
Shuang Song ◽  
Hongyu Zhao

Accurate estimate of relatedness is important for genetic data analyses, such as association mapping and heritability estimation based on data collected from genome-wide association studies. Inaccurate relatedness estimates may lead to spurious associations and biased heritability estimations. Individual-level genotype data are often used to estimate kinship coefficient between individuals. The commonly used sample correlation-based genomic relationship matrix (scGRM) method estimates kinship coefficient by calculating the average sample correlation coefficient among all single nucleotide polymorphisms (SNPs), where the observed allele frequencies are used to calculate both the expectations and variances of genotypes. Although this method is widely used, a substantial proportion of estimated kinship coefficients are negative, which are difficult to interpret. In this paper, through mathematical derivation, we show that there indeed exists bias in the estimated kinship coefficient using the scGRM method when the observed allele frequencies are regarded as true frequencies. This leads to negative bias for the average estimate of kinship among all individuals, which explains the estimated negative kinship coefficients. Based on this observation, we propose an unbiased estimation method, UKin, which can reduce the bias. We justify our improved method with rigorous mathematical proof. We have conducted simulations as well as two real data analyses to demonstrate that both bias and root mean square error in kinship coefficient estimation can be reduced by using UKin. Further simulations indicate that the power in association mapping can also be improved by using our unbiased kinship estimates to adjust for cryptic relatedness.


2020 ◽  
Vol 36 (16) ◽  
pp. 4519-4520
Author(s):  
Ying Zhou ◽  
Sharon R Browning ◽  
Brian L Browning

Abstract Motivation Estimation of pairwise kinship coefficients in large datasets is computationally challenging because the number of related individuals increases quadratically with sample size. Results We present IBDkin, a software package written in C for estimating kinship coefficients from identity by descent (IBD) segments. We use IBDkin to estimate kinship coefficients for 7.95 billion pairs of individuals in the UK Biobank who share at least one detected IBD segment with length ≥ 4 cM. Availability and implementation https://github.com/YingZhou001/IBDkin. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Author(s):  
Sweklej Edyta ◽  
Horoszewicz Elżbieta ◽  
Niedziółka Roman

AbstractThe aim of the study was to analyse the structure of the population, kinship coefficients and inbreeding trend taking into account the sex, breeding system: champions (CH) and non-champions (nCH), breeding country: Poland (PL) and foreign country (Z) and the inbreeding degree of Tatra Shepherd dogs. Out of the currently registered 587 Tatra Shepherd dogs, 41.9% have been qualified for breeding. In the past decade, 1961 puppies were born, which corresponds to an average litter of 5.8 puppies. The breed’s inbreeding rate amounted to 6.34%, and for a 4-generation population was 6.68%. The highest inbreeding rate was found in nCH and PL groups consisting of both male and female dogs. The inbreeding rate was significantly higher in 2005-2014 compared to the years 1994-2004. The limit value FX was exceeded for 25.65% of Shepherd dogs, and the critical value was exceeded for 11.52%. An increasing ancestor loss coefficient (AVK) was found, which may result in an increased number of inbred animals. In particular, it referred to female dogs in the nCH, PL, and F group, whereas a significant increase of AVK was observed in the group of male dogs from foreign kennels. The resulting COR values, respectively 55.58% for males and 55.44% for females, testify to insignificant inbreeding and suggest that breeders look for male inbreds. Studies have shown that there is no risk of inbred depression yet; however, the gene pool of the Tatra Shepherd dog breed has become noticeably restricted. In addition, leaving the stud book for the breed open must be considered due to an increase in the popularity of the breed, and thus an increase in mating.


2018 ◽  
Vol 35 (6) ◽  
pp. 1002-1008 ◽  
Author(s):  
Brent Kirkpatrick ◽  
Shufei Ge ◽  
Liangliang Wang

F1000Research ◽  
2018 ◽  
Vol 7 ◽  
pp. 281
Author(s):  
Matthew Frampton ◽  
Elena R. Schiff ◽  
Nikolas Pontikos ◽  
Anthony W. Segal ◽  
Adam P. Levine

This article introduces seqfam, a python package which is primarily designed for analysing next generation sequencing (NGS) DNA data from families with known pedigree information in order to identify rare variants that are potentially causal of a disease/trait of interest. It uses the popular and versatile Pandas library, and can be straightforwardly integrated into existing analysis code/pipelines. Seqfam can be used to verify pedigree information, to perform Monte Carlo gene dropping, to undertake regression-based gene burden testing, and to identify variants which segregate by affection status in families via user-defined pattern of occurrence rules. Additionally, it can generate scripts for running analyses in a “MapReduce pattern” on a computer cluster, something which is usually desirable in NGS data analysis and indeed “big data” analysis in general. This article summarises how seqfam’s main user functions work and motivates their use. It also provides explanatory context for example scripts and data included in the package which demonstrate use cases. With respect to verifying pedigree information, software exists for efficiently calculating kinship coefficients, so seqfam performs the necessary extra steps of mapping pedigrees and kinship coefficients to expected and observed degrees of relationship respectively. Gene dropping and the application of variant pattern of occurrence rules in families can provide evidence for a variant being causal. The authors are unaware of other software which performs these tasks in familial cohorts, so seqfam fulfils this need. Gene burden rather than single marker tests are often used to detect rare causal variants due to greater power. Seqfam may be an attractive alternative to existing gene burden testing software due to its flexibility, particularly in grouping and aggregating variants.


2016 ◽  
Author(s):  
Alejandro Ochoa ◽  
John D. Storey

AbstractFST is a fundamental measure of genetic differentiation and population structure, currently defined for subdivided populations. FST in practice typically assumes independent, non-overlapping subpopulations, which all split simultaneously from their last common ancestral population so that genetic drift in each subpopulation is probabilistically independent of the other subpopulations. We introduce a generalized FST definition for arbitrary population structures, where individuals may be related in arbitrary ways, allowing for arbitrary probabilistic dependence among individuals. Our definitions are built on identity-by-descent (IBD) probabilities that relate individuals through inbreeding and kinship coefficients. We generalize FST as the mean inbreeding coefficient of the individuals’ local populations relative to their last common ancestral population. We show that the generalized definition agrees with Wright’s original and the independent subpopulation definitions as special cases. We define a novel coancestry model based on “individual-specific allele frequencies” and prove that its parameters correspond to probabilistic kinship coefficients. Lastly, we extend the Pritchard-Stephens-Donnelly admixture model in the context of our coancestry model and calculate its FST. To motivate this work, we include a summary of analyses we have carried out in follow-up papers, where our new approach has been applied to simulations and global human data, showcasing the complexity of human population structure, demonstrating our success in estimating kinship and FST, and the shortcomings of existing approaches. The probabilistic framework we introduce here provides a theoretical foundation that extends FST in terms of inbreeding and kinship coefficients to arbitrary population structures, paving the way for new estimators and novel analyses.Note: This article is Part I of two-part manuscripts. We refer to these in the text as Part I and Part II, respectively.Part I: Alejandro Ochoa and John D. Storey. “FST and kinship for arbitrary population structures I: Generalized definitions”. bioRxiv (10.1101/083915) (2019). https://doi.org/10.1101/083915. First published 2016-10-27.Part II: Alejandro Ochoa and John D. Storey. “FST and kinship for arbitrary population structures II: Method of moments estimators”. bioRxiv (10.1101/083923) (2019). https://doi.org/10.1101/083923. First published 2016-10-27.


Sign in / Sign up

Export Citation Format

Share Document