kinship coefficients Latest Research Papers

Genetic Diversity and Genomic Selection in Eucalyptus Benthamii

10.21203/rs.3.rs-416984/v1 ◽

2021 ◽

Author(s):

BRUNO MARCHETTI DE SOUZA ◽

Lucas Moura de Abreu ◽

Marília de Castro Rodrigues Pappas ◽

Vânia Cristina Rennó Azevedo ◽

Paulo Eduardo Telles dos Santos ◽

...

Keyword(s):

Genetic Diversity ◽

Genomic Selection ◽

Natural Populations ◽

Predictive Ability ◽

Breeding Population ◽

Phenotypic Traits ◽

Breeding Populations ◽

Kinship Coefficients ◽

Hardy Weinberg Equilibrium ◽

The One

Abstract The study investigates the genetic diversity and the ability of genomic-wide selection to predict breeding genomic values of an E. benthamii trial. All individuals (115) of the breeding population were genotyped with 13 microsatellites loci. The diameter at breast height and total height were measured. The data analysis was carried using the softwares: Structure, Popgene, GDA, SPAGeDi1.5 and R. Predictive ability, heritability and standard errors markers were estimated using the RRblup method. The average number of alleles per locus was nine, and the polymorphism level for each locus varied from 3 to 17. The average expected heterozygosity (He=0.655) was very similar to observed heterozygosity and the estimated inbreeding (F = 0.02) was very low. These results corroborate that this population is in Hardy-Weinberg equilibrium for the most loci. The trial genetic diversity is considered high, once the trial sampling demonstrated similar values to the natural populations. The group coancestry (0.085) demonstrate that the trees, in general, related at the half-sib level in this population. By using the Evanno’s method it is inferred that the individuals came from two original populations. The genetic distance calculated among the two groups was low ( =0.21). The heritability estimated from genomic selection for phenotypic traits was very low; however, the heritability estimated using the kinship coefficients was higher. The marker-based heritability using kinship coefficients probably is the more accurate than the one estimated using genomic selection, showing that the population samples can be used to establish breeding populations, hybrids and enriching the species germplasm bank.

Download Full-text

RAFFI: Accurate and fast familial relationship inference in large scale biobank studies using RaPID

PLoS Genetics ◽

10.1371/journal.pgen.1009315 ◽

2021 ◽

Vol 17 (1) ◽

pp. e1009315

Author(s):

Ardalan Naseri ◽

Junjie Shi ◽

Xihong Lin ◽

Shaojie Zhang ◽

Degui Zhi

Keyword(s):

Large Scale ◽

Association Studies ◽

Scale Up ◽

Data Driven ◽

Genome Wide Association Studies ◽

Inference Method ◽

Genome Wide ◽

Familial Relationship ◽

Kinship Coefficients ◽

Data Driven Approach

Inference of relationships from whole-genome genetic data of a cohort is a crucial prerequisite for genome-wide association studies. Typically, relationships are inferred by computing the kinship coefficients (ϕ) and the genome-wide probability of zero IBD sharing (π0) among all pairs of individuals. Current leading methods are based on pairwise comparisons, which may not scale up to very large cohorts (e.g., sample size >1 million). Here, we propose an efficient relationship inference method, RAFFI. RAFFI leverages the efficient RaPID method to call IBD segments first, then estimate the ϕ and π0 from detected IBD segments. This inference is achieved by a data-driven approach that adjusts the estimation based on phasing quality and genotyping quality. Using simulations, we showed that RAFFI is robust against phasing/genotyping errors, admix events, and varying marker densities, and achieves higher accuracy compared to KING, the current leading method, especially for more distant relatives. When applied to the phased UK Biobank data with ~500K individuals, RAFFI is approximately 18 times faster than KING. We expect RAFFI will offer fast and accurate relatedness inference for even larger cohorts.

Download Full-text

Correcting statistical bias in correlation-based kinship estimators

10.1101/2021.01.13.426515 ◽

2021 ◽

Author(s):

Wei Jiang ◽

Xiangyu Zhang ◽

Siting Li ◽

Shuang Song ◽

Hongyu Zhao

Keyword(s):

Association Mapping ◽

Allele Frequencies ◽

Kinship Coefficient ◽

Unbiased Estimation ◽

Relationship Matrix ◽

Genome Wide Association Studies ◽

Data Analyses ◽

Heritability Estimation ◽

Kinship Coefficients ◽

Sample Correlation

Accurate estimate of relatedness is important for genetic data analyses, such as association mapping and heritability estimation based on data collected from genome-wide association studies. Inaccurate relatedness estimates may lead to spurious associations and biased heritability estimations. Individual-level genotype data are often used to estimate kinship coefficient between individuals. The commonly used sample correlation-based genomic relationship matrix (scGRM) method estimates kinship coefficient by calculating the average sample correlation coefficient among all single nucleotide polymorphisms (SNPs), where the observed allele frequencies are used to calculate both the expectations and variances of genotypes. Although this method is widely used, a substantial proportion of estimated kinship coefficients are negative, which are difficult to interpret. In this paper, through mathematical derivation, we show that there indeed exists bias in the estimated kinship coefficient using the scGRM method when the observed allele frequencies are regarded as true frequencies. This leads to negative bias for the average estimate of kinship among all individuals, which explains the estimated negative kinship coefficients. Based on this observation, we propose an unbiased estimation method, UKin, which can reduce the bias. We justify our improved method with rigorous mathematical proof. We have conducted simulations as well as two real data analyses to demonstrate that both bias and root mean square error in kinship coefficient estimation can be reduced by using UKin. Further simulations indicate that the power in association mapping can also be improved by using our unbiased kinship estimates to adjust for cryptic relatedness.

Download Full-text

IBDkin: fast estimation of kinship coefficients from identity by descent segments

Bioinformatics ◽

10.1093/bioinformatics/btaa569 ◽

2020 ◽

Vol 36 (16) ◽

pp. 4519-4520

Author(s):

Ying Zhou ◽

Sharon R Browning ◽

Brian L Browning

Keyword(s):

Software Package ◽

Large Datasets ◽

Supplementary Information ◽

Supplementary Data ◽

Uk Biobank ◽

Identity By Descent ◽

Fast Estimation ◽

Kinship Coefficients ◽

Related Individuals ◽

The Uk

Abstract Motivation Estimation of pairwise kinship coefficients in large datasets is computationally challenging because the number of related individuals increases quadratically with sample size. Results We present IBDkin, a software package written in C for estimating kinship coefficients from identity by descent (IBD) segments. We use IBDkin to estimate kinship coefficients for 7.95 billion pairs of individuals in the UK Biobank who share at least one detected IBD segment with length ≥ 4 cM. Availability and implementation https://github.com/YingZhou001/IBDkin. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Analysis of structure of the population, kinship coefficients and inbreeding trend depending on sex, type of breeding of Tatra Sheepherd dogs

10.1101/2020.02.19.956045 ◽

2020 ◽

Author(s):

Sweklej Edyta ◽

Horoszewicz Elżbieta ◽

Niedziółka Roman

Keyword(s):

Breeding System ◽

Gene Pool ◽

Critical Value ◽

Male And Female ◽

The Past ◽

Limit Value ◽

Sex Type ◽

Kinship Coefficients ◽

Inbreeding Rate ◽

Dog Breed

AbstractThe aim of the study was to analyse the structure of the population, kinship coefficients and inbreeding trend taking into account the sex, breeding system: champions (CH) and non-champions (nCH), breeding country: Poland (PL) and foreign country (Z) and the inbreeding degree of Tatra Shepherd dogs. Out of the currently registered 587 Tatra Shepherd dogs, 41.9% have been qualified for breeding. In the past decade, 1961 puppies were born, which corresponds to an average litter of 5.8 puppies. The breed’s inbreeding rate amounted to 6.34%, and for a 4-generation population was 6.68%. The highest inbreeding rate was found in nCH and PL groups consisting of both male and female dogs. The inbreeding rate was significantly higher in 2005-2014 compared to the years 1994-2004. The limit value FX was exceeded for 25.65% of Shepherd dogs, and the critical value was exceeded for 11.52%. An increasing ancestor loss coefficient (AVK) was found, which may result in an increased number of inbred animals. In particular, it referred to female dogs in the nCH, PL, and F group, whereas a significant increase of AVK was observed in the group of male dogs from foreign kennels. The resulting COR values, respectively 55.58% for males and 55.44% for females, testify to insignificant inbreeding and suggest that breeders look for male inbreds. Studies have shown that there is no risk of inbred depression yet; however, the gene pool of the Tatra Shepherd dog breed has become noticeably restricted. In addition, leaving the stud book for the breed open must be considered due to an increase in the popularity of the breed, and thus an increase in mating.

Download Full-text

Concept for gene conservation strategy for the endangered Chinese yellowhorn, Xanthoceras sorbifolium, based on simulation of pairwise kinship coefficients

Forest Ecology and Management ◽

10.1016/j.foreco.2018.10.045 ◽

2019 ◽

Vol 432 ◽

pp. 976-982 ◽

Cited By ~ 2

Author(s):

Yousry A. El-Kassaby ◽

Qing Wang ◽

Tongli Wang ◽

Blaise Ratcliffe ◽

Quan-Xin Bi ◽

...

Keyword(s):

Conservation Strategy ◽

Gene Conservation ◽

Xanthoceras Sorbifolium ◽

Kinship Coefficients

Download Full-text

Efficient computation of the kinship coefficients

Bioinformatics ◽

10.1093/bioinformatics/bty725 ◽

2018 ◽

Vol 35 (6) ◽

pp. 1002-1008 ◽

Cited By ~ 2

Author(s):

Brent Kirkpatrick ◽

Shufei Ge ◽

Liangliang Wang

Keyword(s):

Efficient Computation ◽

Kinship Coefficients

Download Full-text

Seqfam: A python package for analysis of Next Generation Sequencing DNA data in families

F1000Research ◽

10.12688/f1000research.13930.1 ◽

2018 ◽

Vol 7 ◽

pp. 281

Author(s):

Matthew Frampton ◽

Elena R. Schiff ◽

Nikolas Pontikos ◽

Anthony W. Segal ◽

Adam P. Levine

Keyword(s):

Data Analysis ◽

Next Generation Sequencing ◽

Rare Variants ◽

Pedigree Information ◽

Attractive Alternative ◽

Next Generation ◽

Kinship Coefficients ◽

Single Marker ◽

Python Package ◽

Generation Sequencing

This article introduces seqfam, a python package which is primarily designed for analysing next generation sequencing (NGS) DNA data from families with known pedigree information in order to identify rare variants that are potentially causal of a disease/trait of interest. It uses the popular and versatile Pandas library, and can be straightforwardly integrated into existing analysis code/pipelines. Seqfam can be used to verify pedigree information, to perform Monte Carlo gene dropping, to undertake regression-based gene burden testing, and to identify variants which segregate by affection status in families via user-defined pattern of occurrence rules. Additionally, it can generate scripts for running analyses in a “MapReduce pattern” on a computer cluster, something which is usually desirable in NGS data analysis and indeed “big data” analysis in general. This article summarises how seqfam’s main user functions work and motivates their use. It also provides explanatory context for example scripts and data included in the package which demonstrate use cases. With respect to verifying pedigree information, software exists for efficiently calculating kinship coefficients, so seqfam performs the necessary extra steps of mapping pedigrees and kinship coefficients to expected and observed degrees of relationship respectively. Gene dropping and the application of variant pattern of occurrence rules in families can provide evidence for a variant being causal. The authors are unaware of other software which performs these tasks in familial cohorts, so seqfam fulfils this need. Gene burden rather than single marker tests are often used to detect rare causal variants due to greater power. Seqfam may be an attractive alternative to existing gene burden testing software due to its flexibility, particularly in grouping and aggregating variants.

Download Full-text

Repurposing kinship coefficients as a sample integrity method for next generation sequencing data in a clinical setting

Model Assisted Statistics and Applications ◽

10.3233/mas-170401 ◽

2017 ◽

Vol 12 (3) ◽

pp. 265-273 ◽

Cited By ~ 1

Author(s):

Yoonha Choi ◽

Joshua Babiarz ◽

Ed Tom ◽

Giulia C. Kennedy ◽

Jing Huang

Keyword(s):

Next Generation Sequencing ◽

Clinical Setting ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Kinship Coefficients ◽

Generation Sequencing

Download Full-text

FST and kinship for arbitrary population structures I: Generalized definitions

10.1101/083915 ◽

2016 ◽

Cited By ~ 11

Author(s):

Alejandro Ochoa ◽

John D. Storey

Keyword(s):

Population Structure ◽

Ancestral Population ◽

Link Type ◽

Special Cases ◽

Kinship Coefficients ◽

Population Structures ◽

Subdivided Populations ◽

Specific Allele ◽

Moments Estimators

AbstractFST is a fundamental measure of genetic differentiation and population structure, currently defined for subdivided populations. FST in practice typically assumes independent, non-overlapping subpopulations, which all split simultaneously from their last common ancestral population so that genetic drift in each subpopulation is probabilistically independent of the other subpopulations. We introduce a generalized FST definition for arbitrary population structures, where individuals may be related in arbitrary ways, allowing for arbitrary probabilistic dependence among individuals. Our definitions are built on identity-by-descent (IBD) probabilities that relate individuals through inbreeding and kinship coefficients. We generalize FST as the mean inbreeding coefficient of the individuals’ local populations relative to their last common ancestral population. We show that the generalized definition agrees with Wright’s original and the independent subpopulation definitions as special cases. We define a novel coancestry model based on “individual-specific allele frequencies” and prove that its parameters correspond to probabilistic kinship coefficients. Lastly, we extend the Pritchard-Stephens-Donnelly admixture model in the context of our coancestry model and calculate its FST. To motivate this work, we include a summary of analyses we have carried out in follow-up papers, where our new approach has been applied to simulations and global human data, showcasing the complexity of human population structure, demonstrating our success in estimating kinship and FST, and the shortcomings of existing approaches. The probabilistic framework we introduce here provides a theoretical foundation that extends FST in terms of inbreeding and kinship coefficients to arbitrary population structures, paving the way for new estimators and novel analyses.Note: This article is Part I of two-part manuscripts. We refer to these in the text as Part I and Part II, respectively.Part I: Alejandro Ochoa and John D. Storey. “FST and kinship for arbitrary population structures I: Generalized definitions”. bioRxiv (10.1101/083915) (2019). https://doi.org/10.1101/083915. First published 2016-10-27.Part II: Alejandro Ochoa and John D. Storey. “FST and kinship for arbitrary population structures II: Method of moments estimators”. bioRxiv (10.1101/083923) (2019). https://doi.org/10.1101/083923. First published 2016-10-27.

Download Full-text

kinship coefficients
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Genetic Diversity and Genomic Selection in Eucalyptus Benthamii

RAFFI: Accurate and fast familial relationship inference in large scale biobank studies using RaPID

Correcting statistical bias in correlation-based kinship estimators

IBDkin: fast estimation of kinship coefficients from identity by descent segments

Analysis of structure of the population, kinship coefficients and inbreeding trend depending on sex, type of breeding of Tatra Sheepherd dogs

Concept for gene conservation strategy for the endangered Chinese yellowhorn, Xanthoceras sorbifolium, based on simulation of pairwise kinship coefficients

Efficient computation of the kinship coefficients

Seqfam: A python package for analysis of Next Generation Sequencing DNA data in families

Repurposing kinship coefficients as a sample integrity method for next generation sequencing data in a clinical setting

FST and kinship for arbitrary population structures I: Generalized definitions

Export Citation Format

kinship coefficientsRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Genetic Diversity and Genomic Selection in Eucalyptus Benthamii

RAFFI: Accurate and fast familial relationship inference in large scale biobank studies using RaPID

Correcting statistical bias in correlation-based kinship estimators

IBDkin: fast estimation of kinship coefficients from identity by descent segments

Analysis of structure of the population, kinship coefficients and inbreeding trend depending on sex, type of breeding of Tatra Sheepherd dogs

Concept for gene conservation strategy for the endangered Chinese yellowhorn, Xanthoceras sorbifolium, based on simulation of pairwise kinship coefficients

Efficient computation of the kinship coefficients

Seqfam: A python package for analysis of Next Generation Sequencing DNA data in families

Repurposing kinship coefficients as a sample integrity method for next generation sequencing data in a clinical setting

FST and kinship for arbitrary population structures I: Generalized definitions

kinship coefficients
Recently Published Documents