Inferring identical by descent sharing of sample ancestors promotes high resolution relative detection

Mapping Intimacies ◽

10.1101/243048 ◽

2018 ◽

Author(s):

Monica D. Ramstetter ◽

Sushila A. Shenoy ◽

Thomas D. Dyer ◽

Donna M. Lehman ◽

Joanne E. Curran ◽

...

Keyword(s):

Simulated Data ◽

Identity By Descent ◽

Identical By Descent ◽

Novel Approach ◽

Related Individuals ◽

Using Data ◽

Close Relatives ◽

Multiple Samples ◽

Combining Information ◽

Relative Detection

AbstractAs genetic datasets increase in size, the fraction of samples with one or more close relatives grows rapidly, resulting in sets of mutually related individuals. We present DRUID—Deep Relatedness Utilizing Identity by Descent—a method that works by inferring the identical by descent (IBD) sharing profile of an ungenotyped ancestor of a set of close relatives. Using this IBD profile, DRUID infers relatedness between unobserved ancestors and more distant relatives, thereby combining information from multiple samples to remove one or more generations between the deep relationships to be identified. DRUID constructs sets of close relatives by detecting full siblings and also uses a novel approach to identify the aunts/uncles of two or more siblings, recovering 92.2% of real aunts/uncles with zero false positives. In real and simulated data, DRUID correctly infers up to 10.5% more relatives than PADRE when using data from two sets of distantly related siblings, and 10.7–31.3% more relatives given two sets of siblings and their aunts/uncles. DRUID frequently infers relationships either correctly or within one degree of the truth, with PADRE classifying 43.3–58.3% of tenth degree relatives in this way compared to 79.6–96.7% using DRUID.

Download Full-text

Distinguishing pedigree relationships using multi-way identical by descent sharing and sex-specific genetic maps

10.1101/753343 ◽

2019 ◽

Cited By ~ 2

Author(s):

Ying Qiao ◽

Jens Sannerud ◽

Sayantani Basu-Roy ◽

Caroline Hayward ◽

Amy L. Williams

Keyword(s):

Simulated Data ◽

Genetic Maps ◽

Fast Method ◽

Identical By Descent ◽

Generation Scotland ◽

Paternal Relationship ◽

Close Relatives ◽

Inference Methods ◽

Relationship Of ◽

The Relationship

AbstractThe proportion of samples with one or more close relatives in a genetic dataset increases rapidly with sample size, necessitating relatedness modeling and enabling pedigree-based analyses. Despite this, relatives are generally unreported and current inference methods typically detect only the degree of relatedness of sample pairs and not pedigree relationships. We developed CREST, an accurate and fast method that identifies the pedigree relationships of close relatives. CREST utilizes identical by descent (IBD) segments shared between a pair of samples and their mutual relatives, leveraging the fact that sharing rates among these individuals differ across pedigree configurations. Furthermore, CREST exploits the profound differences in sex-specific genetic maps to classify pairs as maternally or paternally related—e.g., paternal half-siblings—using the locations of autosomal IBD segments shared between the pair. In simulated data, CREST correctly classifies 91.5-99.5% of grandparent-grandchild (GP) pairs, 70.5-97.0% of avuncular (AV) pairs, and 79.0-98.0% of half-siblings (HS) pairs compared to PADRE’s rates of 38.5-76.0% of GP, 60.5-92.0% of AV, 73.0-95.0% of HS pairs. Turning to the real 20,032 sample Generation Scotland (GS) dataset, CREST correctly determines the relationship of 99.0% of GP, 85.7% of AV, and 95.0% of HS pairs that have sufficient mutual relative data, completing this analysis in 10.1 CPU hours including IBD detection. CREST’s maternal and paternal relationship inference is also accurate, as it flagged five pairs as incorrectly labeled in the GS pedigrees— three of which we confirmed as mistakes, and two with an uncertain relationship—yielding 99.7% of HS and 93.5% of GP pairs correctly classified.

Download Full-text

A Monte Carlo approach to calculating probabilities for continuous identity by descent data

Journal of Applied Probability ◽

10.1239/jap/1014842841 ◽

2000 ◽

Vol 37 (3) ◽

pp. 850-864 ◽

Cited By ~ 3

Author(s):

Sharon Browning

Keyword(s):

Monte Carlo ◽

Process Model ◽

Genetic Locus ◽

Gene Copy ◽

Recent Common Ancestor ◽

Identity By Descent ◽

Chi Square ◽

Identical By Descent ◽

Related Individuals ◽

The Relationship

Two related individuals are identical by descent at a genetic locus if they share the same gene copy at that locus due to inheritance from a recent common ancestor. We consider idealized continuous identity by descent (IBD) data in which IBD status is known continuously along chromosomes. IBD data contains information about the relationship between the two individuals, and about the underlying crossover processes. We present a Monte Carlo method for calculating probabilities for IBD data. The method is not restricted to Haldane's Poisson process model of crossing-over but may be used with other models including the chi-square, Kosambi renewal and Sturt models. Results of a simulation study demonstrate that IBD data can be used to distinguish between alternative models for the crossover process.

Download Full-text

A Monte Carlo approach to calculating probabilities for continuous identity by descent data

Journal of Applied Probability ◽

10.1017/s0021900200016041 ◽

2000 ◽

Vol 37 (03) ◽

pp. 850-864 ◽

Cited By ~ 1

Author(s):

Sharon Browning

Keyword(s):

Monte Carlo ◽

Process Model ◽

Genetic Locus ◽

Gene Copy ◽

Recent Common Ancestor ◽

Identity By Descent ◽

Chi Square ◽

Identical By Descent ◽

Related Individuals ◽

The Relationship

Download Full-text

Short Stress State Questionnaire

European Journal of Psychological Assessment ◽

10.1027/1015-5759/a000200 ◽

2015 ◽

Vol 31 (1) ◽

pp. 20-30 ◽

Cited By ~ 28

Author(s):

William S. Helton ◽

Katharina Näswall

Keyword(s):

Stress State ◽

Factor Structure ◽

Human Performance ◽

Self Report ◽

Confirmatory Factor ◽

Stress States ◽

Related Stress ◽

Using Data ◽

Task Conditions ◽

Multiple Samples

Conscious appraisals of stress, or stress states, are an important aspect of human performance. This article presents evidence supporting the validity and measurement characteristics of a short multidimensional self-report measure of stress state, the Short Stress State Questionnaire (SSSQ; Helton, 2004 ). The SSSQ measures task engagement, distress, and worry. A confirmatory factor analysis of the SSSQ using data pooled from multiple samples suggests the SSSQ does have a three factor structure and post-task changes are not due to changes in factor structure, but to mean level changes (state changes). In addition, the SSSQ demonstrates sensitivity to task stressors in line with hypotheses. Different task conditions elicited unique patterns of stress state on the three factors of the SSSQ in line with prior predictions. The 24-item SSSQ is a valid measure of stress state which may be useful to researchers interested in conscious appraisals of task-related stress.

Download Full-text

Novel approach to modeling high-frequency activity data to assess therapeutic effects of analgesics in chronic pain conditions

Scientific Reports ◽

10.1038/s41598-021-87304-w ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Zekun Xu ◽

Eric Laber ◽

Ana-Maria Staicu ◽

B. Duncan X. Lascelles

Keyword(s):

High Frequency ◽

Activity Patterns ◽

Treatment Strategies ◽

Therapeutic Effects ◽

Activity Levels ◽

Activity Data ◽

Novel Approach ◽

High Frequency Activity ◽

Using Data ◽

Chronic Pain Conditions

AbstractOsteoarthritis (OA) is a chronic condition often associated with pain, affecting approximately fourteen percent of the population, and increasing in prevalence. A globally aging population have made treating OA-associated pain as well as maintaining mobility and activity a public health priority. OA affects all mammals, and the use of spontaneous animal models is one promising approach for improving translational pain research and the development of effective treatment strategies. Accelerometers are a common tool for collecting high-frequency activity data on animals to study the effects of treatment on pain related activity patterns. There has recently been increasing interest in their use to understand treatment effects in human pain conditions. However, activity patterns vary widely across subjects; furthermore, the effects of treatment may manifest in higher or lower activity counts or in subtler ways like changes in the frequency of certain types of activities. We use a zero inflated Poisson hidden semi-Markov model to characterize activity patterns and subsequently derive estimators of the treatment effect in terms of changes in activity levels or frequency of activity type. We demonstrate the application of our model, and its advance over traditional analysis methods, using data from a naturally occurring feline OA-associated pain model.

Download Full-text

A NOTE ON PHASING LONG GENOMIC REGIONS USING LOCAL HAPLOTYPE PREDICTIONS

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720006002272 ◽

2006 ◽

Vol 04 (03) ◽

pp. 639-647 ◽

Cited By ~ 6

Author(s):

ELEAZAR ESKIN ◽

RODED SHARAN ◽

ERAN HALPERIN

Keyword(s):

Large Scale ◽

Computational Cost ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Novel Approach ◽

Maximum Likelihood Criterion ◽

The Common ◽

Genomic Regions ◽

High Computational Cost ◽

Combining Information

The common approaches for haplotype inference from genotype data are targeted toward phasing short genomic regions. Longer regions are often tackled in a heuristic manner, due to the high computational cost. Here, we describe a novel approach for phasing genotypes over long regions, which is based on combining information from local predictions on short, overlapping regions. The phasing is done in a way, which maximizes a natural maximum likelihood criterion. Among other things, this criterion takes into account the physical length between neighboring single nucleotide polymorphisms. The approach is very efficient and is applied to several large scale datasets and is shown to be successful in two recent benchmarking studies (Zaitlen et al., in press; Marchini et al., in preparation). Our method is publicly available via a webserver at .

Download Full-text

A New Approach to the Quality Control of Slope and Aspect Classes Derived from Digital Elevation Models

Remote Sensing ◽

10.3390/rs13112069 ◽

2021 ◽

Vol 13 (11) ◽

pp. 2069

Author(s):

M. V. Alba-Fernández ◽

F. J. Ariza-López ◽

M. D. Jiménez-Gamero

Keyword(s):

Quality Control ◽

Hypothesis Test ◽

Simulated Data ◽

Slope Aspect ◽

New Approach ◽

Reference Set ◽

Digital Elevation ◽

Using Data ◽

Elevation Model ◽

Degree Of Similarity

The usefulness of the parameters (e.g., slope, aspect) derived from a Digital Elevation Model (DEM) is limited by its accuracy. In this paper, a thematic-like quality control (class-based) of aspect and slope classes is proposed. A product can be compared against a reference dataset, which provides the quality requirements to be achieved, by comparing the product proportions of each class with those of the reference set. If a distance between the product proportions and the reference proportions is smaller than a small enough positive tolerance, which is fixed by the user, it will be considered that the degree of similarity between the product and the reference set is acceptable, and hence that its quality meets the requirements. A formal statistical procedure, based on a hypothesis test, is developed and its performance is analyzed using simulated data. It uses the Hellinger distance between the proportions. The application to the slope and aspect is illustrated using data derived from a 2×2 m DEM (reference) and 5×5 m DEM in Allo (province of Navarra, Spain).

Download Full-text

Aspects of Various Community Detection Algorithms in Social Network Analysis

Advanced Methodologies and Technologies in Media and Communications - Advances in Multimedia and Interactive Technologies ◽

10.4018/978-1-5225-7601-3.ch026 ◽

2019 ◽

pp. 327-341

Author(s):

Nicole Belinda Dillen ◽

Aruna Chakraborty

Keyword(s):

Social Network ◽

Social Network Analysis ◽

Network Analysis ◽

Community Detection ◽

Overlapping Community Detection ◽

Detection Algorithms ◽

Novel Approach ◽

Recent Developments ◽

Overlapping Community ◽

Related Individuals

One of the most important aspects of social network analysis is community detection, which is used to categorize related individuals in a social network into groups or communities. The approach is quite similar to graph partitioning, and in fact, most detection algorithms rely on concepts from graph theory and sociology. The aim of this chapter is to aid a novice in the field of community detection by providing a wider perspective on some of the different detection algorithms available, including the more recent developments in this field. Five popular algorithms have been studied and explained, and a recent novel approach that was proposed by the authors has also been included. The chapter concludes by highlighting areas suitable for further research, specifically targeting overlapping community detection algorithms.

Download Full-text

A Novel Approach to Uncover the Patient Blood Related Diseases using Data Mining Techniques

Journal of Medical Sciences(Faisalabad) ◽

10.3923/jms.2013.95.102 ◽

2013 ◽

Vol 13 (2) ◽

pp. 95-102 ◽

Cited By ~ 1

Author(s):

K. Dinakaran ◽

R. Preethi

Keyword(s):

Data Mining ◽

Data Mining Techniques ◽

Novel Approach ◽

Using Data

Download Full-text

Evaluating prior predictions of production and seismic data

Computational Geosciences ◽

10.1007/s10596-019-09889-6 ◽

2019 ◽

Vol 23 (6) ◽

pp. 1331-1347 ◽

Cited By ~ 2

Author(s):

Miguel Alfonzo ◽

Dean S. Oliver

Keyword(s):

Data Assimilation ◽

Seismic Data ◽

History Matching ◽

Simulated Data ◽

Model Parameters ◽

Model Diagnostic ◽

Diagonal Approximation ◽

Data Coverage ◽

Reservoir Simulation Model ◽

Using Data

Abstract It is common in ensemble-based methods of history matching to evaluate the adequacy of the initial ensemble of models through visual comparison between actual observations and data predictions prior to data assimilation. If the model is appropriate, then the observed data should look plausible when compared to the distribution of realizations of simulated data. The principle of data coverage alone is, however, not an effective method for model criticism, as coverage can often be obtained by increasing the variability in a single model parameter. In this paper, we propose a methodology for determining the suitability of a model before data assimilation, particularly aimed for real cases with large numbers of model parameters, large amounts of data, and correlated observation errors. This model diagnostic is based on an approximation of the Mahalanobis distance between the observations and the ensemble of predictions in high-dimensional spaces. We applied our methodology to two different examples: a Gaussian example which shows that our shrinkage estimate of the covariance matrix is a better discriminator of outliers than the pseudo-inverse and a diagonal approximation of this matrix; and an example using data from the Norne field. In this second test, we used actual production, repeat formation tester, and inverted seismic data to evaluate the suitability of the initial reservoir simulation model and seismic model. Despite the good data coverage, our model diagnostic suggested that model improvement was necessary. After modifying the model, it was validated against the observations and is now ready for history matching to production and seismic data. This shows that the proposed methodology for the evaluation of the adequacy of the model is suitable for large realistic problems.

Download Full-text