scholarly journals Inference of Multiple-wave Admixtures by Length Distribution of Ancestral Tracks

2016 ◽  
Author(s):  
Xumin Ni ◽  
Xiong Yang ◽  
Kai Yuan ◽  
Qidi Feng ◽  
Wei Guo ◽  
...  

ABSTRACTThe ancestral tracks in admixed genomes are of valuable information for population history inference. A few methods have been developed to infer admixture history based on ancestral tracks. Nonetheless, these methods suffered the same flaw that only population admixture history under some specific models can be inferred. In addition, the inference of history might be biased or even unreliable if the specific model is deviated from the real situation. To address this problem, we firstly proposed a general discrete admixture model to describe the admixture history with multiple ancestral populations and multiple-wave admixtures. We next deduced the length distribution of ancestral tracks under the general discrete admixture model. We further developed a new method, MultiWaver, to explore the multiple-wave admixture histories. Our method could automatically determine an optimal admixture model based on the length distribution of ancestral tracks, and estimate the corresponding parameters under this optimal model. Specifically, we used a likelihood ratio test (LRT) to determine the number of admixture waves, and implemented an expectation??maximization (EM) algorithm to estimate parameters. We used simulation studies to validate the reliability and effectiveness of our method. Finally, good performance was observed when our method was applied to real datasets of African Americans, Mexicans, Uyghurs, and Hazaras.

2016 ◽  
Vol 6 (1) ◽  
Author(s):  
Xumin Ni ◽  
Xiong Yang ◽  
Wei Guo ◽  
Kai Yuan ◽  
Ying Zhou ◽  
...  

Abstract The length of ancestral tracks decays with the passing of generations which can be used to infer population admixture histories. Previous studies have shown the power in recovering the histories of admixed populations via the length distributions of ancestral tracks even under simple models. We believe that the deduction of length distributions under a general model will greatly elevate the power. Here we first deduced the length distributions under a general model and proposed general principles in parameter estimation and model selection with the deduced length distributions. Next, we focused on studying the length distributions and its applications under three typical special cases. Extensive simulations showed that the length distributions of ancestral tracks were well predicted by our theoretical framework. We further developed a new method, AdmixInfer, based on the length distributions and good performance was observed when it was applied to infer population histories under the three typical models. Notably, our method was insensitive to demographic history, sample size and threshold to discard short tracks. Finally, good performance was also observed when applied to some real datasets of African Americans, Mexicans and South Asian populations from the HapMap project and the Human Genome Diversity Project.


2021 ◽  
Vol 9 (1) ◽  
pp. 157-175
Author(s):  
Walaa EL-Sharkawy ◽  
Moshira A. Ismail

This paper deals with testing the number of components in a Birnbaum-Saunders mixture model under randomly right censored data. We focus on two methods, one based on the modified likelihood ratio test and the other based on the shortcut of bootstrap test. Based on extensive Monte Carlo simulation studies, we evaluate and compare the performance of the proposed tests through their size and power. A power analysis provides guidance for researchers to examine the factors that affect the power of the proposed tests used in detecting the correct number of components in a Birnbaum-Saunders mixture model. Finally an example of aircraft Windshield data is used to illustrate the testing procedure.


2015 ◽  
Author(s):  
Xumin Ni ◽  
Xiong Yang ◽  
Wei Guo ◽  
Kai Yuan ◽  
Ying Zhou ◽  
...  

As a chromosome is sliced into pieces by recombination after entering an admixed population, ancestral tracks of chromosomes are shortened with the pasting of generations. The length distribution of ancestral tracks reflects information of recombination and thus can be used to infer the histories of admixed populations. Previous studies have shown that inference based on ancestral tracks is powerful in recovering the histories of admixed populations. However, population histories are always complex, and previous studies only deduced the length distribution of ancestral tracks under very simple admixture models. The deduction of length distribution of ancestral tracks under a more general model will greatly elevate the power in inferring population histories. Here we first deduced the length distribution of ancestral tracks under a general model in an admixed population, and proposed general principles in parameter estimation and model selection with the length distribution. Next, we focused on studying the length distribution of ancestral tracks and its applications under three typical admixture models, which were all special cases of our general model. Extensive simulations showed that the length distribution of ancestral tracks was well predicted by our theoretical models. We further developed a new method based on the length distribution of ancestral tracks and good performance was observed when it was applied in inferring population histories under the three typical models. Notably, our method was insensitive to demographic history, sample size and threshold to discard short tracks. Finally, we applied our method in African Americans and Mexicans from the HapMap dataset, and several South Asian populations from the Human Genome Diversity Project dataset. The results showed that the histories of African Americans and Mexicans matched the historical records well, and the population admixture history of South Asians was very complex and could be traced back to around 100 generations ago.


Genetics ◽  
2005 ◽  
Vol 169 (2) ◽  
pp. 1021-1031 ◽  
Author(s):  
Tianhua Niu ◽  
Adam A. Ding ◽  
Reinhold Kreutz ◽  
Klaus Lindpaintner

2012 ◽  
Vol 2012 ◽  
pp. 1-19 ◽  
Author(s):  
Qihong Duan ◽  
Xiang Chen ◽  
Dengfu Zhao ◽  
Zheng Zhao

We study a multistate model for an aging piece of equipment under condition-based maintenance and apply an expectation maximization algorithm to obtain maximum likelihood estimates of the model parameters. Because of the monitoring discontinuity, we cannot observe any state's duration. The observation consists of the equipment's state at an inspection or right after a repair. Based on a proper construction of stochastic processes involved in the model, calculation of some probabilities and expectations becomes tractable. Using these probabilities and expectations, we can apply an expectation maximization algorithm to estimate the parameters in the model. We carry out simulation studies to test the accuracy and the efficiency of the algorithm.


2019 ◽  
Vol 4 (1) ◽  
pp. 66
Author(s):  
Wirna Arifitriana ◽  
Danardono Danardono

Survival  analysis  is  a  statistical  technique  used  to  analyze  the  data,  aims  to determine the variables that affect the outcome of a beginning to end the incident. One model of survival is a cure model is useful for estimating the proportion of patients who recover and the probability of survival of patients who did not recover until   the   deadline   given.   Analysis   on   Cox   regression   cure   model   Hazard Proportional with Maximum Likelihood Estimates and Algorithm Expectation Maximization (EM). Keywords: Cox Proportional Hazard Cure Model, MLE, EM algorithm, likelihood ratio test, Wald test. 


Genetics ◽  
2021 ◽  
Author(s):  
Éadaoin Harney ◽  
Nick Patterson ◽  
David Reich ◽  
John Wakeley

Abstract qpAdm is a statistical tool for studying the ancestry of populations with histories that involve admixture between two or more source populations. Using qpAdm, it is possible to identify plausible models of admixture that fit the population history of a group of interest and to calculate the relative proportion of ancestry that can be ascribed to each source population in the model. Although qpAdm is widely used in studies of population history of human (and nonhuman) groups, relatively little has been done to assess its performance. We performed a simulation study to assess the behavior of qpAdm under various scenarios in order to identify areas of potential weakness and establish recommended best practices for use. We find that qpAdm is a robust tool that yields accurate results in many cases, including when data coverage is low, there are high rates of missing data or ancient DNA damage, or when diploid calls cannot be made. However, we caution against co-analyzing ancient and present-day data, the inclusion of an extremely large number of reference populations in a single model, and analyzing population histories involving extended periods of gene flow. We provide a user guide suggesting best practices for the use of qpAdm.


2020 ◽  
Vol 29 (9) ◽  
pp. 2733-2748
Author(s):  
Markus Pauly ◽  
Łukasz Smaga

Coefficients of variations are unit-free measures that can, for example, be used to compare the variability of different samples. To this end, we study inference methods for them as well as their reciprocal given by standardised means in general heterogeneous one-way ANOVA designs. As no specific model assumptions are made, a permutation method is proposed to guarantee good finite sample performance. Building on recent limit theorems for randomisation techniques, we prove that the permutation procedure is asymptotically correct in general and finitely exact when data is exchangeable. These results are fostered in extensive simulation studies and two illustrative data analyses.


2017 ◽  
Author(s):  
Mila Lankarany

AbstractInference of excitatory and inhibitory synaptic conductances (SCs) from the spike trains is poorly addressed in the literature due to the complexity of the problem. As recent technological advancements make recording spikes from multiple (neighbor) neurons of a behaving animal (in some rare cases from humans) possible, this paper tackles the problem of estimating SCs solely from the recorded spike trains. Given an ensemble of spikes corresponding to population of neighbor neurons, we aim to infer the average excitatory and inhibitory SCs underlying the shared neural activity. In this paper, we extended our previously established Kalman filtering (KF)–based algorithm to incorporate the voltage-to-spike nonlinearity (mapping from membrane potential to spike rate). Having estimated the instantaneous spike rate using optimal linear filtering (Gaussian kernel), our proposed algorithm uses KF followed by expectation maximization (EM) algorithm in a recursive fashion to infer the average SCs. As the dynamics of SCs and membrane potential is included in our model, the proposed algorithm, unlike other related works, considers different sources of stochasticity, i.e., the variabilities of SCs, membrane potential, and spikes. Moreover, it is worth mentioning that our algorithm is blind to the external stimulus, and it performs only based on observed spikes. We validate the accuracy and practicality of our technique through simulation studies where leaky integrate and fire (LIF) model is used to generate spikes. We show that the estimated SCs can precisely track the original ones. Moreover, we show that the performance of our algorithm can be further improved given enough number of trials (spikes). As a rule of thumb, 50 trials of neurons with the average firing rate of 5 Hz can guarantee the accuracy of our proposed algorithm.


2021 ◽  
Vol 10 (3) ◽  
pp. 1
Author(s):  
Chuanhua Wei ◽  
Xiaoxiao Ma

This paper considers the problem of testing independence of equations in a seemingly unrelated regression model. A novel empirical likelihood test approach is proposed, and under the null hypothesis it is shown to follow asymptotically a chi-square distribution. Finally, simulation studies and a real data example are conducted to illustrate the performance of the proposed method.


Sign in / Sign up

Export Citation Format

Share Document