scholarly journals Multi-rate Poisson Tree Processes for single-locus species delimitation under Maximum Likelihood and Markov Chain Monte Carlo.

2017 ◽  
pp. btx025 ◽  
Author(s):  
P. Kapli ◽  
S. Lutteropp ◽  
J. Zhang ◽  
K. Kobert ◽  
P. Pavlidis ◽  
...  
2019 ◽  
Vol 4 (2) ◽  
pp. 100
Author(s):  
Catrin Muharisa ◽  
Ferra Yanuar ◽  
Hazmira Yozza

Analisis regresi merupakan salah satu metode untuk melihat hubungan antara variabel bebas (independent) dengan variabel terikat (dependent) yang dinyatakan dalam model regresi. Beberapa metode yang bisa digunakan untuk mengestimasi parameter model regresi, diantaranya adalah metode klasik dan metode Bayes. Salah satu metode klasik adalah metode maximum likelihood. Penelitian ini membahas tentang perbandingan metode maximum likelihood dan metode Bayes dalam mengestimasi parameter model regresi linear berganda untuk data berdistribusi normal. Adapun rumus untuk mengestimasi parameter dengan metode maximum likelihood adalah βˆ=(XTX)-1XTY dan ˆσ2 = 1 n P∞ k=1 ei. Sedangkan untuk mengestimasi parameter dengan metode Bayes adalah dengan menggunakan distribusi prior dan fungsi likelihood. Distribusi prior yag dipilih pada kajian ini adalah f(β, σ2 ) = Qn i=1 f(βj |σ 2 )f(σ 2 ) dengan βj ∼ N(µβj , σ2 ) dan σ 2 ∼ IG(a, b). Distribusi prior konjugat tersebut kemudian dikalikan dengan fungsi likelihood L(β, σ2 ) sehingga membentuk distribusi posterior f(β|σ 2 ). Distribusi posterior inilah yang digunakan untuk mengestimasi parameter model melalui proses Markov Chain Monte Carlo (MCMC). Algoritma MCMC yang digunakan adalah algoritma Gibbs Sampler. Model regresi linear berganda yang diperoleh dengan metode maximum likelihood adalahyˆ = −27, 8210000 + 0, 0307430X1 + 0, 0039211X2 + 0, 0034631X3 + 0, 6537000X4dengan kecocokan modelnya adalah sebesar 95,7 %. Sedangkan model regresi linear berganda yang diperoleh dengan metode Bayes adalahyˆ = −26, 620000 + 0, 029380X1 + 0, 004204X2 + 0, 003321X3 + 0, 656200X4dengan kecocokan modelnya adalah sebesar 99,99 %. Dengan demikian dapat disimpulkan bahwa metode Bayes lebih baik dari pada metode maximum likelihood.Kata Kunci: Model Regresi Linear Berganda, metode Maximum Likelihood, dan metode Bayes


2020 ◽  
Vol 36 (4) ◽  
pp. 1253-1259
Author(s):  
Autcha Araveeporn ◽  
Yuwadee Klomwises

Markov Chain Monte Carlo (MCMC) method has been a popular method for getting information about probability distribution for estimating posterior distribution by Gibbs sampling. So far, the standard methods such as maximum likelihood and logistic ridge regression methods have represented to compare with MCMC. The maximum likelihood method is the classical method to estimate the parameter on the logistic regression model by differential the loglikelihood function on the estimator. The logistic ridge regression depends on the choice of ridge parameter by using crossvalidation for computing estimator on penalty function. This paper provides maximum likelihood, logistic ridge regression, and MCMC to estimate parameter on logit function and transforms into a probability. The logistic regression model predicts the probability to observe a phenomenon. The prediction accuracy evaluates in terms of the percentage with correct predictions of a binary event. A simulation study conducts a binary response variable by using 2, 4, and 6 explanatory variables, which are generated from multivariate normal distribution on the positive and negative correlation coefficient or called multicollinearity problem. The criterion of these methods is to compare by a maximum of predictive accuracy. The outcomes find that MCMC satisfies all situations.


2021 ◽  
Vol 12 ◽  
Author(s):  
Oliver Lüdtke ◽  
Esther Ulitzsch ◽  
Alexander Robitzsch

With small to modest sample sizes and complex models, maximum likelihood (ML) estimation of confirmatory factor analysis (CFA) models can show serious estimation problems such as non-convergence or parameter estimates outside the admissible parameter space. In this article, we distinguish different Bayesian estimators that can be used to stabilize the parameter estimates of a CFA: the mode of the joint posterior distribution that is obtained from penalized maximum likelihood (PML) estimation, and the mean (EAP), median (Med), or mode (MAP) of the marginal posterior distribution that are calculated by using Markov Chain Monte Carlo (MCMC) methods. In two simulation studies, we evaluated the performance of the Bayesian estimators from a frequentist point of view. The results show that the EAP produced more accurate estimates of the latent correlation in many conditions and outperformed the other Bayesian estimators in terms of root mean squared error (RMSE). We also argue that it is often advantageous to choose a parameterization in which the main parameters of interest are bounded, and we suggest the four-parameter beta distribution as a prior distribution for loadings and correlations. Using simulated data, we show that selecting weakly informative four-parameter beta priors can further stabilize parameter estimates, even in cases when the priors were mildly misspecified. Finally, we derive recommendations and propose directions for further research.


2016 ◽  
Author(s):  
P. Kapli ◽  
S. Lutteropp ◽  
J. Zhang ◽  
K. Kobert ◽  
P. Pavlidis ◽  
...  

ABSTRACTMotivationIn recent years, molecular species delimitation has become a routine approach for quantifying and classifying biodiversity. Barcoding methods are of particular importance in large-scale surveys as they promote fast species discovery and biodiversity estimates. Among those, distance-based methods are the most common choice as they scale well with large datasets; however, they are sensitive to similarity threshold parameters and they ignore evolutionary relationships. The recently introduced “Poisson Tree Processes” (PTP) method is a phylogeny-aware approach that does not rely on such thresholds. Yet, two weaknesses of PTP impact its accuracy and practicality when applied to large datasets; it does not account for divergent intraspecific variation and is slow for a large number of sequences.ResultsWe introduce the multi-rate PTP (mPTP), an improved method that alleviates the theoretical and technical shortcomings of PTP. It incorporates different levels of intraspecific genetic diversity deriving from differences in either the evolutionary history or sampling of each species. Results on empirical data suggest that mPTP is superior to PTP and popular distance-based methods as it, consistently, yields more accurate delimitations with respect to the taxonomy (i.e., identifies more taxonomic species, infers species numbers closer to the taxonomy). Moreover, mPTP does not require any similarity threshold as input. The novel dynamic programming algorithm attains a speedup of at least five orders of magnitude compared to PTP, allowing it to delimit species in large (meta-) barcoding data. In addition, Markov Chain Monte Carlo sampling provides a comprehensive evaluation of the inferred delimitation in just a few seconds for millions of steps, independently of tree size.AvailabilitymPTP is implemented in C and is available for download at http://github.com/Pas-Kapli/mptp under the GNU Affero 3 license. A web-service is available at http://[email protected], [email protected], [email protected]


2020 ◽  
Author(s):  
Oliver Lüdtke ◽  
Esther Ulitzsch ◽  
Alexander Robitzsch

With small to modest sample sizes and complex models, maximum likelihood (ML) estimation of confirmatory factor analysis (CFA) models can show serious estimation problems such as nonconvergence or parameter estimates that are outside the admissible parameter space. In the present article, we discuss two Bayesian estimation methods for stabilizing parameter estimates of a CFA: Penalized maximum likelihood (PML) estimation and Markov Chain Monte Carlo (MCMC) methods. We clarify that these use different Bayesian point estimates from the joint posterior distribution—the mode (PML) of the joint posterior distribution, and the mean (EAP) or mode (MAP) of the marginal posterior distribution—and discuss under which conditions the two methods produce different results. In a simulation study, we show that the MCMC method clearly outperforms PML and that these performance gains can be explained by the fact that MCMC uses the EAP as a point estimate. We also argue that it is often advantageous to choose a parameterization in which the main parameters of interest are bounded and suggest the four-parameter beta distribution as a prior distribution for loadings and correlations. Using simulated data, we show that selecting weakly informative four-parameter beta priors can further stabilize parameter estimates, even in cases when the priors were mildly misspecified. Finally, we derive recommendations and propose directions for further research.


2019 ◽  
Author(s):  
Lena Collienne ◽  
Kieran Elmes ◽  
Mareike Fischer ◽  
David Bryant ◽  
Alex Gavryushkin

AbstractIn this paper we study the graph of ranked phylogenetic trees where the adjacency relation is given by a local rearrangement of the tree structure. Our work is motivated by tree inference algorithms, such as maximum likelihood and Markov Chain Monte Carlo methods, where the geometry of the search space plays a central role for efficiency and practicality of optimisation and sampling. We hence focus on understanding the geometry of the space (graph) of ranked trees, the so-called ranked nearest neighbour interchange (RNNI) graph. We find the radius and diameter of the space exactly, improving the best previously known estimates. Since the RNNI graph is a generalisation of the classical nearest neighbour interchange (NNI) graph to ranked phylogenetic trees, we compare geometric and algorithmic properties of the two graphs. Surprisingly, we discover that both geometric and algorithmic properties of RNNI and NNI are quite different. For example, we establish convexity of certain natural subspaces in RNNI which are not convex is NNI. Our results suggest that the complexity of computing distances in the two graphs is different.


Sign in / Sign up

Export Citation Format

Share Document