Use of the Multinomial Dirichlet Model for Analysis of Subdivided Genetic Populations

C J Jiang; C Clark Cockerham

doi:10.1093/genetics/115.2.363

Use of the Multinomial Dirichlet Model for Analysis of Subdivided Genetic Populations

Genetics ◽

10.1093/genetics/115.2.363 ◽

1987 ◽

Vol 115 (2) ◽

pp. 363-366

Author(s):

C J Jiang ◽

C Clark Cockerham

Keyword(s):

Dirichlet Distribution ◽

Multinomial Distribution ◽

Estimation Procedure ◽

Estimation Of Parameters ◽

Drift Model ◽

Distribution Estimation ◽

Compound Distribution ◽

Subdivided Populations ◽

Dirichlet Model

ABSTRACT The distribution found by compounding the multinomial distribution with the Dirichlet distribution has been suggested as a basis for the estimation of parameters in subdivided populations, in particular of the "correlation between genotypes" within subpopulations. It is shown that the estimators deriving from these procedures perform poorly when the data are generated by the classical Wright drift model of subdivided populations. This conclusion suggests that the compound distribution estimation approach does not provide a good estimation procedure for real populations which are reasonably described by the Wright model.

Download Full-text

Variaciones espaciales y ontogenéticas en la dieta de un plecóptero de amplia distribución Claudioperla tigrina Klapálek (Plecoptera: Gripopterygidae)

Revista de Biología Tropical ◽

10.15517/rbt.v65i3.23865 ◽

2017 ◽

Vol 65 (3) ◽

pp. 1174

Author(s):

María Celina Reynaga ◽

Natalia Dávalos ◽

Carlos Molineri

Keyword(s):

Organic Matter ◽

Food Item ◽

Dietary Habits ◽

Developmental Stages ◽

Multinomial Distribution ◽

Estimation Of Parameters ◽

Ecological Processes ◽

Size Classes ◽

Positive Side ◽

Definition Of

Dietary information gives insight into several ecological processes acting in lotic ecosystems. This work aimed: 1) to identify the dietary habits of Claudioperla tigrina immature stages along a wide altitudinal as well as latitudinal gradient in North Argentina; 2) to define the functional feeding group (FFG) of C. tigrina; 3) to evaluate differences in diet in the studied sites. Studied nymphs were collected from localities widely scattered in Northwestern Argentina and they fell into different developmental stages (four size classes). The ingested material was extracted from the foregut and midgut by using thorax ventral dissection. Dietary profiles were analyzed through the estimation of parameters associated with a Dirichlet-multinomial distribution. ANOVA’s were performed for each food item using sites as factor. Multidimensional Scaling was used to identify sites with similar dietary profiles. An analysis of food-niche breadth was also performed to evaluate the degree of dietary diversification for the resources consumed in each site. Mouthparts are similar across the different size classes, excepting the increasing sclerotization recorded with age. Mouthparts retained most of the typical chewing groundplan, showing relatively short labial and maxillar palps, and strong, sclerotized and denticulated mandibles and maxillae. Our results pointed out that the nymphs of C. tigrina always ingest two or more food items (CPOM, FPOM, invertebrates and algae), suggesting thus a flexible diet. The diet changed in relation to body size, while finer particles were consumed in the early stages, larger particles were ingested in final stages. Coarse particulate organic matter was the dominant food item, with signals of shredding during ingestion. Differences between sites for FPOM, invertebrates, algae and sediment were detected, but not for MOPG. Correlations were obtained for the first two axis of the MDS analysis. Sites AP, LT, LI, C and M (Yungas Rainforest and Humid Grassland) were negatively correlated with the axis 1 which was associated with increased consumption of FPOM. On the positive side of the axis the site P (High Andes) is associated with a greater proportion of invertebrates and sediment. The sites IN (Humid Grassland) and LR (Argentine Northwest Monte and Thistle of the Prepuna) were located at the positive domain of axis 2 which is in turn associated with a greater count of algae in the dietary contents. We found significant differences in the quantity of secondary items, and this is likely related with the resources environmental availability. The FFG of Claudioperla tigrina is primarily shredder/collector-gatherer in Yungas Rainforest and Humid Grassland shredder/predator in High Andes. FFG classification of C. tigrina and the definition of their role for organic matter processing is an important step for future studies based on functional groups such as analysis of food webs.

Download Full-text

A compound Dirichlet-Multinomial model for provincial level Covid-19 predictions in South Africa

10.1101/2020.06.15.20131433 ◽

2020 ◽

Author(s):

Alta de Waal ◽

Daan de Waal

Keyword(s):

South Africa ◽

Exponential Growth ◽

Dirichlet Distribution ◽

Multinomial Distribution ◽

Resource Planning ◽

Accurate Prediction ◽

Prediction Problem ◽

Multinomial Model ◽

Provincial Level ◽

Over Time

AbstractAccurate prediction of COVID-19 related indicators such as confirmed cases, deaths and recoveries play an important in understanding the spread and impact of the virus, as well as resource planning and allocation. In this study, we approach the prediction problem from a statistical perspective and predict confirmed cases and deaths on a provincial level. We propose the compound Dirichlet Multinomial distribution to estimate the proportion parameter of each province as mutually exclusive outcomes. Furthermore, we make an assumption of exponential growth of the total cummulative counts in order to predict future total counts. The outcomes of this approach is not only prediction. The variation of the proportion parameter is characterised by the Dirichlet distribution, which provides insight in the movement of the pandemic across provinces over time.

Download Full-text

A MODEL FOR ANALYSIS OF POPULATION STRUCTURE

Genetics ◽

10.1093/genetics/78.3.943 ◽

1974 ◽

Vol 78 (3) ◽

pp. 943-960

Author(s):

Edward D Rothman ◽

Charles F Sing ◽

Alan R Templeton

Keyword(s):

Population Structure ◽

Genetic Parameters ◽

Likelihood Function ◽

Dirichlet Distribution ◽

Inbreeding Coefficient ◽

Mutation Rates ◽

Alternative Procedure ◽

Testing Hypotheses ◽

Subdivided Population ◽

Dirichlet Model

ABSTRACT Arguments have been presented for the appropriateness of a multinomial Dirichlet distribution for describing single-locus genotypic frequencies in a subdivided population. This distribution is defined as a function of allele frequency, the average (over the entire population) inbreeding coefficient and the correlation between genotypes within a subdivision. Alternative parameterizations and their genetic interpretations are given.—We then show how information from a sample drawn from this subdivided population, in the absence of pedigrees, can be combined with the multinomial Dirichlet model to form a likelihood function. This likelihood function is then used as the basis for estimation and testing hypotheses concerning the genetic parameters of the model. Comparisons of this approach to the alternative procedure of COCKEXHAM (1969) and (1973) are made using human data obtained from Tecumseh, Michigan and Monte Carlo simulations.—Finally, implications of these results to statistical inference and to mutation rates are presented.

Download Full-text

Estimation of parameters in the extended growth curve model with a linearly structured covariance matrix

Acta et Commentationes Universitatis Tartuensis de Mathematica ◽

10.12697/acutm.2012.16.02 ◽

2012 ◽

Vol 16 (1) ◽

pp. 13-32

Author(s):

Joseph Nzabanita ◽

Dietrich von Rosen ◽

Martin Singull

Keyword(s):

Covariance Matrix ◽

Growth Curve ◽

Estimation Procedure ◽

Inner Product ◽

Dispersion Matrix ◽

Growth Curve Model ◽

Estimation Of Parameters ◽

Curve Model ◽

Extended Growth Curve Model ◽

Residual Space

In this paper the extended growth curve model with two terms and a linearly structured covariance matrix is considered. We propose an estimation procedure that handles linearly structured covariance matrices. The idea is first to estimate the covariance matrix when finding the inner product in a regression space and thereafter re-estimate it when it should be interpreted as a dispersion matrix. This idea is exploited by decomposing the residual space, the orthogonal complement to the design space, into three orthogonal subspaces. Studying residuals obtained from projections of observations on these subspaces yields explicit consistent estimators of the covariance matrix. An explicit consistent estimator of the mean is also proposed and numerical examples are given.

Download Full-text

Estimation of Parameters of Impulse Responses of Mechanical Systems by Modified Prony Method

Solid State Phenomena ◽

10.4028/www.scientific.net/ssp.113.190 ◽

2006 ◽

Vol 113 ◽

pp. 190-194 ◽

Cited By ~ 2

Author(s):

Vytautas Slivinskas ◽

Virginija Šimonytė

Keyword(s):

Root Mean Square Error ◽

Optimization Procedure ◽

Estimation Procedure ◽

Mean Square ◽

Impulse Responses ◽

Estimation Of Parameters ◽

Unknown Parameters ◽

Response Data ◽

Prony Method ◽

Multiple Poles

In this paper the problem of estimation of parameters of a mechanical system using sampled impulse response data is considered. For the estimation of the unknown parameters the modified Prony method is used. It is shown that for the particular data the use of multiple poles improves the accuracy of the model. The initial Prony method estimates are further optimized by an iterative Levenberg optimization procedure. The root mean square error (RMSE) is the criteria for the estimation procedure. The additional modes are used in order to get better results.

Download Full-text

A model for categorical length data from groundfish surveys

Canadian Journal of Fisheries and Aquatic Sciences ◽

10.1139/f04-049 ◽

2004 ◽

Vol 61 (7) ◽

pp. 1135-1142 ◽

Cited By ~ 21

Author(s):

Birgir Hrafnkelsson ◽

Gunnar Stefánsson

Keyword(s):

Gadus Morhua ◽

Atlantic Cod ◽

Length Distribution ◽

Multinomial Distribution ◽

Covariance Structure ◽

Correlation Coefficients ◽

Estimation Procedure ◽

Multinomial Model ◽

Multivariate Gaussian Distribution ◽

Length Data

An extension of the multinomial model of counts is presented to account for overdispersion and different correlation structure. Such models are needed in biological applications such as the analysis of length measurements from surveys of heterogeneous populations used for assessments of marine resources. One of the goals of such a survey is to estimate the length distribution of each species within a particular area. Using data on Atlantic cod (Gadus morhua) in Icelandic waters, it is demonstrated that the assumptions used in practice for categorical length data are seriously violated. The length data on cod exhibit variances that are larger than those of the standard multinomial model and correlation coefficients that are greater than those of the Dirichlet-multinomial model. To alleviate these problems, a hierarchical model based on the multinomial distribution and the logistically transformed multivariate Gaussian distribution is proposed. It is illustrated that this model captures the complex covariance structure of the data. The parameters in the models are estimated using a Bayesian estimation procedure based on Markov chain Monte Carlo.

Download Full-text

Some Properties and Estimation of Parameters for the Five Parameter Type I Generalized Half Logistic Distribution under Complete Observation

Asian Research Journal of Mathematics ◽

10.9734/arjom/2020/v16i430185 ◽

2020 ◽

pp. 39-46

Author(s):

O. A. Bello ◽

P. O. Awodutire ◽

I. Sule ◽

H. O. Lawal

Keyword(s):

Maximum Likelihood ◽

Maximum Likelihood Method ◽

Likelihood Method ◽

Logistic Distribution ◽

Type I ◽

Lifetime Data ◽

Estimation Of Parameters ◽

Distribution Estimation ◽

Complete Observation

This paper is a further study of the five parameter type I generalized half logistic distribution. We derived some properties of the distribution. Estimation of the parameters of the distribution under complete observation was studied using the maximum likelihood method. To assess the flexibility of the distribution, it was applied to a real lifetime data and the results when compared to the sub-models showed that the five parameter type I generalized half logistic distribution performed best.

Download Full-text

Dirichlet-multinomial modelling outperforms alternatives for analysis of microbiome and other ecological count data

10.1101/711317 ◽

2019 ◽

Cited By ~ 1

Author(s):

Joshua G. Harrison ◽

W. John Calder ◽

Vivaswat Shastry ◽

C. Alex Buerkle

Keyword(s):

Monte Carlo ◽

Count Data ◽

Dirichlet Distribution ◽

Molecular Ecology ◽

Multinomial Distribution ◽

Simulated Data ◽

Published Data ◽

Computationally Efficient ◽

Relative Abundances ◽

Analytical Tools

AbstractMolecular ecology regularly requires the analysis of count data that reflect the relative abundance of features of a composition (e.g., taxa in a community, gene transcripts in a tissue). The sampling process that generates these data can be modeled using the multinomial distribution. Replicate multinomial samples inform the relative abundances of features in an underlying Dirichlet distribution. These distributions together form a hierarchical model for relative abundances among replicates and sampling groups. This type of Dirichlet-multinomial modelling (DMM) has been described previously, but its benefits and limitations are largely untested. With simulated data, we quantified the ability of DMM to detect differences in proportions between treatment and control groups, and compared the efficacy of three computational methods to implement DMM—Hamiltonian Monte Carlo (HMC), variational inference (VI), and Gibbs Markov chain Monte Carlo. We report that DMM was better able to detect shifts in relative abundances than analogous analytical tools, while identifying an acceptably low number of false positives. Among methods for implementing DMM, HMC provided the most accurate estimates of relative abundances, and VI was the most computationally efficient. The sensitivity of DMM was exemplified through analysis of previously published data describing lung microbiomes. We report that DMM identified several potentially pathogenic, bacterial taxa as more abundant in the lungs of children who aspirated foreign material during swallowing; these differences went undetected with different statistical approaches. Our results suggest that DMM has strong potential as a statistical method to guide inference in molecular ecology.

Download Full-text

DIM: Adaptively Combining User Interests Mined at Different Stages Based on Deformable Interest Model

Mathematical Problems in Engineering ◽

10.1155/2020/4365602 ◽

2020 ◽

Vol 2020 ◽

pp. 1-13

Author(s):

Xiaoru Wang ◽

Yueli Li ◽

Zhihong Yu ◽

Fu Li ◽

Heng Zhang ◽

...

Keyword(s):

Dirichlet Distribution ◽

Multinomial Distribution ◽

Situational Interest ◽

Personalized Recommendation ◽

User Interest ◽

Degree Of Deformation ◽

Tracking Model ◽

Interest Model ◽

Evolutional Process

User interest mining is widely used in the fields of personalized search and personalized recommendation. Traditional methods ignore the formation of user interest which is a process that evolves over time. This leads to the inability to accurately describe the distribution of user interest. In this paper, we propose the interest tracking model (ITM). To add the timing, ITM uses Dirichlet distribution and multinomial distribution to describe the evolutional process of interest topics and frequent patterns, which well adapts to the evolution of user interest hidden in short texts between different time slices. In addition, it is well known that user interest is composed of long-term interest and situational interest including short-term interest and social hot topics. State-of-the-art methods simply regard the users’ long-term interest as the users’ final interest, which makes those unable to completely describe the user interest distribution. To solve this problem, we propose the deformable interest model (DIM) which designs an objective function to combine users’ long-term interest and situational interest and more comprehensively and accurately mine user interest. Furthermore, we present the degree of deformation which measures the subinterest's degree of influence on final interest and propose in DIM the influence real-time update mechanism. The mechanism adaptively updates the degree of deformation through the linear iteration and reduces the degree of dependence of the interest model on training sets. We present results via a dataset consisting of Flickr users and their uploaded information in three months, a dataset consisting of Twitter users and their tweets in three months, and a dataset consisting of Instagram users and their uploaded information in three months, showing that the perplexity is reduced to 0.378, the average accuracy is increased to 94%, and the average NMI is increased to 0.20, which prove better interest prediction.

Download Full-text

DIRICHLET DISTRIBUTION AND ESTIMATION OF PARAMETERS

Advances and Applications in Statistics ◽

10.17654/as053040401 ◽

2018 ◽

Vol 53 (4) ◽

pp. 401-421

Author(s):

A. Kübra Demirel ◽

H. Eray Çelik

Keyword(s):

Dirichlet Distribution ◽

Estimation Of Parameters

Download Full-text