On Consistency of Absolute Deviations Estimators of Convex Functions

Yao Luo; Eunji Lim

doi:10.5539/ijsp.v5n2p1

On Consistency of Absolute Deviations Estimators of Convex Functions

International Journal of Statistics and Probability ◽

10.5539/ijsp.v5n2p1 ◽

2016 ◽

Vol 5 (2) ◽

pp. 1

Author(s):

Yao Luo ◽

Eunji Lim

Keyword(s):

Waiting Time ◽

Convex Functions ◽

Service Rate ◽

Data Sets ◽

Single Server ◽

Computationally Efficient ◽

Data Set ◽

Least Absolute Deviations ◽

Long Run ◽

Mean Waiting Time

When estimating an unknown function from a data set of n observations, the function is often known to be convex. For example, the long-run average waiting time of a customer in a single server queue is known to be convex in the service rate (Weber 1983) even though there is no closed-form formula for the mean waiting time, and hence, it needs to be estimated from a data set. A computationally efficient way of finding the best fit of the convex function to the data set is to compute the least absolute deviations estimator minimizing the sum of absolute deviations over the set of convex functions. This estimator exhibits numerically preferred behavior since it can be computed faster and for a larger data sets compared to other existing methods (Lim & Luo 2014). In this paper, we establish the validity of the least absolute deviations estimator by proving that the least absolute deviations estimator converges almost surely to the true function as n increases to infinity under modest assumptions.

Download Full-text

Bayesian Inference of Species Trees using Diffusion Models

Systematic Biology ◽

10.1093/sysbio/syaa051 ◽

2020 ◽

Vol 70 (1) ◽

pp. 145-161 ◽

Cited By ~ 1

Author(s):

Marnus Stoltz ◽

Boris Baeumer ◽

Remco Bouckaert ◽

Colin Fox ◽

Gordon Hiscott ◽

...

Keyword(s):

Bayesian Inference ◽

Numerical Algorithms ◽

Diffusion Models ◽

Model Parameters ◽

Data Sets ◽

Species Trees ◽

Computationally Efficient ◽

Data Set ◽

Snp Data ◽

Binary Markers

Abstract We describe a new and computationally efficient Bayesian methodology for inferring species trees and demographics from unlinked binary markers. Likelihood calculations are carried out using diffusion models of allele frequency dynamics combined with novel numerical algorithms. The diffusion approach allows for analysis of data sets containing hundreds or thousands of individuals. The method, which we call Snapper, has been implemented as part of the BEAST2 package. We conducted simulation experiments to assess numerical error, computational requirements, and accuracy recovering known model parameters. A reanalysis of soybean SNP data demonstrates that the models implemented in Snapp and Snapper can be difficult to distinguish in practice, a characteristic which we tested with further simulations. We demonstrate the scale of analysis possible using a SNP data set sampled from 399 fresh water turtles in 41 populations. [Bayesian inference; diffusion models; multi-species coalescent; SNP data; species trees; spectral methods.]

Download Full-text

Large Deviations for the Single-Server Queue and the Reneging Paradox

Mathematics of Operations Research ◽

10.1287/moor.2021.1127 ◽

2021 ◽

Author(s):

Rami Atar ◽

Amarjit Budhiraja ◽

Paul Dupuis ◽

Ruoyu Wu

Keyword(s):

Large Deviations ◽

Decay Rate ◽

Sample Path ◽

Arrival Rate ◽

Rate Function ◽

Service Rate ◽

Single Server ◽

Large Deviations Principle ◽

Long Run ◽

The Individual

For the M/M/1+M model at the law-of-large-numbers scale, the long-run reneging count per unit time does not depend on the individual (i.e., per customer) reneging rate. This paradoxical statement has a simple proof. Less obvious is a large deviations analogue of this fact, stated as follows: the decay rate of the probability that the long-run reneging count per unit time is atypically large or atypically small does not depend on the individual reneging rate. In this paper, the sample path large deviations principle for the model is proved and the rate function is computed. Next, large time asymptotics for the reneging rate are studied for the case when the arrival rate exceeds the service rate. The key ingredient is a calculus of variations analysis of the variational problem associated with atypical reneging. A characterization of the aforementioned decay rate, given explicitly in terms of the arrival and service rate parameters of the model, is provided yielding a precise mathematical description of this paradoxical behavior.

Download Full-text

Convexity results for single-server queues and for multiserver queues with constant service times

Journal of Applied Probability ◽

10.1017/s0021900200038948 ◽

1990 ◽

Vol 27 (02) ◽

pp. 465-468 ◽

Cited By ~ 1

Author(s):

Arie Harel

Keyword(s):

Waiting Time ◽

Service Time ◽

Sojourn Time ◽

Interarrival Time ◽

Service Rate ◽

Single Server ◽

Multiserver Queues ◽

Service Rates ◽

Constant Service Times ◽

Number Of Customers

We show that the waiting time in queue and the sojourn time of every customer in the G/G/1 and G/D/c queue are jointly convex in mean interarrival time and mean service time, and also jointly convex in mean interarrival time and service rate. Counterexamples show that this need not be the case, for the GI/GI/c queue or for the D/GI/c queue, for c ≧ 2. Also, we show that the average number of customers in the M/D/c queue is jointly convex in arrival and service rates. These results are surprising in light of the negative result for the GI/GI/2 queue (Weber (1983)).

Download Full-text

Convexity results for single-server queues and for multiserver queues with constant service times

Journal of Applied Probability ◽

10.2307/3214668 ◽

1990 ◽

Vol 27 (2) ◽

pp. 465-468 ◽

Cited By ~ 5

Author(s):

Arie Harel

Keyword(s):

Waiting Time ◽

Service Time ◽

Sojourn Time ◽

Interarrival Time ◽

Service Rate ◽

Single Server ◽

Multiserver Queues ◽

Service Rates ◽

Constant Service Times ◽

Number Of Customers

We show that the waiting time in queue and the sojourn time of every customer in the G/G/1 and G/D/c queue are jointly convex in mean interarrival time and mean service time, and also jointly convex in mean interarrival time and service rate. Counterexamples show that this need not be the case, for the GI/GI/c queue or for the D/GI/c queue, for c ≧ 2. Also, we show that the average number of customers in the M/D/c queue is jointly convex in arrival and service rates.These results are surprising in light of the negative result for the GI/GI/2 queue (Weber (1983)).

Download Full-text

Voxel-wise and spatial modelling of binary lesion masks: A review and comparison of methods

10.1101/2021.01.11.426223 ◽

2021 ◽

Author(s):

Petya Kindalova ◽

Ioannis Kosmidis ◽

Thomas E. Nichols

Keyword(s):

Maximum Likelihood ◽

Spatial Dependence ◽

Reference Data ◽

Spatial Modelling ◽

Maximum Likelihood Estimates ◽

Data Sets ◽

Computationally Efficient ◽

Data Set ◽

Lesion Mapping ◽

Modelling Approach

AbstractObjectivesWhite matter lesions are a very common finding on MRI in older adults and their presence increases the risk of stroke and dementia. Accurate and computationally efficient modelling methods are necessary to map the association of lesion incidence with risk factors, such as hypertension. However, there is no consensus in the brain mapping literature whether a voxel-wise modelling approach is better for binary lesion data than a more computationally intensive spatial modelling approach that accounts for voxel dependence.MethodsWe review three regression approaches for modelling binary lesion masks including massunivariate probit regression modelling with either maximum likelihood estimates, or mean bias-reduced estimates, and spatial Bayesian modelling, where the regression coefficients have a conditional autoregressive model prior to account for local spatial dependence. We design a novel simulation framework of artificial lesion maps to compare the three alternative lesion mapping methods. The age effect on lesion probability estimated from a reference data set (13,680 individuals from the UK Biobank) is used to simulate a realistic voxel-wise distribution of lesions across age. To mimic the real features of lesion masks, we suggest matching brain lesion summaries (total lesion volume, average lesion size and lesion count) across the reference data set and the simulated data sets. Thus, we allow for a fair comparison between the modelling approaches, under a realistic simulation setting.ResultsOur findings suggest that bias-reduced estimates for voxel-wise binary-response generalized linear models (GLMs) overcome the drawbacks of infinite and biased maximum likelihood estimates and scale well for large data sets because voxel-wise estimation can be performed in parallel across voxels. Contrary to the assumption of spatial dependence being key in lesion mapping, our results show that voxel-wise bias-reduction and spatial modelling result in largely similar estimates.ConclusionBias-reduced estimates for voxel-wise GLMs are not only accurate but also computationally efficient, which will become increasingly important as more biobank-scale neuroimaging data sets become available.

Download Full-text

Generalised birth and death queueing processes: recent results

Advances in Applied Probability ◽

10.2307/1425820 ◽

1977 ◽

Vol 9 (1) ◽

pp. 125-140 ◽

Cited By ~ 10

Author(s):

B. W. Conolly ◽

J. Chan

Keyword(s):

Waiting Time ◽

Queueing Systems ◽

System Size ◽

Single Server ◽

Traffic Intensity ◽

Stable Regime ◽

Mean Waiting Time ◽

Service Rates ◽

Queueing Processes ◽

Effective Service

The systems considered are single-server, though the theory has wider application to models of adaptive queueing systems. Arrival and service mechanisms are governed by state (n)-dependent mean arrival and service rates λn and µn. It is assumed that the choice of λn and µn leads to a stable regime. Formulae are sought that provide easy means of computing statistics of effectiveness of systems. A measure of traffic intensity is first defined in terms of ‘effective’ service time and inter-arrival intervals. It is shown that the latter have a renewal type connection with appropriately defined mean effective arrival and service rates λ∗ and µ∗ and that in consequence the ratio λ∗/µ∗ is the traffic intensity, equal moreover to where is the stable probability of an empty system, consistent with other systems. It is also shown that for first come, first served discipline the equivalent of Little's formula holds, where and are the mean waiting time of an arrival and mean system size at an arbitrary epoch. In addition it appears that stable regime output intervals are statistically identical with effective inter-arrival intervals. Symmetrical moment formulae of arbitrary order are derived algebraically for effective inter-arrival and service intervals, for waiting time, for busy period and for output.

Download Full-text

On the Modelling of the Mobile WiMAX (IEEE 802.16e) Uplink Scheduler

Modelling and Simulation in Engineering ◽

10.1155/2010/804939 ◽

2010 ◽

Vol 2010 ◽

pp. 1-7

Author(s):

Darmawaty Mohd Ali ◽

Kaharudin Dimyati

Keyword(s):

Waiting Time ◽

Arrival Rate ◽

Mobile Wimax ◽

Single Server ◽

Service Time Distribution ◽

Ieee 802.16E ◽

Delay Constraint ◽

Threshold Policy ◽

Polling Model ◽

Mean Waiting Time

Packet scheduling has drawn a great deal of attention in the field of wireless networks as it plays an important role in distributing shared resources in a network. The process involves allocating the bandwidth among users and determining their transmission order. In this paper an uplink (UL) scheduling algorithm for the Mobile Worldwide Interoperability for Microwave Access (WiMAX) network based on the cyclic polling model is proposed. The model in this study consists of five queues (UGS, ertPS, rtPS, nrtPS, and BE) visited by a single server. A threshold policy is imposed to the nrtPS queue to ensure that the delay constraint of real time traffic (UGS, ertPS, and rtPS) is not violated making this approach original in comparison to the existing contributions. A mathematical model is formulated for the weighted sum of the mean waiting time of each individual queues based on the pseudo-conservation law. The results of the analysis are useful in obtaining or testing approximation for individual mean waiting time especially when queues are asymmetric (where each queue may have different stochastic characteristic such as arrival rate and service time distribution) and when their number is large (more than 2 queues).

Download Full-text

Band-based similarity indices for gene expression clustering and classification.

10.21203/rs.2.4296/v1 ◽

2019 ◽

Author(s):

Aurora Torrente

Keyword(s):

Gene Expression ◽

Euclidean Distance ◽

Real Data ◽

Data Sets ◽

Similarity Coefficients ◽

Computationally Efficient ◽

Data Set ◽

Similarity Indices ◽

Clustering And Classification ◽

Band Depth

Abstract Background: The concept of depth induces an ordering from centre outwards in multivariate data. Most depth definitions are unfeasible for dimensions larger than three or four, but the Modified Band Depth (MBD) is a notable exception that has proven to be a valuable tool in the analysis of gene expression data. However, given a notion of depth, there exists no straight forward method to derive a depth-based similarity or dissimilarity measure between observations to be used in standard tasks such as clustering or classification. Results: We propose a methodology to assess a data-driven (dis)similarity between two observations, taking advantage of the bands used in the computation of the MBD. To that end, we build binary vectors associated to each observation to compute the number of times each coordinate is located between the limits of the intervals defined by all possible bands in the set. Those vectors and their Boolean products are used to derive contingency tables from which standard similarity indices can be calculated. Our approach is computationally efficient and can be applied to bands formed by any number of observations from the data set. Conclusions: We have evaluated the performance of several similarity indices with respect to that of the Euclidean distance, used as benchmark, in standard clustering and classification techniques in a variety of simulated and real data sets. Our experiments show that the technique for deriving such measures is very promising, with some of the selected indices outperforming the Euclidean distance. The use of the method is not restricted to these, the extension to other similarity coefficients being straight-forward.

Download Full-text

PCQC: Selecting optimal principal components for identifying clusters with highly imbalanced class sizes in single-cell RNA-seq data

10.1101/2020.11.19.390542 ◽

2020 ◽

Author(s):

David Burstein ◽

John F. Fullard ◽

Panos Roussos

Keyword(s):

Single Cell ◽

Principal Components ◽

Data Sets ◽

Sequencing Data ◽

Computationally Efficient ◽

Data Set ◽

Cell Gene Expression ◽

Efficient Alternative ◽

Small Clusters ◽

Variance Explained

AbstractSummaryPrior to identifying clusters in single cell gene expression experiments, selecting the top principal components is a critical step for filtering out noise in the data set. Identifying these top principal components typically focuses on the total variance explained, and principal components that explain small clusters from rare populations will not necessarily capture a large percentage of variance in the data. We present a computationally efficient alternative for identifying the optimal principal components based on the tails of the distribution of variance explained for each observation. We then evaluate the efficacy of our approach in three different single cell RNA-sequencing data sets and find that our method matches, or outperforms, other selection criteria that are typically employed in the literature.Availability and implementationpcqc is written in Python and available at github.com/RoussosLab/pcqc

Download Full-text

SERVER WAITING TIMES IN INFINITE SUPPLY POLLING SYSTEMS WITH PREPARATION TIMES

Probability in the Engineering and Informational Sciences ◽

10.1017/s0269964815000339 ◽

2015 ◽

Vol 30 (2) ◽

pp. 153-184 ◽

Cited By ~ 1

Author(s):

Jan-Pieter L. Dorsman ◽

Nir Perel ◽

Maria Vlasiou

Keyword(s):

Waiting Time ◽

Queueing Networks ◽

Waiting Times ◽

Fixed Number ◽

Polling Systems ◽

Single Server ◽

Dynamic Allocation ◽

Mean Waiting Time ◽

Preparation Phase ◽

Stationary Waiting Time

We consider a system consisting of a single server serving a fixed number of stations. At each station, there is an infinite queue of customers that have to undergo a preparation phase before being served. This model is connected to layered queueing networks, to an extension of polling systems and surprisingly to random graphs. We are interested in the waiting time of the server. For the case where the server polls the stations cyclically, we give a sufficient condition for the existence of a limiting waiting-time distribution and we study the tail behavior of the stationary waiting time. Furthermore, assuming that preparation times are exponentially distributed, we describe in depth the resulting Markov chain. We also investigate a model variation where the server does not necessarily poll the stations in a cyclic order, but always serves the customer with the earliest completed preparation phase. We show that the mean waiting time under this dynamic allocation never exceeds that of the cyclic case, but that the waiting-time distributions corresponding to both cases are not necessarily stochastically ordered. Finally, we provide extensive numerical results investigating and comparing the effect of the system's parameters to the performance of the server for both models.

Download Full-text