A conjecture on the Feldman bandit problem

2018 ◽  
Vol 55 (1) ◽  
pp. 318-324
Author(s):  
Maher Nouiehed ◽  
Sheldon M. Ross

Abstract We consider the Bernoulli bandit problem where one of the arms has win probability α and the others β, with the identity of the α arm specified by initial probabilities. With u = max(α, β), v = min(α, β), call an arm with win probability u a good arm. Whereas it is known that the strategy of always playing the arm with the largest probability of being a good arm maximizes the expected number of wins in the first n games for all n, we conjecture that it also stochastically maximizes the number of wins. That is, we conjecture that this strategy maximizes the probability of at least k wins in the first n games for all k, n. The conjecture is proven when k = 1, and k = n, and when there are only two arms and k = n - 1.

1980 ◽  
Vol 12 (01) ◽  
pp. 174-182 ◽  
Author(s):  
John Bather

Given a finite number of different experiments with unknown probabilities p 1, p 2, ···, p k of success, the multi-armed bandit problem is concerned with maximising the expected number of successes in a sequence of trials. There are many policies which ensure that the proportion of successes converges to p = max (p 1, p 2, ···, p k ), in the long run. This property is established for a class of decision procedures which rely on randomisation, at each stage, in selecting the experiment for the next trial. Further, it is suggested that some of these procedures might perform well over any finite sequence of trials.


1980 ◽  
Vol 12 (1) ◽  
pp. 174-182 ◽  
Author(s):  
John Bather

Given a finite number of different experiments with unknown probabilities p1, p2, ···, pk of success, the multi-armed bandit problem is concerned with maximising the expected number of successes in a sequence of trials. There are many policies which ensure that the proportion of successes converges to p = max (p1, p2, ···, pk), in the long run. This property is established for a class of decision procedures which rely on randomisation, at each stage, in selecting the experiment for the next trial. Further, it is suggested that some of these procedures might perform well over any finite sequence of trials.


Optimization ◽  
1976 ◽  
Vol 7 (3) ◽  
pp. 471-475 ◽  
Author(s):  
P.W. Jones
Keyword(s):  

2007 ◽  
Author(s):  
Dennis Garlick ◽  
Aaron P. Blaisdell
Keyword(s):  

Genetics ◽  
1989 ◽  
Vol 123 (3) ◽  
pp. 597-601 ◽  
Author(s):  
F Tajima

Abstract The expected number of segregating sites and the expectation of the average number of nucleotide differences among DNA sequences randomly sampled from a population, which is not in equilibrium, have been developed. The results obtained indicate that, in the case where the population size has changed drastically, the number of segregating sites is influenced by the size of the current population more strongly than is the average number of nucleotide differences, while the average number of nucleotide differences is affected by the size of the original population more severely than is the number of segregating sites. The results also indicate that the average number of nucleotide differences is affected by a population bottleneck more strongly than is the number of segregating sites.


2021 ◽  
Vol 17 (2) ◽  
pp. 1-39
Author(s):  
Mai Ben Adar Bessos ◽  
Amir Herzberg

We investigate an understudied threat: networks of stealthy routers (S-Routers) , relaying messages to a hidden destination . The S-Routers relay communication along a path of multiple short-range, low-energy hops, to avoid remote localization by triangulation. Mobile devices called Interceptors can detect communication by an S-Router, but only when the Interceptor is next to the transmitting S-Router. We examine algorithms for a set of mobile Interceptors to find the destination of the communication relayed by the S-Routers. The algorithms are compared according to the number of communicating rounds before the destination is found, i.e., rounds in which data is transmitted from the source to the destination . We evaluate the algorithms analytically and using simulations, including against a parametric, optimized strategy for the S-Routers. Our main result is an Interceptors algorithm that bounds the expected number of communicating rounds by a term quasilinear in the number of S-Routers. For the case where S-Routers transmit at every round (“continuously”), we present an algorithm that improves this bound.


Genetics ◽  
1987 ◽  
Vol 117 (1) ◽  
pp. 149-153
Author(s):  
Curtis Strobeck

ABSTRACT Unbiased estimates of θ = 4Nµ in a random mating population can be based on either the number of alleles or the average number of nucleotide differences in a sample. However, if there is population structure and the sample is drawn from a single subpopulation, these two estimates of θ behave differently. The expected number of alleles in a sample is an increasing function of the migration rates, whereas the expected average number of nucleotide differences is shown to be independent of the migration rates and equal to 4N  Tµ for a general model of population structure which includes both the island model and the circular stepping-stone model. This contrast in the behavior of these two estimates of θ is used as the basis of a test for population subdivision. Using a Monte-Carlo simulation developed so that independent samples from a single subpopulation could be obtained quickly, this test is shown to be a useful method to determine if there is population subdivision.


Genetics ◽  
1996 ◽  
Vol 143 (2) ◽  
pp. 645-659 ◽  
Author(s):  
Timothy Galitski ◽  
John R Roth

Abstract The most prominent systems for the study of adaptive mutability depend on the specialized activities of genetic elements like bacteriophage Mu and the F plasmid. Searching for general adaptive mutability, we have investigated the behavior of Salmonella typhimurium strains with chromosomal lacZ mutations. We have studied 30 revertible nonsense, missense, frameshift, and insertion alleles. One-third of the mutants produced ≥10 late revertant colonies (appearing three to seven days after plating on selective medium). For the prolific mutants, the number of late revertants showed rank correlation with the residual β-galactosidase activity; for the same mutants, revertant number showed no correlation with the nonselective reversion rate (from fluctuation tests). Leaky mutants, which grew slowly on selective medium, produced late revertants whereas tight nongrowing mutants generally did not produce late revertants. However, the number of late revertants was not proportional to residual growth. Using total residual growth and the nonselective reversion rate, the expected number of late revertants was calculated. For several leaky mutants, the observed revertant number exceeded the expected number. We suggest that excess late revertants from these mutants arise from general adaptive mutability available to any chromosomal gene.


Sign in / Sign up

Export Citation Format

Share Document