Respondent-driven sampling and an unusual epidemic

J. Malmros; F. Liljeros; T. Britton

doi:10.1017/jpr.2016.17

Respondent-driven sampling and an unusual epidemic

Journal of Applied Probability ◽

10.1017/jpr.2016.17 ◽

2016 ◽

Vol 53 (2) ◽

pp. 518-530 ◽

Cited By ~ 4

Author(s):

J. Malmros ◽

F. Liljeros ◽

T. Britton

Keyword(s):

Population Size ◽

Current Practice ◽

Relative Size ◽

Respondent Driven Sampling ◽

Study Participation ◽

Hidden Populations ◽

Infinite Population ◽

Major Outbreak

Abstract Respondent-driven sampling (RDS) is frequently used when sampling from hidden populations. In RDS, sampled individuals pass on participation coupons to at most c of their acquaintances in the community (c = 3 being a common choice). If these individuals choose to participate, they in turn pass coupons on to their acquaintances, and so on. The process of recruiting is shown to behave like a new Reed–Frost-type network epidemic, in which 'becoming infected' corresponds to study participation. We calculate R0, the probability of a major 'outbreak', and the relative size of a major outbreak for c < ∞ in the limit of infinite population size and compare to the standard Reed–Frost epidemic. Our results indicate that c should often be chosen larger than in current practice.

Download Full-text

Population Size Estimations Among Hidden Populations Using Respondent-Driven Sampling Surveys: Case Studies From Armenia

JMIR Public Health and Surveillance ◽

10.2196/12034 ◽

2019 ◽

Vol 5 (1) ◽

pp. e12034 ◽

Cited By ~ 5

Author(s):

Katherine R McLaughlin ◽

Lisa G Johnston ◽

Laura J Gamble ◽

Trdat Grigoryan ◽

Arshak Papoyan ◽

...

Keyword(s):

Population Size ◽

Case Studies ◽

Respondent Driven Sampling ◽

Hidden Populations

Download Full-text

Population Size Estimations Among Hidden Populations Using Respondent-Driven Sampling Surveys: Case Studies From Armenia (Preprint)

10.2196/preprints.12034 ◽

2018 ◽

Author(s):

Katherine R McLaughlin ◽

Lisa G Johnston ◽

Laura J Gamble ◽

Trdat Grigoryan ◽

Arshak Papoyan ◽

...

Keyword(s):

Population Size ◽

Measurement Errors ◽

Network Size ◽

Respondent Driven Sampling ◽

Size Estimation ◽

Hidden Populations ◽

Care Needs ◽

Personal Network ◽

Successive Sampling ◽

Population Size Estimates

BACKGROUND Estimates of the sizes of hidden populations, including female sex workers (FSW), men who have sex with men (MSM), and people who inject drugs (PWID), are essential for understanding the magnitude of vulnerabilities, health care needs, risk behaviors, and HIV and other infections. OBJECTIVE This article advances the successive sampling-population size estimation (SS-PSE) method by examining the performance of a modification allowing visibility to be jointly modeled with population size in the context of 15 datasets. Datasets are from respondent-driven sampling (RDS) surveys of FSW, MSM, and PWID from three cities in Armenia. We compare and evaluate the accuracy of our imputed visibility population size estimates to those found for the same populations through other unpublished methods. We then suggest questions that are useful for eliciting information needed to compute SS-PSE and provide guidelines and caveats to improve the implementation of SS-PSE for real data. METHODS SS-PSE approximates the RDS sampling mechanism via the successive sampling model and uses the order of selection of the sample to provide information on the distribution of network sizes over the population members. We incorporate visibility imputation, a measure of a person’s propensity to participate in the study, given that inclusion probabilities for RDS are unknown and social network sizes, often used as a proxy for inclusion probability, are subject to measurement errors from self-reported study data. RESULTS FSW in Yerevan (2012, 2016) and Vanadzor (2016) as well as PWID in Yerevan (2014), Gyumri (2016), and Vanadzor (2016) had great fits with prior estimations. The MSM populations in all three cities had inconsistencies with expert prior values. The maximum low prior value was larger than the minimum high prior value, making a great fit impossible. One possible explanation is the inclusion of transgender individuals in the MSM populations during these studies. There could be differences between what experts perceive as the size of the population, based on who is an eligible member of that population, and what members of the population perceive. There could also be inconsistencies among different study participants, as some may include transgender individuals in their accounting of personal network size, while others may not. Because of these difficulties, the transgender population was split apart from the MSM population for the 2018 study. CONCLUSIONS Prior estimations from expert opinions may not always be accurate. RDS surveys should be assessed to ensure that they have met all of the assumptions, that variables have reached convergence, and that the network structure of the population does not have bottlenecks. We recommend that SS-PSE be used in conjunction with other population size estimations commonly used in RDS, as well as results of other years of SS-PSE, to ensure generation of the most accurate size estimation.

Download Full-text

Assessing Bias in Population Size Estimates Among Hidden Populations When Using the Service Multiplier Method Combined With Respondent-Driven Sampling Surveys: Survey Study

JMIR Public Health and Surveillance ◽

10.2196/15044 ◽

2020 ◽

Vol 6 (2) ◽

pp. e15044

Author(s):

Sungai T Chabata ◽

Elizabeth Fearon ◽

Emily L Webb ◽

Helen A Weiss ◽

James R Hargreaves ◽

...

Keyword(s):

Population Size ◽

Sex Workers ◽

Multiplier Method ◽

Survey Study ◽

Respondent Driven Sampling ◽

Size Estimation ◽

Hidden Populations ◽

Increased Risk ◽

Population Size Estimates ◽

Size Estimates

Background Population size estimates (PSEs) for hidden populations at increased risk of HIV, including female sex workers (FSWs), are important to inform public health policy and resource allocation. The service multiplier method (SMM) is commonly used to estimate the sizes of hidden populations. We used this method to obtain PSEs for FSWs at 9 sites in Zimbabwe and explored methods for assessing potential biases that could arise in using this approach. Objective This study aimed to guide the assessment of biases that arise when estimating the population sizes of hidden populations using the SMM combined with respondent-driven sampling (RDS) surveys. Methods We conducted RDS surveys at 9 sites in late 2013, where the Sisters with a Voice program (the program), which collects program visit data of FSWs, was also present. Using the SMM, we obtained PSEs for FSWs at each site by dividing the number of FSWs who attended the program, based on program records, by the RDS-II weighted proportion of FSWs who reported attending this program in the previous 6 months in the RDS surveys. Both the RDS weighting and SMM make a number of assumptions, potentially leading to biases if the assumptions are not met. To test these assumptions, we used convergence and bottleneck plots to assess seed dependence of RDS-II proportion estimates, chi-square tests to assess if there was an association between the characteristics of FSWs and their knowledge of program existence, and logistic regression to compare the characteristics of FSWs attending the program with those recruited to RDS surveys. Results The PSEs ranged from 194 (95% CI 62-325) to 805 (95% CI 456-1142) across 9 sites from May to November 2013. The 95% CIs for the majority of sites were wide. In some sites, the RDS-II proportion of women who reported program use in the RDS surveys may have been influenced by the characteristics of selected seeds, and we also observed bottlenecks in some sites. There was no evidence of association between characteristics of FSWs and knowledge of program existence, and in the majority of sites, there was no evidence that the characteristics of the populations differed between RDS and program data. Conclusions We used a series of rigorous methods to explore potential biases in our PSEs. We were able to identify the biases and their potential direction, but we could not determine the ultimate direction of these biases in our PSEs. We have evidence that the PSEs in most sites may be biased and a suggestion that the bias is toward underestimation, and this should be considered if the PSEs are to be used. These tests for bias should be included when undertaking population size estimation using the SMM combined with RDS surveys.

Download Full-text

Assessing Bias in Population Size Estimates Among Hidden Populations When Using the Service Multiplier Method Combined With Respondent-Driven Sampling Surveys: Survey Study (Preprint)

10.2196/preprints.15044 ◽

2019 ◽

Author(s):

Sungai T Chabata ◽

Elizabeth Fearon ◽

Emily L Webb ◽

Helen A Weiss ◽

James R Hargreaves ◽

...

Keyword(s):

Population Size ◽

Sex Workers ◽

Multiplier Method ◽

Survey Study ◽

Respondent Driven Sampling ◽

Size Estimation ◽

Hidden Populations ◽

Increased Risk ◽

Population Size Estimates ◽

Size Estimates

BACKGROUND Population size estimates (PSEs) for hidden populations at increased risk of HIV, including female sex workers (FSWs), are important to inform public health policy and resource allocation. The service multiplier method (SMM) is commonly used to estimate the sizes of hidden populations. We used this method to obtain PSEs for FSWs at 9 sites in Zimbabwe and explored methods for assessing potential biases that could arise in using this approach. OBJECTIVE This study aimed to guide the assessment of biases that arise when estimating the population sizes of hidden populations using the SMM combined with respondent-driven sampling (RDS) surveys. METHODS We conducted RDS surveys at 9 sites in late 2013, where the Sisters with a Voice program (the program), which collects program visit data of FSWs, was also present. Using the SMM, we obtained PSEs for FSWs at each site by dividing the number of FSWs who attended the program, based on program records, by the RDS-II weighted proportion of FSWs who reported attending this program in the previous 6 months in the RDS surveys. Both the RDS weighting and SMM make a number of assumptions, potentially leading to biases if the assumptions are not met. To test these assumptions, we used convergence and bottleneck plots to assess seed dependence of RDS-II proportion estimates, chi-square tests to assess if there was an association between the characteristics of FSWs and their knowledge of program existence, and logistic regression to compare the characteristics of FSWs attending the program with those recruited to RDS surveys. RESULTS The PSEs ranged from 194 (95% CI 62-325) to 805 (95% CI 456-1142) across 9 sites from May to November 2013. The 95% CIs for the majority of sites were wide. In some sites, the RDS-II proportion of women who reported program use in the RDS surveys may have been influenced by the characteristics of selected seeds, and we also observed bottlenecks in some sites. There was no evidence of association between characteristics of FSWs and knowledge of program existence, and in the majority of sites, there was no evidence that the characteristics of the populations differed between RDS and program data. CONCLUSIONS We used a series of rigorous methods to explore potential biases in our PSEs. We were able to identify the biases and their potential direction, but we could not determine the ultimate direction of these biases in our PSEs. We have evidence that the PSEs in most sites may be biased and a suggestion that the bias is toward underestimation, and this should be considered if the PSEs are to be used. These tests for bias should be included when undertaking population size estimation using the SMM combined with RDS surveys.

Download Full-text

B-Graph Sampling to Estimate the Size of a Hidden Population

Journal of Official Statistics ◽

10.1515/jos-2015-0042 ◽

2015 ◽

Vol 31 (4) ◽

pp. 723-736 ◽

Cited By ~ 1

Author(s):

Marinus Spreen ◽

Stefan Bogaerts

Keyword(s):

Social Network ◽

Sampling Design ◽

Respondent Driven Sampling ◽

Sampling Frame ◽

Hidden Populations ◽

Hidden Population ◽

Graph Sampling ◽

Graph Design ◽

Applied Design ◽

Incomplete Sampling

Abstract Link-tracing designs are often used to estimate the size of hidden populations by utilizing the relational links between their members. A major problem in studies of hidden populations is the lack of a convenient sampling frame. The most frequently applied design in studies of hidden populations is respondent-driven sampling in which no sampling frame is used. However, in some studies multiple but incomplete sampling frames are available. In this article, we introduce the B-graph design that can be used in such situations. In this design, all available incomplete sampling frames are joined and turned into one sampling frame, from which a random sample is drawn and selected respondents are asked to mention their contacts. By considering the population as a bipartite graph of a two-mode network (those from the sampling frame and those who are not on the frame), the number of respondents who are directly linked to the sampling frame members can be estimated using Chao’s and Zelterman’s estimators for sparse data. The B-graph sampling design is illustrated using the data of a social network study from Utrecht, the Netherlands.

Download Full-text

Software Application Profile: The Anchored Multiplier calculator—a Bayesian tool to synthesize population size estimates

International Journal of Epidemiology ◽

10.1093/ije/dyz101 ◽

2019 ◽

Vol 48 (6) ◽

pp. 1744-1749

Author(s):

Paul D Wesson ◽

Willi McFarland ◽

Cong Charlie Qin ◽

Ali Mirzazadeh

Keyword(s):

Population Size ◽

Web Application ◽

Probability Distributions ◽

Public Health Research ◽

Forest Plot ◽

Hidden Populations ◽

Software Application ◽

Posterior Probability Distribution ◽

Population Size Estimates ◽

Size Estimates

Abstract Estimating the number of people in hidden populations is needed for public health research, yet available methods produce highly variable and uncertain results. The Anchored Multiplier calculator uses a Bayesian framework to synthesize multiple population size estimates to generate a consensus estimate. Users submit point estimates and lower/upper bounds which are converted to beta probability distributions and combined to form a single posterior probability distribution. The Anchored Multiplier calculator is available as a web browser-based application. The software allows for unlimited empirical population size estimates to be submitted and combined according to Bayes Theorem to form a single estimate. The software returns output as a forest plot (to visually compare data inputs and the final Anchored Multiplier estimate) and a table that displays results as population percentages and counts. The web application ‘Anchored Multiplier Calculator’ is free software and is available at [http://globalhealthsciences.ucsf.edu/resources/tools] or directly at [http://anchoredmultiplier.ucsf.edu/].

Download Full-text

A Revisit of Infinite Population Models for Evolutionary Algorithms on Continuous Optimization Problems

Evolutionary Computation ◽

10.1162/evco_a_00249 ◽

2020 ◽

Vol 28 (1) ◽

pp. 55-85

Author(s):

Bo Song ◽

Victor O.K. Li

Keyword(s):

Population Dynamics ◽

Evolutionary Algorithms ◽

Population Size ◽

Optimization Problems ◽

Continuous Optimization ◽

Analytical Framework ◽

Population Models ◽

Initial Population ◽

Infinite Population ◽

Continuous Optimization Problems

Infinite population models are important tools for studying population dynamics of evolutionary algorithms. They describe how the distributions of populations change between consecutive generations. In general, infinite population models are derived from Markov chains by exploiting symmetries between individuals in the population and analyzing the limit as the population size goes to infinity. In this article, we study the theoretical foundations of infinite population models of evolutionary algorithms on continuous optimization problems. First, we show that the convergence proofs in a widely cited study were in fact problematic and incomplete. We further show that the modeling assumption of exchangeability of individuals cannot yield the transition equation. Then, in order to analyze infinite population models, we build an analytical framework based on convergence in distribution of random elements which take values in the metric space of infinite sequences. The framework is concise and mathematically rigorous. It also provides an infrastructure for studying the convergence of the stacking of operators and of iterating the algorithm which previous studies failed to address. Finally, we use the framework to prove the convergence of infinite population models for the mutation operator and the [Formula: see text]-ary recombination operator. We show that these operators can provide accurate predictions for real population dynamics as the population size goes to infinity, provided that the initial population is identically and independently distributed.

Download Full-text

Respondent-Driven Sampling II: Deriving Valid Population Estimates from Chain-Referral Samples of Hidden Populations

Social Problems ◽

10.1525/sp.2002.49.1.11 ◽

2002 ◽

Vol 49 (1) ◽

pp. 11-34 ◽

Cited By ~ 1072

Author(s):

Douglas D. Heckathorn

Keyword(s):

Population Estimates ◽

Respondent Driven Sampling ◽

Hidden Populations

Download Full-text

Comments on "Theoretical analysis of evolutionary algorithms with an infinite population size in continuous space. I. Basic properties of selection and mutation" [with reply]

IEEE Transactions on Neural Networks ◽

10.1109/72.661129 ◽

1998 ◽

Vol 9 (2) ◽

pp. 341-343 ◽

Cited By ~ 2

Author(s):

Yong Gao ◽

Xiaofeng Qi ◽

F. Palmieri

Keyword(s):

Theoretical Analysis ◽

Evolutionary Algorithms ◽

Population Size ◽

Continuous Space ◽

Basic Properties ◽

Infinite Population

Download Full-text

On the prediction of simultaneous inbreeding coefficients at multiple loci

Genetics Research ◽

10.1017/s0016672303006633 ◽

2004 ◽

Vol 83 (2) ◽

pp. 113-120 ◽

Cited By ~ 8

Author(s):

JULES HERNÁNDEZ-SÁNCHEZ ◽

CHRIS S. HALEY ◽

JOHN A. WOOLLIAMS

Keyword(s):

Population Size ◽

Effective Population Size ◽

Conditional Probability ◽

Prior Probability ◽

Effective Population ◽

Infinite Population ◽

Inbreeding Coefficients ◽

Multiple Loci ◽

Monoecious Population ◽

Marker Spacing

A new deterministic method for predicting simultaneous inbreeding coefficients at three and four loci is presented. The method involves calculating the conditional probability of IBD (identical by descent) at one locus given IBD at other loci, and multiplying this probability by the prior probability of the latter loci being simultaneously IBD. The conditional probability is obtained applying a novel regression model, and the prior probability from the theory of digenic measures of Weir and Cockerham. The model was validated for a finite monoecious population mating at random, with a constant effective population size, and with or without selfing, and also for an infinite population with a constant intermediate proportion of selfing. We assumed discrete generations. Deterministic predictions were very accurate when compared with simulation results, and robust to alternative forms of implementation. These simultaneous inbreeding coefficients were more sensitive to changes in effective population size than in marker spacing. Extensions to predict simultaneous inbreeding coefficients at more than four loci are now possible.

Download Full-text