Effects of Community Structure
on
Respondent-Driven Sampling
Abstract:
Respondent-driven sampling (RDS) is a recently introduced, and
now widely used, technique for estimating disease prevalence in
hidden populations. The sample is collected through a form of
snowball sampling where current sample members recruit future
sample members. We re-interpret respondent-driven sampling as
Markov chain Monte Carlo (MCMC) importance sampling, and examine
the effects of community structure and recruitment methodology
on the variance of RDS estimates. Past work on RDS has assumed
that the variance of RDS estimates is primarily affected by
segregation between healthy and infected individuals. We examine
an illustrative model to show that this network feature, while
important, in isolation tends to significantly underestimate the
effects of community structure on RDS estimates. We also show
that variance is increased by a sample design feature which
allows sample members to recruit multiple future sample members.
Our observations are further substantiated by network data
collected as part of the National Longitudinal Study of
Adolescent Health. This is joint work with Matthew Salganik.
Biography:
Sharad Goel is a member of the Microeconomics and Social Systems
Group at Yahoo! Research. He received his PhD in Applied
Mathematics from Cornell University, and has held positions as a
research fellow in the math departments of Stanford University
and the University of Southern California. Sharad works at the
interface of economics, statistics and computer science, and is
particularly interested in problems of collective behavior.
|