Estimating the size of hidden populations from register data.

Ledberg, Anders; Wennberg, Peter

Ledberg, Anders; Wennberg, Peter.

Ledberg A; Centre for Social Research on Alcohol and Drugs, SoRAD, Stockholm University, SE-10691 Stockholm, Sweden. anders.ledberg@sorad.su.se.

BMC Med Res Methodol ; 14: 58, 2014 Apr 27.

Article en En | MEDLINE | ID: mdl-24766871

RESUMEN

BACKGROUND: Prevalence estimates of drug use, or of its consequences, are considered important in many contexts and may have substantial influence over public policy. However, it is rarely possible to simply count the relevant individuals, in particular when the defining characteristics might be illegal, as in the drug use case. Consequently methods are needed to estimate the size of such partly 'hidden' populations, and many such methods have been developed and used within epidemiology including studies of alcohol and drug use. Here we introduce a method appropriate for estimating the size of human populations given a single source of data, for example entries in a health-care registry. METHODS: The setup is the following: during a fixed time-period, e.g. a year, individuals belonging to the target population have a non-zero probability of being "registered". Each individual might be registered multiple times and the time-points of the registrations are recorded. Assuming that the population is closed and that the probability of being registered at least once is constant, we derive a family of maximum likelihood (ML) estimators of total population size. We study the ML estimator using Monte Carlo simulations and delimit the range of cases where it is useful. In particular we investigate the effect of making the population heterogeneous with respect to probability of being registered. RESULTS: The new estimator is asymptotically unbiased and we show that high precision estimates can be obtained for samples covering as little as 25% of the total population size. However, if the total population size is small (say in the order of 500) a larger fraction needs to be sampled to achieve reliable estimates. Further we show that the estimator give reliable estimates even when individuals differ in the probability of being registered. We also compare the ML estimator to an estimator known as Chao's estimator and show that the latter can have a substantial bias when applied to epidemiological data. CONCLUSIONS: The population size estimator suggested herein complements existing methods and is less sensitive to certain types of dependencies typical in epidemiological data.

Asunto(s)

Dependencia de Heroína/epidemiología; Dependencia de Heroína/mortalidad; Densidad de Población; Simulación por Computador; Métodos Epidemiológicos; Humanos; Funciones de Verosimilitud; Método de Montecarlo; Dinámica Poblacional; Sistema de Registros; Tamaño de la Muestra; Estadística como Asunto

Texto completo

Imprimir

XML

PubMed Links

Search on Google

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Densidad de Población / Dependencia de Heroína Tipo de estudio: Health_economic_evaluation / Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Año: 2014 Tipo del documento: Article

Texto completo

Imprimir

XML

PubMed Links

Search on Google