Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 31
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
Nat Commun ; 15(1): 4626, 2024 May 30.
Artículo en Inglés | MEDLINE | ID: mdl-38816383

RESUMEN

The human infectious reservoir of Plasmodium falciparum is governed by transmission efficiency during vector-human contact and mosquito biting preferences. Understanding biting bias in a natural setting can help target interventions to interrupt transmission. In a 15-month cohort in western Kenya, we detected P. falciparum in indoor-resting Anopheles and human blood samples by qPCR and matched mosquito bloodmeals to cohort participants using short-tandem repeat genotyping. Using risk factor analyses and discrete choice models, we assessed mosquito biting behavior with respect to parasite transmission. Biting was highly unequal; 20% of people received 86% of bites. Biting rates were higher on males (biting rate ratio (BRR): 1.68; CI: 1.28-2.19), children 5-15 years (BRR: 1.49; CI: 1.13-1.98), and P. falciparum-infected individuals (BRR: 1.25; CI: 1.01-1.55). In aggregate, P. falciparum-infected school-age (5-15 years) boys accounted for 50% of bites potentially leading to onward transmission and had an entomological inoculation rate 6.4x higher than any other group. Additionally, infectious mosquitoes were nearly 3x more likely than non-infectious mosquitoes to bite P. falciparum-infected individuals (relative risk ratio 2.76, 95% CI 1.65-4.61). Thus, persistent P. falciparum transmission was characterized by disproportionate onward transmission from school-age boys and by the preference of infected mosquitoes to feed upon infected people.


Asunto(s)
Anopheles , Mordeduras y Picaduras de Insectos , Malaria Falciparum , Mosquitos Vectores , Plasmodium falciparum , Humanos , Anopheles/parasitología , Anopheles/fisiología , Animales , Plasmodium falciparum/fisiología , Plasmodium falciparum/aislamiento & purificación , Plasmodium falciparum/genética , Malaria Falciparum/transmisión , Malaria Falciparum/parasitología , Masculino , Adolescente , Niño , Preescolar , Femenino , Kenia/epidemiología , Mosquitos Vectores/parasitología , Mosquitos Vectores/fisiología , Adulto , Conducta Alimentaria , Adulto Joven , Lactante
2.
Proc Natl Acad Sci U S A ; 120(21): e2207185120, 2023 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-37192169

RESUMEN

Collecting complete network data is expensive, time-consuming, and often infeasible. Aggregated Relational Data (ARD), which ask respondents questions of the form "How many people with trait X do you know?" provide a low-cost option when collecting complete network data is not possible. Rather than asking about connections between each pair of individuals directly, ARD collect the number of contacts the respondent knows with a given trait. Despite widespread use and a growing literature on ARD methodology, there is still no systematic understanding of when and why ARD should accurately recover features of the unobserved network. This paper provides such a characterization by deriving conditions under which statistics about the unobserved network (or functions of these statistics like regression coefficients) can be consistently estimated using ARD. We first provide consistent estimates of network model parameters for three commonly used probabilistic models: the beta-model with node-specific unobserved effects, the stochastic block model with unobserved community structure, and latent geometric space models with unobserved latent locations. A key observation is that cross-group link probabilities for a collection of (possibly unobserved) groups identify the model parameters, meaning ARD are sufficient for parameter estimation. With these estimated parameters, it is possible to simulate graphs from the fitted distribution and analyze the distribution of network statistics. We can then characterize conditions under which the simulated networks based on ARD will allow for consistent estimation of the unobserved network statistics, such as eigenvector centrality, or response functions by or of the unobserved network, such as regression coefficients.

4.
Popul Health Metr ; 20(1): 3, 2022 01 10.
Artículo en Inglés | MEDLINE | ID: mdl-35012587

RESUMEN

BACKGROUND: The mortality pattern from birth to age five is known to vary by underlying cause of mortality, which has been documented in multiple instances. Many countries without high functioning vital registration systems could benefit from estimates of age- and cause-specific mortality to inform health programming, however, to date the causes of under-five death have only been described for broad age categories such as for neonates (0-27 days), infants (0-11 months), and children age 12-59 months. METHODS: We adapt the log quadratic model to mortality patterns for children under five to all-cause child mortality and then to age- and cause-specific mortality (U5ACSM). We apply these methods to empirical sample registration system mortality data in China from 1996 to 2015. Based on these empirical data, we simulate probabilities of mortality in the case when the true relationships between age and mortality by cause are known. RESULTS: We estimate U5ACSM within 0.1-0.7 deaths per 1000 livebirths in hold out strata for life tables constructed from the China sample registration system, representing considerable improvement compared to an error of 1.2 per 1000 livebirths using a standard approach. This improved prediction error for U5ACSM is consistently demonstrated for all-cause as well as pneumonia- and injury-specific mortality. We also consistently identified cause-specific mortality patterns in simulated mortality scenarios. CONCLUSION: The log quadratic model is a significant improvement over the standard approach for deriving U5ACSM based on both simulation and empirical results.


Asunto(s)
Mortalidad del Niño , Mortalidad Infantil , Causas de Muerte , Niño , Preescolar , China/epidemiología , Humanos , Lactante , Recién Nacido , Tablas de Vida
5.
R J ; 14(4): 316-334, 2022 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37974934

RESUMEN

Verbal autopsy (VA) is a survey-based tool widely used to infer cause of death (COD) in regions without complete-coverage civil registration and vital statistics systems. In such settings, many deaths happen outside of medical facilities and are not officially documented by a medical professional. VA surveys, consisting of signs and symptoms reported by a person close to the decedent, are used to infer the COD for an individual, and to estimate and monitor the COD distribution in the population. Several classification algorithms have been developed and widely used to assign causes of death using VA data. However, the incompatibility between different idiosyncratic model implementations and required data structure makes it difficult to systematically apply and compare different methods. The openVA package provides the first standardized framework for analyzing VA data that is compatible with all openly available methods and data structure. It provides an open-source, R implementation of several most widely used VA methods. It supports different data input and output formats, and customizable information about the associations between causes and symptoms. The paper discusses the relevant algorithms, their implementations in R packages under the openVA suite, and demonstrates the pipeline of model fitting, summary, comparison, and visualization in the R environment.

6.
Ann Appl Stat ; 16(1): 124-143, 2022 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-37621750

RESUMEN

In order to implement disease-specific interventions in young age groups, policy makers in low- and middle-income countries require timely and accurate estimates of age- and cause-specific child mortality. High-quality data is not available in settings where these interventions are most needed, but there is a push to create sample registration systems that collect detailed mortality information. current methods that estimate mortality from this data employ multistage frameworks without rigorous statistical justification that separately estimate all-cause and cause-specific mortality and are not sufficiently adaptable to capture important features of the data. We propose a flexible Bayesian modeling framework to estimate age- and cause-specific child mortality from sample registration data. We provide a theoretical justification for the framework, explore its properties via simulation, and use it to estimate mortality trends using data from the Maternal and Child Health Surveillance System in China.

7.
Epidemics ; 36: 100477, 2021 09.
Artículo en Inglés | MEDLINE | ID: mdl-34171509

RESUMEN

The novel SARS-CoV-2 virus, as it manifested in India in April 2020, showed marked heterogeneity in its transmission. Here, we used data collected from contact tracing during the lockdown in response to the first wave of COVID-19 in Punjab, a major state in India, to quantify this heterogeneity, and to examine implications for transmission dynamics. We found evidence of heterogeneity acting at multiple levels: in the number of potentially infectious contacts per index case, and in the per-contact risk of infection. Incorporating these findings in simple mathematical models of disease transmission reveals that these heterogeneities act in combination to strongly influence transmission dynamics. Standard approaches, such as representing heterogeneity through secondary case distributions, could be biased by neglecting these underlying interactions between heterogeneities. We discuss implications for policy, and for more efficient contact tracing in resource-constrained settings such as India. Our results highlight how contact tracing, an important public health measure, can also provide important insights into epidemic spread and control.


Asunto(s)
COVID-19 , SARS-CoV-2 , Control de Enfermedades Transmisibles , Trazado de Contacto , Humanos , India/epidemiología
8.
Proc Natl Acad Sci U S A ; 117(48): 30266-30275, 2020 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-33208538

RESUMEN

Many modern problems in medicine and public health leverage machine-learning methods to predict outcomes based on observable covariates. In a wide array of settings, predicted outcomes are used in subsequent statistical analysis, often without accounting for the distinction between observed and predicted outcomes. We call inference with predicted outcomes postprediction inference. In this paper, we develop methods for correcting statistical inference using outcomes predicted with arbitrarily complicated machine-learning models including random forests and deep neural nets. Rather than trying to derive the correction from first principles for each machine-learning algorithm, we observe that there is typically a low-dimensional and easily modeled representation of the relationship between the observed and predicted outcomes. We build an approach for postprediction inference that naturally fits into the standard machine-learning framework where the data are divided into training, testing, and validation sets. We train the prediction model in the training set, estimate the relationship between the observed and predicted outcomes in the testing set, and use that relationship to correct subsequent inference in the validation set. We show our postprediction inference (postpi) approach can correct bias and improve variance estimation and subsequent statistical inference with predicted outcomes. To show the broad range of applicability of our approach, we show postpi can improve inference in two distinct fields: modeling predicted phenotypes in repurposed gene expression data and modeling predicted causes of death in verbal autopsy data. Our method is available through an open-source R package: https://github.com/leekgroup/postpi.


Asunto(s)
Aprendizaje Automático , Causas de Muerte , Simulación por Computador , Humanos , Especificidad de Órganos
9.
medRxiv ; 2020 Sep 15.
Artículo en Inglés | MEDLINE | ID: mdl-32995809

RESUMEN

The novel SARS-CoV-2 virus shows marked heterogeneity in its transmission. Here, we used data collected from contact tracing during the lockdown in Punjab, a major state in India, to quantify this heterogeneity, and to examine implications for transmission dynamics. We found evidence of heterogeneity acting at multiple levels: in the number of potentially infectious contacts per index case, and in the per-contact risk of infection. Incorporating these findings in simple mathematical models of disease transmission reveals that these heterogeneities act in combination to strongly influence transmission dynamics. Standard approaches, such as representing heterogeneity through secondary case distributions, could be biased by neglecting these underlying interactions between heterogeneities. We discuss implications for policy, and for more efficient contact tracing in resource-constrained settings such as India. Our results highlight how contact tracing, an important public health measure, can also provide important insights into epidemic spread and control.

10.
BMC Med ; 18(1): 69, 2020 03 26.
Artículo en Inglés | MEDLINE | ID: mdl-32213178

RESUMEN

BACKGROUND: A verbal autopsy (VA) is an interview conducted with the caregivers of someone who has recently died to describe the circumstances of the death. In recent years, several algorithmic methods have been developed to classify cause of death using VA data. The performance of one method-InSilicoVA-was evaluated in a study by Flaxman et al., published in BMC Medicine in 2018. The results of that study are different from those previously published by our group. METHODS: Based on the description of methods in the Flaxman et al. study, we attempt to replicate the analysis to understand why the published results differ from those of our previous work. RESULTS: We failed to reproduce the results published in Flaxman et al. Most of the discrepancies we find likely result from undocumented differences in data pre-processing, and/or values assigned to key parameters governing the behavior of the algorithm. CONCLUSION: This finding highlights the importance of making replication code available along with published results. All code necessary to replicate the work described here is freely available on GitHub.


Asunto(s)
Autopsia/métodos , Causas de Muerte/tendencias , Humanos , Proyectos de Investigación , Estudios de Validación como Asunto
11.
Am Econ Rev ; 110(8): 2454-2484, 2020 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-34526729

RESUMEN

Social network data are often prohibitively expensive to collect, limiting empirical network research. We propose an inexpensive and feasible strategy for network elicitation using Aggregated Relational Data (ARD): responses to questions of the form "how many of your links have trait k ?" Our method uses ARD to recover parameters of a network formation model, which permits sampling from a distribution over node- or graph-level statistics. We replicate the results of two field experiments that used network data and draw similar conclusions with ARD alone.

12.
Ann Appl Stat ; 14(1): 241-256, 2020 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-33520049

RESUMEN

The distribution of deaths by cause provides crucial information for public health planning, response and evaluation. About 60% of deaths globally are not registered or given a cause, limiting our ability to understand disease epidemiology. Verbal autopsy (VA) surveys are increasingly used in such settings to collect information on the signs, symptoms and medical history of people who have recently died. This article develops a novel Bayesian method for estimation of population distributions of deaths by cause using verbal autopsy data. The proposed approach is based on a multivariate probit model where associations among items in questionnaires are flexibly induced by latent factors. Using the Population Health Metrics Research Consortium labeled data that include both VA and medically certified causes of death, we assess performance of the proposed method. Further, we estimate important questionnaire items that are highly associated with causes of death. This framework provides insights that will simplify future data.

13.
J Comput Graph Stat ; 28(1): 185-196, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31447541

RESUMEN

Many existing statistical and machine learning tools for social network analysis focus on a single level of analysis. Methods designed for clustering optimize a global partition of the graph, whereas projection-based approaches (e.g., the latent space model in the statistics literature) represent in rich detail the roles of individuals. Many pertinent questions in sociology and economics, however, span multiple scales of analysis. Further, many questions involve comparisons across disconnected graphs that will, inevitably be of different sizes, either due to missing data or the inherent heterogeneity in real-world networks. We propose a class of network models that represent network structure on multiple scales and facilitate comparison across graphs with different numbers of individuals. These models differentially invest modeling effort within subgraphs of high density, often termed communities, while maintaining a parsimonious structure between said subgraphs. We show that our model class is projective, highlighting an ongoing discussion in the social network modeling literature on the dependence of inference paradigms on the size of the observed graph. We illustrate the utility of our method using data on household relations from Karnataka, India. Supplementary material for this article is available online.

14.
BMC Med ; 17(1): 116, 2019 06 27.
Artículo en Inglés | MEDLINE | ID: mdl-31242925

RESUMEN

BACKGROUND: Verbal autopsies with physician assignment of cause of death (COD) are commonly used in settings where medical certification of deaths is uncommon. It remains unanswered if automated algorithms can replace physician assignment. METHODS: We randomized verbal autopsy interviews for deaths in 117 villages in rural India to either physician or automated COD assignment. Twenty-four trained lay (non-medical) surveyors applied the allocated method using a laptop-based electronic system. Two of 25 physicians were allocated randomly to independently code the deaths in the physician assignment arm. Six algorithms (Naïve Bayes Classifier (NBC), King-Lu, InSilicoVA, InSilicoVA-NT, InterVA-4, and SmartVA) coded each death in the automated arm. The primary outcome was concordance with the COD distribution in the standard physician-assigned arm. Four thousand six hundred fifty-one (4651) deaths were allocated to physician (standard), and 4723 to automated arms. RESULTS: The two arms were nearly identical in demographics and key symptom patterns. The average concordances of automated algorithms with the standard were 62%, 56%, and 59% for adult, child, and neonatal deaths, respectively. Automated algorithms showed inconsistent results, even for causes that are relatively easy to identify such as road traffic injuries. Automated algorithms underestimated the number of cancer and suicide deaths in adults and overestimated other injuries in adults and children. Across all ages, average weighted concordance with the standard was 62% (range 79-45%) with the best to worst ranking automated algorithms being InterVA-4, InSilicoVA-NT, InSilicoVA, SmartVA, NBC, and King-Lu. Individual-level sensitivity for causes of adult deaths in the automated arm was low between the algorithms but high between two independent physicians in the physician arm. CONCLUSIONS: While desirable, automated algorithms require further development and rigorous evaluation. Lay reporting of deaths paired with physician COD assignment of verbal autopsies, despite some limitations, remains a practicable method to document the patterns of mortality reliably for unattended deaths. TRIAL REGISTRATION: ClinicalTrials.gov , NCT02810366. Submitted on 11 April 2016.


Asunto(s)
Autopsia/métodos , Recolección de Datos/métodos , Médicos/normas , Adulto , Niño , Muerte , Femenino , Humanos , India , Masculino
15.
Biostatistics ; 20(4): 549-564, 2019 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-29741607

RESUMEN

In many clinical settings, a patient outcome takes the form of a scalar time series with a recovery curve shape, which is characterized by a sharp drop due to a disruptive event (e.g., surgery) and subsequent monotonic smooth rise towards an asymptotic level not exceeding the pre-event value. We propose a Bayesian model that predicts recovery curves based on information available before the disruptive event. A recovery curve of interest is the quantified sexual function of prostate cancer patients after prostatectomy surgery. We illustrate the utility of our model as a pre-treatment medical decision aid, producing personalized predictions that are both interpretable and accurate. We uncover covariate relationships that agree with and supplement that in existing medical literature.


Asunto(s)
Técnicas de Apoyo para la Decisión , Modelos Estadísticos , Evaluación de Resultado en la Atención de Salud/estadística & datos numéricos , Prostatectomía/estadística & datos numéricos , Anciano , Teorema de Bayes , Humanos , Masculino , Persona de Mediana Edad , Prostatectomía/efectos adversos
16.
J Comput Graph Stat ; 28(4): 767-777, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-33033426

RESUMEN

Bayesian graphical models are a useful tool for understanding dependence relationships among many variables, particularly in situations with external prior information. In high-dimensional settings, the space of possible graphs becomes enormous, rendering even state-of-the-art Bayesian stochastic search computationally infeasible. We propose a deterministic alternative to estimate Gaussian and Gaussian copula graphical models using an Expectation Conditional Maximization (ECM) algorithm, extending the EM approach from Bayesian variable selection to graphical model estimation. We show that the ECM approach enables fast posterior exploration under a sequence of mixture priors, and can incorporate multiple sources of information.

17.
Proc Mach Learn Res ; 97: 3877-3885, 2019 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-33521648

RESUMEN

In this article, we propose a new class of priors for Bayesian inference with multiple Gaussian graphical models. We introduce Bayesian treatments of two popular procedures, the group graphical lasso and the fused graphical lasso, and extend them to a continuous spike-and-slab framework to allow self-adaptive shrinkage and model selection simultaneously. We develop an EM algorithm that performs fast and dynamic explorations of posterior modes. Our approach selects sparse models efficiently and automatically with substantially smaller bias than would be induced by alternative regularization procedures. The performance of the proposed methods are demonstrated through simulation and two real data examples.

18.
Demography ; 55(5): 1979-1999, 2018 10.
Artículo en Inglés | MEDLINE | ID: mdl-30276667

RESUMEN

The digital traces that we leave online are increasingly fruitful sources of data for social scientists, including those interested in demographic research. The collection and use of digital data also presents numerous statistical, computational, and ethical challenges, motivating the development of new research approaches to address these burgeoning issues. In this article, we argue that researchers with formal training in demography-those who have a history of developing innovative approaches to using challenging data-are well positioned to contribute to this area of work. We discuss the benefits and challenges of using digital trace data for social and demographic research, and we review examples of current demographic literature that creatively use digital trace data to study processes related to fertility, mortality, and migration. Focusing on Facebook data for advertisers-a novel "digital census" that has largely been untapped by demographers-we provide illustrative and empirical examples of how demographic researchers can manage issues such as bias and representation when using digital trace data. We conclude by offering our perspective on the road ahead regarding demography and its role in the data revolution.


Asunto(s)
Macrodatos , Recolección de Datos/métodos , Demografía/métodos , Investigación , Medios de Comunicación Sociales/estadística & datos numéricos , Sesgo , Tasa de Natalidad/tendencias , Recolección de Datos/ética , Demografía/ética , Ética en Investigación , Humanos , Mortalidad/tendencias , Privacidad , Grupos Raciales/estadística & datos numéricos , Medios de Comunicación Sociales/ética
19.
Appl Stoch Models Bus Ind ; 34(2): 87-104, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29962902

RESUMEN

Relational event data, which consist of events involving pairs of actors over time, are now commonly available at the finest of temporal resolutions. Existing continuous-time methods for modeling such data are based on point processes and directly model interaction "contagion," whereby one interaction increases the propensity of future interactions among actors, often as dictated by some latent variable structure. In this article, we present an alternative approach to using temporal-relational point process models for continuous-time event data. We characterize interactions between a pair of actors as either spurious or as resulting from an underlying, persistent connection in a latent social network. We argue that consistent deviations from expected behavior, rather than solely high frequency counts, are crucial for identifying well-established underlying social relationships. This study aims to explore these latent network structures in two contexts: one comprising of college students and another involving barn swallows.

20.
Sociol Methods Res ; 46(3): 390-421, 2017 08.
Artículo en Inglés | MEDLINE | ID: mdl-29033471

RESUMEN

Despite recent and growing interest in using Twitter to examine human behavior and attitudes, there is still significant room for growth regarding the ability to leverage Twitter data for social science research. In particular, gleaning demographic information about Twitter users-a key component of much social science research-remains a challenge. This article develops an accurate and reliable data processing approach for social science researchers interested in using Twitter data to examine behaviors and attitudes, as well as the demographic characteristics of the populations expressing or engaging in them. Using information gathered from Twitter users who state an intention to not vote in the 2012 presidential election, we describe and evaluate a method for processing data to retrieve demographic information reported by users that is not encoded as text (e.g., details of images) and evaluate the reliability of these techniques. We end by assessing the challenges of this data collection strategy and discussing how large-scale social media data may benefit demographic researchers.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...