Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 20 de 1.544
Filtrar
Más filtros

Publication year range
1.
Biostatistics ; 25(2): 289-305, 2024 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-36977366

RESUMEN

Causally interpretable meta-analysis combines information from a collection of randomized controlled trials to estimate treatment effects in a target population in which experimentation may not be possible but from which covariate information can be obtained. In such analyses, a key practical challenge is the presence of systematically missing data when some trials have collected data on one or more baseline covariates, but other trials have not, such that the covariate information is missing for all participants in the latter. In this article, we provide identification results for potential (counterfactual) outcome means and average treatment effects in the target population when covariate data are systematically missing from some of the trials in the meta-analysis. We propose three estimators for the average treatment effect in the target population, examine their asymptotic properties, and show that they have good finite-sample performance in simulation studies. We use the estimators to analyze data from two large lung cancer screening trials and target population data from the National Health and Nutrition Examination Survey (NHANES). To accommodate the complex survey design of the NHANES, we modify the methods to incorporate survey sampling weights and allow for clustering.


Asunto(s)
Detección Precoz del Cáncer , Neoplasias Pulmonares , Humanos , Encuestas Nutricionales , Neoplasias Pulmonares/epidemiología , Simulación por Computador , Proyectos de Investigación
2.
Mol Cell Proteomics ; 22(8): 100558, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-37105364

RESUMEN

Mass spectrometry (MS) enables high-throughput identification and quantification of proteins in complex biological samples and can provide insights into the global function of biological systems. Label-free quantification is cost-effective and suitable for the analysis of human samples. Despite rapid developments in label-free data acquisition workflows, the number of proteins quantified across samples can be limited by technical and biological variability. This variation can result in missing values which can in turn challenge downstream data analysis tasks. General purpose or gene expression-specific imputation algorithms are widely used to improve data completeness. Here, we propose an imputation algorithm designated for label-free MS data that is aware of the type of missingness affecting data. On published datasets acquired by data-dependent and data-independent acquisition workflows with variable degrees of biological complexity, we demonstrate that the proposed missing value estimation procedure by barycenter computation competes closely with the state-of-the-art imputation algorithms in differential abundance tasks while outperforming them in the accuracy of variance estimates of the peptide abundance measurements, and better controls the false discovery rate in label-free MS experiments. The barycenter estimation procedure is implemented in the msImpute software package and is available from the Bioconductor repository.


Asunto(s)
Algoritmos , Péptidos , Humanos , Péptidos/análisis , Proteínas , Espectrometría de Masas/métodos
3.
Genet Epidemiol ; 47(1): 61-77, 2023 02.
Artículo en Inglés | MEDLINE | ID: mdl-36125445

RESUMEN

There is an increasing interest in using multiple types of omics features (e.g., DNA sequences, RNA expressions, methylation, protein expressions, and metabolic profiles) to study how the relationships between phenotypes and genotypes may be mediated by other omics markers. Genotypes and phenotypes are typically available for all subjects in genetic studies, but typically, some omics data will be missing for some subjects, due to limitations such as cost and sample quality. In this article, we propose a powerful approach for mediation analysis that accommodates missing data among multiple mediators and allows for various interaction effects. We formulate the relationships among genetic variants, other omics measurements, and phenotypes through linear regression models. We derive the joint likelihood for models with two mediators, accounting for arbitrary patterns of missing values. Utilizing computationally efficient and stable algorithms, we conduct maximum likelihood estimation. Our methods produce unbiased and statistically efficient estimators. We demonstrate the usefulness of our methods through simulation studies and an application to the Metabolic Syndrome in Men study.


Asunto(s)
Análisis de Mediación , Modelos Genéticos , Humanos , Genotipo , Simulación por Computador , Funciones de Verosimilitud , Algoritmos
4.
Clin Infect Dis ; 2024 Jun 02.
Artículo en Inglés | MEDLINE | ID: mdl-38824440

RESUMEN

Data on alcohol use and incident Tuberculosis (TB) infection are needed. In adults aged 15+ in rural Uganda (N=49,585), estimated risk of incident TB infection was 29.2% with alcohol use vs. 19.2% without (RR: 1.49; 95%CI: 1.40-1.60). There is potential for interventions to interrupt transmission among people who drink alcohol.

5.
Mol Biol Evol ; 40(5)2023 05 02.
Artículo en Inglés | MEDLINE | ID: mdl-37140129

RESUMEN

The data available for reconstructing molecular phylogenies have become wildly disparate. Phylogenomic studies can generate data for thousands of genetic markers for dozens of species, but for hundreds of other taxa, data may be available from only a few genes. Can these two types of data be integrated to combine the advantages of both, addressing the relationships of hundreds of species with thousands of genes? Here, we show that this is possible, using data from frogs. We generated a phylogenomic data set for 138 ingroup species and 3,784 nuclear markers (ultraconserved elements [UCEs]), including new UCE data from 70 species. We also assembled a supermatrix data set, including data from 97% of frog genera (441 total), with 1-307 genes per taxon. We then produced a combined phylogenomic-supermatrix data set (a "gigamatrix") containing 441 ingroup taxa and 4,091 markers but with 86% missing data overall. Likelihood analysis of the gigamatrix yielded a generally well-supported tree among families, largely consistent with trees from the phylogenomic data alone. All terminal taxa were placed in the expected families, even though 42.5% of these taxa each had >99.5% missing data and 70.2% had >90% missing data. Our results show that missing data need not be an impediment to successfully combining very large phylogenomic and supermatrix data sets, and they open the door to new studies that simultaneously maximize sampling of genes and taxa.


Asunto(s)
Anuros , Animales , Filogenia , Análisis de Secuencia de ADN , Anuros/genética , Probabilidad
6.
Am J Epidemiol ; 193(6): 908-916, 2024 Jun 03.
Artículo en Inglés | MEDLINE | ID: mdl-38422371

RESUMEN

Routinely collected testing data have been a vital resource for public health response during the COVID-19 pandemic and have revealed the extent to which Black and Hispanic persons have borne a disproportionate burden of SARS-CoV-2 infections and hospitalizations in the United States. However, missing race and ethnicity data and missed infections due to testing disparities limit the interpretation of testing data and obscure the true toll of the pandemic. We investigated potential bias arising from these 2 types of missing data through a case study carried out in Holyoke, Massachusetts, during the prevaccination phase of the pandemic. First, we estimated SARS-CoV-2 testing and case rates by race and ethnicity, imputing missing data using a joint modeling approach. We then investigated disparities in SARS-CoV-2 reported case rates and missed infections by comparing case rate estimates with estimates derived from a COVID-19 seroprevalence survey. Compared with the non-Hispanic White population, we found that the Hispanic population had similar testing rates (476 tested per 1000 vs 480 per 1000) but twice the case rate (8.1% vs 3.7%). We found evidence of inequitable testing, with a higher rate of missed infections in the Hispanic population than in the non-Hispanic White population (79 infections missed per 1000 vs 60 missed per 1000).


Asunto(s)
Prueba de COVID-19 , COVID-19 , Hispánicos o Latinos , SARS-CoV-2 , Humanos , COVID-19/etnología , COVID-19/epidemiología , COVID-19/diagnóstico , Massachusetts/epidemiología , Prueba de COVID-19/estadística & datos numéricos , Hispánicos o Latinos/estadística & datos numéricos , Masculino , Femenino , Persona de Mediana Edad , Disparidades en Atención de Salud/etnología , Disparidades en Atención de Salud/estadística & datos numéricos , Adulto , Disparidades en el Estado de Salud , Negro o Afroamericano/estadística & datos numéricos , Etnicidad/estadística & datos numéricos , Anciano , Diagnóstico Erróneo/estadística & datos numéricos
7.
Am J Epidemiol ; 193(1): 203-213, 2024 Jan 08.
Artículo en Inglés | MEDLINE | ID: mdl-37650647

RESUMEN

We developed and validated a claims-based algorithm that classifies patients into obesity categories. Using Medicare (2007-2017) and Medicaid (2000-2014) claims data linked to 2 electronic health record (EHR) systems in Boston, Massachusetts, we identified a cohort of patients with an EHR-based body mass index (BMI) measurement (calculated as weight (kg)/height (m)2). We used regularized regression to select from 137 variables and built generalized linear models to classify patients with BMIs of ≥25, ≥30, and ≥40. We developed the prediction model using EHR system 1 (training set) and validated it in EHR system 2 (validation set). The cohort contained 123,432 patients in the Medicare population and 40,736 patients in the Medicaid population. The model comprised 97 variables in the Medicare set and 95 in the Medicaid set, including BMI-related diagnosis codes, cardiovascular and antidiabetic drugs, and obesity-related comorbidities. The areas under the receiver-operating-characteristic curve in the validation set were 0.72, 0.75, and 0.83 (Medicare) and 0.66, 0.66, and 0.70 (Medicaid) for BMIs of ≥25, ≥30, and ≥40, respectively. The positive predictive values were 81.5%, 80.6%, and 64.7% (Medicare) and 81.6%, 77.5%, and 62.5% (Medicaid), for BMIs of ≥25, ≥30, and ≥40, respectively. The proposed model can identify obesity categories in claims databases when BMI measurements are missing and can be used for confounding adjustment, defining subgroups, or probabilistic bias analysis.


Asunto(s)
Medicare , Obesidad , Anciano , Humanos , Estados Unidos/epidemiología , Obesidad/epidemiología , Índice de Masa Corporal , Comorbilidad , Hipoglucemiantes , Registros Electrónicos de Salud
8.
Am J Epidemiol ; 2024 Jun 20.
Artículo en Inglés | MEDLINE | ID: mdl-38904459

RESUMEN

When analyzing a selected sample from a general population, selection bias can arise relative to the causal average treatment effect (ATE) for the general population, and also relative to the ATE for the selected sample itself. We provide simple graphical rules that indicate: (1) if a selected-sample analysis will be unbiased for each ATE; (2) whether adjusting for certain covariates could eliminate selection bias. The rules can easily be checked in a standard single-world intervention graph. When the treatment could affect selection, a third estimand of potential scientific interest is the "net treatment difference", namely the net change in outcomes that would occur for the selected sample if all members of the general population were treated versus not treated, including any effects of the treatment on which individuals are in the selected sample . We provide graphical rules for this estimand as well. We decompose bias in a selected-sample analysis relative to the general-population ATE into: (1) "internal bias" relative to the net treatment difference; (2) "net-external bias", a discrepancy between the net treatment difference and the general-population ATE. Each bias can be assessed unambiguously via a distinct graphical rule, providing new conceptual insight into the mechanisms by which certain causal structures produce selection bias.

9.
Am J Epidemiol ; 193(7): 1019-1030, 2024 Jul 08.
Artículo en Inglés | MEDLINE | ID: mdl-38400653

RESUMEN

Targeted maximum likelihood estimation (TMLE) is increasingly used for doubly robust causal inference, but how missing data should be handled when using TMLE with data-adaptive approaches is unclear. Based on data (1992-1998) from the Victorian Adolescent Health Cohort Study, we conducted a simulation study to evaluate 8 missing-data methods in this context: complete-case analysis, extended TMLE incorporating an outcome-missingness model, the missing covariate missing indicator method, and 5 multiple imputation (MI) approaches using parametric or machine-learning models. We considered 6 scenarios that varied in terms of exposure/outcome generation models (presence of confounder-confounder interactions) and missingness mechanisms (whether outcome influenced missingness in other variables and presence of interaction/nonlinear terms in missingness models). Complete-case analysis and extended TMLE had small biases when outcome did not influence missingness in other variables. Parametric MI without interactions had large bias when exposure/outcome generation models included interactions. Parametric MI including interactions performed best in bias and variance reduction across all settings, except when missingness models included a nonlinear term. When choosing a method for handling missing data in the context of TMLE, researchers must consider the missingness mechanism and, for MI, compatibility with the analysis method. In many settings, a parametric MI approach that incorporates interactions and nonlinearities is expected to perform well.


Asunto(s)
Causalidad , Humanos , Funciones de Verosimilitud , Adolescente , Interpretación Estadística de Datos , Sesgo , Modelos Estadísticos , Simulación por Computador
10.
Biostatistics ; 24(2): 502-517, 2023 04 14.
Artículo en Inglés | MEDLINE | ID: mdl-34939083

RESUMEN

Cluster randomized trials (CRTs) randomly assign an intervention to groups of individuals (e.g., clinics or communities) and measure outcomes on individuals in those groups. While offering many advantages, this experimental design introduces challenges that are only partially addressed by existing analytic approaches. First, outcomes are often missing for some individuals within clusters. Failing to appropriately adjust for differential outcome measurement can result in biased estimates and inference. Second, CRTs often randomize limited numbers of clusters, resulting in chance imbalances on baseline outcome predictors between arms. Failing to adaptively adjust for these imbalances and other predictive covariates can result in efficiency losses. To address these methodological gaps, we propose and evaluate a novel two-stage targeted minimum loss-based estimator to adjust for baseline covariates in a manner that optimizes precision, after controlling for baseline and postbaseline causes of missing outcomes. Finite sample simulations illustrate that our approach can nearly eliminate bias due to differential outcome measurement, while existing CRT estimators yield misleading results and inferences. Application to real data from the SEARCH community randomized trial demonstrates the gains in efficiency afforded through adaptive adjustment for baseline covariates, after controlling for missingness on individual-level outcomes.


Asunto(s)
Evaluación de Resultado en la Atención de Salud , Proyectos de Investigación , Humanos , Ensayos Clínicos Controlados Aleatorios como Asunto , Probabilidad , Sesgo , Análisis por Conglomerados , Simulación por Computador
11.
Biostatistics ; 2023 Aug 02.
Artículo en Inglés | MEDLINE | ID: mdl-37531621

RESUMEN

Cluster randomized trials (CRTs) often enroll large numbers of participants; yet due to resource constraints, only a subset of participants may be selected for outcome assessment, and those sampled may not be representative of all cluster members. Missing data also present a challenge: if sampled individuals with measured outcomes are dissimilar from those with missing outcomes, unadjusted estimates of arm-specific endpoints and the intervention effect may be biased. Further, CRTs often enroll and randomize few clusters, limiting statistical power and raising concerns about finite sample performance. Motivated by SEARCH-TB, a CRT aimed at reducing incident tuberculosis infection, we demonstrate interlocking methods to handle these challenges. First, we extend Two-Stage targeted minimum loss-based estimation to account for three sources of missingness: (i) subsampling; (ii) measurement of baseline status among those sampled; and (iii) measurement of final status among those in the incidence cohort (persons known to be at risk at baseline). Second, we critically evaluate the assumptions under which subunits of the cluster can be considered the conditionally independent unit, improving precision and statistical power but also causing the CRT to behave like an observational study. Our application to SEARCH-TB highlights the real-world impact of different assumptions on measurement and dependence; estimates relying on unrealistic assumptions suggested the intervention increased the incidence of TB infection by 18% (risk ratio [RR]=1.18, 95% confidence interval [CI]: 0.85-1.63), while estimates accounting for the sampling scheme, missingness, and within community dependence found the intervention decreased the incident TB by 27% (RR=0.73, 95% CI: 0.57-0.92).

12.
Brief Bioinform ; 23(1)2022 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-34472591

RESUMEN

Missing values are common in high-throughput mass spectrometry data. Two strategies are available to address missing values: (i) eliminate or impute the missing values and apply statistical methods that require complete data and (ii) use statistical methods that specifically account for missing values without imputation (imputation-free methods). This study reviews the effect of sample size and percentage of missing values on statistical inference for multiple methods under these two strategies. With increasing missingness, the ability of imputation and imputation-free methods to identify differentially and non-differentially regulated compounds in a two-group comparison study declined. Random forest and k-nearest neighbor imputation combined with a Wilcoxon test performed well in statistical testing for up to 50% missingness with little bias in estimating the effect size. Quantile regression imputation accompanied with a Wilcoxon test also had good statistical testing outcomes but substantially distorted the difference in means between groups. None of the imputation-free methods performed consistently better for statistical testing than imputation methods.


Asunto(s)
Proyectos de Investigación , Sesgo , Análisis por Conglomerados , Espectrometría de Masas/métodos
13.
Brief Bioinform ; 23(1)2022 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-34882223

RESUMEN

Clinical data are increasingly being mined to derive new medical knowledge with a goal of enabling greater diagnostic precision, better-personalized therapeutic regimens, improved clinical outcomes and more efficient utilization of health-care resources. However, clinical data are often only available at irregular intervals that vary between patients and type of data, with entries often being unmeasured or unknown. As a result, missing data often represent one of the major impediments to optimal knowledge derivation from clinical data. The Data Analytics Challenge on Missing data Imputation (DACMI) presented a shared clinical dataset with ground truth for evaluating and advancing the state of the art in imputing missing data for clinical time series. We extracted 13 commonly measured blood laboratory tests. To evaluate the imputation performance, we randomly removed one recorded result per laboratory test per patient admission and used them as the ground truth. DACMI is the first shared-task challenge on clinical time series imputation to our best knowledge. The challenge attracted 12 international teams spanning three continents across multiple industries and academia. The evaluation outcome suggests that competitive machine learning and statistical models (e.g. LightGBM, MICE and XGBoost) coupled with carefully engineered temporal and cross-sectional features can achieve strong imputation performance. However, care needs to be taken to prevent overblown model complexity. The challenge participating systems collectively experimented with a wide range of machine learning and probabilistic algorithms to combine temporal imputation and cross-sectional imputation, and their design principles will inform future efforts to better model clinical missing data.


Asunto(s)
Algoritmos , Aprendizaje Automático , Estudios Transversales , Recolección de Datos , Humanos , Modelos Estadísticos
14.
Biometrics ; 80(1)2024 Jan 29.
Artículo en Inglés | MEDLINE | ID: mdl-38456546

RESUMEN

The problem of estimating the size of a population based on a subset of individuals observed across multiple data sources is often referred to as capture-recapture or multiple-systems estimation. This is fundamentally a missing data problem, where the number of unobserved individuals represents the missing data. As with any missing data problem, multiple-systems estimation requires users to make an untestable identifying assumption in order to estimate the population size from the observed data. If an appropriate identifying assumption cannot be found for a data set, no estimate of the population size should be produced based on that data set, as models with different identifying assumptions can produce arbitrarily different population size estimates-even with identical observed data fits. Approaches to multiple-systems estimation often do not explicitly specify identifying assumptions. This makes it difficult to decouple the specification of the model for the observed data from the identifying assumption and to provide justification for the identifying assumption. We present a re-framing of the multiple-systems estimation problem that leads to an approach that decouples the specification of the observed-data model from the identifying assumption, and discuss how common models fit into this framing. This approach takes advantage of existing software and facilitates various sensitivity analyses. We demonstrate our approach in a case study estimating the number of civilian casualties in the Kosovo war.


Asunto(s)
Densidad de Población , Humanos
15.
Biometrics ; 80(1)2024 Jan 29.
Artículo en Inglés | MEDLINE | ID: mdl-38364812

RESUMEN

People living with HIV on antiretroviral therapy often have undetectable virus levels by standard assays, but "latent" HIV still persists in viral reservoirs. Eliminating these reservoirs is the goal of HIV cure research. The quantitative viral outgrowth assay (QVOA) is commonly used to estimate the reservoir size, that is, the infectious units per million (IUPM) of HIV-persistent resting CD4+ T cells. A new variation of the QVOA, the ultra deep sequencing assay of the outgrowth virus (UDSA), was recently developed that further quantifies the number of viral lineages within a subset of infected wells. Performing the UDSA on a subset of wells provides additional information that can improve IUPM estimation. This paper considers statistical inference about the IUPM from combined dilution assay (QVOA) and deep viral sequencing (UDSA) data, even when some deep sequencing data are missing. Methods are proposed to accommodate assays with wells sequenced at multiple dilution levels and with imperfect sensitivity and specificity, and a novel bias-corrected estimator is included for small samples. The proposed methods are evaluated in a simulation study, applied to data from the University of North Carolina HIV Cure Center, and implemented in the open-source R package SLDeepAssay.


Asunto(s)
Infecciones por VIH , VIH-1 , Humanos , Latencia del Virus , VIH-1/genética , Linfocitos T CD4-Positivos , Simulación por Computador , Carga Viral
16.
Biometrics ; 80(1)2024 Jan 29.
Artículo en Inglés | MEDLINE | ID: mdl-38281771

RESUMEN

Statistical approaches that successfully combine multiple datasets are more powerful, efficient, and scientifically informative than separate analyses. To address variation architectures correctly and comprehensively for high-dimensional data across multiple sample sets (ie, cohorts), we propose multiple augmented reduced rank regression (maRRR), a flexible matrix regression and factorization method to concurrently learn both covariate-driven and auxiliary structured variations. We consider a structured nuclear norm objective that is motivated by random matrix theory, in which the regression or factorization terms may be shared or specific to any number of cohorts. Our framework subsumes several existing methods, such as reduced rank regression and unsupervised multimatrix factorization approaches, and includes a promising novel approach to regression and factorization of a single dataset (aRRR) as a special case. Simulations demonstrate substantial gains in power from combining multiple datasets, and from parsimoniously accounting for all structured variations. We apply maRRR to gene expression data from multiple cancer types (ie, pan-cancer) from The Cancer Genome Atlas, with somatic mutations as covariates. The method performs well with respect to prediction and imputation of held-out data, and provides new insights into mutation-driven and auxiliary variations that are shared or specific to certain cancer types.


Asunto(s)
Neoplasias , Humanos , Análisis Multivariante , Neoplasias/genética
17.
Stat Med ; 43(6): 1238-1255, 2024 Mar 15.
Artículo en Inglés | MEDLINE | ID: mdl-38258282

RESUMEN

In clinical studies, multi-state model (MSM) analysis is often used to describe the sequence of events that patients experience, enabling better understanding of disease progression. A complicating factor in many MSM studies is that the exact event times may not be known. Motivated by a real dataset of patients who received stem cell transplants, we considered the setting in which some event times were exactly observed and some were missing. In our setting, there was little information about the time intervals in which the missing event times occurred and missingness depended on the event type, given the analysis model covariates. These additional challenges limited the usefulness of some missing data methods (maximum likelihood, complete case analysis, and inverse probability weighting). We show that multiple imputation (MI) of event times can perform well in this setting. MI is a flexible method that can be used with any complete data analysis model. Through an extensive simulation study, we show that MI by predictive mean matching (PMM), in which sampling is from a set of observed times without reliance on a specific parametric distribution, has little bias when event times are missing at random, conditional on the observed data. Applying PMM separately for each sub-group of patients with a different pathway through the MSM tends to further reduce bias and improve precision. We recommend MI using PMM methods when performing MSM analysis with Markov models and partially observed event times.


Asunto(s)
Proyectos de Investigación , Humanos , Interpretación Estadística de Datos , Simulación por Computador , Probabilidad , Sesgo
18.
Stat Med ; 43(7): 1458-1474, 2024 Mar 30.
Artículo en Inglés | MEDLINE | ID: mdl-38488532

RESUMEN

Generalized estimating equations (GEEs) provide a useful framework for estimating marginal regression parameters based on data from cluster randomized trials (CRTs), but they can result in inaccurate parameter estimates when some outcomes are informatively missing. Existing techniques to handle missing outcomes in CRTs rely on correct specification of a propensity score model, a covariate-conditional mean outcome model, or require at least one of these two models to be correct, which can be challenging in practice. In this article, we develop new weighted GEEs to simultaneously estimate the marginal mean, scale, and correlation parameters in CRTs with missing outcomes, allowing for multiple propensity score models and multiple covariate-conditional mean models to be specified. The resulting estimators are consistent provided that any one of these models is correct. An iterative algorithm is provided for implementing this more robust estimator and practical considerations for specifying multiple models are discussed. We evaluate the performance of the proposed method through Monte Carlo simulations and apply the proposed multiply robust estimator to analyze the Botswana Combination Prevention Project, a large HIV prevention CRT designed to evaluate whether a combination of HIV-prevention measures can reduce HIV incidence.


Asunto(s)
Infecciones por VIH , Modelos Estadísticos , Humanos , Simulación por Computador , Interpretación Estadística de Datos , Ensayos Clínicos Controlados Aleatorios como Asunto , Infecciones por VIH/epidemiología , Infecciones por VIH/prevención & control , Análisis por Conglomerados
19.
J Surg Oncol ; 129(7): 1192-1201, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38583135

RESUMEN

BACKGROUND: Missing data can affect the representativeness and accuracy of survey results, and sexual health-related surveys are especially at a higher risk of nonresponse due to their sensitive nature and stigma. The purpose of this study was to evaluate the proportion of patients who do not complete the BREAST-Q Sexual Well-being relative to other BREAST-Q modules and compare responders versus nonresponders of Sexual Well-being. We secondarily examined variables associated with Sexual Well-being at 1-year. METHODS: A retrospective analysis of patients who underwent breast reconstruction from January 2018 to December 2021 and completed any of the BREAST-Q modules postoperatively at 1-year was performed. RESULTS: The 2941 patients were included. Of the four BREAST-Q domains, Sexual Well-being had the highest rate of nonresponse (47%). Patients who were separated (vs. married, OR = 0.69), whose primary language was not English (vs. English, OR = 0.60), and had Medicaid insurance (vs. commercial, OR = 0.67) were significantly less likely to complete the Sexual Well-being. Postmenopausal patients were significantly more likely to complete the survey than premenopausal patients. Lastly, autologous reconstruction patients were 2.93 times more likely to respond than implant-based reconstruction patients (p < 0.001) while delayed (vs. immediate, OR = 0.70, p = 0.022) and unilateral (vs. bilateral, OR = 0.80, p = 0.008) reconstruction patients were less likely to respond. History of psychiatric diagnosis, aromatase inhibitors, and immediate breast reconstruction were significantly associated with lower Sexual Well-being at 1-year. CONCLUSION: Sexual Well-being is the least frequently completed BREAST-Q domain, and there are demographic and clinical differences between responders and nonresponders. We encourage providers to recognize patterns in nonresponse data for Sexual-Well-being to ensure that certain patient population's sexual health concerns are not overlooked.


Asunto(s)
Neoplasias de la Mama , Mamoplastia , Salud Sexual , Humanos , Femenino , Estudios Retrospectivos , Mamoplastia/psicología , Persona de Mediana Edad , Neoplasias de la Mama/cirugía , Neoplasias de la Mama/psicología , Encuestas y Cuestionarios , Adulto , Calidad de Vida , Estudios de Seguimiento , Anciano , Conducta Sexual/psicología , Mastectomía/psicología , Pronóstico
20.
BMC Med Res Methodol ; 24(1): 67, 2024 Mar 13.
Artículo en Inglés | MEDLINE | ID: mdl-38481152

RESUMEN

BACKGROUND: Advancements in linking publicly available census records with vital and administrative records have enabled novel investigations in epidemiology and social history. However, in the absence of unique identifiers, the linkage of the records may be uncertain or only be successful for a subset of the census cohort, resulting in missing data. For survival analysis, differential ascertainment of event times can impact inference on risk associations and median survival. METHODS: We modify some existing approaches that are commonly used to handle missing survival times to accommodate this imperfect linkage situation including complete case analysis, censoring, weighting, and several multiple imputation methods. We then conduct simulation studies to compare the performance of the proposed approaches in estimating the associations of a risk factor or exposure in terms of hazard ratio (HR) and median survival times in the presence of missing survival times. The effects of different missing data mechanisms and exposure-survival associations on their performance are also explored. The approaches are applied to a historic cohort of residents in Ambler, PA, established using the 1930 US census, from which only 2,440 out of 4,514 individuals (54%) had death records retrievable from publicly available data sources and death certificates. Using this cohort, we examine the effects of occupational and paraoccupational asbestos exposure on survival and disparities in mortality by race and gender. RESULTS: We show that imputation based on conditional survival results in less bias and greater efficiency relative to a complete case analysis when estimating log-hazard ratios and median survival times. When the approaches are applied to the Ambler cohort, we find a significant association between occupational exposure and mortality, particularly among black individuals and males, but not between paraoccupational exposure and mortality. DISCUSSION: This investigation illustrates the strengths and weaknesses of different imputation methods for missing survival times due to imperfect linkage of the administrative or registry data. The performance of the methods may depend on the missingness process as well as the parameter being estimated and models of interest, and such factors should be considered when choosing the methods to address the missing event times.


Asunto(s)
Censos , Análisis de Supervivencia , Femenino , Humanos , Masculino , Causalidad , Simulación por Computador , Modelos de Riesgos Proporcionales
SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda