Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 20 de 1.079
Filtrar
Más filtros

Tipo del documento
Publication year range
1.
Biostatistics ; 25(2): 306-322, 2024 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-37230469

RESUMEN

Measurement error is common in environmental epidemiologic studies, but methods for correcting measurement error in regression models with multiple environmental exposures as covariates have not been well investigated. We consider a multiple imputation approach, combining external or internal calibration samples that contain information on both true and error-prone exposures with the main study data of multiple exposures measured with error. We propose a constrained chained equations multiple imputation (CEMI) algorithm that places constraints on the imputation model parameters in the chained equations imputation based on the assumptions of strong nondifferential measurement error. We also extend the constrained CEMI method to accommodate nondetects in the error-prone exposures in the main study data. We estimate the variance of the regression coefficients using the bootstrap with two imputations of each bootstrapped sample. The constrained CEMI method is shown by simulations to outperform existing methods, namely the method that ignores measurement error, classical calibration, and regression prediction, yielding estimated regression coefficients with smaller bias and confidence intervals with coverage close to the nominal level. We apply the proposed method to the Neighborhood Asthma and Allergy Study to investigate the associations between the concentrations of multiple indoor allergens and the fractional exhaled nitric oxide level among asthmatic children in New York City. The constrained CEMI method can be implemented by imposing constraints on the imputation matrix using the mice and bootImpute packages in R.


Asunto(s)
Algoritmos , Exposición a Riesgos Ambientales , Niño , Humanos , Animales , Ratones , Exposición a Riesgos Ambientales/efectos adversos , Estudios Epidemiológicos , Calibración , Sesgo
2.
BMC Bioinformatics ; 25(1): 236, 2024 Jul 12.
Artículo en Inglés | MEDLINE | ID: mdl-38997639

RESUMEN

BACKGROUND: Homologous recombination deficiency (HRD) stands as a clinical indicator for discerning responsive outcomes to platinum-based chemotherapy and poly ADP-ribose polymerase (PARP) inhibitors. One of the conventional approaches to HRD prognostication has generally centered on identifying deleterious mutations within the BRCA1/2 genes, along with quantifying the genomic scars, such as Genomic Instability Score (GIS) estimation with scarHRD. However, the scarHRD method has limitations in scenarios involving tumors bereft of corresponding germline data. Although several RNA-seq-based HRD prediction algorithms have been developed, they mainly support cohort-wise classification, thereby yielding HRD status without furnishing an analogous quantitative metric akin to scarHRD. This study introduces the expHRD method, which operates as a novel transcriptome-based framework tailored to n-of-1-style HRD scoring. RESULTS: The prediction model has been established using the elastic net regression method in the Cancer Genome Atlas (TCGA) pan-cancer training set. The bootstrap technique derived the HRD geneset for applying the expHRD calculation. The expHRD demonstrated a notable correlation with scarHRD and superior performance in predicting HRD-high samples. We also performed intra- and extra-cohort evaluations for clinical feasibility in the TCGA-OV and the Genomic Data Commons (GDC) ovarian cancer cohort, respectively. The innovative web service designed for ease of use is poised to extend the realms of HRD prediction across diverse malignancies, with ovarian cancer standing as an emblematic example. CONCLUSIONS: Our novel approach leverages the transcriptome data, enabling the prediction of HRD status with remarkable precision. This innovative method addresses the challenges associated with limited available data, opening new avenues for utilizing transcriptomics to inform clinical decisions.


Asunto(s)
Recombinación Homóloga , Neoplasias , Transcriptoma , Humanos , Transcriptoma/genética , Recombinación Homóloga/genética , Neoplasias/genética , Algoritmos , Femenino , Perfilación de la Expresión Génica/métodos
3.
Mol Biol Evol ; 40(7)2023 07 05.
Artículo en Inglés | MEDLINE | ID: mdl-37440530

RESUMEN

Likelihood-based tests of phylogenetic trees are a foundation of modern systematics. Over the past decade, an enormous wealth and diversity of model-based approaches have been developed for phylogenetic inference of both gene trees and species trees. However, while many techniques exist for conducting formal likelihood-based tests of gene trees, such frameworks are comparatively underdeveloped and underutilized for testing species tree hypotheses. To date, widely used tests of tree topology are designed to assess the fit of classical models of molecular sequence data and individual gene trees and thus are not readily applicable to the problem of species tree inference. To address this issue, we derive several analogous likelihood-based approaches for testing topologies using modern species tree models and heuristic algorithms that use gene tree topologies as input for maximum likelihood estimation under the multispecies coalescent. For the purpose of comparing support for species trees, these tests leverage the statistical procedures of their original gene tree-based counterparts that have an extended history for testing phylogenetic hypotheses at a single locus. We discuss and demonstrate a number of applications, limitations, and important considerations of these tests using simulated and empirical phylogenomic data sets that include both bifurcating topologies and reticulate network models of species relationships. Finally, we introduce the open-source R package SpeciesTopoTestR (SpeciesTopology Tests in R) that includes a suite of functions for conducting formal likelihood-based tests of species topologies given a set of input gene tree topologies.


Asunto(s)
Algoritmos , Modelos Genéticos , Filogenia , Funciones de Verosimilitud
4.
Am J Epidemiol ; 2024 May 16.
Artículo en Inglés | MEDLINE | ID: mdl-38751323

RESUMEN

In 2023, Martinez et al. examined trends in the inclusion, conceptualization, operationalization and analysis of race and ethnicity among studies published in US epidemiology journals. Based on a random sample of papers (N=1,050) published from 1995-2018, the authors describe the treatment of race, ethnicity, and ethnorace in the analytic sample (N=414, 39% of baseline sample) over time. Between 32% and 19% of studies in each time stratum lacked race data; 61% to 34% lacked ethnicity data. The review supplies stark evidence of the routine omission and variability of measures of race and ethnicity in epidemiologic research. Informed by public health critical race praxis (PHCRP), this commentary discusses the implications of four problems the findings suggest pervade epidemiology: 1) a general lack of clarity about what race and ethnicity are; 2) the limited use of critical race or other theory; 3) an ironic lack of rigor in measuring race and ethnicity; and, 4) the ordinariness of racism and white supremacy in epidemiology. The identified practices reflect neither current publication guidelines nor the state of the knowledge on race, ethnicity and racism; therefore, we conclude by offering recommendations to move epidemiology toward more rigorous research in an increasingly diverse society.

5.
Eur J Neurosci ; 59(11): 3074-3092, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38578844

RESUMEN

Focal structural damage to white matter tracts can result in functional deficits in stroke patients. Traditional voxel-based lesion-symptom mapping is commonly used to localize brain structures linked to neurological deficits. Emerging evidence suggests that the impact of structural focal damage may extend beyond immediate lesion sites. In this study, we present a disconnectome mapping approach based on support vector regression (SVR) to identify brain structures and white matter pathways associated with functional deficits in stroke patients. For clinical validation, we utilized imaging data from 340 stroke patients exhibiting motor deficits. A disconnectome map was initially derived from lesions for each patient. Bootstrap sampling was then employed to balance the sample size between a minority group of patients exhibiting right or left motor deficits and those without deficits. Subsequently, SVR analysis was used to identify voxels associated with motor deficits (p < .005). Our disconnectome-based analysis significantly outperformed alternative lesion-symptom approaches in identifying major white matter pathways within the corticospinal tracts associated with upper-lower limb motor deficits. Bootstrapping significantly increased the sensitivity (80%-87%) for identifying patients with motor deficits, with a minimum lesion size of 32 and 235 mm3 for the right and left motor deficit, respectively. Overall, the lesion-based methods achieved lower sensitivities compared with those based on disconnection maps. The primary contribution of our approach lies in introducing a bootstrapped disconnectome-based mapping approach to identify lesion-derived white matter disconnections associated with functional deficits, particularly efficient in handling imbalanced data.


Asunto(s)
Accidente Cerebrovascular , Humanos , Accidente Cerebrovascular/diagnóstico por imagen , Accidente Cerebrovascular/fisiopatología , Femenino , Masculino , Persona de Mediana Edad , Anciano , Sustancia Blanca/diagnóstico por imagen , Sustancia Blanca/patología , Adulto , Mapeo Encefálico/métodos , Encéfalo/diagnóstico por imagen , Encéfalo/patología , Encéfalo/fisiopatología , Imagen por Resonancia Magnética/métodos , Tractos Piramidales/diagnóstico por imagen , Tractos Piramidales/patología
6.
Syst Biol ; 72(6): 1280-1295, 2023 Dec 30.
Artículo en Inglés | MEDLINE | ID: mdl-37756489

RESUMEN

The bootstrap method is based on resampling sequence alignments and re-estimating trees. Felsenstein's bootstrap proportions (FBP) are the most common approach to assess the reliability and robustness of sequence-based phylogenies. However, when increasing taxon sampling (i.e., the number of sequences) to hundreds or thousands of taxa, FBP tend to return low support for deep branches. The transfer bootstrap expectation (TBE) has been recently suggested as an alternative to FBP. TBE is measured using a continuous transfer index in [0,1] for each bootstrap tree, instead of the binary {0,1} index used in FBP to measure the presence/absence of the branch of interest. TBE has been shown to yield higher and more informative supports while inducing a very low number of falsely supported branches. Nonetheless, it has been argued that TBE must be used with care due to sampling issues, especially in datasets with a high number of closely related taxa. In this study, we conduct multiple experiments by varying taxon sampling and comparing FBP and TBE support values on different phylogenetic depths, using empirical datasets. Our results show that the main critique of TBE stands in extreme cases with shallow branches and highly unbalanced sampling among clades, but that TBE is still robust in most cases, while FBP is inescapably negatively impacted by high taxon sampling. We suggest guidelines and good practices in TBE (and FBP) computing and interpretation.


Asunto(s)
Filogenia , Reproducibilidad de los Resultados
7.
Biometrics ; 80(3)2024 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-39166460

RESUMEN

A common problem in clinical trials is to test whether the effect of an explanatory variable on a response of interest is similar between two groups, for example, patient or treatment groups. In this regard, similarity is defined as equivalence up to a pre-specified threshold that denotes an acceptable deviation between the two groups. This issue is typically tackled by assessing if the explanatory variable's effect on the response is similar. This assessment is based on, for example, confidence intervals of differences or a suitable distance between two parametric regression models. Typically, these approaches build on the assumption of a univariate continuous or binary outcome variable. However, multivariate outcomes, especially beyond the case of bivariate binary responses, remain underexplored. This paper introduces an approach based on a generalized joint regression framework exploiting the Gaussian copula. Compared to existing methods, our approach accommodates various outcome variable scales, such as continuous, binary, categorical, and ordinal, including mixed outcomes in multi-dimensional spaces. We demonstrate the validity of this approach through a simulation study and an efficacy-toxicity case study, hence highlighting its practical relevance.


Asunto(s)
Simulación por Computador , Modelos Estadísticos , Humanos , Análisis Multivariante , Análisis de Regresión , Resultado del Tratamiento , Biometría/métodos , Ensayos Clínicos como Asunto , Interpretación Estadística de Datos
8.
Biometrics ; 80(2)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38742907

RESUMEN

We propose a new non-parametric conditional independence test for a scalar response and a functional covariate over a continuum of quantile levels. We build a Cramer-von Mises type test statistic based on an empirical process indexed by random projections of the functional covariate, effectively avoiding the "curse of dimensionality" under the projected hypothesis, which is almost surely equivalent to the null hypothesis. The asymptotic null distribution of the proposed test statistic is obtained under some mild assumptions. The asymptotic global and local power properties of our test statistic are then investigated. We specifically demonstrate that the statistic is able to detect a broad class of local alternatives converging to the null at the parametric rate. Additionally, we recommend a simple multiplier bootstrap approach for estimating the critical values. The finite-sample performance of our statistic is examined through several Monte Carlo simulation experiments. Finally, an analysis of an EEG data set is used to show the utility and versatility of our proposed test statistic.


Asunto(s)
Simulación por Computador , Modelos Estadísticos , Método de Montecarlo , Humanos , Electroencefalografía/estadística & datos numéricos , Interpretación Estadística de Datos , Biometría/métodos , Estadísticas no Paramétricas
9.
Stat Med ; 43(2): 256-278, 2024 01 30.
Artículo en Inglés | MEDLINE | ID: mdl-37965978

RESUMEN

Health disparity research often evaluates health outcomes across demographic subgroups. Multilevel regression and poststratification (MRP) is a popular approach for small subgroup estimation as it can stabilize estimates by fitting multilevel models and adjust for selection bias by poststratifying on auxiliary variables, which are population characteristics predictive of the analytic outcome. However, the granularity and quality of the estimates produced by MRP are limited by the availability of the auxiliary variables' joint distribution; data analysts often only have access to the marginal distributions. To overcome this limitation, we embed the estimation of population cell counts needed for poststratification into the MRP workflow: embedded MRP (EMRP). Under EMRP, we generate synthetic populations of the auxiliary variables before implementing MRP. All sources of estimation uncertainty are propagated with a fully Bayesian framework. Through simulation studies, we compare different methods of generating the synthetic populations and demonstrate EMRP's improvements over alternatives on the bias-variance tradeoff to yield valid subpopulation inferences of interest. We apply EMRP to the Longitudinal Survey of Wellbeing and estimate food insecurity prevalence among vulnerable groups in New York City. We find that all EMRP estimators can correct for the bias in classical MRP while maintaining lower standard errors and narrower confidence intervals than directly imputing with the weighted finite population Bayesian bootstrap (WFPBB) and design-based estimates. Performances from the EMRP estimators do not differ substantially from each other, though we would generally recommend using the WFPBB-MRP for its consistently high coverage rates.


Asunto(s)
Teorema de Bayes , Humanos , Sesgo , Sesgo de Selección , Simulación por Computador , Estudios Longitudinales
10.
Stat Med ; 43(10): 1920-1932, 2024 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-38417455

RESUMEN

Consider the choice of outcome for overall treatment benefit in a clinical trial which measures the first time to each of several clinical events. We describe several new variants of the win ratio that incorporate the time spent in each clinical state over the common follow-up, where clinical state means the worst clinical event that has occurred by that time. One version allows restriction so that death during follow-up is most important, while time spent in other clinical states is still accounted for. Three other variants are described; one is based on the average pairwise win time, one creates a continuous outcome for each participant based on expected win times against a reference distribution and another that uses the estimated distributions of clinical state to compare the treatment arms. Finally, a combination testing approach is described to give robust power for detecting treatment benefit across a broad range of alternatives. These new methods are designed to be closer to the overall treatment benefit/harm from a patient's perspective, compared to the ordinary win ratio. The new methods are compared to the composite event approach and the ordinary win ratio. Simulations show that when overall treatment benefit on death is substantial, the variants based on either the participants' expected win times (EWTs) against a reference distribution or estimated clinical state distributions have substantially higher power than either the pairwise comparison or composite event methods. The methods are illustrated by re-analysis of the trial heart failure: a controlled trial investigating outcomes of exercise training.


Asunto(s)
Insuficiencia Cardíaca , Humanos , Determinación de Punto Final/métodos , Interpretación Estadística de Datos
11.
Stat Med ; 43(2): 279-295, 2024 01 30.
Artículo en Inglés | MEDLINE | ID: mdl-38124426

RESUMEN

The use of Monte-Carlo (MC) p $$ p $$ -values when testing the significance of a large number of hypotheses is now commonplace. In large-scale hypothesis testing, we will typically encounter at least some p $$ p $$ -values near the threshold of significance, which require a larger number of MC replicates than p $$ p $$ -values that are far from the threshold. As a result, some incorrect conclusions can be reached due to MC error alone; for hypotheses near the threshold, even a very large number (eg, 1 0 6 $$ 1{0}^6 $$ ) of MC replicates may not be enough to guarantee conclusions reached using MC p $$ p $$ -values. Gandy and Hahn (GH)6-8 have developed the only method that directly addresses this problem. They defined a Monte-Carlo error rate (MCER) to be the probability that any decisions on accepting or rejecting a hypothesis based on MC p $$ p $$ -values are different from decisions based on ideal p $$ p $$ -values; their method then makes decisions by controlling the MCER. Unfortunately, the GH method is frequently very conservative, often making no rejections at all and leaving a large number of hypotheses "undecided". In this article, we propose MERIT, a method for large-scale MC hypothesis testing that also controls the MCER but is more statistically efficient than the GH method. Through extensive simulation studies, we demonstrate that MERIT controls the MCER while making more decisions that agree with the ideal p $$ p $$ -values than GH does. We also illustrate our method by an analysis of gene expression data from a prostate cancer study.


Asunto(s)
Proyectos de Investigación , Humanos , Simulación por Computador , Probabilidad , Método de Montecarlo
12.
Stat Med ; 43(10): 1993-2006, 2024 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-38442874

RESUMEN

When designing confirmatory Phase 3 studies, one usually evaluates one or more efficacious and safe treatment option(s) based on data from previous studies. However, several retrospective research articles reported the phenomenon of "diminished treatment effect in Phase 3" based on many case studies. Even under basic assumptions, it was shown that the commonly used estimator could substantially overestimate the efficacy of selected group(s). As alternatives, we propose a class of computational methods to reduce estimation bias and mean squared error with a broader scope of multiple treatment groups and flexibility to accommodate summary results by group as input. Based on simulation studies and a real data example, we provide practical implementation guidance for this class of methods under different scenarios. For more complicated problems, our framework can serve as a starting point with additional layers built in. Proposed methods can also be widely applied to other selection problems.


Asunto(s)
Proyectos de Investigación , Humanos , Sesgo de Selección , Estudios Retrospectivos , Simulación por Computador , Sesgo
13.
Stat Med ; 43(2): 216-232, 2024 01 30.
Artículo en Inglés | MEDLINE | ID: mdl-37957033

RESUMEN

In multi-season clinical trials with a randomize-once strategy, patients enrolled from previous seasons who stay alive and remain in the study will be treated according to the initial randomization in subsequent seasons. To address the potentially selective attrition from earlier seasons for the non-randomized cohorts, we develop an inverse probability of treatment weighting method using season-specific propensity scores to produce unbiased estimates of survival functions or hazard ratios. Bootstrap variance estimators are used to account for the randomness in the estimated weights and the potential correlations in repeated events within each patient from season to season. Simulation studies show that the weighting procedure and bootstrap variance estimator provide unbiased estimates and valid inferences in Kaplan-Meier estimates and Cox proportional hazard models. Finally, data from the INVESTED trial are analyzed to illustrate the proposed method.


Asunto(s)
Modelos Estadísticos , Humanos , Modelos de Riesgos Proporcionales , Simulación por Computador , Puntaje de Propensión , Estimación de Kaplan-Meier
14.
Stat Med ; 43(15): 2894-2927, 2024 Jul 10.
Artículo en Inglés | MEDLINE | ID: mdl-38738397

RESUMEN

Estimating causal effects from large experimental and observational data has become increasingly prevalent in both industry and research. The bootstrap is an intuitive and powerful technique used to construct standard errors and confidence intervals of estimators. Its application however can be prohibitively demanding in settings involving large data. In addition, modern causal inference estimators based on machine learning and optimization techniques exacerbate the computational burden of the bootstrap. The bag of little bootstraps has been proposed in non-causal settings for large data but has not yet been applied to evaluate the properties of estimators of causal effects. In this article, we introduce a new bootstrap algorithm called causal bag of little bootstraps for causal inference with large data. The new algorithm significantly improves the computational efficiency of the traditional bootstrap while providing consistent estimates and desirable confidence interval coverage. We describe its properties, provide practical considerations, and evaluate the performance of the proposed algorithm in terms of bias, coverage of the true 95% confidence intervals, and computational time in a simulation study. We apply it in the evaluation of the effect of hormone therapy on the average time to coronary heart disease using a large observational data set from the Women's Health Initiative.


Asunto(s)
Algoritmos , Causalidad , Simulación por Computador , Humanos , Femenino , Intervalos de Confianza , Enfermedad Coronaria/epidemiología , Modelos Estadísticos , Interpretación Estadística de Datos , Sesgo , Estudios Observacionales como Asunto/métodos , Estudios Observacionales como Asunto/estadística & datos numéricos
15.
Stat Med ; 43(20): 3921-3942, 2024 Sep 10.
Artículo en Inglés | MEDLINE | ID: mdl-38951867

RESUMEN

For survival analysis applications we propose a novel procedure for identifying subgroups with large treatment effects, with focus on subgroups where treatment is potentially detrimental. The approach, termed forest search, is relatively simple and flexible. All-possible subgroups are screened and selected based on hazard ratio thresholds indicative of harm with assessment according to the standard Cox model. By reversing the role of treatment one can seek to identify substantial benefit. We apply a splitting consistency criteria to identify a subgroup considered "maximally consistent with harm." The type-1 error and power for subgroup identification can be quickly approximated by numerical integration. To aid inference we describe a bootstrap bias-corrected Cox model estimator with variance estimated by a Jacknife approximation. We provide a detailed evaluation of operating characteristics in simulations and compare to virtual twins and generalized random forests where we find the proposal to have favorable performance. In particular, in our simulation setting, we find the proposed approach favorably controls the type-1 error for falsely identifying heterogeneity with higher power and classification accuracy for substantial heterogeneous effects. Two real data applications are provided for publicly available datasets from a clinical trial in oncology, and HIV.


Asunto(s)
Simulación por Computador , Infecciones por VIH , Modelos de Riesgos Proporcionales , Humanos , Análisis de Supervivencia
16.
Stat Med ; 2024 Jul 23.
Artículo en Inglés | MEDLINE | ID: mdl-39044448

RESUMEN

Logistic regression models are widely used in case-control data analysis, and testing the goodness-of-fit of their parametric model assumption is a fundamental research problem. In this article, we propose to enhance the power of the goodness-of-fit test by exploiting a monotonic density ratio model, in which the ratio of case and control densities is assumed to be a monotone function. We show that such a monotonic density ratio model is naturally induced by the retrospective case-control sampling design under the alternative hypothesis. The pool-adjacent-violator algorithm is adapted to solve for the constrained nonparametric maximum likelihood estimator under the alternative hypothesis. By measuring the discrepancy between this estimator and the semiparametric maximum likelihood estimator under the null hypothesis, we develop a new Kolmogorov-Smirnov-type statistic to test the goodness-of-fit for logistic regression models with case-control data. A bootstrap resampling procedure is suggested to approximate the p $$ p $$ -value of the proposed test. Simulation results show that the type I error of the proposed test is well controlled and the power improvement is substantial in many cases. Three real data applications are also included for illustration.

17.
Am J Bot ; 111(2): e16282, 2024 02.
Artículo en Inglés | MEDLINE | ID: mdl-38334302

RESUMEN

Although molecular phylogenetics remains the most widely used method of inferring the evolutionary history of living groups, the last decade has seen a renewed interest in morphological phylogenetics, mostly driven by the promises that integrating the fossil record in phylogenetic trees offers to our understanding of macroevolutionary processes and dynamics and the possibility that the inclusion of fossil taxa could lead to more accurate phylogenetic hypotheses. The plant fossil record presents some challenges to its integration in a phylogenetic framework. Phylogenies including plant fossils often retrieve uncertain relationships with low support, or lack of resolution. This low support is due to the pervasiveness of morphological convergence among plant organs and the fragmentary nature of many plant fossils, and it is often perceived as a fundamental weakness reducing the utility of plant fossils in phylogenetics. Here I discuss the importance of uncertainty in morphological phylogenetics and how we can identify important information from different patterns and types of uncertainty. I also review a set of methodologies that can allow us to understand the causes underpinning uncertainty and how these practices can help us to further our knowledge of plant fossils. I also propose that a new visual language, including the use of networks instead of trees, represents an improvement on the old visualization based on consensus trees and more adequately serves phylogeneticists working with plant fossils. This set of methods and visualization tools represents an important way forward in a fundamental field for our understanding of the evolutionary history of plants.


Asunto(s)
Evolución Biológica , Fósiles , Filogenia , Incertidumbre , Plantas/genética
18.
BMC Med Res Methodol ; 24(1): 73, 2024 Mar 21.
Artículo en Inglés | MEDLINE | ID: mdl-38515018

RESUMEN

BACKGROUND: Misclassification bias (MB) is the deviation of measured from true values due to incorrect case assignment. This study compared MB when cystectomy status was determined using administrative database codes vs. predicted cystectomy probability. METHODS: We identified every primary cystectomy-diversion type at a single hospital 2009-2019. We linked to claims data to measure true association of cystectomy with 30 patient and hospitalization factors. Associations were also measured when cystectomy status was assigned using billing codes and by cystectomy probability from multivariate logistic regression model with covariates from administrative data. MB was the difference between measured and true associations. RESULTS: 500 people underwent cystectomy (0.12% of 428 677 hospitalizations). Sensitivity and positive predictive values for cystectomy codes were 97.1% and 58.6% for incontinent diversions and 100.0% and 48.4% for continent diversions, respectively. The model accurately predicted cystectomy-incontinent diversion (c-statistic [C] 0.999, Integrated Calibration Index [ICI] 0.000) and cystectomy-continent diversion (C:1.000, ICI 0.000) probabilities. MB was significantly lower when model-based predictions was used to impute cystectomy-diversion type status using for both incontinent cystectomy (F = 12.75; p < .0001) and continent cystectomy (F = 11.25; p < .0001). CONCLUSIONS: A model using administrative data accurately returned the probability that cystectomy by diversion type occurred during a hospitalization. Using this model to impute cystectomy status minimized MB. Accuracy of administrative database research can be increased by using probabilistic imputation to determine case status instead of individual codes.


Asunto(s)
Cistectomía , Neoplasias de la Vejiga Urinaria , Humanos , Hospitalización , Probabilidad , Sesgo , Bases de Datos Factuales , Neoplasias de la Vejiga Urinaria/cirugía
19.
BMC Med Res Methodol ; 24(1): 148, 2024 Jul 13.
Artículo en Inglés | MEDLINE | ID: mdl-39003462

RESUMEN

We propose a compartmental model for investigating smoking dynamics in an Italian region (Tuscany). Calibrating the model on local data from 1993 to 2019, we estimate the probabilities of starting and quitting smoking and the probability of smoking relapse. Then, we forecast the evolution of smoking prevalence until 2043 and assess the impact on mortality in terms of attributable deaths. We introduce elements of novelty with respect to previous studies in this field, including a formal definition of the equations governing the model dynamics and a flexible modelling of smoking probabilities based on cubic regression splines. We estimate model parameters by defining a two-step procedure and quantify the sampling variability via a parametric bootstrap. We propose the implementation of cross-validation on a rolling basis and variance-based Global Sensitivity Analysis to check the robustness of the results and support our findings. Our results suggest a decrease in smoking prevalence among males and stability among females, over the next two decades. We estimate that, in 2023, 18% of deaths among males and 8% among females are due to smoking. We test the use of the model in assessing the impact on smoking prevalence and mortality of different tobacco control policies, including the tobacco-free generation ban recently introduced in New Zealand.


Asunto(s)
Predicción , Cese del Hábito de Fumar , Fumar , Humanos , Italia/epidemiología , Femenino , Masculino , Fumar/epidemiología , Prevalencia , Predicción/métodos , Cese del Hábito de Fumar/estadística & datos numéricos , Adulto , Persona de Mediana Edad , Modelos Estadísticos
20.
J R Stat Soc Series B Stat Methodol ; 86(2): 411-434, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38746015

RESUMEN

Mediation analysis aims to assess if, and how, a certain exposure influences an outcome of interest through intermediate variables. This problem has recently gained a surge of attention due to the tremendous need for such analyses in scientific fields. Testing for the mediation effect (ME) is greatly challenged by the fact that the underlying null hypothesis (i.e. the absence of MEs) is composite. Most existing mediation tests are overly conservative and thus underpowered. To overcome this significant methodological hurdle, we develop an adaptive bootstrap testing framework that can accommodate different types of composite null hypotheses in the mediation pathway analysis. Applied to the product of coefficients test and the joint significance test, our adaptive testing procedures provide type I error control under the composite null, resulting in much improved statistical power compared to existing tests. Both theoretical properties and numerical examples of the proposed methodology are discussed.

SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda