Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 145
Filtrar
Más filtros

Bases de datos
Tipo del documento
Intervalo de año de publicación
1.
Proc Natl Acad Sci U S A ; 119(16): e2120737119, 2022 04 19.
Artículo en Inglés | MEDLINE | ID: mdl-35412893

RESUMEN

Probability models are used for many statistical tasks, notably parameter estimation, interval estimation, inference about model parameters, point prediction, and interval prediction. Thus, choosing a statistical model and accounting for uncertainty about this choice are important parts of the scientific process. Here we focus on one such choice, that of variables to include in a linear regression model. Many methods have been proposed, including Bayesian and penalized likelihood methods, and it is unclear which one to use. We compared 21 of the most popular methods by carrying out an extensive set of simulation studies based closely on real datasets that span a range of situations encountered in practical data analysis. Three adaptive Bayesian model averaging (BMA) methods performed best across all statistical tasks. These used adaptive versions of Zellner's g-prior for the parameters, where the prior variance parameter g is a function of sample size or is estimated from the data. We found that for BMA methods implemented with Markov chain Monte Carlo, 10,000 iterations were enough. Computationally, we found two of the three best methods (BMA with g=√n and empirical Bayes-local) to be competitive with the least absolute shrinkage and selection operator (LASSO), which is often preferred as a variable selection technique because of its computational efficiency. BMA performed better than Bayesian model selection (in which just one model is selected).

2.
Biostatistics ; 2023 Sep 11.
Artículo en Inglés | MEDLINE | ID: mdl-37697901

RESUMEN

The traditional trial paradigm is often criticized as being slow, inefficient, and costly. Statistical approaches that leverage external trial data have emerged to make trials more efficient by augmenting the sample size. However, these approaches assume that external data are from previously conducted trials, leaving a rich source of untapped real-world data (RWD) that cannot yet be effectively leveraged. We propose a semi-supervised mixture (SS-MIX) multisource exchangeability model (MEM); a flexible, two-step Bayesian approach for incorporating RWD into randomized controlled trial analyses. The first step is a SS-MIX model on a modified propensity score and the second step is a MEM. The first step targets a representative subgroup of individuals from the trial population and the second step avoids borrowing when there are substantial differences in outcomes among the trial sample and the representative observational sample. When comparing the proposed approach to competing borrowing approaches in a simulation study, we find that our approach borrows efficiently when the trial and RWD are consistent, while mitigating bias when the trial and external data differ on either measured or unmeasured covariates. We illustrate the proposed approach with an application to a randomized controlled trial investigating intravenous hyperimmune immunoglobulin in hospitalized patients with influenza, while leveraging data from an external observational study to supplement a subgroup analysis by influenza subtype.

3.
Biostatistics ; 24(3): 669-685, 2023 Jul 14.
Artículo en Inglés | MEDLINE | ID: mdl-35024790

RESUMEN

The explosion in high-resolution data capture technologies in health has increased interest in making inferences about individual-level parameters. While technology may provide substantial data on a single individual, how best to use multisource population data to improve individualized inference remains an open research question. One possible approach, the multisource exchangeability model (MEM), is a Bayesian method for integrating data from supplementary sources into the analysis of a primary source. MEM was originally developed to improve inference for a single study by asymmetrically borrowing information from a set of similar previous studies and was further developed to apply a more computationally intensive symmetric borrowing in the context of basket trial; however, even for asymmetric borrowing, its computational burden grows exponentially with the number of supplementary sources, making it unsuitable for applications where hundreds or thousands of supplementary sources (i.e., individuals) could contribute to inference on a given individual. In this article, we propose the data-driven MEM (dMEM), a two-stage approach that includes both source selection and clustering to enable the inclusion of an arbitrary number of sources to contribute to individualized inference in a computationally tractable and data-efficient way. We illustrate the application of dMEM to individual-level human behavior and mental well-being data collected via smartphones, where our approach increases individual-level estimation precision by 84% compared with a standard no-borrowing method and outperforms recently proposed competing methods in 80% of individuals.


Asunto(s)
Modelos Estadísticos , Humanos , Teorema de Bayes
4.
Biostatistics ; 2023 Sep 05.
Artículo en Inglés | MEDLINE | ID: mdl-37669215

RESUMEN

In recent years, multi-regional clinical trials (MRCTs) have increased in popularity in the pharmaceutical industry due to their ability to accelerate the global drug development process. To address potential challenges with MRCTs, the International Council for Harmonisation released the E17 guidance document which suggests the use of statistical methods that utilize information borrowing across regions if regional sample sizes are small. We develop an approach that allows for information borrowing via Bayesian model averaging in the context of a joint analysis of survival and longitudinal data from MRCTs. In this novel application of joint models to MRCTs, we use Laplace's method to integrate over subject-specific random effects and to approximate posterior distributions for region-specific treatment effects on the time-to-event outcome. Through simulation studies, we demonstrate that the joint modeling approach can result in an increased rejection rate when testing the global treatment effect compared with methods that analyze survival data alone. We then apply the proposed approach to data from a cardiovascular outcomes MRCT.

5.
Biostatistics ; 24(2): 262-276, 2023 04 14.
Artículo en Inglés | MEDLINE | ID: mdl-34296263

RESUMEN

Multiregional clinical trials (MRCTs) provide the benefit of more rapidly introducing drugs to the global market; however, small regional sample sizes can lead to poor estimation quality of region-specific effects when using current statistical methods. With the publication of the International Conference for Harmonisation E17 guideline in 2017, the MRCT design is recognized as a viable strategy that can be accepted by regional regulatory authorities, necessitating new statistical methods that improve the quality of region-specific inference. In this article, we develop a novel methodology for estimating region-specific and global treatment effects for MRCTs using Bayesian model averaging. This approach can be used for trials that compare two treatment groups with respect to a continuous outcome, and it allows for the incorporation of patient characteristics through the inclusion of covariates. We propose an approach that uses posterior model probabilities to quantify evidence in favor of consistency of treatment effects across all regions, and this metric can be used by regulatory authorities for drug approval. We show through simulations that the proposed modeling approach results in lower MSE than a fixed-effects linear regression model and better control of type I error rates than a Bayesian hierarchical model.


Asunto(s)
Aprobación de Drogas , Proyectos de Investigación , Humanos , Teorema de Bayes , Resultado del Tratamiento , Tamaño de la Muestra , Probabilidad
6.
Metab Eng ; 83: 137-149, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38582144

RESUMEN

Metabolic reaction rates (fluxes) play a crucial role in comprehending cellular phenotypes and are essential in areas such as metabolic engineering, biotechnology, and biomedical research. The state-of-the-art technique for estimating fluxes is metabolic flux analysis using isotopic labelling (13C-MFA), which uses a dataset-model combination to determine the fluxes. Bayesian statistical methods are gaining popularity in the field of life sciences, but the use of 13C-MFA is still dominated by conventional best-fit approaches. The slow take-up of Bayesian approaches is, at least partly, due to the unfamiliarity of Bayesian methods to metabolic engineering researchers. To address this unfamiliarity, we here outline similarities and differences between the two approaches and highlight particular advantages of the Bayesian way of flux analysis. With a real-life example, re-analysing a moderately informative labelling dataset of E. coli, we identify situations in which Bayesian methods are advantageous and more informative, pointing to potential pitfalls of current 13C-MFA evaluation approaches. We propose the use of Bayesian model averaging (BMA) for flux inference as a means of overcoming the problem of model uncertainty through its tendency to assign low probabilities to both, models that are unsupported by data, and models that are overly complex. In this capacity, BMA resembles a tempered Ockham's razor. With the tempered razor as a guide, BMA-based 13C-MFA alleviates the problem of model selection uncertainty and is thereby capable of becoming a game changer for metabolic engineering by uncovering new insights and inspiring novel approaches.


Asunto(s)
Teorema de Bayes , Isótopos de Carbono , Escherichia coli , Isótopos de Carbono/metabolismo , Escherichia coli/metabolismo , Escherichia coli/genética , Análisis de Flujos Metabólicos/métodos , Modelos Biológicos , Ingeniería Metabólica/métodos , Marcaje Isotópico
7.
Stat Med ; 43(4): 774-792, 2024 02 20.
Artículo en Inglés | MEDLINE | ID: mdl-38081586

RESUMEN

When long-term follow up is required for a primary endpoint in a randomized clinical trial, a valid surrogate marker can help to estimate the treatment effect and accelerate the decision process. Several model-based methods have been developed to evaluate the proportion of the treatment effect that is explained by the treatment effect on the surrogate marker. More recently, a nonparametric approach has been proposed allowing for more flexibility by avoiding the restrictive parametric model assumptions required in the model-based methods. While the model-based approaches suffer from potential mis-specification of the models, the nonparametric method fails to give desirable estimates when the sample size is small, or when the range of the data does not follow certain conditions. In this paper, we propose a Bayesian model averaging approach to estimate the proportion of treatment effect explained by the surrogate marker. Our procedure offers a compromise between the model-based approach and the nonparametric approach by introducing model flexibility via averaging over several candidate models and maintains the strength of parametric models with respect to inference. We compare our approach with previous model-based methods and the nonparametric method. Simulation studies demonstrate the advantage of our method when surrogate supports are inconsistent and sample sizes are small. We illustrate our method using data from the Diabetes Prevention Program study to examine hemoglobin A1c as a surrogate marker for fasting glucose.


Asunto(s)
Diabetes Mellitus , Humanos , Teorema de Bayes , Simulación por Computador , Tamaño de la Muestra , Biomarcadores
8.
Lipids Health Dis ; 23(1): 109, 2024 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-38622701

RESUMEN

OBJECTIVE: This study aims to investigate the association between specific lipidomes and the risk of breast cancer (BC) using the Two-Sample Mendelian Randomization (TSMR) approach and Bayesian Model Averaging Mendelian Randomization (BMA-MR) method. METHOD: The study analyzed data from large-scale GWAS datasets of 179 lipidomes to assess the relationship between lipidomes and BC risk across different molecular subtypes. TSMR was employed to explore causal relationships, while the BMA-MR method was carried out to validate the results. The study assessed heterogeneity and horizontal pleiotropy through Cochran's Q, MR-Egger intercept tests, and MR-PRESSO. Moreover, a leave-one-out sensitivity analysis was performed to evaluate the impact of individual single nucleotide polymorphisms on the MR study. RESULTS: By examining 179 lipidome traits as exposures and BC as the outcome, the study revealed significant causal effects of glycerophospholipids, sphingolipids, and glycerolipids on BC risk. Specifically, for estrogen receptor-positive BC (ER+ BC), phosphatidylcholine (P < 0.05) and phosphatidylinositol (OR: 0.916-0.966, P < 0.05) within glycerophospholipids play significant roles, along with the importance of glycerolipids (diacylglycerol (OR = 0.923, P < 0.001) and triacylglycerol, OR: 0.894-0.960, P < 0.05)). However, the study did not observe a noteworthy impact of sphingolipids on ER+BC. In the case of estrogen receptor-negative BC (ER- BC), not only glycerophospholipids, sphingolipids (OR = 1.085, P = 0.008), and glycerolipids (OR = 0.909, P = 0.002) exerted an influence, but the protective effect of sterols (OR: 1.034-1.056, P < 0.05) was also discovered. The prominence of glycerolipids was minimal in ER-BC. Phosphatidylethanolamine (OR: 1.091-1.119, P < 0.05) was an important causal effect in ER-BC. CONCLUSIONS: The findings reveal that phosphatidylinositol and triglycerides levels decreased the risk of BC, indicating a potential protective role of these lipid molecules. Moreover, the study elucidates BC's intricate lipid metabolic pathways, highlighting diverse lipidome structural variations that may have varying effects in different molecular subtypes.


Asunto(s)
Lipidómica , Neoplasias , Teorema de Bayes , Análisis de la Aleatorización Mendeliana , Glicerofosfolípidos , Fosfatidilinositoles , Esfingolípidos , Receptores de Estrógenos/genética , Estudio de Asociación del Genoma Completo
9.
J Biopharm Stat ; 34(3): 349-365, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38105583

RESUMEN

Selecting a safe and clinically beneficial dose can be difficult in drug development. Dose justification often relies on dose-response modeling where parametric assumptions are made in advance which may not adequately fit the data. This is especially problematic in longitudinal dose-response models, where additional parametric assumptions must be made. This paper proposes a class of longitudinal dose-response models to be used in the Bayesian model averaging paradigm which improve trial operating characteristics while maintaining flexibility a priori. A new longitudinal model for non-monotonic longitudinal profiles is proposed. The benefits and trade-offs of the proposed approach are demonstrated through a case study and simulation.


Asunto(s)
Modelos Estadísticos , Humanos , Teorema de Bayes , Simulación por Computador , Relación Dosis-Respuesta a Droga
10.
Multivariate Behav Res ; : 1-21, 2024 May 11.
Artículo en Inglés | MEDLINE | ID: mdl-38733319

RESUMEN

Network psychometrics uses graphical models to assess the network structure of psychological variables. An important task in their analysis is determining which variables are unrelated in the network, i.e., are independent given the rest of the network variables. This conditional independence structure is a gateway to understanding the causal structure underlying psychological processes. Thus, it is crucial to have an appropriate method for evaluating conditional independence and dependence hypotheses. Bayesian approaches to testing such hypotheses allow researchers to differentiate between absence of evidence and evidence of absence of connections (edges) between pairs of variables in a network. Three Bayesian approaches to assessing conditional independence have been proposed in the network psychometrics literature. We believe that their theoretical foundations are not widely known, and therefore we provide a conceptual review of the proposed methods and highlight their strengths and limitations through a simulation study. We also illustrate the methods using an empirical example with data on Dark Triad Personality. Finally, we provide recommendations on how to choose the optimal method and discuss the current gaps in the literature on this important topic.

11.
Alzheimers Dement ; 20(7): 4702-4716, 2024 07.
Artículo en Inglés | MEDLINE | ID: mdl-38779851

RESUMEN

INTRODUCTION: Patients with subjective memory complaints (SMC) may include subgroups with different neuropsychological profiles and risks of cognitive impairment. METHODS: Cluster analysis was performed on two datasets (n: 630 and 734) comprising demographic and neuropsychological data from SMC and healthy controls (HC). Survival analyses were conducted on clusters. Bayesian model averaging assessed the predictive utility of clusters and other biomarkers. RESULTS: Two clusters with higher and lower than average cognitive performance were detected in SMC and HC. Assignment to the lower performance cluster increased the risk of cognitive impairment in both datasets (hazard ratios: 1.78 and 2.96; Plog-rank: 0.04 and <0.001) and was associated with lower hippocampal volumes and higher tau/amyloid beta 42 ratios in cerebrospinal fluid. The effect of SMC was small and confounded by mood. DISCUSSION: This study provides evidence of the presence of cognitive clusters that hold biological significance and predictive value for cognitive decline in SMC and HC. HIGHLIGHTS: Patients with subjective memory complaints include two cognitive clusters. Assignment to the lower performance cluster increases risk of cognitive impairment. This cluster shows a pattern of biomarkers consistent with incipient Alzheimer's disease pathology. The same cognitive cluster structure is found in healthy controls. The effect of memory complaints on risk of cognitive decline is small and confounded.


Asunto(s)
Disfunción Cognitiva , Trastornos de la Memoria , Pruebas Neuropsicológicas , Humanos , Femenino , Masculino , Anciano , Análisis por Conglomerados , Pruebas Neuropsicológicas/estadística & datos numéricos , Disfunción Cognitiva/líquido cefalorraquídeo , Péptidos beta-Amiloides/líquido cefalorraquídeo , Proteínas tau/líquido cefalorraquídeo , Biomarcadores/líquido cefalorraquídeo , Teorema de Bayes , Hipocampo/patología , Persona de Mediana Edad , Fragmentos de Péptidos/líquido cefalorraquídeo
12.
J Environ Manage ; 354: 120252, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38394869

RESUMEN

Data-driven machine learning approaches are promising to substitute physically based groundwater numerical models and capture input-output relationships for reducing computational burden. But the performance and reliability are strongly influenced by different sources of uncertainty. Conventional researches generally rely on a stand-alone machine learning surrogate approach and fail to account for errors in model outputs resulting from structural deficiencies. To overcome this issue, this study proposes a flexible integrated Bayesian machine learning modeling (IBMLM) method to explicitly quantify uncertainties originating from structures and parameters of machine learning surrogate models. An Expectation-Maximization (EM) algorithm is combined with Bayesian model averaging (BMA) to find out maximum likelihood and construct posterior predictive distribution. Three machine learning approaches representing different model complexity are incorporated in the framework, including artificial neural network (ANN), support vector machine (SVM) and random forest (RF). The proposed IBMLM method is demonstrated in a field-scale real-world "1500-foot" sand aquifer, Baton Rouge, USA, where overexploitation caused serious saltwater intrusion (SWI) issues. This study adds to the understanding of how chloride concentration transport responds to multi-dimensional extraction-injection remediation strategies in a sophisticated saltwater intrusion model. Results show that most IBMLM exhibit r values above 0.98 and NSE values above 0.93, both slightly higher than individual machine learning, confirming that the IBMLM is well established to provide better model predictions than individual machine learning models, while maintaining the advantage of high computing efficiency. The IBMLM is found useful to predict saltwater intrusion without running the physically based numerical simulation model. We conclude that an explicit consideration of machine learning model structure uncertainty along with parameters improves accuracy and reliability of predictions, and also corrects uncertainty bounds. The applicability of the IBMLM framework can be extended in regions where a physical hydrogeologic model is difficult to build due to lack of subsurface information.


Asunto(s)
Agua Subterránea , Incertidumbre , Teorema de Bayes , Reproducibilidad de los Resultados , Agua Subterránea/química , Aprendizaje Automático
13.
Environ Geochem Health ; 46(7): 253, 2024 Jun 17.
Artículo en Inglés | MEDLINE | ID: mdl-38884835

RESUMEN

Urinary cadmium (U-Cd) values are indicators for determining chronic cadmium toxicity, and previous studies have calculated U-Cd indicators using renal injury biomarkers. However, most of these studies have been conducted in adult populations, and there is a lack of research on U-Cd thresholds in preschool children. We aimed to apply benchmark dose (BMD) analysis to estimate the U-Cd threshold level associated with renal impairment in preschool children in the cadmium-polluted area. 518 preschool children aged 3-5 years were selected by systematic sampling (275 boys, 243 girls). Urinary cadmium and three biomarkers of early renal injury (urinary N-acetyl-ß-D-glucosaminidase, UNAG; urinary ß2-microglobulin, Uß2-MG; urinary retinol-binding protein, URBP) were determined. Bayesian model averaging estimated the BMD and lower confidence interval limit (BMDL) of U-Cd. The medians U-Cd levels in both boys and girls exceeded the recommended national standard threshold (5 µg/g cr) and U-Cd levels were higher in girls than in boys. Urinary N-acetyl-ß-D-glucosaminidase (UNAG) was the most sensitive biomarker of renal effects in preschool children. The overall BMDL5 (BMDL at a benchmark response value of 5) was 2.76 µg/g cr. In the gender analysis, the BMDL5 values were 1.92 µg/g cr for boys and 4.12 µg/g cr for girls. This study shows that the U-Cd threshold (BMDL5) is lower than the national standard (5 µg/g cr) and boys' BMDL5 was lower than the limit set by the European Parliament and Council in 2019 (2 µg/g cr), which provides a reference point for making U-Cd thresholds for preschool children.


Asunto(s)
Teorema de Bayes , Biomarcadores , Cadmio , Humanos , Preescolar , Masculino , Femenino , Cadmio/orina , Biomarcadores/orina , Contaminantes Ambientales/orina , Acetilglucosaminidasa/orina , Benchmarking , Exposición a Riesgos Ambientales , Microglobulina beta-2/orina , Proteínas de Unión al Retinol/orina , Monitoreo del Ambiente/métodos
14.
Behav Res Methods ; 56(3): 1260-1282, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-37099263

RESUMEN

Researchers conduct meta-analyses in order to synthesize information across different studies. Compared to standard meta-analytic methods, Bayesian model-averaged meta-analysis offers several practical advantages including the ability to quantify evidence in favor of the absence of an effect, the ability to monitor evidence as individual studies accumulate indefinitely, and the ability to draw inferences based on multiple models simultaneously. This tutorial introduces the concepts and logic underlying Bayesian model-averaged meta-analysis and illustrates its application using the open-source software JASP. As a running example, we perform a Bayesian meta-analysis on language development in children. We show how to conduct a Bayesian model-averaged meta-analysis and how to interpret the results.


Asunto(s)
Proyectos de Investigación , Programas Informáticos , Niño , Humanos , Teorema de Bayes
15.
Environ Monit Assess ; 196(7): 614, 2024 Jun 13.
Artículo en Inglés | MEDLINE | ID: mdl-38871960

RESUMEN

Global warming upsets the environmental balance and leads to more frequent and severe climatic events. These extreme events include floods, droughts, and heatwaves. These widespread extreme events disrupt various sectors of ecosystems directly. However, among all these events, drought is one of the most prolonged climatic events that significantly destroys the ecosystem. Therefore, accurate and efficient assessment of droughts is necessary to mitigate their detrimental impacts. In recent years, several drought indices based on global climate models (GCMs) of Coupled Model Intercomparison Project Phase 6 (CMIP6) have been proposed to quantify and monitor droughts. However, each index has its advantages and limitations. As each index ensembles different models by using different statistical approaches, it is well known that the margin of error is always a part of statistics. Therefore, this study proposed a new drought index to reduce the uncertainty involved in the assessment of droughts. The proposed index named the Ridge Ensemble Standardized Drought Index (RESDI) is based on the innovative ensemble approach termed ridge parameters and distance-based weighting (RDW) scheme. And the development of this RDW scheme is based on two types of methods i.e., ridge regression and divergence-based method. In this research, we ensemble 18 different GCMs of CMIP6 using the RDW scheme. A comparative analysis of the RDW scheme is performed against the simple model average (SMA) and Bayesian model averaging (BMA) schemes at 32 locations on the Tibetan plateau. The comparison revealed that RDW has less mean absolute error (MAE) and root-mean-square error (RMSE). Therefore, the developed RESDI based on RDW is used to project drought properties under three distinct shared socioeconomic pathway (SSP) scenarios: SSP1-2.6, SSP2-4.5, and SSP5-8.5, across seven different time scales (1, 3, 7, 9, 12, 24, and 48). The projected data is then standardized by using the K-components Gaussian mixture model (K-CGMM). In addition, the study employs steady-state probabilities (SSPs) to determine the long-term behavior of drought. The outcome of this research shows that "normal drought (ND)" has the highest probability of occurrence under all scenarios and time scales.


Asunto(s)
Sequías , Monitoreo del Ambiente , Monitoreo del Ambiente/métodos , Cambio Climático , Ecosistema , Modelos Teóricos , Calentamiento Global , Clima
16.
Environ Monit Assess ; 196(3): 284, 2024 Feb 19.
Artículo en Inglés | MEDLINE | ID: mdl-38374477

RESUMEN

Accurate and reliable air temperature forecasts are necessary for predicting and responding to thermal disasters such as heat strokes. Forecasts from Numerical Weather Prediction (NWP) models contain biases which require post-processing. Studies assessing the skill of probabilistic post-processing techniques (PPTs) on temperature forecasts in India are lacking. This study aims to evaluate probabilistic post-processing approaches such as Nonhomogeneous Gaussian Regression (NGR) and Bayesian Model Averaging (BMA) for improving daily temperature forecasts from two NWP models, namely, the European Centre for Medium Range Weather Forecasts (ECMWF) and the Global Ensemble Forecast System (GEFS), across the Indian subcontinent. Apart from that, the effect of probabilistic PPT on heatwave prediction skills across India is also evaluated. Results show that probabilistic PPT comprehensively outperform traditional approaches in forecasting temperatures across India at all lead times. In the Himalayan regions where the forecast skill of raw forecasts is low, the probabilistic techniques are not able to produce skillful forecasts even though they perform much better than traditional techniques. The NGR method is found to be the best performing PPT across the Indian region. Post-processing Tmax forecasts using the NGR approach was found to considerably improve the heatwave prediction skill across highly heatwave prone regions in India. The outcomes of this study will be helpful in setting up improved heatwave prediction and early warning systems in India.


Asunto(s)
Monitoreo del Ambiente , Golpe de Calor , Humanos , Temperatura , Teorema de Bayes , Monitoreo del Ambiente/métodos , Tiempo (Meteorología)
17.
Entropy (Basel) ; 26(7)2024 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-39056962

RESUMEN

Most statistical modeling applications involve the consideration of a candidate collection of models based on various sets of explanatory variables. The candidate models may also differ in terms of the structural formulations for the systematic component and the posited probability distributions for the random component. A common practice is to use an information criterion to select a model from the collection that provides an optimal balance between fidelity to the data and parsimony. The analyst then typically proceeds as if the chosen model was the only model ever considered. However, such a practice fails to account for the variability inherent in the model selection process, which can lead to inappropriate inferential results and conclusions. In recent years, inferential methods have been proposed for multimodel frameworks that attempt to provide an appropriate accounting of modeling uncertainty. In the frequentist paradigm, such methods should ideally involve model selection probabilities, i.e., the relative frequencies of selection for each candidate model based on repeated sampling. Model selection probabilities can be conveniently approximated through bootstrapping. When the Akaike information criterion is employed, Akaike weights are also commonly used as a surrogate for selection probabilities. In this work, we show that the conventional bootstrap approach for approximating model selection probabilities is impacted by bias. We propose a simple correction to adjust for this bias. We also argue that Akaike weights do not provide adequate approximations for selection probabilities, although they do provide a crude gauge of model plausibility.

18.
Biometrics ; 79(4): 3586-3598, 2023 12.
Artículo en Inglés | MEDLINE | ID: mdl-36594642

RESUMEN

Sponsors often rely on multi-regional clinical trials (MRCTs) to introduce new treatments more rapidly into the global market. Many commonly used statistical methods do not account for regional differences, and small regional sample sizes frequently result in lower estimation quality of region-specific treatment effects. The International Council for Harmonization E17 guidelines suggest consideration of methods that allow for information borrowing across regions to improve estimation. In response to these guidelines, we develop a novel methodology to estimate global and region-specific treatment effects from MRCTs with time-to-event endpoints using Bayesian model averaging (BMA). This approach accounts for the possibility of heterogeneous treatment effects between regions, and we discuss how to assess the consistency of these effects using posterior model probabilities. We obtain posterior samples of the treatment effects using a Laplace approximation, and we show through simulation studies that the proposed modeling approach estimates region-specific treatment effects with lower mean squared error than a Cox proportional hazards model while resulting in a similar rejection rate of the global treatment effect. We then apply the BMA approach to data from the LEADER trial, an MRCT designed to evaluate the cardiovascular safety of an anti-diabetic treatment.


Asunto(s)
Modelos Estadísticos , Proyectos de Investigación , Teorema de Bayes , Tamaño de la Muestra , Simulación por Computador
19.
Stat Med ; 42(27): 4990-5006, 2023 11 30.
Artículo en Inglés | MEDLINE | ID: mdl-37705361

RESUMEN

In immuno-oncology clinical trials, multiple immunological biomarkers are usually examined over time to comprehensively and appropriately evaluate the efficacy of treatments. Because predicting patients' future survival statuses on the basis of such recorded longitudinal information might be of great interest, joint modeling of longitudinal and time-to-event data has been intensively discussed as a toolkit to implement such a prediction. To achieve a desirable predictive performance, averaging over multiple candidate predictive models to account for the model uncertainty might be a more suitable statistical approach than selecting the single best model. Although Bayesian model averaging can be one of the approaches, several problems related to model weights with marginal likelihoods have been discussed. To address these problems, we here propose a Bayesian predictive model averaging (BPMA) method that uses Bayesian leave-one-out cross-validation predictive densities to account for the subject-specific and time-dependent nature of the prediction. We examine the operating characteristics of the proposed BPMA method in terms of the predictive accuracy (ie, the calibration and discrimination abilities) in extensive simulation studies. In addition, we discuss the strengths and limitations of the proposed method by applying it to an immuno-oncology clinical trial in patients with advanced ovarian cancer.


Asunto(s)
Neoplasias , Humanos , Teorema de Bayes , Simulación por Computador , Modelos Estadísticos , Neoplasias/terapia , Probabilidad , Incertidumbre , Ensayos Clínicos como Asunto
20.
BMC Med Res Methodol ; 23(1): 163, 2023 07 06.
Artículo en Inglés | MEDLINE | ID: mdl-37415112

RESUMEN

INTRODUCTION: The length of hospital stay (LOHS) caused by COVID-19 has imposed a financial burden, and cost on the healthcare service system and a high psychological burden on patients and health workers. The purpose of this study is to adopt the Bayesian model averaging (BMA) based on linear regression models and to determine the predictors of the LOHS of COVID-19. METHODS: In this historical cohort study, from 5100 COVID-19 patients who had registered in the hospital database, 4996 patients were eligible to enter the study. The data included demographic, clinical, biomarkers, and LOHS. Factors affecting the LOHS were fitted in six models, including the stepwise method, AIC, BIC in classical linear regression models, two BMA using Occam's Window and Markov Chain Monte Carlo (MCMC) methods, and GBDT algorithm, a new method of machine learning. RESULTS: The average length of hospitalization was 6.7 ± 5.7 days. In fitting classical linear models, both stepwise and AIC methods (R 2 = 0.168 and adjusted R 2 = 0.165) performed better than BIC (R 2 = 0.160 and adjusted = 0.158). In fitting the BMA, Occam's Window model has performed better than MCMC with R 2 = 0.174. The GBDT method with the value of R 2 = 0.64, has performed worse than the BMA in the testing dataset but not in the training dataset. Based on the six fitted models, hospitalized in ICU, respiratory distress, age, diabetes, CRP, PO2, WBC, AST, BUN, and NLR were associated significantly with predicting LOHS of COVID-19. CONCLUSION: The BMA with Occam's Window method has a better fit and better performance in predicting affecting factors on the LOHS in the testing dataset than other models.


Asunto(s)
COVID-19 , Humanos , Estudios de Cohortes , Teorema de Bayes , Hospitalización , Tiempo de Internación , Convulsiones
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA