Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 84
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Am J Epidemiol ; 193(2): 377-388, 2024 Feb 05.
Artigo em Inglês | MEDLINE | ID: mdl-37823269

RESUMO

Propensity score analysis is a common approach to addressing confounding in nonrandomized studies. Its implementation, however, requires important assumptions (e.g., positivity). The disease risk score (DRS) is an alternative confounding score that can relax some of these assumptions. Like the propensity score, the DRS summarizes multiple confounders into a single score, on which conditioning by matching allows the estimation of causal effects. However, matching relies on arbitrary choices for pruning out data (e.g., matching ratio, algorithm, and caliper width) and may be computationally demanding. Alternatively, weighting methods, common in propensity score analysis, are easy to implement and may entail fewer choices, yet none have been developed for the DRS. Here we present 2 weighting approaches: One derives directly from inverse probability weighting; the other, named target distribution weighting, relates to importance sampling. We empirically show that inverse probability weighting and target distribution weighting display performance comparable to matching techniques in terms of bias but outperform them in terms of efficiency (mean squared error) and computational speed (up to >870 times faster in an illustrative study). We illustrate implementation of the methods in 2 case studies where we investigate placebo treatments for multiple sclerosis and administration of aspirin in stroke patients.


Assuntos
Acidente Vascular Cerebral , Humanos , Pontuação de Propensão , Fatores de Risco , Viés , Causalidade , Acidente Vascular Cerebral/epidemiologia , Acidente Vascular Cerebral/etiologia , Simulação por Computador
2.
Stat Med ; 43(3): 514-533, 2024 02 10.
Artigo em Inglês | MEDLINE | ID: mdl-38073512

RESUMO

Missing data is a common problem in medical research, and is commonly addressed using multiple imputation. Although traditional imputation methods allow for valid statistical inference when data are missing at random (MAR), their implementation is problematic when the presence of missingness depends on unobserved variables, that is, the data are missing not at random (MNAR). Unfortunately, this MNAR situation is rather common, in observational studies, registries and other sources of real-world data. While several imputation methods have been proposed for addressing individual studies when data are MNAR, their application and validity in large datasets with multilevel structure remains unclear. We therefore explored the consequence of MNAR data in hierarchical data in-depth, and proposed a novel multilevel imputation method for common missing patterns in clustered datasets. This method is based on the principles of Heckman selection models and adopts a two-stage meta-analysis approach to impute binary and continuous variables that may be outcomes or predictors and that are systematically or sporadically missing. After evaluating the proposed imputation model in simulated scenarios, we illustrate it use in a cross-sectional community survey to estimate the prevalence of malaria parasitemia in children aged 2-10 years in five regions in Uganda.


Assuntos
Pesquisa Biomédica , Criança , Humanos , Estudos Transversais , Uganda/epidemiologia
3.
BMC Med Res Methodol ; 24(1): 91, 2024 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-38641771

RESUMO

Observational data provide invaluable real-world information in medicine, but certain methodological considerations are required to derive causal estimates. In this systematic review, we evaluated the methodology and reporting quality of individual-level patient data meta-analyses (IPD-MAs) conducted with non-randomized exposures, published in 2009, 2014, and 2019 that sought to estimate a causal relationship in medicine. We screened over 16,000 titles and abstracts, reviewed 45 full-text articles out of the 167 deemed potentially eligible, and included 29 into the analysis. Unfortunately, we found that causal methodologies were rarely implemented, and reporting was generally poor across studies. Specifically, only three of the 29 articles used quasi-experimental methods, and no study used G-methods to adjust for time-varying confounding. To address these issues, we propose stronger collaborations between physicians and methodologists to ensure that causal methodologies are properly implemented in IPD-MAs. In addition, we put forward a suggested checklist of reporting guidelines for IPD-MAs that utilize causal methods. This checklist could improve reporting thereby potentially enhancing the quality and trustworthiness of IPD-MAs, which can be considered one of the most valuable sources of evidence for health policy.


Assuntos
Medicina , Projetos de Pesquisa , Humanos , Lista de Checagem
4.
Stat Med ; 42(19): 3508-3528, 2023 08 30.
Artigo em Inglês | MEDLINE | ID: mdl-37311563

RESUMO

External validation of the discriminative ability of prediction models is of key importance. However, the interpretation of such evaluations is challenging, as the ability to discriminate depends on both the sample characteristics (ie, case-mix) and the generalizability of predictor coefficients, but most discrimination indices do not provide any insight into their respective contributions. To disentangle differences in discriminative ability across external validation samples due to a lack of model generalizability from differences in sample characteristics, we propose propensity-weighted measures of discrimination. These weighted metrics, which are derived from propensity scores for sample membership, are standardized for case-mix differences between the model development and validation samples, allowing for a fair comparison of discriminative ability in terms of model characteristics in a target population of interest. We illustrate our methods with the validation of eight prediction models for deep vein thrombosis in 12 external validation data sets and assess our methods in a simulation study. In the illustrative example, propensity score standardization reduced between-study heterogeneity of discrimination, indicating that between-study variability was partially attributable to case-mix. The simulation study showed that only flexible propensity-score methods (allowing for non-linear effects) produced unbiased estimates of model discrimination in the target population, and only when the positivity assumption was met. Propensity score-based standardization may facilitate the interpretation of (heterogeneity in) discriminative ability of a prediction model as observed across multiple studies, and may guide model updating strategies for a particular target population. Careful propensity score modeling with attention for non-linear relations is recommended.


Assuntos
Benchmarking , Grupos Diagnósticos Relacionados , Humanos , Simulação por Computador
5.
Stat Med ; 42(8): 1188-1206, 2023 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-36700492

RESUMO

When data are available from individual patients receiving either a treatment or a control intervention in a randomized trial, various statistical and machine learning methods can be used to develop models for predicting future outcomes under the two conditions, and thus to predict treatment effect at the patient level. These predictions can subsequently guide personalized treatment choices. Although several methods for validating prediction models are available, little attention has been given to measuring the performance of predictions of personalized treatment effect. In this article, we propose a range of measures that can be used to this end. We start by defining two dimensions of model accuracy for treatment effects, for a single outcome: discrimination for benefit and calibration for benefit. We then amalgamate these two dimensions into an additional concept, decision accuracy, which quantifies the model's ability to identify patients for whom the benefit from treatment exceeds a given threshold. Subsequently, we propose a series of performance measures related to these dimensions and discuss estimating procedures, focusing on randomized data. Our methods are applicable for continuous or binary outcomes, for any type of prediction model, as long as it uses baseline covariates to predict outcomes under treatment and control. We illustrate all methods using two simulated datasets and a real dataset from a trial in depression. We implement all methods in the R package predieval. Results suggest that the proposed measures can be useful in evaluating and comparing the performance of competing models in predicting individualized treatment effect.


Assuntos
Modelos Estatísticos , Medicina de Precisão , Ensaios Clínicos Controlados Aleatórios como Assunto , Humanos , Resultado do Tratamento , Regras de Decisão Clínica
6.
Nephrol Dial Transplant ; 36(10): 1837-1850, 2021 09 27.
Artigo em Inglês | MEDLINE | ID: mdl-33051669

RESUMO

BACKGROUND: Accurate risk prediction is needed in order to provide personalized healthcare for chronic kidney disease (CKD) patients. An overload of prognosis studies is being published, ranging from individual biomarker studies to full prediction studies. We aim to systematically appraise published prognosis studies investigating multiple biomarkers and their role in risk predictions. Our primary objective was to investigate if the prognostic models that are reported in the literature were of sufficient quality and to externally validate them. METHODS: We undertook a systematic review and appraised the quality of studies reporting multivariable prognosis models for end-stage renal disease (ESRD), cardiovascular (CV) events and mortality in CKD patients. We subsequently externally validated these models in a randomized trial that included patients from a broad CKD population. RESULTS: We identified 91 papers describing 36 multivariable models for prognosis of ESRD, 50 for CV events, 46 for mortality and 17 for a composite outcome. Most studies were deemed of moderate quality. Moreover, they often adopted different definitions for the primary outcome and rarely reported full model equations (21% of the included studies). External validation was performed in the Multifactorial Approach and Superior Treatment Efficacy in Renal Patients with the Aid of Nurse Practitioners trial (n = 788, with 160 events for ESRD, 79 for CV and 102 for mortality). The 24 models that reported full model equations showed a great variability in their performance, although calibration remained fairly adequate for most models, except when predicting mortality (calibration slope >1.5). CONCLUSIONS: This review shows that there is an abundance of multivariable prognosis models for the CKD population. Most studies were considered of moderate quality, and they were reported and analysed in such a manner that their results cannot directly be used in follow-up research or in clinical practice.


Assuntos
Falência Renal Crônica , Insuficiência Renal Crônica , Biomarcadores , Humanos , Falência Renal Crônica/diagnóstico , Falência Renal Crônica/terapia , Prognóstico , Insuficiência Renal Crônica/diagnóstico , Insuficiência Renal Crônica/terapia , Resultado do Tratamento
7.
Stat Med ; 40(15): 3533-3559, 2021 07 10.
Artigo em Inglês | MEDLINE | ID: mdl-33948970

RESUMO

Prediction models often yield inaccurate predictions for new individuals. Large data sets from pooled studies or electronic healthcare records may alleviate this with an increased sample size and variability in sample characteristics. However, existing strategies for prediction model development generally do not account for heterogeneity in predictor-outcome associations between different settings and populations. This limits the generalizability of developed models (even from large, combined, clustered data sets) and necessitates local revisions. We aim to develop methodology for producing prediction models that require less tailoring to different settings and populations. We adopt internal-external cross-validation to assess and reduce heterogeneity in models' predictive performance during the development. We propose a predictor selection algorithm that optimizes the (weighted) average performance while minimizing its variability across the hold-out clusters (or studies). Predictors are added iteratively until the estimated generalizability is optimized. We illustrate this by developing a model for predicting the risk of atrial fibrillation and updating an existing one for diagnosing deep vein thrombosis, using individual participant data from 20 cohorts (N = 10 873) and 11 diagnostic studies (N = 10 014), respectively. Meta-analysis of calibration and discrimination performance in each hold-out cluster shows that trade-offs between average and heterogeneity of performance occurred. Our methodology enables the assessment of heterogeneity of prediction model performance during model development in multiple or clustered data sets, thereby informing researchers on predictor selection to improve the generalizability to different settings and populations, and reduce the need for model tailoring. Our methodology has been implemented in the R package metamisc.


Assuntos
Projetos de Pesquisa , Calibragem , Humanos
8.
Stat Med ; 40(19): 4230-4251, 2021 08 30.
Artigo em Inglês | MEDLINE | ID: mdl-34031906

RESUMO

In prediction model research, external validation is needed to examine an existing model's performance using data independent to that for model development. Current external validation studies often suffer from small sample sizes and consequently imprecise predictive performance estimates. To address this, we propose how to determine the minimum sample size needed for a new external validation study of a prediction model for a binary outcome. Our calculations aim to precisely estimate calibration (Observed/Expected and calibration slope), discrimination (C-statistic), and clinical utility (net benefit). For each measure, we propose closed-form and iterative solutions for calculating the minimum sample size required. These require specifying: (i) target SEs (confidence interval widths) for each estimate of interest, (ii) the anticipated outcome event proportion in the validation population, (iii) the prediction model's anticipated (mis)calibration and variance of linear predictor values in the validation population, and (iv) potential risk thresholds for clinical decision-making. The calculations can also be used to inform whether the sample size of an existing (already collected) dataset is adequate for external validation. We illustrate our proposal for external validation of a prediction model for mechanical heart valve failure with an expected outcome event proportion of 0.018. Calculations suggest at least 9835 participants (177 events) are required to precisely estimate the calibration and discrimination measures, with this number driven by the calibration slope criterion, which we anticipate will often be the case. Also, 6443 participants (116 events) are required to precisely estimate net benefit at a risk threshold of 8%. Software code is provided.


Assuntos
Modelos Estatísticos , Modelos Teóricos , Calibragem , Humanos , Prognóstico , Tamanho da Amostra
9.
Stat Med ; 40(13): 3066-3084, 2021 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-33768582

RESUMO

Individual participant data (IPD) from multiple sources allows external validation of a prognostic model across multiple populations. Often this reveals poor calibration, potentially causing poor predictive performance in some populations. However, rather than discarding the model outright, it may be possible to modify the model to improve performance using recalibration techniques. We use IPD meta-analysis to identify the simplest method to achieve good model performance. We examine four options for recalibrating an existing time-to-event model across multiple populations: (i) shifting the baseline hazard by a constant, (ii) re-estimating the shape of the baseline hazard, (iii) adjusting the prognostic index as a whole, and (iv) adjusting individual predictor effects. For each strategy, IPD meta-analysis examines (heterogeneity in) model performance across populations. Additionally, the probability of achieving good performance in a new population can be calculated allowing ranking of recalibration methods. In an applied example, IPD meta-analysis reveals that the existing model had poor calibration in some populations, and large heterogeneity across populations. However, re-estimation of the intercept substantially improved the expected calibration in new populations, and reduced between-population heterogeneity. Comparing recalibration strategies showed that re-estimating both the magnitude and shape of the baseline hazard gave the highest predicted probability of good performance in a new population. In conclusion, IPD meta-analysis allows a prognostic model to be externally validated in multiple settings, and enables recalibration strategies to be compared and ranked to decide on the least aggressive recalibration strategy to achieve acceptable external model performance without discarding existing model information.


Assuntos
Análise de Dados , Projetos de Pesquisa , Calibragem , Humanos , Metanálise como Assunto , Probabilidade , Prognóstico
10.
Stat Med ; 40(26): 5961-5981, 2021 11 20.
Artigo em Inglês | MEDLINE | ID: mdl-34402094

RESUMO

Randomized trials typically estimate average relative treatment effects, but decisions on the benefit of a treatment are possibly better informed by more individualized predictions of the absolute treatment effect. In case of a binary outcome, these predictions of absolute individualized treatment effect require knowledge of the individual's risk without treatment and incorporation of a possibly differential treatment effect (ie, varying with patient characteristics). In this article, we lay out the causal structure of individualized treatment effect in terms of potential outcomes and describe the required assumptions that underlie a causal interpretation of its prediction. Subsequently, we describe regression models and model estimation techniques that can be used to move from average to more individualized treatment effect predictions. We focus mainly on logistic regression-based methods that are both well-known and naturally provide the required probabilistic estimates. We incorporate key components from both causal inference and prediction research to arrive at individualized treatment effect predictions. While the separate components are well known, their successful amalgamation is very much an ongoing field of research. We cut the problem down to its essentials in the setting of a randomized trial, discuss the importance of a clear definition of the estimand of interest, provide insight into the required assumptions, and give guidance with respect to modeling and estimation options. Simulated data illustrate the potential of different modeling options across scenarios that vary both average treatment effect and treatment effect heterogeneity. Two applied examples illustrate individualized treatment effect prediction in randomized trial data.


Assuntos
Ensaios Clínicos Controlados Aleatórios como Assunto , Causalidade , Humanos , Estudos Longitudinais
11.
Stat Med ; 39(10): 1440-1457, 2020 05 15.
Artigo em Inglês | MEDLINE | ID: mdl-32022311

RESUMO

As real world evidence on drug efficacy involves nonrandomized studies, statistical methods adjusting for confounding are needed. In this context, prognostic score (PGS) analysis has recently been proposed as a method for causal inference. It aims to restore balance across the different treatment groups by identifying subjects with a similar prognosis for a given reference exposure ("control"). This requires the development of a multivariable prognostic model in the control arm of the study sample, which is then extrapolated to the different treatment arms. Unfortunately, large cohorts for developing prognostic models are not always available. Prognostic models are therefore subject to a dilemma between overfitting and parsimony; the latter being prone to a violation of the assumption of no unmeasured confounders when important covariates are ignored. Although it is possible to limit overfitting by using penalization strategies, an alternative approach is to adopt evidence synthesis. Aggregating previously published prognostic models may improve the generalizability of PGS, while taking account of a large set of covariates-even when limited individual participant data are available. In this article, we extend a method for prediction model aggregation to PGS analysis in nonrandomized studies. We conduct extensive simulations to assess the validity of model aggregation, compared with other methods of PGS analysis for estimating marginal treatment effects. We show that aggregating existing PGS into a "meta-score" is robust to misspecification, even when elementary scores wrongfully omit confounders or focus on different outcomes. We illustrate our methods in a setting of treatments for asthma.


Assuntos
Modelos Estatísticos , Causalidade , Humanos , Prognóstico
12.
Stat Med ; 39(25): 3591-3607, 2020 11 10.
Artigo em Inglês | MEDLINE | ID: mdl-32687233

RESUMO

Missing data present challenges for development and real-world application of clinical prediction models. While these challenges have received considerable attention in the development setting, there is only sparse research on the handling of missing data in applied settings. The main unique feature of handling missing data in these settings is that missing data methods have to be performed for a single new individual, precluding direct application of mainstay methods used during model development. Correspondingly, we propose that it is desirable to perform model validation using missing data methods that transfer to practice in single new patients. This article compares existing and new methods to account for missing data for a new individual in the context of prediction. These methods are based on (i) submodels based on observed data only, (ii) marginalization over the missing variables, or (iii) imputation based on fully conditional specification (also known as chained equations). They were compared in an internal validation setting to highlight the use of missing data methods that transfer to practice while validating a model. As a reference, they were compared to the use of multiple imputation by chained equations in a set of test patients, because this has been used in validation studies in the past. The methods were evaluated in a simulation study where performance was measured by means of optimism corrected C-statistic and mean squared prediction error. Furthermore, they were applied in data from a large Dutch cohort of prophylactic implantable cardioverter defibrillator patients.


Assuntos
Simulação por Computador , Estudos de Coortes , Humanos
13.
Stat Med ; 39(15): 2115-2137, 2020 07 10.
Artigo em Inglês | MEDLINE | ID: mdl-32350891

RESUMO

Precision medicine research often searches for treatment-covariate interactions, which refers to when a treatment effect (eg, measured as a mean difference, odds ratio, hazard ratio) changes across values of a participant-level covariate (eg, age, gender, biomarker). Single trials do not usually have sufficient power to detect genuine treatment-covariate interactions, which motivate the sharing of individual participant data (IPD) from multiple trials for meta-analysis. Here, we provide statistical recommendations for conducting and planning an IPD meta-analysis of randomized trials to examine treatment-covariate interactions. For conduct, two-stage and one-stage statistical models are described, and we recommend: (i) interactions should be estimated directly, and not by calculating differences in meta-analysis results for subgroups; (ii) interaction estimates should be based solely on within-study information; (iii) continuous covariates and outcomes should be analyzed on their continuous scale; (iv) nonlinear relationships should be examined for continuous covariates, using a multivariate meta-analysis of the trend (eg, using restricted cubic spline functions); and (v) translation of interactions into clinical practice is nontrivial, requiring individualized treatment effect prediction. For planning, we describe first why the decision to initiate an IPD meta-analysis project should not be based on between-study heterogeneity in the overall treatment effect; and second, how to calculate the power of a potential IPD meta-analysis project in advance of IPD collection, conditional on characteristics (eg, number of participants, standard deviation of covariates) of the trials (potentially) promising their IPD. Real IPD meta-analysis projects are used for illustration throughout.


Assuntos
Análise de Dados , Modelos Estatísticos , Humanos , Metanálise como Assunto , Modelos de Riscos Proporcionais
14.
BMC Med ; 17(1): 109, 2019 06 13.
Artigo em Inglês | MEDLINE | ID: mdl-31189462

RESUMO

BACKGROUND: The Framingham risk models and pooled cohort equations (PCE) are widely used and advocated in guidelines for predicting 10-year risk of developing coronary heart disease (CHD) and cardiovascular disease (CVD) in the general population. Over the past few decades, these models have been extensively validated within different populations, which provided mounting evidence that local tailoring is often necessary to obtain accurate predictions. The objective is to systematically review and summarize the predictive performance of three widely advocated cardiovascular risk prediction models (Framingham Wilson 1998, Framingham ATP III 2002 and PCE 2013) in men and women separately, to assess the generalizability of performance across different subgroups and geographical regions, and to determine sources of heterogeneity in the findings across studies. METHODS: A search was performed in October 2017 to identify studies investigating the predictive performance of the aforementioned models. Studies were included if they externally validated one or more of the original models in the general population for the same outcome as the original model. We assessed risk of bias for each validation and extracted data on population characteristics and model performance. Performance estimates (observed versus expected (OE) ratio and c-statistic) were summarized using a random effects models and sources of heterogeneity were explored with meta-regression. RESULTS: The search identified 1585 studies, of which 38 were included, describing a total of 112 external validations. Results indicate that, on average, all models overestimate the 10-year risk of CHD and CVD (pooled OE ratio ranged from 0.58 (95% CI 0.43-0.73; Wilson men) to 0.79 (95% CI 0.60-0.97; ATP III women)). Overestimation was most pronounced for high-risk individuals and European populations. Further, discriminative performance was better in women for all models. There was considerable heterogeneity in the c-statistic between studies, likely due to differences in population characteristics. CONCLUSIONS: The Framingham Wilson, ATP III and PCE discriminate comparably well but all overestimate the risk of developing CVD, especially in higher risk populations. Because the extent of miscalibration substantially varied across settings, we highly recommend that researchers further explore reasons for overprediction and that the models be updated for specific populations.


Assuntos
Doenças Cardiovasculares/diagnóstico , Modelos Teóricos , Idoso , Doenças Cardiovasculares/epidemiologia , Estudos de Coortes , Feminino , Humanos , Masculino , Valor Preditivo dos Testes , Prognóstico , Medição de Risco/métodos , Fatores de Risco
15.
Stat Med ; 38(11): 2013-2029, 2019 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-30652333

RESUMO

In nonrandomised studies, inferring causal effects requires appropriate methods for addressing confounding bias. Although it is common to adopt propensity score analysis to this purpose, prognostic score analysis has recently been proposed as an alternative strategy. While both approaches were originally introduced to estimate causal effects for binary interventions, the theory of propensity score has since been extended to the case of general treatment regimes. Indeed, many treatments are not assigned in a binary fashion and require a certain extent of dosing. Hence, researchers may often be interested in estimating treatment effects across multiple exposures. To the best of our knowledge, the prognostic score analysis has not been yet generalised to this case. In this article, we describe the theory of prognostic scores for causal inference with general treatment regimes. Our methods can be applied to compare multiple treatments using nonrandomised data, a topic of great relevance in contemporary evaluations of clinical interventions. We propose estimators for the average treatment effects in different populations of interest, the validity of which is assessed through a series of simulations. Finally, we present an illustrative case in which we estimate the effect of the delay to Aspirin administration on a composite outcome of death or dependence at 6 months in stroke patients.


Assuntos
Prognóstico , Resultado do Tratamento , Algoritmos , Humanos , Modelos Estatísticos , Pontuação de Propensão
16.
Stat Med ; 38(22): 4290-4309, 2019 09 30.
Artigo em Inglês | MEDLINE | ID: mdl-31373722

RESUMO

Clinical prediction models aim to provide estimates of absolute risk for a diagnostic or prognostic endpoint. Such models may be derived from data from various studies in the context of a meta-analysis. We describe and propose approaches for assessing heterogeneity in predictor effects and predictions arising from models based on data from different sources. These methods are illustrated in a case study with patients suffering from traumatic brain injury, where we aim to predict 6-month mortality based on individual patient data using meta-analytic techniques (15 studies, n = 11 022 patients). The insights into various aspects of heterogeneity are important to develop better models and understand problems with the transportability of absolute risk predictions.


Assuntos
Metanálise como Assunto , Modelos Estatísticos , Probabilidade , Medição de Risco/métodos , Simulação por Computador , Humanos
17.
BMC Med Res Methodol ; 19(1): 183, 2019 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-31477023

RESUMO

BACKGROUND: Individual participant data meta-analysis (IPD-MA) is considered the gold standard for investigating subgroup effects. Frequently used regression-based approaches to detect subgroups in IPD-MA are: meta-regression, per-subgroup meta-analysis (PS-MA), meta-analysis of interaction terms (MA-IT), naive one-stage IPD-MA (ignoring potential study-level confounding), and centred one-stage IPD-MA (accounting for potential study-level confounding). Clear guidance on the analyses is lacking and clinical researchers may use approaches with suboptimal efficiency to investigate subgroup effects in an IPD setting. Therefore, our aim is to overview and compare the aforementioned methods, and provide recommendations over which should be preferred. METHODS: We conducted a simulation study where we generated IPD of randomised trials and varied the magnitude of subgroup effect (0, 25, 50% relative reduction), between-study treatment effect heterogeneity (none, medium, large), ecological bias (none, quantitative, qualitative), sample size (50,100,200), and number of trials (5,10) for binary, continuous and time-to-event outcomes. For each scenario, we assessed the power, false positive rate (FPR) and bias of aforementioned five approaches. RESULTS: Naive and centred IPD-MA yielded the highest power, whilst preserving acceptable FPR around the nominal 5% in all scenarios. Centred IPD-MA showed slightly less biased estimates than naïve IPD-MA. Similar results were obtained for MA-IT, except when analysing binary outcomes (where it yielded less power and FPR < 5%). PS-MA showed similar power as MA-IT in non-heterogeneous scenarios, but power collapsed as heterogeneity increased, and decreased even more in the presence of ecological bias. PS-MA suffered from too high FPRs in non-heterogeneous settings and showed biased estimates in all scenarios. Meta-regression showed poor power (< 20%) in all scenarios and completely biased results in settings with qualitative ecological bias. CONCLUSIONS: Our results indicate that subgroup detection in IPD-MA requires careful modelling. Naive and centred IPD-MA performed equally well, but due to less bias of the estimates in the presence of ecological bias, we recommend the latter.


Assuntos
Algoritmos , Biometria/métodos , Metanálise como Assunto , Modelos Estatísticos , Simulação por Computador , Humanos , Análise de Regressão
18.
Nephrol Dial Transplant ; 33(7): 1259-1268, 2018 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-29462353

RESUMO

Background: Delayed graft function (DGF) is a common complication after kidney transplantation in the era of accepting an equal number of brain- and circulatory-death donor kidneys in the Netherlands. To identify those cases with an increased risk of developing DGF, various multivariable algorithms have been proposed. The objective was to validate the reproducibility of four predictive algorithms by Irish et al. (A risk prediction model for delayed graft function in the current era of deceased donor renal transplantation. Am J Transplant 2010;10:2279-2286) (USA), Jeldres et al. (Prediction of delayed graft function after renal transplantation. Can Urol Assoc J 2009;3:377-382) (Canada), Chapal et al. (A useful scoring system for the prediction and management of delayed graft function following kidney transplantation from cadaveric donors. Kidney Int 2014;86:1130-1139) (France) and Zaza et al. (Predictive model for delayed graft function based on easily available pre-renal transplant variables. Intern Emerg Med 2015;10:135-141) (Italy) according to a novel framework for external validation. Methods: We conducted a prospective observational study with data from the Dutch Organ Transplantation Registry (NOTR). Renal transplant recipients from all eight Dutch academic medical centers between 2002 and 2012 who received a deceased allograft were included (N = 3333). The four prediction algorithms were reconstructed from donor, recipient and transplantation data. Their predictive value for DGF was validated by c-statistics, calibration statistics and net benefit analysis. Case-mix (un)relatedness was investigated with a membership model and mean and standard deviation of the linear predictor. Results: The prevalence of DGF was 37%. Despite a significantly different case-mix, the US algorithm by Irish was best reproducible, with a c-index of 0.761 (range 0.756 - 0.762), and well-calibrated over the complete range of predicted probabilities of having DGF. The US model had a net benefit of 0.242 at a threshold probability of 0.25, compared with 0.089 net benefit for the same threshold in the original study, equivalent to correctly identifying DGF in 24 cases per 100 patients (true positive results) without an increase in the number of false-positive results. Conclusions: The US model by Irish et al. was generalizable and best transportable to Dutch recipients with a deceased donor kidney. The algorithm detects an increased risk of DGF after allocation and enables us to improve individual patient management.


Assuntos
Função Retardada do Enxerto/etiologia , Transplante de Rim/efeitos adversos , Modelos Estatísticos , Sistema de Registros/estatística & dados numéricos , Doadores de Tecidos , Adolescente , Adulto , Idoso , Função Retardada do Enxerto/epidemiologia , Feminino , Sobrevivência de Enxerto , Humanos , Masculino , Pessoa de Meia-Idade , Países Baixos/epidemiologia , Estudos Prospectivos , Fatores de Tempo , Transplante Homólogo , Adulto Jovem
19.
Stat Med ; 36(28): 4529-4539, 2017 Dec 10.
Artigo em Inglês | MEDLINE | ID: mdl-27891652

RESUMO

Prediction models fitted with logistic regression often show poor performance when applied in populations other than the development population. Model updating may improve predictions. Previously suggested methods vary in their extensiveness of updating the model. We aim to define a strategy in selecting an appropriate update method that considers the balance between the amount of evidence for updating in the new patient sample and the danger of overfitting. We consider recalibration in the large (re-estimation of model intercept); recalibration (re-estimation of intercept and slope) and model revision (re-estimation of all coefficients) as update methods. We propose a closed testing procedure that allows the extensiveness of the updating to increase progressively from a minimum (the original model) to a maximum (a completely revised model). The procedure involves multiple testing with maintaining approximately the chosen type I error rate. We illustrate this approach with three clinical examples: patients with prostate cancer, traumatic brain injury and children presenting with fever. The need for updating the prostate cancer model was completely driven by a different model intercept in the update sample (adjustment: 2.58). Separate testing of model revision against the original model showed statistically significant results, but led to overfitting (calibration slope at internal validation = 0.86). The closed testing procedure selected recalibration in the large as update method, without overfitting. The advantage of the closed testing procedure was confirmed by the other two examples. We conclude that the proposed closed testing procedure may be useful in selecting appropriate update methods for previously developed prediction models. Copyright © 2016 John Wiley & Sons, Ltd.


Assuntos
Biometria/métodos , Modelos Logísticos , Medição de Risco/métodos , Lesões Encefálicas/epidemiologia , Criança , Pré-Escolar , Simulação por Computador , Feminino , Febre/epidemiologia , Humanos , Lactente , Masculino , Pessoa de Meia-Idade , Probabilidade , Neoplasias da Próstata/epidemiologia , Análise de Regressão
20.
Stat Med ; 36(8): 1210-1226, 2017 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-28083901

RESUMO

Non-randomized studies aim to reveal whether or not interventions are effective in real-life clinical practice, and there is a growing interest in including such evidence in the decision-making process. We evaluate existing methodologies and present new approaches to using non-randomized evidence in a network meta-analysis of randomized controlled trials (RCTs) when the aim is to assess relative treatment effects. We first discuss how to assess compatibility between the two types of evidence. We then present and compare an array of alternative methods that allow the inclusion of non-randomized studies in a network meta-analysis of RCTs: the naïve data synthesis, the design-adjusted synthesis, the use of non-randomized evidence as prior information and the use of three-level hierarchical models. We apply some of the methods in two previously published clinical examples comparing percutaneous interventions for the treatment of coronary in-stent restenosis and antipsychotics in patients with schizophrenia. We discuss in depth the advantages and limitations of each method, and we conclude that the inclusion of real-world evidence from non-randomized studies has the potential to corroborate findings from RCTs, increase precision and enhance the decision-making process. Copyright © 2017 John Wiley & Sons, Ltd.


Assuntos
Ensaios Clínicos como Assunto , Interpretação Estatística de Dados , Metanálise em Rede , Ensaios Clínicos Controlados Aleatórios como Assunto , Ensaios Clínicos como Assunto/estatística & dados numéricos , Humanos , Modelos Estatísticos , Ensaios Clínicos Controlados Aleatórios como Assunto/estatística & dados numéricos , Estatística como Assunto , Resultado do Tratamento
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA