RESUMO
BACKGROUND: Despite many systematic reviews and meta-analyses examining the associations of pregnancy complications with risk of type 2 diabetes mellitus (T2DM) and hypertension, previous umbrella reviews have only examined a single pregnancy complication. Here we have synthesised evidence from systematic reviews and meta-analyses on the associations of a wide range of pregnancy-related complications with risk of developing T2DM and hypertension. METHODS: Medline, Embase and Cochrane Database of Systematic Reviews were searched from inception until 26 September 2022 for systematic reviews and meta-analysis examining the association between pregnancy complications and risk of T2DM and hypertension. Screening of articles, data extraction and quality appraisal (AMSTAR2) were conducted independently by two reviewers using Covidence software. Data were extracted for studies that examined the risk of T2DM and hypertension in pregnant women with the pregnancy complication compared to pregnant women without the pregnancy complication. Summary estimates of each review were presented using tables, forest plots and narrative synthesis and reported following Preferred Reporting Items for Overviews of Reviews (PRIOR) guidelines. RESULTS: Ten systematic reviews were included. Two pregnancy complications were identified. Gestational diabetes mellitus (GDM): One review showed GDM was associated with a 10-fold higher risk of T2DM at least 1 year after pregnancy (relative risk (RR) 9.51 (95% confidence interval (CI) 7.14 to 12.67) and although the association differed by ethnicity (white: RR 16.28 (95% CI 15.01 to 17.66), non-white: RR 10.38 (95% CI 4.61 to 23.39), mixed: RR 8.31 (95% CI 5.44 to 12.69)), the between subgroups difference were not statistically significant at 5% significance level. Another review showed GDM was associated with higher mean blood pressure at least 3 months postpartum (mean difference in systolic blood pressure: 2.57 (95% CI 1.74 to 3.40) mmHg and mean difference in diastolic blood pressure: 1.89 (95% CI 1.32 to 2.46) mmHg). Hypertensive disorders of pregnancy (HDP): Three reviews showed women with a history of HDP were 3 to 6 times more likely to develop hypertension at least 6 weeks after pregnancy compared to women without HDP (meta-analysis with largest number of studies: odds ratio (OR) 4.33 (3.51 to 5.33)) and one review reported a higher rate of T2DM after HDP (hazard ratio (HR) 2.24 (1.95 to 2.58)) at least a year after pregnancy. One of the three reviews and five other reviews reported women with a history of preeclampsia were 3 to 7 times more likely to develop hypertension at least 6 weeks postpartum (meta-analysis with the largest number of studies: OR 3.90 (3.16 to 4.82) with one of these reviews reporting the association was greatest in women from Asia (Asia: OR 7.54 (95% CI 2.49 to 22.81), Europe: OR 2.19 (95% CI 0.30 to 16.02), North and South America: OR 3.32 (95% CI 1.26 to 8.74)). CONCLUSIONS: GDM and HDP are associated with a greater risk of developing T2DM and hypertension. Common confounders adjusted for across the included studies in the reviews were maternal age, body mass index (BMI), socioeconomic status, smoking status, pre-pregnancy and current BMI, parity, family history of T2DM or cardiovascular disease, ethnicity, and time of delivery. Further research is needed to evaluate the value of embedding these pregnancy complications as part of assessment for future risk of T2DM and chronic hypertension.
Assuntos
Diabetes Mellitus Tipo 2 , Diabetes Gestacional , Hipertensão , Pré-Eclâmpsia , Feminino , Humanos , Gravidez , Diabetes Mellitus Tipo 2/complicações , Diabetes Mellitus Tipo 2/epidemiologia , Diabetes Gestacional/prevenção & controle , Hipertensão/complicações , Hipertensão/epidemiologia , Paridade , Revisões Sistemáticas como Assunto , Metanálise como AssuntoRESUMO
PURPOSE: To develop and validate prediction models for the risk of future work absence and level of presenteeism, in adults seeking primary healthcare with musculoskeletal disorders (MSD). METHODS: Six studies from the West-Midlands/Northwest regions of England, recruiting adults consulting primary care with MSD were included for model development and internal-external cross-validation (IECV). The primary outcome was any work absence within 6 months of their consultation. Secondary outcomes included 6-month presenteeism and 12-month work absence. Ten candidate predictors were included: age; sex; multisite pain; baseline pain score; pain duration; job type; anxiety/depression; comorbidities; absence in the previous 6 months; and baseline presenteeism. RESULTS: For the 6-month absence model, 2179 participants (215 absences) were available across five studies. Calibration was promising, although varied across studies, with a pooled calibration slope of 0.93 (95% CI: 0.41-1.46) on IECV. On average, the model discriminated well between those with work absence within 6 months, and those without (IECV-pooled C-statistic 0.76, 95% CI: 0.66-0.86). The 6-month presenteeism model, while well calibrated on average, showed some individual-level variation in predictive accuracy, and the 12-month absence model was poorly calibrated due to the small available size for model development. CONCLUSIONS: The developed models predict 6-month work absence and presenteeism with reasonable accuracy, on average, in adults consulting with MSD. The model to predict 12-month absence was poorly calibrated and is not yet ready for use in practice. This information may support shared decision-making and targeting occupational health interventions at those with a higher risk of absence or presenteeism in the 6 months following consultation. Further external validation is needed before the models' use can be recommended or their impact on patients can be fully assessed.
RESUMO
BACKGROUND: Relapse and recurrence of depression are common, contributing to the overall burden of depression globally. Accurate prediction of relapse or recurrence while patients are well would allow the identification of high-risk individuals and may effectively guide the allocation of interventions to prevent relapse and recurrence. AIMS: To review prognostic models developed to predict the risk of relapse, recurrence, sustained remission, or recovery in adults with remitted major depressive disorder. METHOD: We searched the Cochrane Library (current issue); Ovid MEDLINE (1946 onwards); Ovid Embase (1980 onwards); Ovid PsycINFO (1806 onwards); and Web of Science (1900 onwards) up to May 2021. We included development and external validation studies of multivariable prognostic models. We assessed risk of bias of included studies using the Prediction model risk of bias assessment tool (PROBAST). RESULTS: We identified 12 eligible prognostic model studies (11 unique prognostic models): 8 model development-only studies, 3 model development and external validation studies and 1 external validation-only study. Multiple estimates of performance measures were not available and meta-analysis was therefore not necessary. Eleven out of the 12 included studies were assessed as being at high overall risk of bias and none examined clinical utility. CONCLUSIONS: Due to high risk of bias of the included studies, poor predictive performance and limited external validation of the models identified, presently available clinical prediction models for relapse and recurrence of depression are not yet sufficiently developed for deploying in clinical settings. There is a need for improved prognosis research in this clinical area and future studies should conform to best practice methodological and reporting guidelines.
Assuntos
Transtorno Depressivo Maior , Adulto , Doença Crônica , Depressão , Transtorno Depressivo Maior/diagnóstico , Humanos , Prognóstico , RecidivaRESUMO
Previous articles in Statistics in Medicine describe how to calculate the sample size required for external validation of prediction models with continuous and binary outcomes. The minimum sample size criteria aim to ensure precise estimation of key measures of a model's predictive performance, including measures of calibration, discrimination, and net benefit. Here, we extend the sample size guidance to prediction models with a time-to-event (survival) outcome, to cover external validation in datasets containing censoring. A simulation-based framework is proposed, which calculates the sample size required to target a particular confidence interval width for the calibration slope measuring the agreement between predicted risks (from the model) and observed risks (derived using pseudo-observations to account for censoring) on the log cumulative hazard scale. Precise estimation of calibration curves, discrimination, and net-benefit can also be checked in this framework. The process requires assumptions about the validation population in terms of the (i) distribution of the model's linear predictor and (ii) event and censoring distributions. Existing information can inform this; in particular, the linear predictor distribution can be approximated using the C-index or Royston's D statistic from the model development article, together with the overall event risk. We demonstrate how the approach can be used to calculate the sample size required to validate a prediction model for recurrent venous thromboembolism. Ideally the sample size should ensure precise calibration across the entire range of predicted risks, but must at least ensure adequate precision in regions important for clinical decision-making. Stata and R code are provided.
Assuntos
Modelos Estatísticos , Calibragem , Simulação por Computador , Humanos , Prognóstico , Tamanho da AmostraRESUMO
Clinical prediction models (CPMs) can predict clinically relevant outcomes or events. Typically, prognostic CPMs are derived to predict the risk of a single future outcome. However, there are many medical applications where two or more outcomes are of interest, meaning this should be more widely reflected in CPMs so they can accurately estimate the joint risk of multiple outcomes simultaneously. A potentially naïve approach to multi-outcome risk prediction is to derive a CPM for each outcome separately, then multiply the predicted risks. This approach is only valid if the outcomes are conditionally independent given the covariates, and it fails to exploit the potential relationships between the outcomes. This paper outlines several approaches that could be used to develop CPMs for multiple binary outcomes. We consider four methods, ranging in complexity and conditional independence assumptions: namely, probabilistic classifier chain, multinomial logistic regression, multivariate logistic regression, and a Bayesian probit model. These are compared with methods that rely on conditional independence: separate univariate CPMs and stacked regression. Employing a simulation study and real-world example, we illustrate that CPMs for joint risk prediction of multiple outcomes should only be derived using methods that model the residual correlation between outcomes. In such a situation, our results suggest that probabilistic classification chains, multinomial logistic regression or the Bayesian probit model are all appropriate choices. We call into question the development of CPMs for each outcome in isolation when multiple correlated or structurally related outcomes are of interest and recommend more multivariate approaches to risk prediction.
Assuntos
Modelos Estatísticos , Teorema de Bayes , Simulação por Computador , Humanos , Modelos Logísticos , PrognósticoRESUMO
In prediction model research, external validation is needed to examine an existing model's performance using data independent to that for model development. Current external validation studies often suffer from small sample sizes and consequently imprecise predictive performance estimates. To address this, we propose how to determine the minimum sample size needed for a new external validation study of a prediction model for a binary outcome. Our calculations aim to precisely estimate calibration (Observed/Expected and calibration slope), discrimination (C-statistic), and clinical utility (net benefit). For each measure, we propose closed-form and iterative solutions for calculating the minimum sample size required. These require specifying: (i) target SEs (confidence interval widths) for each estimate of interest, (ii) the anticipated outcome event proportion in the validation population, (iii) the prediction model's anticipated (mis)calibration and variance of linear predictor values in the validation population, and (iv) potential risk thresholds for clinical decision-making. The calculations can also be used to inform whether the sample size of an existing (already collected) dataset is adequate for external validation. We illustrate our proposal for external validation of a prediction model for mechanical heart valve failure with an expected outcome event proportion of 0.018. Calculations suggest at least 9835 participants (177 events) are required to precisely estimate the calibration and discrimination measures, with this number driven by the calibration slope criterion, which we anticipate will often be the case. Also, 6443 participants (116 events) are required to precisely estimate net benefit at a risk threshold of 8%. Software code is provided.
Assuntos
Modelos Estatísticos , Modelos Teóricos , Calibragem , Humanos , Prognóstico , Tamanho da AmostraRESUMO
Clinical prediction models provide individualized outcome predictions to inform patient counseling and clinical decision making. External validation is the process of examining a prediction model's performance in data independent to that used for model development. Current external validation studies often suffer from small sample sizes, and subsequently imprecise estimates of a model's predictive performance. To address this, we propose how to determine the minimum sample size needed for external validation of a clinical prediction model with a continuous outcome. Four criteria are proposed, that target precise estimates of (i) R2 (the proportion of variance explained), (ii) calibration-in-the-large (agreement between predicted and observed outcome values on average), (iii) calibration slope (agreement between predicted and observed values across the range of predicted values), and (iv) the variance of observed outcome values. Closed-form sample size solutions are derived for each criterion, which require the user to specify anticipated values of the model's performance (in particular R2 ) and the outcome variance in the external validation dataset. A sensible starting point is to base values on those for the model development study, as obtained from the publication or study authors. The largest sample size required to meet all four criteria is the recommended minimum sample size needed in the external validation dataset. The calculations can also be applied to estimate expected precision when an existing dataset with a fixed sample size is available, to help gauge if it is adequate. We illustrate the proposed methods on a case-study predicting fat-free mass in children.
Assuntos
Modelos Estatísticos , Calibragem , Criança , Humanos , Prognóstico , Tamanho da AmostraRESUMO
Individual participant data (IPD) from multiple sources allows external validation of a prognostic model across multiple populations. Often this reveals poor calibration, potentially causing poor predictive performance in some populations. However, rather than discarding the model outright, it may be possible to modify the model to improve performance using recalibration techniques. We use IPD meta-analysis to identify the simplest method to achieve good model performance. We examine four options for recalibrating an existing time-to-event model across multiple populations: (i) shifting the baseline hazard by a constant, (ii) re-estimating the shape of the baseline hazard, (iii) adjusting the prognostic index as a whole, and (iv) adjusting individual predictor effects. For each strategy, IPD meta-analysis examines (heterogeneity in) model performance across populations. Additionally, the probability of achieving good performance in a new population can be calculated allowing ranking of recalibration methods. In an applied example, IPD meta-analysis reveals that the existing model had poor calibration in some populations, and large heterogeneity across populations. However, re-estimation of the intercept substantially improved the expected calibration in new populations, and reduced between-population heterogeneity. Comparing recalibration strategies showed that re-estimating both the magnitude and shape of the baseline hazard gave the highest predicted probability of good performance in a new population. In conclusion, IPD meta-analysis allows a prognostic model to be externally validated in multiple settings, and enables recalibration strategies to be compared and ranked to decide on the least aggressive recalibration strategy to achieve acceptable external model performance without discarding existing model information.
Assuntos
Análise de Dados , Projetos de Pesquisa , Calibragem , Humanos , Metanálise como Assunto , Probabilidade , PrognósticoRESUMO
BACKGROUND: Pre-eclampsia is a leading cause of maternal and perinatal mortality and morbidity. Early identification of women at risk during pregnancy is required to plan management. Although there are many published prediction models for pre-eclampsia, few have been validated in external data. Our objective was to externally validate published prediction models for pre-eclampsia using individual participant data (IPD) from UK studies, to evaluate whether any of the models can accurately predict the condition when used within the UK healthcare setting. METHODS: IPD from 11 UK cohort studies (217,415 pregnant women) within the International Prediction of Pregnancy Complications (IPPIC) pre-eclampsia network contributed to external validation of published prediction models, identified by systematic review. Cohorts that measured all predictor variables in at least one of the identified models and reported pre-eclampsia as an outcome were included for validation. We reported the model predictive performance as discrimination (C-statistic), calibration (calibration plots, calibration slope, calibration-in-the-large), and net benefit. Performance measures were estimated separately in each available study and then, where possible, combined across studies in a random-effects meta-analysis. RESULTS: Of 131 published models, 67 provided the full model equation and 24 could be validated in 11 UK cohorts. Most of the models showed modest discrimination with summary C-statistics between 0.6 and 0.7. The calibration of the predicted compared to observed risk was generally poor for most models with observed calibration slopes less than 1, indicating that predictions were generally too extreme, although confidence intervals were wide. There was large between-study heterogeneity in each model's calibration-in-the-large, suggesting poor calibration of the predicted overall risk across populations. In a subset of models, the net benefit of using the models to inform clinical decisions appeared small and limited to probability thresholds between 5 and 7%. CONCLUSIONS: The evaluated models had modest predictive performance, with key limitations such as poor calibration (likely due to overfitting in the original development datasets), substantial heterogeneity, and small net benefit across settings. The evidence to support the use of these prediction models for pre-eclampsia in clinical decision-making is limited. Any models that we could not validate should be examined in terms of their predictive performance, net benefit, and heterogeneity across multiple UK settings before consideration for use in practice. TRIAL REGISTRATION: PROSPERO ID: CRD42015029349 .
Assuntos
Pré-Eclâmpsia/diagnóstico , Complicações na Gravidez/diagnóstico , Feminino , Humanos , Gravidez , Prognóstico , Reprodutibilidade dos Testes , Projetos de Pesquisa , Medição de RiscoRESUMO
A one-stage individual participant data (IPD) meta-analysis synthesizes IPD from multiple studies using a general or generalized linear mixed model. This produces summary results (eg, about treatment effect) in a single step, whilst accounting for clustering of participants within studies (via a stratified study intercept, or random study intercepts) and between-study heterogeneity (via random treatment effects). We use simulation to evaluate the performance of restricted maximum likelihood (REML) and maximum likelihood (ML) estimation of one-stage IPD meta-analysis models for synthesizing randomized trials with continuous or binary outcomes. Three key findings are identified. First, for ML or REML estimation of stratified intercept or random intercepts models, a t-distribution based approach generally improves coverage of confidence intervals for the summary treatment effect, compared with a z-based approach. Second, when using ML estimation of a one-stage model with a stratified intercept, the treatment variable should be coded using "study-specific centering" (ie, 1/0 minus the study-specific proportion of participants in the treatment group), as this reduces the bias in the between-study variance estimate (compared with 1/0 and other coding options). Third, REML estimation reduces downward bias in between-study variance estimates compared with ML estimation, and does not depend on the treatment variable coding; for binary outcomes, this requires REML estimation of the pseudo-likelihood, although this may not be stable in some situations (eg, when data are sparse). Two applied examples are used to illustrate the findings.
Assuntos
Modelos Estatísticos , Viés , Análise por Conglomerados , Simulação por Computador , Humanos , Modelos LinearesRESUMO
OBJECTIVES: The ability to efficiently and accurately predict future risk of primary total hip and knee replacement (THR/TKR) in earlier stages of osteoarthritis (OA) has potentially important applications. We aimed to develop and validate two models to estimate an individual's risk of primary THR and TKR in patients newly presenting to primary care. METHODS: We identified two cohorts of patients aged ≥40 years newly consulting hip pain/OA and knee pain/OA in the Clinical Practice Research Datalink. Candidate predictors were identified by systematic review, novel hypothesis-free 'Record-Wide Association Study' with replication, and panel consensus. Cox proportional hazards models accounting for competing risk of death were applied to derive risk algorithms for THR and TKR. Internal-external cross-validation (IECV) was then applied over geographical regions to validate two models. RESULTS: 45 predictors for THR and 53 for TKR were identified, reviewed and selected by the panel. 301 052 and 416 030 patients newly consulting between 1992 and 2015 were identified in the hip and knee cohorts, respectively (median follow-up 6 years). The resultant model C-statistics is 0.73 (0.72, 0.73) and 0.79 (0.78, 0.79) for THR (with 20 predictors) and TKR model (with 24 predictors), respectively. The IECV C-statistics ranged between 0.70-0.74 (THR model) and 0.76-0.82 (TKR model); the IECV calibration slope ranged between 0.93-1.07 (THR model) and 0.92-1.12 (TKR model). CONCLUSIONS: Two prediction models with good discrimination and calibration that estimate individuals' risk of THR and TKR have been developed and validated in large-scale, nationally representative data, and are readily automated in electronic patient records.
Assuntos
Artroplastia de Quadril/estatística & dados numéricos , Artroplastia do Joelho/estatística & dados numéricos , Técnicas de Apoio para a Decisão , Osteoartrite do Quadril/cirurgia , Osteoartrite do Joelho/cirurgia , Adulto , Calibragem , Bases de Dados Factuais , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Estudos Prospectivos , Reprodutibilidade dos Testes , Medição de Risco/métodos , Medição de Risco/normas , Reino UnidoRESUMO
In the medical literature, hundreds of prediction models are being developed to predict health outcomes in individuals. For continuous outcomes, typically a linear regression model is developed to predict an individual's outcome value conditional on values of multiple predictors (covariates). To improve model development and reduce the potential for overfitting, a suitable sample size is required in terms of the number of subjects (n) relative to the number of predictor parameters (p) for potential inclusion. We propose that the minimum value of n should meet the following four key criteria: (i) small optimism in predictor effect estimates as defined by a global shrinkage factor of ≥0.9; (ii) small absolute difference of ≤ 0.05 in the apparent and adjusted R2 ; (iii) precise estimation (a margin of error ≤ 10% of the true value) of the model's residual standard deviation; and similarly, (iv) precise estimation of the mean predicted outcome value (model intercept). The criteria require prespecification of the user's chosen p and the model's anticipated R2 as informed by previous studies. The value of n that meets all four criteria provides the minimum sample size required for model development. In an applied example, a new model to predict lung function in African-American women using 25 predictor parameters requires at least 918 subjects to meet all criteria, corresponding to at least 36.7 subjects per predictor parameter. Even larger sample sizes may be needed to additionally ensure precise estimates of key predictor effects, especially when important categorical predictors have low prevalence in certain categories.
Assuntos
Análise Multivariada , Tamanho da Amostra , Negro ou Afro-Americano , Simulação por Computador , Feminino , Humanos , Testes de Função RespiratóriaRESUMO
One-stage individual participant data meta-analysis models should account for within-trial clustering, but it is currently debated how to do this. For continuous outcomes modeled using a linear regression framework, two competing approaches are a stratified intercept or a random intercept. The stratified approach involves estimating a separate intercept term for each trial, whereas the random intercept approach assumes that trial intercepts are drawn from a normal distribution. Here, through an extensive simulation study for continuous outcomes, we evaluate the impact of using the stratified and random intercept approaches on statistical properties of the summary treatment effect estimate. Further aims are to compare (i) competing estimation options for the one-stage models, including maximum likelihood and restricted maximum likelihood, and (ii) competing options for deriving confidence intervals (CI) for the summary treatment effect, including the standard normal-based 95% CI, and more conservative approaches of Kenward-Roger and Satterthwaite, which inflate CIs to account for uncertainty in variance estimates. The findings reveal that, for an individual participant data meta-analysis of randomized trials with a 1:1 treatment:control allocation ratio and heterogeneity in the treatment effect, (i) bias and coverage of the summary treatment effect estimate are very similar when using stratified or random intercept models with restricted maximum likelihood, and thus either approach could be taken in practice, (ii) CIs are generally best derived using either a Kenward-Roger or Satterthwaite correction, although occasionally overly conservative, and (iii) if maximum likelihood is required, a random intercept performs better than a stratified intercept model. An illustrative example is provided.
Assuntos
Metanálise como Assunto , Modelos Estatísticos , Intervalos de Confiança , Interpretação Estatística de Dados , Humanos , Funções Verossimilhança , Modelos Lineares , Distribuição Normal , Resultado do TratamentoRESUMO
BACKGROUND: Researchers and funders should consider the statistical power of planned Individual Participant Data (IPD) meta-analysis projects, as they are often time-consuming and costly. We propose simulation-based power calculations utilising a two-stage framework, and illustrate the approach for a planned IPD meta-analysis of randomised trials with continuous outcomes where the aim is to identify treatment-covariate interactions. METHODS: The simulation approach has four steps: (i) specify an underlying (data generating) statistical model for trials in the IPD meta-analysis; (ii) use readily available information (e.g. from publications) and prior knowledge (e.g. number of studies promising IPD) to specify model parameter values (e.g. control group mean, intervention effect, treatment-covariate interaction); (iii) simulate an IPD meta-analysis dataset of a particular size from the model, and apply a two-stage IPD meta-analysis to obtain the summary estimate of interest (e.g. interaction effect) and its associated p-value; (iv) repeat the previous step (e.g. thousands of times), then estimate the power to detect a genuine effect by the proportion of summary estimates with a significant p-value. RESULTS: In a planned IPD meta-analysis of lifestyle interventions to reduce weight gain in pregnancy, 14 trials (1183 patients) promised their IPD to examine a treatment-BMI interaction (i.e. whether baseline BMI modifies intervention effect on weight gain). Using our simulation-based approach, a two-stage IPD meta-analysis has < 60% power to detect a reduction of 1 kg weight gain for a 10-unit increase in BMI. Additional IPD from ten other published trials (containing 1761 patients) would improve power to over 80%, but only if a fixed-effect meta-analysis was appropriate. Pre-specified adjustment for prognostic factors would increase power further. Incorrect dichotomisation of BMI would reduce power by over 20%, similar to immediately throwing away IPD from ten trials. CONCLUSIONS: Simulation-based power calculations could inform the planning and funding of IPD projects, and should be used routinely.
Assuntos
Simulação por Computador , Ganho de Peso na Gestação/fisiologia , Sobrepeso/prevenção & controle , Complicações na Gravidez/prevenção & controle , Algoritmos , Índice de Massa Corporal , Feminino , Humanos , Modelos Estatísticos , Sobrepeso/fisiopatologia , Gravidez , Complicações na Gravidez/fisiopatologia , Ensaios Clínicos Controlados Aleatórios como AssuntoRESUMO
OBJECTIVES: Risk of bias assessments are important in meta-analyses of both aggregate and individual participant data (IPD). There is limited evidence on whether and how risk of bias of included studies or datasets in IPD meta-analyses (IPDMAs) is assessed. We review how risk of bias is currently assessed, reported, and incorporated in IPDMAs of test accuracy and clinical prediction model studies and provide recommendations for improvement. STUDY DESIGN AND SETTING: We searched PubMed (January 2018-May 2020) to identify IPDMAs of test accuracy and prediction models, then elicited whether each IPDMA assessed risk of bias of included studies and, if so, how assessments were reported and subsequently incorporated into the IPDMAs. RESULTS: Forty-nine IPDMAs were included. Nineteen of 27 (70%) test accuracy IPDMAs assessed risk of bias, compared to 5 of 22 (23%) prediction model IPDMAs. Seventeen of 19 (89%) test accuracy IPDMAs used Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2), but no tool was used consistently among prediction model IPDMAs. Of IPDMAs assessing risk of bias, 7 (37%) test accuracy IPDMAs and 1 (20%) prediction model IPDMA provided details on the information sources (e.g., the original manuscript, IPD, primary investigators) used to inform judgments, and 4 (21%) test accuracy IPDMAs and 1 (20%) prediction model IPDMA provided information or whether assessments were done before or after obtaining the IPD of the included studies or datasets. Of all included IPDMAs, only seven test accuracy IPDMAs (26%) and one prediction model IPDMA (5%) incorporated risk of bias assessments into their meta-analyses. For future IPDMA projects, we provide guidance on how to adapt tools such as Prediction model Risk Of Bias ASsessment Tool (for prediction models) and QUADAS-2 (for test accuracy) to assess risk of bias of included primary studies and their IPD. CONCLUSION: Risk of bias assessments and their reporting need to be improved in IPDMAs of test accuracy and, especially, prediction model studies. Using recommended tools, both before and after IPD are obtained, will address this.
Assuntos
Confiabilidade dos Dados , Modelos Estatísticos , Humanos , Prognóstico , ViésRESUMO
BACKGROUND: Relapse of depression is common and contributes to the overall associated morbidity and burden. We lack evidence-based tools to estimate an individual's risk of relapse after treatment in primary care, which may help us more effectively target relapse prevention. OBJECTIVE: The objective was to develop and validate a prognostic model to predict risk of relapse of depression in primary care. METHODS: Multilevel logistic regression models were developed, using individual participant data from seven primary care-based studies (n=1244), to predict relapse of depression. The model was internally validated using bootstrapping, and generalisability was explored using internal-external cross-validation. FINDINGS: Residual depressive symptoms (OR: 1.13 (95% CI: 1.07 to 1.20), p<0.001) and baseline depression severity (OR: 1.07 (1.04 to 1.11), p<0.001) were associated with relapse. The validated model had low discrimination (C-statistic 0.60 (0.55-0.65)) and miscalibration concerns (calibration slope 0.81 (0.31-1.31)). On secondary analysis, being in a relationship was associated with reduced risk of relapse (OR: 0.43 (0.28-0.67), p<0.001); this remained statistically significant after correction for multiple significance testing. CONCLUSIONS: We could not predict risk of depression relapse with sufficient accuracy in primary care data, using routinely recorded measures. Relationship status warrants further research to explore its role as a prognostic factor for relapse. CLINICAL IMPLICATIONS: Until we can accurately stratify patients according to risk of relapse, a universal approach to relapse prevention may be most beneficial, either during acute-phase treatment or post remission. Where possible, this could be guided by the presence or absence of known prognostic factors (eg, residual depressive symptoms) and targeted towards these. TRIAL REGISTRATION NUMBER: NCT04666662.
Assuntos
Atenção Primária à Saúde , Recidiva , Humanos , Feminino , Masculino , Prognóstico , Pessoa de Meia-Idade , Adulto , Depressão/diagnóstico , Depressão/epidemiologia , Depressão/psicologia , Idoso , Prevenção Secundária , Transtorno Depressivo/diagnóstico , Transtorno Depressivo/epidemiologia , Transtorno Depressivo/psicologiaRESUMO
Objective: To predict birth weight at various potential gestational ages of delivery based on data routinely available at the first antenatal visit. Design: Individual participant data meta-analysis. Data sources: Individual participant data of four cohorts (237 228 pregnancies) from the International Prediction of Pregnancy Complications (IPPIC) network dataset. Eligibility criteria for selecting studies: Studies in the IPPIC network were identified by searching major databases for studies reporting risk factors for adverse pregnancy outcomes, such as pre-eclampsia, fetal growth restriction, and stillbirth, from database inception to August 2019. Data of four IPPIC cohorts (237 228 pregnancies) from the US (National Institute of Child Health and Human Development, 2018; 233 483 pregnancies), UK (Allen et al, 2017; 1045 pregnancies), Norway (STORK Groruddalen research programme, 2010; 823 pregnancies), and Australia (Rumbold et al, 2006; 1877 pregnancies) were included in the development of the model. Results: The IPPIC birth weight model was developed with random intercept regression models with backward elimination for variable selection. Internal-external cross validation was performed to assess the study specific and pooled performance of the model, reported as calibration slope, calibration-in-the-large, and observed versus expected average birth weight ratio. Meta-analysis showed that the apparent performance of the model had good calibration (calibration slope 0.99, 95% confidence interval (CI) 0.88 to 1.10; calibration-in-the-large 44.5 g, -18.4 to 107.3) with an observed versus expected average birth weight ratio of 1.02 (95% CI 0.97 to 1.07). The proportion of variation in birth weight explained by the model (R2) was 46.9% (range 32.7-56.1% in each cohort). On internal-external cross validation, the model showed good calibration and predictive performance when validated in three cohorts with a calibration slope of 0.90 (Allen cohort), 1.04 (STORK Groruddalen cohort), and 1.07 (Rumbold cohort), calibration-in-the-large of -22.3 g (Allen cohort), -33.42 (Rumbold cohort), and 86.4 g (STORK Groruddalen cohort), and observed versus expected ratio of 0.99 (Rumbold cohort), 1.00 (Allen cohort), and 1.03 (STORK Groruddalen cohort); respective pooled estimates were 1.00 (95% CI 0.78 to 1.23; calibration slope), 9.7 g (-154.3 to 173.8; calibration-in-the-large), and 1.00 (0.94 to 1.07; observed v expected ratio). The model predictions were more accurate (smaller mean square error) in the lower end of predicted birth weight, which is important in informing clinical decision making. Conclusions: The IPPIC birth weight model allowed birth weight predictions for a range of possible gestational ages. The model explained about 50% of individual variation in birth weights, was well calibrated (especially in babies at high risk of fetal growth restriction and its complications), and showed promising performance in four different populations included in the individual participant data meta-analysis. Further research to examine the generalisability of performance in other countries, settings, and subgroups is required. Trial registration: PROSPERO CRD42019135045.
RESUMO
INTRODUCTION: The number of people with diabetes mellitus is increasing globally and consequently so too is diabetic retinopathy (DR). Most patients with diabetes are monitored through the diabetic eye screening programme (DESP) until they have signs of retinopathy and these changes progress, requiring referral into hospital eye services (HES). Here, they continue to be monitored until they require treatment. Due to current pressures on HES, delays can occur, leading to harm. There is a need to triage patients based on their individual risk. At present, patients are stratified according to retinopathy stage alone, yet other risk factors like glycated haemoglobin (HbA1c) may be useful. Therefore, a prediction model that combines multiple prognostic factors to predict progression will be useful for triage in this setting to improve care.We previously developed a Diabetic Retinopathy Progression model to Treatment or Vision Loss (DRPTVL-UK) using a large primary care database. The aim of the present study is to externally validate the DRPTVL-UK model in a secondary care setting, specifically in a population under care by HES. This study will also provide an opportunity to update the model by considering additional predictors not previously available. METHODS AND ANALYSIS: We will use a retrospective cohort of 2400 patients with diabetes aged 12 years and over, referred from DESP to the NHS hospital trusts with referable DR between 2013 and 2016, with follow-up information recorded until December 2021.We will evaluate the external validity of the DRPTVL-UK model using measures of discrimination, calibration and net benefit. In addition, consensus meetings will be held to agree on acceptable risk thresholds for triage within the HES system. ETHICS AND DISSEMINATION: This study was approved by REC (ref 22/SC/0425, 05/12/2022, Hampshire A Research Ethics Committee). The results of the study will be published in a peer-reviewed journal, presented at clinical conferences. TRIAL REGISTRATION NUMBER: ISRCTN 10956293.
Assuntos
Diabetes Mellitus , Retinopatia Diabética , Humanos , Retinopatia Diabética/diagnóstico , Retinopatia Diabética/terapia , Retinopatia Diabética/epidemiologia , Estudos Retrospectivos , Transtornos da Visão , Fatores de Risco , Hemoglobinas GlicadasRESUMO
OBJECTIVE: The purpose of this study was to develop and externally validate multivariable prediction models for future pain intensity outcomes to inform targeted interventions for patients with neck or low back pain in primary care settings. METHODS: Model development data were obtained from a group of 679 adults with neck or low back pain who consulted a participating United Kingdom general practice. Predictors included self-report items regarding pain severity and impact from the STarT MSK Tool. Pain intensity at 2 and 6 months was modeled separately for continuous and dichotomized outcomes using linear and logistic regression, respectively. External validation of all models was conducted in a separate group of 586 patients recruited from a similar population with patients' predictor information collected both at point of consultation and 2 to 4 weeks later using self-report questionnaires. Calibration and discrimination of the models were assessed separately using STarT MSK Tool data from both time points to assess differences in predictive performance. RESULTS: Pain intensity and patients reporting their condition would last a long time contributed most to predictions of future pain intensity conditional on other variables. On external validation, models were reasonably well calibrated on average when using tool measurements taken 2 to 4 weeks after consultation (calibration slope = 0.848 [95% CI = 0.767 to 0.928] for 2-month pain intensity score), but performance was poor using point-of-consultation tool data (calibration slope for 2-month pain intensity score of 0.650 [95% CI = 0.549 to 0.750]). CONCLUSION: Model predictive accuracy was good when predictors were measured 2 to 4 weeks after primary care consultation, but poor when measured at the point of consultation. Future research will explore whether additional, nonmodifiable predictors improve point-of-consultation predictive performance. IMPACT: External validation demonstrated that these individualized prediction models were not sufficiently accurate to recommend their use in clinical practice. Further research is required to improve performance through inclusion of additional nonmodifiable risk factors.