Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
2.
Cochrane Database Syst Rev ; 12: CD013139, 2021 12 21.
Artículo en Inglés | MEDLINE | ID: mdl-34931303

RESUMEN

BACKGROUND: The Revised Cardiac Risk Index (RCRI) is a widely acknowledged prognostic model to estimate preoperatively the probability of developing in-hospital major adverse cardiac events (MACE) in patients undergoing noncardiac surgery. However, the RCRI does not always make accurate predictions, so various studies have investigated whether biomarkers added to or compared with the RCRI could improve this. OBJECTIVES: Primary: To investigate the added predictive value of biomarkers to the RCRI to preoperatively predict in-hospital MACE and other adverse outcomes in patients undergoing noncardiac surgery. Secondary: To investigate the prognostic value of biomarkers compared to the RCRI to preoperatively predict in-hospital MACE and other adverse outcomes in patients undergoing noncardiac surgery. Tertiary: To investigate the prognostic value of other prediction models compared to the RCRI to preoperatively predict in-hospital MACE and other adverse outcomes in patients undergoing noncardiac surgery. SEARCH METHODS: We searched MEDLINE and Embase from 1 January 1999 (the year that the RCRI was published) until 25 June 2020. We also searched ISI Web of Science and SCOPUS for articles referring to the original RCRI development study in that period. SELECTION CRITERIA: We included studies among adults who underwent noncardiac surgery, reporting on (external) validation of the RCRI and: - the addition of biomarker(s) to the RCRI; or - the comparison of the predictive accuracy of biomarker(s) to the RCRI; or - the comparison of the predictive accuracy of the RCRI to other models. Besides MACE, all other adverse outcomes were considered for inclusion. DATA COLLECTION AND ANALYSIS: We developed a data extraction form based on the CHARMS checklist. Independent pairs of authors screened references, extracted data and assessed risk of bias and concerns regarding applicability according to PROBAST. For biomarkers and prediction models that were added or compared to the RCRI in ≥ 3 different articles, we described study characteristics and findings in further detail. We did not apply GRADE as no guidance is available for prognostic model reviews. MAIN RESULTS: We screened 3960 records and included 107 articles.   Over all objectives we rated risk of bias as high in ≥ 1 domain in 90% of included studies, particularly in the analysis domain. Statistical pooling or meta-analysis of reported results was impossible due to heterogeneity in various aspects: outcomes used, scale by which the biomarker was added/compared to the RCRI, prediction horizons and studied populations.  Added predictive value of biomarkers to the RCRI Fifty-one studies reported on the added value of biomarkers to the RCRI. Sixty-nine different predictors were identified derived from blood (29%), imaging (33%) or other sources (38%). Addition of NT-proBNP, troponin or their combination improved the RCRI for predicting MACE (median delta c-statistics: 0.08, 0.14 and 0.12 for NT-proBNP, troponin and their combination, respectively). The median total net reclassification index (NRI) was 0.16 and 0.74 after addition of troponin and NT-proBNP to the RCRI, respectively. Calibration was not reported. To predict myocardial infarction, the median delta c-statistic when NT-proBNP was added to the RCRI was 0.09, and 0.06 for prediction of all-cause mortality and MACE combined. For BNP and copeptin, data were not sufficient to provide results on their added predictive performance, for any of the outcomes. Comparison of the predictive value of biomarkers to the RCRI  Fifty-one studies assessed the predictive performance of biomarkers alone compared to the RCRI. We identified 60 unique predictors derived from blood (38%), imaging (30%) or other sources, such as the American Society of Anesthesiologists (ASA) classification (32%). Predictions were similar between the ASA classification and the RCRI for all studied outcomes. In studies different from those identified in objective 1, the median delta c-statistic was 0.15 and 0.12 in favour of  BNP and NT-proBNP alone, respectively, when compared to the RCRI, for the prediction of MACE. For C-reactive protein, the predictive performance was similar to the RCRI. For other biomarkers and outcomes, data were insufficient to provide summary results. One study reported on calibration and none on reclassification. Comparison of the predictive value of other prognostic models to the RCRI   Fifty-two articles compared the predictive ability of the RCRI to other prognostic models. Of these, 42% developed a new prediction model, 22% updated the RCRI, or another prediction model, and 37% validated an existing prediction model. None of the other prediction models showed better performance in predicting MACE than the RCRI. To predict myocardial infarction and cardiac arrest, ACS-NSQIP-MICA had a higher median delta c-statistic of 0.11 compared to the RCRI. To predict all-cause mortality, the median delta c-statistic was 0.15 higher in favour of ACS-NSQIP-SRS compared to the RCRI. Predictive performance was not better for CHADS2, CHA2DS2-VASc, R2CHADS2, Goldman index, Detsky index or VSG-CRI compared to the RCRI for any of the outcomes. Calibration and reclassification were reported in only one and three studies, respectively. AUTHORS' CONCLUSIONS: Studies included in this review suggest that the predictive performance of the RCRI in predicting MACE is improved when NT-proBNP, troponin or their combination are added. Other studies indicate that BNP and NT-proBNP, when used in isolation, may even have a higher discriminative performance than the RCRI. There was insufficient evidence of a difference between the predictive accuracy of the RCRI and other prediction models in predicting MACE. However, ACS-NSQIP-MICA and ACS-NSQIP-SRS outperformed the RCRI in predicting myocardial infarction and cardiac arrest combined, and all-cause mortality, respectively. Nevertheless, the results cannot be interpreted as conclusive due to high risks of bias in a majority of papers, and pooling was impossible due to heterogeneity in outcomes, prediction horizons, biomarkers and studied populations. Future research on the added prognostic value of biomarkers to existing prediction models should focus on biomarkers with good predictive accuracy in other settings (e.g. diagnosis of myocardial infarction) and identification of biomarkers from omics data. They should be compared to novel biomarkers with so far insufficient evidence compared to established ones, including NT-proBNP or troponins. Adherence to recent guidance for prediction model studies (e.g. TRIPOD; PROBAST) and use of standardised outcome definitions in primary studies is highly recommended to facilitate systematic review and meta-analyses in the future.


Asunto(s)
Paro Cardíaco , Infarto del Miocardio , Adulto , Sesgo , Biomarcadores , Humanos , Fragmentos de Péptidos , Valor Predictivo de las Pruebas , Pronóstico , Medición de Riesgo
3.
BMJ Open ; 11(7): e048008, 2021 07 09.
Artículo en Inglés | MEDLINE | ID: mdl-34244270

RESUMEN

INTRODUCTION: The Transparent Reporting of a multivariable prediction model of Individual Prognosis Or Diagnosis (TRIPOD) statement and the Prediction model Risk Of Bias ASsessment Tool (PROBAST) were both published to improve the reporting and critical appraisal of prediction model studies for diagnosis and prognosis. This paper describes the processes and methods that will be used to develop an extension to the TRIPOD statement (TRIPOD-artificial intelligence, AI) and the PROBAST (PROBAST-AI) tool for prediction model studies that applied machine learning techniques. METHODS AND ANALYSIS: TRIPOD-AI and PROBAST-AI will be developed following published guidance from the EQUATOR Network, and will comprise five stages. Stage 1 will comprise two systematic reviews (across all medical fields and specifically in oncology) to examine the quality of reporting in published machine-learning-based prediction model studies. In stage 2, we will consult a diverse group of key stakeholders using a Delphi process to identify items to be considered for inclusion in TRIPOD-AI and PROBAST-AI. Stage 3 will be virtual consensus meetings to consolidate and prioritise key items to be included in TRIPOD-AI and PROBAST-AI. Stage 4 will involve developing the TRIPOD-AI checklist and the PROBAST-AI tool, and writing the accompanying explanation and elaboration papers. In the final stage, stage 5, we will disseminate TRIPOD-AI and PROBAST-AI via journals, conferences, blogs, websites (including TRIPOD, PROBAST and EQUATOR Network) and social media. TRIPOD-AI will provide researchers working on prediction model studies based on machine learning with a reporting guideline that can help them report key details that readers need to evaluate the study quality and interpret its findings, potentially reducing research waste. We anticipate PROBAST-AI will help researchers, clinicians, systematic reviewers and policymakers critically appraise the design, conduct and analysis of machine learning based prediction model studies, with a robust standardised tool for bias evaluation. ETHICS AND DISSEMINATION: Ethical approval has been granted by the Central University Research Ethics Committee, University of Oxford on 10-December-2020 (R73034/RE001). Findings from this study will be disseminated through peer-review publications. PROSPERO REGISTRATION NUMBER: CRD42019140361 and CRD42019161764.


Asunto(s)
Inteligencia Artificial , Lista de Verificación , Sesgo , Humanos , Pronóstico , Proyectos de Investigación , Medición de Riesgo
4.
J Clin Epidemiol ; 132: 142-145, 2021 04.
Artículo en Inglés | MEDLINE | ID: mdl-33775387

RESUMEN

Clinical prediction models play an increasingly important role in contemporary clinical care, by informing healthcare professionals, patients and their relatives about outcome risks, with the aim to facilitate (shared) medical decision making and improve health outcomes. Diagnostic prediction models aim to calculate an individual's risk that a disease is already present, whilst prognostic prediction models aim to calculate the risk of particular heath states occurring in the future. This article serves as a primer for diagnostic and prognostic clinical prediction models, by discussing the basic terminology, some of the inherent challenges, and the need for validation of predictive performance and the evaluation of impact of these models in clinical care.


Asunto(s)
Técnicas de Apoyo para la Decisión , Modelos Estadísticos , Humanos , Pronóstico
5.
BMJ Open ; 10(11): e038832, 2020 Nov 11.
Artículo en Inglés | MEDLINE | ID: mdl-33177137

RESUMEN

INTRODUCTION: Studies addressing the development and/or validation of diagnostic and prognostic prediction models are abundant in most clinical domains. Systematic reviews have shown that the methodological and reporting quality of prediction model studies is suboptimal. Due to the increasing availability of larger, routinely collected and complex medical data, and the rising application of Artificial Intelligence (AI) or machine learning (ML) techniques, the number of prediction model studies is expected to increase even further. Prediction models developed using AI or ML techniques are often labelled as a 'black box' and little is known about their methodological and reporting quality. Therefore, this comprehensive systematic review aims to evaluate the reporting quality, the methodological conduct, and the risk of bias of prediction model studies that applied ML techniques for model development and/or validation. METHODS AND ANALYSIS: A search will be performed in PubMed to identify studies developing and/or validating prediction models using any ML methodology and across all medical fields. Studies will be included if they were published between January 2018 and December 2019, predict patient-related outcomes, use any study design or data source, and available in English. Screening of search results and data extraction from included articles will be performed by two independent reviewers. The primary outcomes of this systematic review are: (1) the adherence of ML-based prediction model studies to the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD), and (2) the risk of bias in such studies as assessed using the Prediction model Risk Of Bias ASsessment Tool (PROBAST). A narrative synthesis will be conducted for all included studies. Findings will be stratified by study type, medical field and prevalent ML methods, and will inform necessary extensions or updates of TRIPOD and PROBAST to better address prediction model studies that used AI or ML techniques. ETHICS AND DISSEMINATION: Ethical approval is not required for this study because only available published data will be analysed. Findings will be disseminated through peer-reviewed publications and scientific conferences. SYSTEMATIC REVIEW REGISTRATION: PROSPERO, CRD42019161764.


Asunto(s)
Aprendizaje Automático , Proyectos de Investigación , Sesgo , Humanos , Pronóstico , Revisiones Sistemáticas como Asunto
6.
Cochrane Database Syst Rev ; 7: CD012022, 2020 07 31.
Artículo en Inglés | MEDLINE | ID: mdl-32735048

RESUMEN

BACKGROUND: Chronic lymphocytic leukaemia (CLL) is the most common cancer of the lymphatic system in Western countries. Several clinical and biological factors for CLL have been identified. However, it remains unclear which of the available prognostic models combining those factors can be used in clinical practice to predict long-term outcome in people newly-diagnosed with CLL. OBJECTIVES: To identify, describe and appraise all prognostic models developed to predict overall survival (OS), progression-free survival (PFS) or treatment-free survival (TFS) in newly-diagnosed (previously untreated) adults with CLL, and meta-analyse their predictive performances. SEARCH METHODS: We searched MEDLINE (from January 1950 to June 2019 via Ovid), Embase (from 1974 to June 2019) and registries of ongoing trials (to 5 March 2020) for development and validation studies of prognostic models for untreated adults with CLL. In addition, we screened the reference lists and citation indices of included studies. SELECTION CRITERIA: We included all prognostic models developed for CLL which predict OS, PFS, or TFS, provided they combined prognostic factors known before treatment initiation, and any studies that tested the performance of these models in individuals other than the ones included in model development (i.e. 'external model validation studies'). We included studies of adults with confirmed B-cell CLL who had not received treatment prior to the start of the study. We did not restrict the search based on study design. DATA COLLECTION AND ANALYSIS: We developed a data extraction form to collect information based on the Checklist for Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies (CHARMS). Independent pairs of review authors screened references, extracted data and assessed risk of bias according to the Prediction model Risk Of Bias ASsessment Tool (PROBAST). For models that were externally validated at least three times, we aimed to perform a quantitative meta-analysis of their predictive performance, notably their calibration (proportion of people predicted to experience the outcome who do so) and discrimination (ability to differentiate between people with and without the event) using a random-effects model. When a model categorised individuals into risk categories, we pooled outcome frequencies per risk group (low, intermediate, high and very high). We did not apply GRADE as guidance is not yet available for reviews of prognostic models. MAIN RESULTS: From 52 eligible studies, we identified 12 externally validated models: six were developed for OS, one for PFS and five for TFS. In general, reporting of the studies was poor, especially predictive performance measures for calibration and discrimination; but also basic information, such as eligibility criteria and the recruitment period of participants was often missing. We rated almost all studies at high or unclear risk of bias according to PROBAST. Overall, the applicability of the models and their validation studies was low or unclear; the most common reasons were inappropriate handling of missing data and serious reporting deficiencies concerning eligibility criteria, recruitment period, observation time and prediction performance measures. We report the results for three models predicting OS, which had available data from more than three external validation studies: CLL International Prognostic Index (CLL-IPI) This score includes five prognostic factors: age, clinical stage, IgHV mutational status, B2-microglobulin and TP53 status. Calibration: for the low-, intermediate- and high-risk groups, the pooled five-year survival per risk group from validation studies corresponded to the frequencies observed in the model development study. In the very high-risk group, predicted survival from CLL-IPI was lower than observed from external validation studies. Discrimination: the pooled c-statistic of seven external validation studies (3307 participants, 917 events) was 0.72 (95% confidence interval (CI) 0.67 to 0.77). The 95% prediction interval (PI) of this model for the c-statistic, which describes the expected interval for the model's discriminative ability in a new external validation study, ranged from 0.59 to 0.83. Barcelona-Brno score Aimed at simplifying the CLL-IPI, this score includes three prognostic factors: IgHV mutational status, del(17p) and del(11q). Calibration: for the low- and intermediate-risk group, the pooled survival per risk group corresponded to the frequencies observed in the model development study, although the score seems to overestimate survival for the high-risk group. Discrimination: the pooled c-statistic of four external validation studies (1755 participants, 416 events) was 0.64 (95% CI 0.60 to 0.67); 95% PI 0.59 to 0.68. MDACC 2007 index score The authors presented two versions of this model including six prognostic factors to predict OS: age, B2-microglobulin, absolute lymphocyte count, gender, clinical stage and number of nodal groups. Only one validation study was available for the more comprehensive version of the model, a formula with a nomogram, while seven studies (5127 participants, 994 events) validated the simplified version of the model, the index score. Calibration: for the low- and intermediate-risk groups, the pooled survival per risk group corresponded to the frequencies observed in the model development study, although the score seems to overestimate survival for the high-risk group. Discrimination: the pooled c-statistic of the seven external validation studies for the index score was 0.65 (95% CI 0.60 to 0.70); 95% PI 0.51 to 0.77. AUTHORS' CONCLUSIONS: Despite the large number of published studies of prognostic models for OS, PFS or TFS for newly-diagnosed, untreated adults with CLL, only a minority of these (N = 12) have been externally validated for their respective primary outcome. Three models have undergone sufficient external validation to enable meta-analysis of the model's ability to predict survival outcomes. Lack of reporting prevented us from summarising calibration as recommended. Of the three models, the CLL-IPI shows the best discrimination, despite overestimation. However, performance of the models may change for individuals with CLL who receive improved treatment options, as the models included in this review were tested mostly on retrospective cohorts receiving a traditional treatment regimen. In conclusion, this review shows a clear need to improve the conducting and reporting of both prognostic model development and external validation studies. For prognostic models to be used as tools in clinical practice, the development of the models (and their subsequent validation studies) should adapt to include the latest therapy options to accurately predict performance. Adaptations should be timely.


Asunto(s)
Leucemia Linfocítica Crónica de Células B/mortalidad , Modelos Teóricos , Adulto , Factores de Edad , Sesgo , Biomarcadores de Tumor , Calibración , Intervalos de Confianza , Análisis Discriminante , Supervivencia sin Enfermedad , Femenino , Genes p53/genética , Humanos , Cadenas Pesadas de Inmunoglobulina/genética , Región Variable de Inmunoglobulina/genética , Leucemia Linfocítica Crónica de Células B/patología , Masculino , Estadificación de Neoplasias , Pronóstico , Supervivencia sin Progresión , Receptores de Antígenos de Linfocitos B/genética , Reproducibilidad de los Resultados , Proteína p53 Supresora de Tumor/genética
8.
Cochrane Database Syst Rev ; 1: CD012643, 2020 01 13.
Artículo en Inglés | MEDLINE | ID: mdl-31930780

RESUMEN

BACKGROUND: Hodgkin lymphoma (HL) is one of the most common haematological malignancies in young adults and, with cure rates of 90%, has become curable for the majority of individuals. Positron emission tomography (PET) is an imaging tool used to monitor a tumour's metabolic activity, stage and progression. Interim PET during chemotherapy has been posited as a prognostic factor in individuals with HL to distinguish between those with a poor prognosis and those with a better prognosis. This distinction is important to inform decision-making on the clinical pathway of individuals with HL. OBJECTIVES: To determine whether in previously untreated adults with HL receiving first-line therapy, interim PET scan results can distinguish between those with a poor prognosis and those with a better prognosis, and thereby predict survival outcomes in each group. SEARCH METHODS: We searched MEDLINE, Embase, CENTRAL and conference proceedings up until April 2019. We also searched one trial registry (ClinicalTrials.gov). SELECTION CRITERIA: We included retrospective and prospective studies evaluating interim PET scans in a minimum of 10 individuals with HL (all stages) undergoing first-line therapy. Interim PET was defined as conducted during therapy (after one, two, three or four treatment cycles). The minimum follow-up period was at least 12 months. We excluded studies if the trial design allowed treatment modification based on the interim PET scan results. DATA COLLECTION AND ANALYSIS: We developed a data extraction form according to the Checklist for Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies (CHARMS). Two teams of two review authors independently screened the studies, extracted data on overall survival (OS), progression-free survival (PFS) and PET-associated adverse events (AEs), assessed risk of bias (per outcome) according to the Quality in Prognosis Studies (QUIPS) tool, and assessed the certainty of the evidence (GRADE). We contacted investigators to obtain missing information and data. MAIN RESULTS: Our literature search yielded 11,277 results. In total, we included 23 studies (99 references) with 7335 newly-diagnosed individuals with classic HL (all stages). Participants in 16 studies underwent (interim) PET combined with computed tomography (PET-CT), compared to PET only in the remaining seven studies. The standard chemotherapy regimen included ABVD (16) studies, compared to BEACOPP or other regimens (seven studies). Most studies (N = 21) conducted interim PET scans after two cycles (PET2) of chemotherapy, although PET1, PET3 and PET4 were also reported in some studies. In the meta-analyses, we used PET2 data if available as we wanted to ensure homogeneity between studies. In most studies interim PET scan results were evaluated according to the Deauville 5-point scale (N = 12). Eight studies were not included in meta-analyses due to missing information and/or data; results were reported narratively. For the remaining studies, we pooled the unadjusted hazard ratio (HR). The timing of the outcome measurement was after two or three years (the median follow-up time ranged from 22 to 65 months) in the pooled studies. Eight studies explored the independent prognostic ability of interim PET by adjusting for other established prognostic factors (e.g. disease stage, B symptoms). We did not pool the results because the multivariable analyses adjusted for a different set of factors in each study. Overall survival Twelve (out of 23) studies reported OS. Six of these were assessed as low risk of bias in all of the first four domains of QUIPS (study participation, study attrition, prognostic factor measurement and outcome measurement). The other six studies were assessed as unclear, moderate or high risk of bias in at least one of these four domains. Four studies were assessed as low risk, and eight studies as high risk of bias for the domain other prognostic factors (covariates). Nine studies were assessed as low risk, and three studies as high risk of bias for the domain 'statistical analysis and reporting'. We pooled nine studies with 1802 participants. Participants with HL who have a negative interim PET scan result probably have a large advantage in OS compared to those with a positive interim PET scan result (unadjusted HR 5.09, 95% confidence interval (CI) 2.64 to 9.81, I² = 44%, moderate-certainty evidence). In absolute values, this means that 900 out of 1000 participants with a negative interim PET scan result will probably survive longer than three years compared to 585 (95% CI 356 to 757) out of 1000 participants with a positive result. Adjusted results from two studies also indicate an independent prognostic value of interim PET scan results (moderate-certainty evidence). Progression-free survival Twenty-one studies reported PFS. Eleven out of 21 were assessed as low risk of bias in the first four domains. The remaining were assessed as unclear, moderate or high risk of bias in at least one of the four domains. Eleven studies were assessed as low risk, and ten studies as high risk of bias for the domain other prognostic factors (covariates). Eight studies were assessed as high risk, thirteen as low risk of bias for statistical analysis and reporting. We pooled 14 studies with 2079 participants. Participants who have a negative interim PET scan result may have an advantage in PFS compared to those with a positive interim PET scan result, but the evidence is very uncertain (unadjusted HR 4.90, 95% CI 3.47 to 6.90, I² = 45%, very low-certainty evidence). This means that 850 out of 1000 participants with a negative interim PET scan result may be progression-free longer than three years compared to 451 (95% CI 326 to 569) out of 1000 participants with a positive result. Adjusted results (not pooled) from eight studies also indicate that there may be an independent prognostic value of interim PET scan results (low-certainty evidence). PET-associated adverse events No study measured PET-associated AEs. AUTHORS' CONCLUSIONS: This review provides moderate-certainty evidence that interim PET scan results predict OS, and very low-certainty evidence that interim PET scan results predict progression-free survival in treated individuals with HL. This evidence is primarily based on unadjusted data. More studies are needed to test the adjusted prognostic ability of interim PET against established prognostic factors.


Asunto(s)
Protocolos de Quimioterapia Combinada Antineoplásica/uso terapéutico , Enfermedad de Hodgkin/tratamiento farmacológico , Tomografía Computarizada por Tomografía de Emisión de Positrones/métodos , Quimioradioterapia , Toma de Decisiones , Progresión de la Enfermedad , Supervivencia sin Enfermedad , Humanos , Pronóstico , Adulto Joven
9.
Cochrane Database Syst Rev ; 9: CD012643, 2019 09 16.
Artículo en Inglés | MEDLINE | ID: mdl-31525824

RESUMEN

BACKGROUND: Hodgkin lymphoma (HL) is one of the most common haematological malignancies in young adults and, with cure rates of 90%, has become curable for the majority of individuals. Positron emission tomography (PET) is an imaging tool used to monitor a tumour's metabolic activity, stage and progression. Interim PET during chemotherapy has been posited as a prognostic factor in individuals with HL to distinguish between those with a poor prognosis and those with a better prognosis. This distinction is important to inform decision-making on the clinical pathway of individuals with HL. OBJECTIVES: To determine whether in previously untreated adults with HL receiving first-line therapy, interim PET scan results can distinguish between those with a poor prognosis and those with a better prognosis, and thereby predict survival outcomes in each group. SEARCH METHODS: We searched MEDLINE, Embase, CENTRAL and conference proceedings up until April 2019. We also searched one trial registry (ClinicalTrials.gov). SELECTION CRITERIA: We included retrospective and prospective studies evaluating interim PET scans in a minimum of 10 individuals with HL (all stages) undergoing first-line therapy. Interim PET was defined as conducted during therapy (after one, two, three or four treatment cycles). The minimum follow-up period was at least 12 months. We excluded studies if the trial design allowed treatment modification based on the interim PET scan results. DATA COLLECTION AND ANALYSIS: We developed a data extraction form according to the Checklist for Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies (CHARMS). Two teams of two review authors independently screened the studies, extracted data on overall survival (OS), progression-free survival (PFS) and PET-associated adverse events (AEs), assessed risk of bias (per outcome) according to the Quality in Prognosis Studies (QUIPS) tool, and assessed the certainty of the evidence (GRADE). We contacted investigators to obtain missing information and data. MAIN RESULTS: Our literature search yielded 11,277 results. In total, we included 23 studies (99 references) with 7335 newly-diagnosed individuals with classic HL (all stages).Participants in 16 studies underwent (interim) PET combined with computed tomography (PET-CT), compared to PET only in the remaining seven studies. The standard chemotherapy regimen included ABVD (16) studies, compared to BEACOPP or other regimens (seven studies). Most studies (N = 21) conducted interim PET scans after two cycles (PET2) of chemotherapy, although PET1, PET3 and PET4 were also reported in some studies. In the meta-analyses, we used PET2 data if available as we wanted to ensure homogeneity between studies. In most studies interim PET scan results were evaluated according to the Deauville 5-point scale (N = 12).Eight studies were not included in meta-analyses due to missing information and/or data; results were reported narratively. For the remaining studies, we pooled the unadjusted hazard ratio (HR). The timing of the outcome measurement was after two or three years (the median follow-up time ranged from 22 to 65 months) in the pooled studies.Eight studies explored the independent prognostic ability of interim PET by adjusting for other established prognostic factors (e.g. disease stage, B symptoms). We did not pool the results because the multivariable analyses adjusted for a different set of factors in each study.Overall survivalTwelve (out of 23) studies reported OS. Six of these were assessed as low risk of bias in all of the first four domains of QUIPS (study participation, study attrition, prognostic factor measurement and outcome measurement). The other six studies were assessed as unclear, moderate or high risk of bias in at least one of these four domains. Nine studies were assessed as high risk, and three studies as moderate risk of bias for the domain study confounding. Eight studies were assessed as low risk, and four studies as high risk of bias for the domain statistical analysis and reporting.We pooled nine studies with 1802 participants. Participants with HL who have a negative interim PET scan result probably have a large advantage in OS compared to those with a positive interim PET scan result (unadjusted HR 5.09, 95% confidence interval (CI) 2.64 to 9.81, I² = 44%, moderate-certainty evidence). In absolute values, this means that 900 out of 1000 participants with a negative interim PET scan result will probably survive longer than three years compared to 585 (95% CI 356 to 757) out of 1000 participants with a positive result.Adjusted results from two studies also indicate an independent prognostic value of interim PET scan results (moderate-certainty evidence).Progression-free survival Twenty-one studies reported PFS. Eleven out of 21 were assessed as low risk of bias in the first four domains. The remaining were assessed as unclear, moderate or high risk of bias in at least one of the four domains. Eleven studies were assessed as high risk, nine studies as moderate risk and one study as low risk of bias for study confounding. Eight studies were assessed as high risk, three as moderate risk and nine as low risk of bias for statistical analysis and reporting.We pooled 14 studies with 2079 participants. Participants who have a negative interim PET scan result may have an advantage in PFS compared to those with a positive interim PET scan result, but the evidence is very uncertain (unadjusted HR 4.90, 95% CI 3.47 to 6.90, I² = 45%, very low-certainty evidence). This means that 850 out of 1000 participants with a negative interim PET scan result may be progression-free longer than three years compared to 451 (95% CI 326 to 569) out of 1000 participants with a positive result.Adjusted results (not pooled) from eight studies also indicate that there may be an independent prognostic value of interim PET scan results (low-certainty evidence).PET-associated adverse eventsNo study measured PET-associated AEs. AUTHORS' CONCLUSIONS: This review provides moderate-certainty evidence that interim PET scan results predict OS, and very low-certainty evidence that interim PET scan results predict progression-free survival in treated individuals with HL. This evidence is primarily based on unadjusted data. More studies are needed to test the adjusted prognostic ability of interim PET against established prognostic factors.


Asunto(s)
Quimioradioterapia/métodos , Enfermedad de Hodgkin/diagnóstico por imagen , Enfermedad de Hodgkin/tratamiento farmacológico , Tomografía de Emisión de Positrones/métodos , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapéutico , Progresión de la Enfermedad , Supervivencia sin Enfermedad , Humanos , Pronóstico , Ensayos Clínicos Controlados Aleatorios como Asunto
10.
Stat Methods Med Res ; 28(8): 2455-2474, 2019 08.
Artículo en Inglés | MEDLINE | ID: mdl-29966490

RESUMEN

Binary logistic regression is one of the most frequently applied statistical approaches for developing clinical prediction models. Developers of such models often rely on an Events Per Variable criterion (EPV), notably EPV ≥10, to determine the minimal sample size required and the maximum number of candidate predictors that can be examined. We present an extensive simulation study in which we studied the influence of EPV, events fraction, number of candidate predictors, the correlations and distributions of candidate predictor variables, area under the ROC curve, and predictor effects on out-of-sample predictive performance of prediction models. The out-of-sample performance (calibration, discrimination and probability prediction error) of developed prediction models was studied before and after regression shrinkage and variable selection. The results indicate that EPV does not have a strong relation with metrics of predictive performance, and is not an appropriate criterion for (binary) prediction model development studies. We show that out-of-sample predictive performance can better be approximated by considering the number of predictors, the total sample size and the events fraction. We propose that the development of new sample size criteria for prediction models should be based on these three parameters, and provide suggestions for improving sample size determination.


Asunto(s)
Modelos Estadísticos , Tamaño de la Muestra , Simulación por Computador , Humanos , Modelos Logísticos , Proyectos de Investigación
11.
Stat Med ; 38(7): 1276-1296, 2019 03 30.
Artículo en Inglés | MEDLINE | ID: mdl-30357870

RESUMEN

When designing a study to develop a new prediction model with binary or time-to-event outcomes, researchers should ensure their sample size is adequate in terms of the number of participants (n) and outcome events (E) relative to the number of predictor parameters (p) considered for inclusion. We propose that the minimum values of n and E (and subsequently the minimum number of events per predictor parameter, EPP) should be calculated to meet the following three criteria: (i) small optimism in predictor effect estimates as defined by a global shrinkage factor of ≥0.9, (ii) small absolute difference of ≤ 0.05 in the model's apparent and adjusted Nagelkerke's R2 , and (iii) precise estimation of the overall risk in the population. Criteria (i) and (ii) aim to reduce overfitting conditional on a chosen p, and require prespecification of the model's anticipated Cox-Snell R2 , which we show can be obtained from previous studies. The values of n and E that meet all three criteria provides the minimum sample size required for model development. Upon application of our approach, a new diagnostic model for Chagas disease requires an EPP of at least 4.8 and a new prognostic model for recurrent venous thromboembolism requires an EPP of at least 23. This reinforces why rules of thumb (eg, 10 EPP) should be avoided. Researchers might additionally ensure the sample size gives precise estimates of key predictor effects; this is especially important when key categorical predictors have few events in some categories, as this may substantially increase the numbers required.


Asunto(s)
Análisis Multivariante , Análisis de Regresión , Tamaño de la Muestra , Simulación por Computador , Humanos , Tiempo
12.
Stat Methods Med Res ; 28(9): 2768-2786, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-30032705

RESUMEN

It is widely recommended that any developed-diagnostic or prognostic-prediction model is externally validated in terms of its predictive performance measured by calibration and discrimination. When multiple validations have been performed, a systematic review followed by a formal meta-analysis helps to summarize overall performance across multiple settings, and reveals under which circumstances the model performs suboptimal (alternative poorer) and may need adjustment. We discuss how to undertake meta-analysis of the performance of prediction models with either a binary or a time-to-event outcome. We address how to deal with incomplete availability of study-specific results (performance estimates and their precision), and how to produce summary estimates of the c-statistic, the observed:expected ratio and the calibration slope. Furthermore, we discuss the implementation of frequentist and Bayesian meta-analysis methods, and propose novel empirically-based prior distributions to improve estimation of between-study heterogeneity in small samples. Finally, we illustrate all methods using two examples: meta-analysis of the predictive performance of EuroSCORE II and of the Framingham Risk Score. All examples and meta-analysis models have been implemented in our newly developed R package "metamisc".


Asunto(s)
Metaanálisis como Asunto , Modelos Estadísticos , Proyectos de Investigación , Medición de Riesgo/métodos , Teorema de Bayes , Calibración , Humanos , Pronóstico , Revisiones Sistemáticas como Asunto
15.
Health Serv Insights ; 11: 1178632918785133, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30083056

RESUMEN

BACKGROUND: When profiling health care providers, adjustment for case-mix is essential. However, conventional risk adjustment methods may perform poorly, especially when provider volumes are small or events rare. Propensity score (PS) methods, commonly used in observational studies of binary treatments, have been shown to perform well when the amount of observations and/or events are low and can be extended to a multiple provider setting. The objective of this study was to evaluate the performance of different risk adjustment methods when profiling multiple health care providers that perform highly protocolized procedures, such as coronary artery bypass grafting. METHODS: In a simulation study, provider effects estimated using PS adjustment, PS weighting, PS matching, and multivariable logistic regression were compared in terms of bias, coverage and mean squared error (MSE) when varying the event rate, sample size, provider volumes, and number of providers. An empirical example from the field of cardiac surgery was used to demonstrate the different methods. RESULTS: Overall, PS adjustment, PS weighting, and logistic regression resulted in provider effects with low amounts of bias and good coverage. The PS matching and PS weighting with trimming led to biased effects and high MSE across several scenarios. Moreover, PS matching is not practical to implement when the number of providers surpasses three. CONCLUSIONS: None of the PS methods clearly outperformed logistic regression, except when sample sizes were relatively small. Propensity score matching performed worse than the other PS methods considered.

16.
Eur J Prev Cardiol ; 25(6): 642-650, 2018 04.
Artículo en Inglés | MEDLINE | ID: mdl-29411690

RESUMEN

Background Cardiovascular disease (CVD) prevention is commonly focused on providing individuals at high predicted CVD risk with preventive medication. Whereas CVD risk increases rapidly with age, current risk-based selection of individuals mainly targets the elderly. However, the lifelong (preventable) consequences of CVD events may be larger in younger individuals. The purpose of this paper is to investigate if health benefits from preventive treatment may increase when the selection strategy is further optimised. Methods Data from three Dutch cohorts were combined ( n = 47469, men:women 1:1.92) and classified into subgroups based on age and gender. The Framingham global risk score was used to estimate 10-year CVD risk. The associated lifelong burden of CVD events according to this 10-year CVD risk was expressed as quality-adjusted life years lost. Based on this approach, the additional health benefits from preventive treatment, reducing this 10-year CVD risk, from selecting individuals based on their expected CVD burden rather than their expected CVD risk were estimated. These benefits were expressed as quality-adjusted life years gained over lifetime. Results When using the current selection strategy (10% risk threshold), 32% of the individuals were selected for preventive treatment. When the same proportion was selected based on burden, more younger and fewer older individuals would receive treatment. Across all individuals, the gain in quality-adjusted life years was 217 between the two strategies, over a 10-year time horizon. In addition, when combining the strategies 5% extra eligible individuals were selected resulting in a gain of 628 quality-adjusted life years. Conclusion Improvement of the selection approach of individuals can help to reduce further the CVD burden. Selecting individuals for preventive treatment based on their expected CVD burden will provide more younger and fewer older individuals with treatment, and will reduce the overall CVD burden.


Asunto(s)
Enfermedades Cardiovasculares/prevención & control , Prevención Primaria/métodos , Salud Pública , Años de Vida Ajustados por Calidad de Vida , Medición de Riesgo/métodos , Anciano , Enfermedades Cardiovasculares/economía , Enfermedades Cardiovasculares/epidemiología , Análisis Costo-Beneficio , Femenino , Humanos , Masculino , Persona de Mediana Edad , Países Bajos/epidemiología , Prevención Primaria/economía , Factores de Riesgo
17.
Stat Methods Med Res ; 27(5): 1351-1364, 2018 05.
Artículo en Inglés | MEDLINE | ID: mdl-27487843

RESUMEN

Network meta-analysis (NMA) is a common approach to summarizing relative treatment effects from randomized trials with different treatment comparisons. Most NMAs are based on published aggregate data (AD) and have limited possibilities for investigating the extent of network consistency and between-study heterogeneity. Given that individual participant data (IPD) are considered the gold standard in evidence synthesis, we explored statistical methods for IPD-NMA and investigated their potential advantages and limitations, compared with AD-NMA. We discuss several one-stage random-effects NMA models that account for within-trial imbalances, treatment effect modifiers, missing response data and longitudinal responses. We illustrate all models in a case study of 18 antidepressant trials with a continuous endpoint (the Hamilton Depression Score). All trials suffered from drop-out; missingness of longitudinal responses ranged from 21 to 41% after 6 weeks follow-up. Our results indicate that NMA based on IPD may lead to increased precision of estimated treatment effects. Furthermore, it can help to improve network consistency and explain between-study heterogeneity by adjusting for participant-level effect modifiers and adopting more advanced models for dealing with missing response data. We conclude that implementation of IPD-NMA should be considered when trials are affected by substantial drop-out rate, and when treatment effects are potentially influenced by participant-level covariates.


Asunto(s)
Interpretación Estadística de Datos , Metaanálisis en Red , Antidepresivos/uso terapéutico , Depresión/tratamiento farmacológico , Humanos , Estudios Longitudinales , Modelos Estadísticos , Pacientes Desistentes del Tratamiento/estadística & datos numéricos , Resultado del Tratamiento
18.
Stat Methods Med Res ; 27(11): 3505-3522, 2018 11.
Artículo en Inglés | MEDLINE | ID: mdl-28480827

RESUMEN

If individual participant data are available from multiple studies or clusters, then a prediction model can be externally validated multiple times. This allows the model's discrimination and calibration performance to be examined across different settings. Random-effects meta-analysis can then be used to quantify overall (average) performance and heterogeneity in performance. This typically assumes a normal distribution of 'true' performance across studies. We conducted a simulation study to examine this normality assumption for various performance measures relating to a logistic regression prediction model. We simulated data across multiple studies with varying degrees of variability in baseline risk or predictor effects and then evaluated the shape of the between-study distribution in the C-statistic, calibration slope, calibration-in-the-large, and E/O statistic, and possible transformations thereof. We found that a normal between-study distribution was usually reasonable for the calibration slope and calibration-in-the-large; however, the distributions of the C-statistic and E/O were often skewed across studies, particularly in settings with large variability in the predictor effects. Normality was vastly improved when using the logit transformation for the C-statistic and the log transformation for E/O, and therefore we recommend these scales to be used for meta-analysis. An illustrated example is given using a random-effects meta-analysis of the performance of QRISK2 across 25 general practices.


Asunto(s)
Calibración , Predicción , Modelos Estadísticos , Algoritmos , Investigación Biomédica/estadística & datos numéricos , Resultado del Tratamiento , Estudios de Validación como Asunto
19.
Diagn Progn Res ; 1: 8, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-31093539

RESUMEN

BACKGROUND: The use of multinomial logistic regression models is advocated for modeling the associations of covariates with three or more mutually exclusive outcome categories. As compared to a binary logistic regression analysis, the simultaneous modeling of multiple outcome categories using a multinomial model often better resembles the clinical setting, where a physician typically must distinguish between more than two possible diagnoses or outcome events for an individual patient (e.g., the differential diagnosis). A disadvantage of the multinomial logistic model is that the interpretation of its results is often complex. In particular, the calculation of predicted probabilities for the various outcomes requires a series of careful calculations. Nomograms are widely used in studies reporting binary logistic regression models to facilitate the interpretation of the results and allow the calculation of the predicted probability for individuals. METHODS AND RESULTS: In this paper we outline an approach for deriving a generic nomogram for multinomial logistic regression models and an accompanying scoring chart that can further simplify the calculation of predicted multinomial probabilities. We illustrate the use of the nomogram and scoring chart and their interpretation using a clinical example. CONCLUSIONS: The generic multinomial nomogram and scoring chart can be used irrespective of the number of outcome categories that are present.

20.
Eur J Prev Cardiol ; 23(16): 1755-1765, 2016 11.
Artículo en Inglés | MEDLINE | ID: mdl-27378766

RESUMEN

OBJECTIVE: There is uncertainty about the direction and magnitude of the associations between parity, breastfeeding and the risk of coronary heart disease (CHD). We examined the separate and combined associations of parity and breastfeeding practices with the incidence of CHD later in life among women in a large, pan-European cohort study. METHODS: Data were used from European Prospective Investigation into Cancer and Nutrition (EPIC)-CVD, a case-cohort study nested within the EPIC prospective study of 520,000 participants from 10 countries. Information on reproductive history was available for 14,917 women, including 5138 incident cases of CHD. Using Prentice-weighted Cox regression separately for each country followed by a random-effects meta-analysis, we calculated hazard ratios (HRs) and 95% confidence intervals (CIs) for CHD, after adjustment for age, study centre and several socioeconomic and biological risk factors. RESULTS: Compared with nulliparous women, the adjusted HR was 1.19 (95% CI: 1.01-1.41) among parous women; HRs were higher among women with more children (e.g., adjusted HR: 1.95 (95% CI: 1.19-3.20) for women with five or more children). Compared with women who did not breastfeed, the adjusted HR was 0.71 (95% CI: 0.52-0.98) among women who breastfed. For childbearing women who never breastfed, the adjusted HR was 1.58 (95% CI: 1.09-2.30) compared with nulliparous women, whereas for childbearing women who breastfed, the adjusted HR was 1.19 (95% CI: 0.99-1.43). CONCLUSION: Having more children was associated with a higher risk of CHD later in life, whereas breastfeeding was associated with a lower CHD risk. Women who both had children and breastfed did have a non-significantly higher risk of CHD.


Asunto(s)
Lactancia Materna , Enfermedad Coronaria/epidemiología , Paridad , Vigilancia de la Población , Medición de Riesgo/métodos , Adulto , Edad de Inicio , Enfermedad Coronaria/etiología , Europa (Continente)/epidemiología , Femenino , Estudios de Seguimiento , Predicción , Humanos , Incidencia , Persona de Mediana Edad , Embarazo , Estudios Prospectivos , Factores de Riesgo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...