Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 187
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Lancet Oncol ; 25(10): 1371-1386, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-39362250

RESUMO

BACKGROUND: Multiple risk-prediction models are used in clinical practice to triage patients as being at low risk or high risk of ovarian cancer. In the ROCkeTS study, we aimed to identify the best diagnostic test for ovarian cancer in symptomatic patients, through head-to-head comparisons of risk-prediction models, in a real-world setting. Here, we report the results for the postmenopausal cohort. METHODS: In this multicentre, prospective diagnostic accuracy study, we recruited newly presenting female patients aged 16-90 years with non-specific symptoms and raised CA125 or abnormal ultrasound results (or both) who had been referred via rapid access, elective clinics, or emergency presentations from 23 hospitals in the UK. Patients with normal CA125 and simple ovarian cysts of smaller than 5 cm in diameter, active non-ovarian malignancy, or previous ovarian malignancy, or those who were pregnant or declined a transvaginal scan, were ineligible. In this analysis, only postmenopausal participants were included. Participants completed a symptom questionnaire, gave a blood sample, and had transabdominal and transvaginal ultrasounds performed by International Ovarian Tumour Analysis consortium (IOTA)-certified sonographers. Index tests were Risk of Malignancy 1 (RMI1) at a threshold of 200, Risk of Malignancy Algorithm (ROMA) at multiple thresholds, IOTA Assessment of Different Neoplasias in the Adnexa (ADNEX) at thresholds of 3% and 10%, IOTA SRRisk model at thresholds of 3% and 10%, IOTA Simple Rules (malignant vs benign, or inconclusive), and CA125 at 35 IU/mL. In a post-hoc analysis, the Ovarian Adnexal and Reporting Data System (ORADS) at 10% was derived from IOTA ultrasound variables using established methods since ORADS was described after completion of recruitment. Index tests were conducted by study staff masked to the results of the reference standard. The comparator was RMI1 at the 250 threshold (the current UK National Health Service standard of care). The reference standard was surgical or biopsy tissue histology or cytology within 3 months, or a self-reported diagnosis of ovarian cancer at 12 month follow-up. The primary outcome was diagnostic accuracy at predicting primary invasive ovarian cancer versus benign or normal histology, assessed by analysing the sensitivity, specificity, C-index, area under receiver operating characteristic curve, positive and negative predictive values, and calibration plots in participants with conclusive reference standard results and available index test data. This study is registered with the International Standard Randomised Controlled Trial Number registry (ISRCTN17160843). FINDINGS: Between July 13, 2015, and Nov 30, 2018, 1242 postmenopausal patients were recruited, of whom 215 (17%) had primary ovarian cancer. 166 participants had missing, inconclusive, or other reference standard results; therefore, data from a maximum of 1076 participants were used to assess the index tests for the primary outcome. Compared with RMI1 at 250 (sensitivity 82·9% [95% CI 76·7 to 88·0], specificity 87·4% [84·9 to 89·6]), IOTA ADNEX at 10% was more sensitive (difference of -13·9% [-20·2 to -7·6], p<0·0001) but less specific (difference of 28·5% [24·7 to 32·3], p<0·0001). ROMA at 29·9 had similar sensitivity (difference of -3·6% [-9·1 to 1·9], p=0·24) but lower specificity (difference of 5·2% [2·5 to 8·0], p=0·0001). RMI1 at 200 had similar sensitivity (difference of -2·1% [-4·7 to 0·5], p=0·13) but lower specificity (difference of 3·0% [1·7 to 4·3], p<0·0001). IOTA SRRisk model at 10% had similar sensitivity (difference of -4·3% [-11·0 to -2·3], p=0·23) but lower specificity (difference of 16·2% [12·6 to 19·8], p<0·0001). IOTA Simple Rules had similar sensitivity (difference of -1·6% [-9·3 to 6·2], p=0·82) and specificity (difference of -2·2% [-5·1 to 0·6], p=0·14). CA125 at 35 IU/mL had similar sensitivity (difference of -2·1% [-6·6 to 2·3], p=0·42) but higher specificity (difference of 6·7% [4·3 to 9·1], p<0·0001). In a post-hoc analysis, when compared with RMI1 at 250, ORADS achieved similar sensitivity (difference of -2·1%, 95% CI -8·6 to 4·3, p=0·60) and lower specificity (difference of 10·2%, 95% CI 6·8 to 13·6, p<0·0001). INTERPRETATION: In view of its higher sensitivity than RMI1 at 250, despite some loss in specificity, we recommend that IOTA ADNEX at 10% should be considered as the new standard-of-care diagnostic in ovarian cancer for postmenopausal patients. FUNDING: UK National Institute of Heath Research.


Assuntos
Antígeno Ca-125 , Neoplasias Ovarianas , Pós-Menopausa , Humanos , Feminino , Pessoa de Meia-Idade , Neoplasias Ovarianas/diagnóstico , Neoplasias Ovarianas/sangue , Neoplasias Ovarianas/epidemiologia , Neoplasias Ovarianas/patologia , Neoplasias Ovarianas/diagnóstico por imagem , Idoso , Estudos Prospectivos , Adulto , Reino Unido/epidemiologia , Medição de Risco , Idoso de 80 Anos ou mais , Antígeno Ca-125/sangue , Adolescente , Adulto Jovem , Valor Preditivo dos Testes , Ultrassonografia , Fatores de Risco
2.
Br J Cancer ; 130(6): 934-940, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38243011

RESUMO

BACKGROUND: Several diagnostic prediction models to help clinicians discriminate between benign and malignant adnexal masses are available. This study is a head-to-head comparison of the performance of the Assessment of Different NEoplasias in the adneXa (ADNEX) model with that of the Risk of Ovarian Malignancy Algorithm (ROMA). METHODS: This is a retrospective study based on prospectively included consecutive women with an adnexal tumour scheduled for surgery at five oncology centres and one non-oncology centre in four countries between 2015 and 2019. The reference standard was histology. Model performance for ADNEX and ROMA was evaluated regarding discrimination, calibration, and clinical utility. RESULTS: The primary analysis included 894 patients, of whom 434 (49%) had a malignant tumour. The area under the receiver operating characteristic curve (AUC) was 0.92 (95% CI 0.88-0.95) for ADNEX with CA125, 0.90 (0.84-0.94) for ADNEX without CA125, and 0.85 (0.80-0.89) for ROMA. ROMA, and to a lesser extent ADNEX, underestimated the risk of malignancy. Clinical utility was highest for ADNEX. ROMA had no clinical utility at decision thresholds <27%. CONCLUSIONS: ADNEX had better ability to discriminate between benign and malignant adnexal tumours and higher clinical utility than ROMA. CLINICAL TRIAL REGISTRATION: clinicaltrials.gov NCT01698632 and NCT02847832.


Assuntos
Doenças dos Anexos , Neoplasias Ovarianas , Humanos , Feminino , Estudos Retrospectivos , Ultrassonografia , Neoplasias Ovarianas/diagnóstico , Neoplasias Ovarianas/cirurgia , Neoplasias Ovarianas/patologia , Doenças dos Anexos/diagnóstico , Doenças dos Anexos/cirurgia , Doenças dos Anexos/patologia , Algoritmos , Sensibilidade e Especificidade , Antígeno Ca-125
3.
N Engl J Med ; 385(2): 107-118, 2021 07 08.
Artigo em Inglês | MEDLINE | ID: mdl-34106556

RESUMO

BACKGROUND: Observational studies have shown that fetoscopic endoluminal tracheal occlusion (FETO) has been associated with increased survival among infants with severe pulmonary hypoplasia due to isolated congenital diaphragmatic hernia on the left side, but data from randomized trials are lacking. METHODS: In this open-label trial conducted at centers with experience in FETO and other types of prenatal surgery, we randomly assigned, in a 1:1 ratio, women carrying singleton fetuses with severe isolated congenital diaphragmatic hernia on the left side to FETO at 27 to 29 weeks of gestation or expectant care. Both treatments were followed by standardized postnatal care. The primary outcome was infant survival to discharge from the neonatal intensive care unit. We used a group-sequential design with five prespecified interim analyses for superiority, with a maximum sample size of 116 women. RESULTS: The trial was stopped early for efficacy after the third interim analysis. In an intention-to-treat analysis that included 80 women, 40% of infants (16 of 40) in the FETO group survived to discharge, as compared with 15% (6 of 40) in the expectant care group (relative risk, 2.67; 95% confidence interval [CI], 1.22 to 6.11; two-sided P = 0.009). Survival to 6 months of age was identical to the survival to discharge (relative risk, 2.67; 95% CI, 1.22 to 6.11). The incidence of preterm, prelabor rupture of membranes was higher among women in the FETO group than among those in the expectant care group (47% vs. 11%; relative risk, 4.51; 95% CI, 1.83 to 11.9), as was the incidence of preterm birth (75% vs. 29%; relative risk, 2.59; 95% CI, 1.59 to 4.52). One neonatal death occurred after emergency delivery for placental laceration from fetoscopic balloon removal, and one neonatal death occurred because of failed balloon removal. In an analysis that included 11 additional participants with data that were available after the trial was stopped, survival to discharge was 36% among infants in the FETO group and 14% among those in the expectant care group (relative risk, 2.65; 95% CI, 1.21 to 6.09). CONCLUSIONS: In fetuses with isolated severe congenital diaphragmatic hernia on the left side, FETO performed at 27 to 29 weeks of gestation resulted in a significant benefit over expectant care with respect to survival to discharge, and this benefit was sustained to 6 months of age. FETO increased the risks of preterm, prelabor rupture of membranes and preterm birth. (Funded by the European Commission and others; TOTAL ClinicalTrials.gov number, NCT01240057.).


Assuntos
Oclusão com Balão , Terapias Fetais , Hérnias Diafragmáticas Congênitas/terapia , Traqueia/cirurgia , Adulto , Oclusão com Balão/efeitos adversos , Oclusão com Balão/instrumentação , Oclusão com Balão/métodos , Feminino , Ruptura Prematura de Membranas Fetais/epidemiologia , Terapias Fetais/efeitos adversos , Fetoscopia , Idade Gestacional , Hérnias Diafragmáticas Congênitas/mortalidade , Humanos , Análise de Intenção de Tratamento , Trabalho de Parto Prematuro/epidemiologia , Gravidade do Paciente , Gravidez , Nascimento Prematuro/epidemiologia , Conduta Expectante
4.
N Engl J Med ; 385(2): 119-129, 2021 07 08.
Artigo em Inglês | MEDLINE | ID: mdl-34106555

RESUMO

BACKGROUND: Fetoscopic endoluminal tracheal occlusion (FETO) has been associated with increased postnatal survival among infants with severe pulmonary hypoplasia due to isolated congenital diaphragmatic hernia on the left side, but data are lacking to inform its effects in infants with moderate disease. METHODS: In this open-label trial conducted at many centers with experience in FETO and other types of prenatal surgery, we randomly assigned, in a 1:1 ratio, women carrying singleton fetuses with a moderate isolated congenital diaphragmatic hernia on the left side to FETO at 30 to 32 weeks of gestation or expectant care. Both treatments were followed by standardized postnatal care. The primary outcomes were infant survival to discharge from a neonatal intensive care unit (NICU) and survival without oxygen supplementation at 6 months of age. RESULTS: In an intention-to-treat analysis involving 196 women, 62 of 98 infants in the FETO group (63%) and 49 of 98 infants in the expectant care group (50%) survived to discharge (relative risk , 1.27; 95% confidence interval [CI], 0.99 to 1.63; two-sided P = 0.06). At 6 months of age, 53 of 98 infants (54%) in the FETO group and 43 of 98 infants (44%) in the expectant care group were alive without oxygen supplementation (relative risk, 1.23; 95% CI, 0.93 to 1.65). The incidence of preterm, prelabor rupture of membranes was higher among women in the FETO group than among those in the expectant care group (44% vs. 12%; relative risk, 3.79; 95% CI, 2.13 to 6.91), as was the incidence of preterm birth (64% vs. 22%, respectively; relative risk, 2.86; 95% CI, 1.94 to 4.34), but FETO was not associated with any other serious maternal complications. There were two spontaneous fetal deaths (one in each group) without obvious cause and one neonatal death that was associated with balloon removal. CONCLUSIONS: This trial involving fetuses with moderate congenital diaphragmatic hernia on the left side did not show a significant benefit of FETO performed at 30 to 32 weeks of gestation over expectant care with respect to survival to discharge or the need for oxygen supplementation at 6 months. FETO increased the risks of preterm, prelabor rupture of membranes and preterm birth. (Funded by the European Commission and others; TOTAL ClinicalTrials.gov number, NCT00763737.).


Assuntos
Oclusão com Balão , Hérnias Diafragmáticas Congênitas/terapia , Traqueia/cirurgia , Adulto , Oclusão com Balão/efeitos adversos , Oclusão com Balão/instrumentação , Oclusão com Balão/métodos , Feminino , Ruptura Prematura de Membranas Fetais/epidemiologia , Terapias Fetais/efeitos adversos , Fetoscopia , Idade Gestacional , Hérnias Diafragmáticas Congênitas/mortalidade , Humanos , Análise de Intenção de Tratamento , Trabalho de Parto Prematuro/epidemiologia , Gravidade do Paciente , Gravidez , Nascimento Prematuro/epidemiologia , Conduta Expectante
5.
Stat Med ; 43(6): 1119-1134, 2024 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-38189632

RESUMO

Tuning hyperparameters, such as the regularization parameter in Ridge or Lasso regression, is often aimed at improving the predictive performance of risk prediction models. In this study, various hyperparameter tuning procedures for clinical prediction models were systematically compared and evaluated in low-dimensional data. The focus was on out-of-sample predictive performance (discrimination, calibration, and overall prediction error) of risk prediction models developed using Ridge, Lasso, Elastic Net, or Random Forest. The influence of sample size, number of predictors and events fraction on performance of the hyperparameter tuning procedures was studied using extensive simulations. The results indicate important differences between tuning procedures in calibration performance, while generally showing similar discriminative performance. The one-standard-error rule for tuning applied to cross-validation (1SE CV) often resulted in severe miscalibration. Standard non-repeated and repeated cross-validation (both 5-fold and 10-fold) performed similarly well and outperformed the other tuning procedures. Bootstrap showed a slight tendency to more severe miscalibration than standard cross-validation-based tuning procedures. Differences between tuning procedures were larger for smaller sample sizes, lower events fractions and fewer predictors. These results imply that the choice of tuning procedure can have a profound influence on the predictive performance of prediction models. The results support the application of standard 5-fold or 10-fold cross-validation that minimizes out-of-sample prediction error. Despite an increased computational burden, we found no clear benefit of repeated over non-repeated cross-validation for hyperparameter tuning. We warn against the potentially detrimental effects on model calibration of the popular 1SE CV rule for tuning prediction models in low-dimensional settings.


Assuntos
Projetos de Pesquisa , Humanos , Simulação por Computador , Tamanho da Amostra
6.
J Biomed Inform ; 155: 104666, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38848886

RESUMO

OBJECTIVE: Class imbalance is sometimes considered a problem when developing clinical prediction models and assessing their performance. To address it, correction strategies involving manipulations of the training dataset, such as random undersampling or oversampling, are frequently used. The aim of this article is to illustrate the consequences of these class imbalance correction strategies on clinical prediction models' internal validity in terms of calibration and discrimination performances. METHODS: We used both heuristic intuition and formal mathematical reasoning to characterize the relations between conditional probabilities of interest and probabilities targeted when using random undersampling or oversampling. We propose a plug-in estimator that represents a natural correction for predictions obtained from models that have been trained on artificially balanced datasets ("naïve" models). We conducted a Monte Carlo simulation with two different data generation processes and present a real-world example using data from the International Stroke Trial database to empirically demonstrate the consequences of applying random resampling techniques for class imbalance correction on calibration and discrimination (in terms of Area Under the ROC, AUC) for logistic regression and tree-based prediction models. RESULTS: Across our simulations and in the real-world example, calibration of the naïve models was very poor. The models using the plug-in estimator generally outperformed the models relying on class imbalance correction in terms of calibration while achieving the same discrimination performance. CONCLUSION: Random resampling techniques for class imbalance correction do not generally improve discrimination performance (i.e., AUC), and their use is hard to justify when aiming at providing calibrated predictions. Improper use of such class imbalance correction techniques can lead to suboptimal data usage and less valid risk prediction models.


Assuntos
Método de Monte Carlo , Humanos , Calibragem , Curva ROC , Modelos Estatísticos , Área Sob a Curva , Simulação por Computador , Modelos Logísticos , Algoritmos , Medição de Risco/métodos
8.
Ann Intern Med ; 176(1): 105-114, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36571841

RESUMO

Risk prediction models need thorough validation to assess their performance. Validation of models for survival outcomes poses challenges due to the censoring of observations and the varying time horizon at which predictions can be made. This article describes measures to evaluate predictions and the potential improvement in decision making from survival models based on Cox proportional hazards regression.As a motivating case study, the authors consider the prediction of the composite outcome of recurrence or death (the "event") in patients with breast cancer after surgery. They developed a simple Cox regression model with 3 predictors, as in the Nottingham Prognostic Index, in 2982 women (1275 events over 5 years of follow-up) and externally validated this model in 686 women (285 events over 5 years). Improvement in performance was assessed after the addition of progesterone receptor as a prognostic biomarker.The model predictions can be evaluated across the full range of observed follow-up times or for the event occurring by the end of a fixed time horizon of interest. The authors first discuss recommended statistical measures that evaluate model performance in terms of discrimination, calibration, or overall performance. Further, they evaluate the potential clinical utility of the model to support clinical decision making according to a net benefit measure. They provide SAS and R code to illustrate internal and external validation.The authors recommend the proposed set of performance measures for transparent reporting of the validity of predictions from survival models.


Assuntos
Neoplasias da Mama , Humanos , Feminino , Modelos de Riscos Proporcionais , Prognóstico
9.
BMC Med ; 21(1): 70, 2023 02 24.
Artigo em Inglês | MEDLINE | ID: mdl-36829188

RESUMO

BACKGROUND: Clinical prediction models should be validated before implementation in clinical practice. But is favorable performance at internal validation or one external validation sufficient to claim that a prediction model works well in the intended clinical context? MAIN BODY: We argue to the contrary because (1) patient populations vary, (2) measurement procedures vary, and (3) populations and measurements change over time. Hence, we have to expect heterogeneity in model performance between locations and settings, and across time. It follows that prediction models are never truly validated. This does not imply that validation is not important. Rather, the current focus on developing new models should shift to a focus on more extensive, well-conducted, and well-reported validation studies of promising models. CONCLUSION: Principled validation strategies are needed to understand and quantify heterogeneity, monitor performance over time, and update prediction models when appropriate. Such strategies will help to ensure that prediction models stay up-to-date and safe to support clinical decision-making.

10.
BMC Med Res Methodol ; 23(1): 276, 2023 11 24.
Artigo em Inglês | MEDLINE | ID: mdl-38001421

RESUMO

BACKGROUND: Assessing malignancy risk is important to choose appropriate management of ovarian tumors. We compared six algorithms to estimate the probabilities that an ovarian tumor is benign, borderline malignant, stage I primary invasive, stage II-IV primary invasive, or secondary metastatic. METHODS: This retrospective cohort study used 5909 patients recruited from 1999 to 2012 for model development, and 3199 patients recruited from 2012 to 2015 for model validation. Patients were recruited at oncology referral or general centers and underwent an ultrasound examination and surgery ≤ 120 days later. We developed models using standard multinomial logistic regression (MLR), Ridge MLR, random forest (RF), XGBoost, neural networks (NN), and support vector machines (SVM). We used nine clinical and ultrasound predictors but developed models with or without CA125. RESULTS: Most tumors were benign (3980 in development and 1688 in validation data), secondary metastatic tumors were least common (246 and 172). The c-statistic (AUROC) to discriminate benign from any type of malignant tumor ranged from 0.89 to 0.92 for models with CA125, from 0.89 to 0.91 for models without. The multiclass c-statistic ranged from 0.41 (SVM) to 0.55 (XGBoost) for models with CA125, and from 0.42 (SVM) to 0.51 (standard MLR) for models without. Multiclass calibration was best for RF and XGBoost. Estimated probabilities for a benign tumor in the same patient often differed by more than 0.2 (20% points) depending on the model. Net Benefit for diagnosing malignancy was similar for algorithms at the commonly used 10% risk threshold, but was slightly higher for RF at higher thresholds. Comparing models, between 3% (XGBoost vs. NN, with CA125) and 30% (NN vs. SVM, without CA125) of patients fell on opposite sides of the 10% threshold. CONCLUSION: Although several models had similarly good performance, individual probability estimates varied substantially.


Assuntos
Neoplasias Ovarianas , Feminino , Humanos , Estudos Retrospectivos , Incerteza , Neoplasias Ovarianas/diagnóstico por imagem , Neoplasias Ovarianas/patologia , Modelos Logísticos , Algoritmos , Antígeno Ca-125
11.
Eur Heart J ; 43(31): 2921-2930, 2022 08 14.
Artigo em Inglês | MEDLINE | ID: mdl-35639667

RESUMO

The medical field has seen a rapid increase in the development of artificial intelligence (AI)-based prediction models. With the introduction of such AI-based prediction model tools and software in cardiovascular patient care, the cardiovascular researcher and healthcare professional are challenged to understand the opportunities as well as the limitations of the AI-based predictions. In this article, we present 12 critical questions for cardiovascular health professionals to ask when confronted with an AI-based prediction model. We aim to support medical professionals to distinguish the AI-based prediction models that can add value to patient care from the AI that does not.


Assuntos
Inteligência Artificial , Doenças Cardiovasculares , Pessoal de Saúde , Humanos , Software
12.
Am J Obstet Gynecol ; 226(4): 560.e1-560.e24, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-34808130

RESUMO

BACKGROUND: Two randomized controlled trials compared the neonatal and infant outcomes after fetoscopic endoluminal tracheal occlusion with expectant prenatal management in fetuses with severe and moderate isolated congenital diaphragmatic hernia, respectively. Fetoscopic endoluminal tracheal occlusion was carried out at 27+0 to 29+6 weeks' gestation (referred to as "early") for severe and at 30+0 to 31+6 weeks ("late") for moderate hypoplasia. The reported absolute increase in the survival to discharge was 13% (95% confidence interval, -1 to 28; P=.059) and 25% (95% confidence interval, 6-46; P=.0091) for moderate and severe hypoplasia. OBJECTIVE: Data from the 2 trials were pooled to study the heterogeneity of the treatment effect by observed over expected lung-to-head ratio and explore the effect of gestational age at balloon insertion. STUDY DESIGN: Individual participant data from the 2 trials were reanalyzed. Women were assessed between 2008 and 2020 at 14 experienced fetoscopic endoluminal tracheal occlusion centers and were randomized in a 1:1 ratio to either expectant management or fetoscopic endoluminal tracheal occlusion. All received standardized postnatal management. The combined data involved 287 patients (196 with moderate hypoplasia and 91 with severe hypoplasia). The primary endpoint was survival to discharge from the neonatal intensive care unit. The secondary endpoints were survival to 6 months of age, survival to 6 months without oxygen supplementation, and gestational age at live birth. Penalized regression was used with the following covariates: intervention (fetoscopic endoluminal tracheal occlusion vs expectant), early balloon insertion (yes vs no), observed over expected lung-to-head ratio, liver herniation (yes vs no), and trial (severe vs moderate). The interaction between intervention and the observed over expected lung-to-head ratio was evaluated to study treatment effect heterogeneity. RESULTS: For survival to discharge, the adjusted odds ratio of fetoscopic endoluminal tracheal occlusion was 1.78 (95% confidence interval, 1.05-3.01; P=.031). The additional effect of early balloon insertion was highly uncertain (adjusted odds ratio, 1.53; 95% confidence interval, 0.60-3.91; P=.370). When combining these 2 effects, the adjusted odds ratio of fetoscopic endoluminal tracheal occlusion with early balloon insertion was 2.73 (95% confidence interval, 1.15-6.49). The results for survival to 6 months and survival to 6 months without oxygen dependence were comparable. The gestational age at delivery was on average 1.7 weeks earlier (95% confidence interval, 1.1-2.3) following fetoscopic endoluminal tracheal occlusion with late insertion and 3.2 weeks earlier (95% confidence interval, 2.3-4.1) following fetoscopic endoluminal tracheal occlusion with early insertion compared with expectant management. There was no evidence that the effect of fetoscopic endoluminal tracheal occlusion depended on the observed over expected lung-to-head ratio for any of the endpoints. CONCLUSION: This analysis suggests that fetoscopic endoluminal tracheal occlusion increases survival for both moderate and severe lung hypoplasia. The difference between the results for the Tracheal Occlusion To Accelerate Lung growth trials, when considered apart, may be because of the difference in the time point of balloon insertion. However, the effect of the time point of balloon insertion could not be robustly assessed because of a small sample size and the confounding effect of disease severity. Fetoscopic endoluminal tracheal occlusion with early balloon insertion in particular strongly increases the risk for preterm delivery.


Assuntos
Oclusão com Balão , Hérnias Diafragmáticas Congênitas , Oclusão com Balão/métodos , Feminino , Fetoscopia/métodos , Hérnias Diafragmáticas Congênitas/cirurgia , Humanos , Lactente , Recém-Nascido , Pulmão/cirurgia , Gravidez , Traqueia/cirurgia
13.
Stat Med ; 41(8): 1334-1360, 2022 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-34897756

RESUMO

Calibration is a vital aspect of the performance of risk prediction models, but research in the context of ordinal outcomes is scarce. This study compared calibration measures for risk models predicting a discrete ordinal outcome, and investigated the impact of the proportional odds assumption on calibration and overfitting. We studied the multinomial, cumulative, adjacent category, continuation ratio, and stereotype logit/logistic models. To assess calibration, we investigated calibration intercepts and slopes, calibration plots, and the estimated calibration index. Using large sample simulations, we studied the performance of models for risk estimation under various conditions, assuming that the true model has either a multinomial logistic form or a cumulative logit proportional odds form. Small sample simulations were used to compare the tendency for overfitting between models. As a case study, we developed models to diagnose the degree of coronary artery disease (five categories) in symptomatic patients. When the true model was multinomial logistic, proportional odds models often yielded poor risk estimates, with calibration slopes deviating considerably from unity even on large model development datasets. The stereotype logistic model improved the calibration slope, but still provided biased risk estimates for individual patients. When the true model had a cumulative logit proportional odds form, multinomial logistic regression provided biased risk estimates, although these biases were modest. Nonproportional odds models require more parameters to be estimated from the data, and hence suffered more from overfitting. Despite larger sample size requirements, we generally recommend multinomial logistic regression for risk prediction modeling of discrete ordinal outcomes.


Assuntos
Calibragem , Humanos , Modelos Logísticos , Probabilidade , Tamanho da Amostra
14.
BMC Med Res Methodol ; 22(1): 101, 2022 04 08.
Artigo em Inglês | MEDLINE | ID: mdl-35395724

RESUMO

BACKGROUND: Describe and evaluate the methodological conduct of prognostic prediction models developed using machine learning methods in oncology. METHODS: We conducted a systematic review in MEDLINE and Embase between 01/01/2019 and 05/09/2019, for studies developing a prognostic prediction model using machine learning methods in oncology. We used the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement, Prediction model Risk Of Bias ASsessment Tool (PROBAST) and CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies (CHARMS) to assess the methodological conduct of included publications. Results were summarised by modelling type: regression-, non-regression-based and ensemble machine learning models. RESULTS: Sixty-two publications met inclusion criteria developing 152 models across all publications. Forty-two models were regression-based, 71 were non-regression-based and 39 were ensemble models. A median of 647 individuals (IQR: 203 to 4059) and 195 events (IQR: 38 to 1269) were used for model development, and 553 individuals (IQR: 69 to 3069) and 50 events (IQR: 17.5 to 326.5) for model validation. A higher number of events per predictor was used for developing regression-based models (median: 8, IQR: 7.1 to 23.5), compared to alternative machine learning (median: 3.4, IQR: 1.1 to 19.1) and ensemble models (median: 1.7, IQR: 1.1 to 6). Sample size was rarely justified (n = 5/62; 8%). Some or all continuous predictors were categorised before modelling in 24 studies (39%). 46% (n = 24/62) of models reporting predictor selection before modelling used univariable analyses, and common method across all modelling types. Ten out of 24 models for time-to-event outcomes accounted for censoring (42%). A split sample approach was the most popular method for internal validation (n = 25/62, 40%). Calibration was reported in 11 studies. Less than half of models were reported or made available. CONCLUSIONS: The methodological conduct of machine learning based clinical prediction models is poor. Guidance is urgently needed, with increased awareness and education of minimum prediction modelling standards. Particular focus is needed on sample size estimation, development and validation analysis methods, and ensuring the model is available for independent validation, to improve quality of machine learning based clinical prediction models.


Assuntos
Aprendizado de Máquina , Oncologia , Projetos de Pesquisa , Viés , Humanos , Prognóstico
15.
Acta Obstet Gynecol Scand ; 101(1): 46-55, 2022 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-34817062

RESUMO

INTRODUCTION: There is no global agreement on how to best determine pregnancy of unknown location viability and location using biomarkers. Measurements of progesterone and ß human chorionic gonadotropin (ßhCG) are still used in clinical practice to exclude the possibility of a viable intrauterine pregnancy (VIUP). We evaluate the predictive value of progesterone, ßhCG, and ßhCG ratio cut-off levels to exclude a VIUP in women with a pregnancy of unknown location. MATERIAL AND METHODS: This was a secondary analysis of prospective multicenter study data of consecutive women with a pregnancy of unknown location between January 2015 and 2017 collected from dedicated early pregnancy assessment units of eight hospitals. Single progesterone and serial ßhCG measurements were taken. Women were followed up until final pregnancy outcome between 11 and 14 weeks of gestation was confirmed using transvaginal ultrasonography: (1) VIUP, (2) non-viable intrauterine pregnancy or failed pregnancy of unknown location, and (3) ectopic pregnancy or persisting pregnancy of unknown location. The predictive value of cut-off levels for ruling out VIUP were evaluated across a range of values likely to be encountered clinically for progesterone, ßhCG, and ßhCG ratio. RESULTS: Data from 2507 of 3272 (76.6%) women were suitable for analysis. All had data for ßhCG levels, 2248 (89.7%) had progesterone levels, and 1809 (72.2%) had ßhCG ratio. The likelihood of viability falls with the progesterone level. Although the median progesterone level associated with viability was 59 nmol/L, VIUP were identified with levels as low as 5 nmol/L. No single ßhCG cut-off reliably ruled out the presence of viability with certainty, even when the level was more than 3000 IU/L, there were 39/358 (11%) women who had a VIUP. The probability of viability decreases with the ßhCG ratio. Although the median ßhCG ratio associated with viability was 2.26, VIUP were identified with ratios as low as 1.02. A progesterone level below 2 nmol/L and ßhCG ratio below 0.87 were unlikely to be associated with viability but were not definitive when considering multiple imputation. CONCLUSIONS: Cut-off levels for ßhCG, ßhCG ratio, and progesterone are not safe to be used clinically to exclude viability in early pregnancy. Although ßhCG ratio and progesterone have slightly better performance in comparison, single ßhCG used in this manner is highly unreliable.


Assuntos
Gravidez Ectópica/diagnóstico , Diagnóstico Pré-Natal , Adulto , Gonadotropina Coriônica/metabolismo , Gonadotropina Coriônica Humana Subunidade beta/metabolismo , Estudos de Coortes , Feminino , Humanos , Londres , Valor Preditivo dos Testes , Gravidez , Gravidez Ectópica/sangue , Progesterona/metabolismo , Estudos Prospectivos , Medicina Estatal
16.
Transfus Med ; 32(4): 306-317, 2022 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-35543403

RESUMO

OBJECTIVE: Assess the prognostic value of pre-operative haemoglobin concentration (Hb) for identifying patients who develop severe post-operative anaemia or require blood transfusion following primary total hip or knee, or unicompartmental knee arthroplasty (THA, TKA, UKA). BACKGROUND: Pre-operative group and save (G&S), and post-operative Hb measurement may be unnecessary for many patients undergoing hip and knee arthroplasty provided individuals at greatest risk of severe post-operative anaemia can be identified. METHODS AND MATERIALS: Patients undergoing THA, TKA, or UKA between 2011 and 2018 were included. Outcomes were post-operative Hb below 70 and 80 g/L, and peri-operative blood transfusion. Logistic regression assessed the association between pre-operative Hb and each outcome. Decision curve analysis compared strategies for selecting patients for G&S and post-operative Hb measurement. RESULTS: 10 015 THA, TKA and UKA procedures were performed in 8582 patients. The incidence of blood transfusion (4.5%) decreased during the study. Using procedure specific Hb thresholds to select patients for pre-operative G&S and post-operative Hb testing had a greater net benefit than selecting all patients, no patients, or patients with pre-operative anaemia. CONCLUSIONS: Pre-operative G&S and post-operative Hb measurement may not be indicated for UKA or TKA when adopting restrictive transfusion thresholds, provided clinicians accept a 0.1% risk of patients developing severe undiagnosed post-operative anaemia (Hb < 70 g/L). The decision to perform these blood tests for THA patients should be based on local institutional data and selection of acceptable risk thresholds.


Assuntos
Anemia , Artroplastia do Joelho , Anemia/diagnóstico , Anemia/etiologia , Anemia/terapia , Artroplastia do Joelho/efeitos adversos , Artroplastia do Joelho/métodos , Transfusão de Sangue , Testes Hematológicos , Hemoglobinas/análise , Humanos
17.
Gynecol Obstet Invest ; 87(1): 54-61, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35152217

RESUMO

OBJECTIVES: The aim of this study was to develop a model that can discriminate between different etiologies of abnormal uterine bleeding. DESIGN: The International Endometrial Tumor Analysis 1 study is a multicenter observational diagnostic study in 18 bleeding clinics in 9 countries. Consecutive women with abnormal vaginal bleeding presenting for ultrasound examination (n = 2,417) were recruited. The histology was obtained from endometrial sampling, D&C, hysteroscopic resection, hysterectomy, or ultrasound follow-up for >1 year. METHODS: A model was developed using multinomial regression based on age, body mass index, and ultrasound predictors to distinguish between: (1) endometrial atrophy, (2) endometrial polyp or intracavitary myoma, (3) endometrial malignancy or atypical hyperplasia, (4) proliferative/secretory changes, endometritis, or hyperplasia without atypia and validated using leave-center-out cross-validation and bootstrapping. The main outcomes are the model's ability to discriminate between the four outcomes and the calibration of risk estimates. RESULTS: The median age in 2,417 women was 50 (interquartile range 43-57). 414 (17%) women had endometrial atrophy; 996 (41%) had a polyp or myoma; 155 (6%) had an endometrial malignancy or atypical hyperplasia; and 852 (35%) had proliferative/secretory changes, endometritis, or hyperplasia without atypia. The model distinguished well between malignant and benign histology (c-statistic 0.88 95% CI: 0.85-0.91) and between all benign histologies. The probabilities for each of the four outcomes were over- or underestimated depending on the centers. LIMITATIONS: Not all patients had a diagnosis based on histology. The model over- or underestimated the risk for certain outcomes in some centers, indicating local recalibration is advisable. CONCLUSIONS: The proposed model reliably distinguishes between four histological outcomes. This is the first model to discriminate between several outcomes and is the only model applicable when menopausal status is uncertain. The model could be useful for patient management and counseling, and aid in the interpretation of ultrasound findings. Future research is needed to externally validate and locally recalibrate the model.


Assuntos
Hiperplasia Endometrial , Neoplasias do Endométrio , Endometrite , Mioma , Pólipos , Lesões Pré-Cancerosas , Doenças Uterinas , Neoplasias Uterinas , Atrofia/complicações , Atrofia/diagnóstico por imagem , Atrofia/patologia , Hiperplasia Endometrial/complicações , Hiperplasia Endometrial/diagnóstico por imagem , Hiperplasia Endometrial/patologia , Neoplasias do Endométrio/patologia , Endometrite/complicações , Endometrite/diagnóstico por imagem , Endometrite/patologia , Endométrio/diagnóstico por imagem , Endométrio/patologia , Feminino , Humanos , Hiperplasia/complicações , Hiperplasia/patologia , Masculino , Mioma/complicações , Mioma/patologia , Pólipos/patologia , Lesões Pré-Cancerosas/complicações , Doenças Uterinas/patologia , Hemorragia Uterina/diagnóstico por imagem , Hemorragia Uterina/etiologia , Hemorragia Uterina/patologia , Neoplasias Uterinas/complicações , Neoplasias Uterinas/diagnóstico por imagem , Neoplasias Uterinas/patologia
18.
Stat Med ; 40(4): 859-864, 2021 02 20.
Artigo em Inglês | MEDLINE | ID: mdl-33283904

RESUMO

In 2019 we published a pair of articles in Statistics in Medicine that describe how to calculate the minimum sample size for developing a multivariable prediction model with a continuous outcome, or with a binary or time-to-event outcome. As for any sample size calculation, the approach requires the user to specify anticipated values for key parameters. In particular, for a prediction model with a binary outcome, the outcome proportion and a conservative estimate for the overall fit of the developed model as measured by the Cox-Snell R2 (proportion of variance explained) must be specified. This proposal raises the question of how to identify a plausible value for R2 in advance of model development. Our articles suggest researchers should identify R2 from closely related models already published in their field. In this letter, we present details on how to derive R2 using the reported C statistic (AUROC) for such existing prediction models with a binary outcome. The C statistic is commonly reported, and so our approach allows researchers to obtain R2 for subsequent sample size calculations for new models. Stata and R code is provided, and a small simulation study.


Assuntos
Tamanho da Amostra , Simulação por Computador , Humanos
20.
Eur J Clin Invest ; 50(5): e13229, 2020 May.
Artigo em Inglês | MEDLINE | ID: mdl-32281648

RESUMO

The role of P-values for null hypothesis testing is under debate. We aim to explore the impact of the significance threshold on estimates for the strengths of associations ("effects") and the implications for different types of epidemiological research. We consider situations with normal distribution of a true effect, while varying the effect size. We confirm the occurrence of "testimation bias": estimating effect size only if the test was statistically significant leads to exaggerated results. The absolute bias is largest for true effects around 0.7 times the size of the standard error: +220% bias if effects are selected after testing with P < .05, and +335% if tested with P < .005. Less bias was found for testing with P < .20 (+130%) and larger true effect sizes. We conclude that a lower P-value threshold for declaring statistical significance implies more exaggeration in an estimated effect. This implies that if a low threshold is used, effect size estimation should not be attempted, for example in the context of selecting promising discoveries that need further validation. Confirmatory studies, such as randomized controlled trials, might stick to the 0.05 threshold if adequately powered, while prediction modelling studies should use an even higher threshold, such as 0.2, to avoid strongly biased effect estimates.


Assuntos
Pesquisa Biomédica , Estatística como Assunto , Viés , Diagnóstico , Humanos , Prognóstico , Reprodutibilidade dos Testes
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA