Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 49
Filtrar
1.
NPJ Digit Med ; 7(1): 98, 2024 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-38637674

RESUMO

Accurate prediction of recurrence and progression in non-muscle invasive bladder cancer (NMIBC) is essential to inform management and eligibility for clinical trials. Despite substantial interest in developing artificial intelligence (AI) applications in NMIBC, their clinical readiness remains unclear. This systematic review aimed to critically appraise AI studies predicting NMIBC outcomes, and to identify common methodological and reporting pitfalls. MEDLINE, EMBASE, Web of Science, and Scopus were searched from inception to February 5th, 2024 for AI studies predicting NMIBC recurrence or progression. APPRAISE-AI was used to assess methodological and reporting quality of these studies. Performance between AI and non-AI approaches included within these studies were compared. A total of 15 studies (five on recurrence, four on progression, and six on both) were included. All studies were retrospective, with a median follow-up of 71 months (IQR 32-93) and median cohort size of 125 (IQR 93-309). Most studies were low quality, with only one classified as high quality. While AI models generally outperformed non-AI approaches with respect to accuracy, c-index, sensitivity, and specificity, this margin of benefit varied with study quality (median absolute performance difference was 10 for low, 22 for moderate, and 4 for high quality studies). Common pitfalls included dataset limitations, heterogeneous outcome definitions, methodological flaws, suboptimal model evaluation, and reproducibility issues. Recommendations to address these challenges are proposed. These findings emphasise the need for collaborative efforts between urological and AI communities paired with rigorous methodologies to develop higher quality models, enabling AI to reach its potential in enhancing NMIBC care.

2.
JAMA Netw Open ; 6(9): e2335377, 2023 09 05.
Artigo em Inglês | MEDLINE | ID: mdl-37747733

RESUMO

Importance: Artificial intelligence (AI) has gained considerable attention in health care, yet concerns have been raised around appropriate methods and fairness. Current AI reporting guidelines do not provide a means of quantifying overall quality of AI research, limiting their ability to compare models addressing the same clinical question. Objective: To develop a tool (APPRAISE-AI) to evaluate the methodological and reporting quality of AI prediction models for clinical decision support. Design, Setting, and Participants: This quality improvement study evaluated AI studies in the model development, silent, and clinical trial phases using the APPRAISE-AI tool, a quantitative method for evaluating quality of AI studies across 6 domains: clinical relevance, data quality, methodological conduct, robustness of results, reporting quality, and reproducibility. These domains included 24 items with a maximum overall score of 100 points. Points were assigned to each item, with higher points indicating stronger methodological or reporting quality. The tool was applied to a systematic review on machine learning to estimate sepsis that included articles published until September 13, 2019. Data analysis was performed from September to December 2022. Main Outcomes and Measures: The primary outcomes were interrater and intrarater reliability and the correlation between APPRAISE-AI scores and expert scores, 3-year citation rate, number of Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) low risk-of-bias domains, and overall adherence to the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) statement. Results: A total of 28 studies were included. Overall APPRAISE-AI scores ranged from 33 (low quality) to 67 (high quality). Most studies were moderate quality. The 5 lowest scoring items included source of data, sample size calculation, bias assessment, error analysis, and transparency. Overall APPRAISE-AI scores were associated with expert scores (Spearman ρ, 0.82; 95% CI, 0.64-0.91; P < .001), 3-year citation rate (Spearman ρ, 0.69; 95% CI, 0.43-0.85; P < .001), number of QUADAS-2 low risk-of-bias domains (Spearman ρ, 0.56; 95% CI, 0.24-0.77; P = .002), and adherence to the TRIPOD statement (Spearman ρ, 0.87; 95% CI, 0.73-0.94; P < .001). Intraclass correlation coefficient ranges for interrater and intrarater reliability were 0.74 to 1.00 for individual items, 0.81 to 0.99 for individual domains, and 0.91 to 0.98 for overall scores. Conclusions and Relevance: In this quality improvement study, APPRAISE-AI demonstrated strong interrater and intrarater reliability and correlated well with several study quality measures. This tool may provide a quantitative approach for investigators, reviewers, editors, and funding organizations to compare the research quality across AI studies for clinical decision support.


Assuntos
Inteligência Artificial , Sistemas de Apoio a Decisões Clínicas , Humanos , Reprodutibilidade dos Testes , Aprendizado de Máquina , Relevância Clínica
3.
Can Urol Assoc J ; 17(11): E395-E401, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37549345

RESUMO

INTRODUCTION: The use of artificial intelligence (AI) in urology is gaining significant traction. While previous reviews of AI applications in urology exist, there have been few attempts to synthesize existing literature on urothelial cancer (UC). METHODS: Comprehensive searches based on the concepts of "AI" and "urothelial cancer" were conducted in MEDLINE , EMBASE , Web of Science, and Scopus. Study selection and data abstraction were conducted by two independent reviewers. Two independent raters assessed study quality in a random sample of 25 studies with the prediction model risk of bias assessment tool (PROBAST) and the standardized reporting of machine learning applications in urology (STREAM-URO) framework. RESULTS: From a database search of 4581 studies, 227 were included. By area of research, 33% focused on image analysis, 26% on genomics, 16% on radiomics, and 15% on clinicopathology. Thematic content analysis identified qualitative trends in AI models employed and variables for feature extraction. Only 19% of studies compared performance of AI models to non-AI methods. All selected studies demonstrated high risk of bias for analysis and overall concern with Cohen's kappa (k)=0.68. Selected studies met 66% of STREAM-URO items, with k=0.76. CONCLUSIONS: The use of AI in UC is a topic of increasing importance; however, there is a need for improved standardized reporting, as evidenced by the high risk of bias and low methodologic quality identified in the included studies.

4.
Oral Dis ; 2023 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-37392423

RESUMO

OBJECTIVES: This systematic review aimed at evaluating the performance of artificial intelligence (AI) models in detecting dental caries on oral photographs. METHODS: Methodological characteristics and performance metrics of clinical studies reporting on deep learning and other machine learning algorithms were assessed. The risk of bias was evaluated using the quality assessment of diagnostic accuracy studies 2 (QUADAS-2) tool. A systematic search was conducted in EMBASE, Medline, and Scopus. RESULTS: Out of 3410 identified records, 19 studies were included with six and seven studies having low risk of biases and applicability concerns for all the domains, respectively. Metrics varied widely and were assessed on multiple levels. F1-scores for classification and detection tasks were 68.3%-94.3% and 42.8%-95.4%, respectively. Irrespective of the task, F1-scores were 68.3%-95.4% for professional cameras, 78.8%-87.6%, for intraoral cameras, and 42.8%-80% for smartphone cameras. Limited studies allowed assessing AI performance for lesions of different severity. CONCLUSION: Automatic detection of dental caries using AI may provide objective verification of clinicians' diagnoses and facilitate patient-clinician communication and teledentistry. Future studies should consider more robust study designs, employ comparable and standardized metrics, and focus on the severity of caries lesions.

5.
JAMIA Open ; 6(3): ooad046, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37425489

RESUMO

Background: Standard ontologies are critical for interoperability and multisite analyses of health data. Nevertheless, mapping concepts to ontologies is often done with generic tools and is labor-intensive. Contextualizing candidate concepts within source data is also done in an ad hoc manner. Methods and Results: We present AnnoDash, a flexible dashboard to support annotation of concepts with terms from a given ontology. Text-based similarity is used to identify likely matches, and large language models are used to improve ontology ranking. A convenient interface is provided to visualize observations associated with a concept, supporting the disambiguation of vague concept descriptions. Time-series plots contrast the concept with known clinical measurements. We evaluated the dashboard qualitatively against several ontologies (SNOMED CT, LOINC, etc.) by using MIMIC-IV measurements. The dashboard is web-based and step-by-step instructions for deployment are provided, simplifying usage for nontechnical audiences. The modular code structure enables users to extend upon components, including improving similarity scoring, constructing new plots, or configuring new ontologies. Conclusion: AnnoDash, an improved clinical terminology annotation tool, can facilitate data harmonizing by promoting mapping of clinical data. AnnoDash is freely available at https://github.com/justin13601/AnnoDash (https://doi.org/10.5281/zenodo.8043943).

6.
Int J Nurs Stud ; 145: 104529, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37307638

RESUMO

BACKGROUND: Institutions struggle with successful use of sepsis alerts within electronic health records. OBJECTIVE: Test the association of sepsis screening measurement criteria in discrimination of mortality and detection of sepsis in a large dataset. DESIGN: Retrospective, cohort study using a large United States (U.S.) intensive care database. The Institutional Review Board exempt status was obtained from Kansas University Medical Center Human Research Protection Program (10-1-2015). SETTING: 334 U.S. hospitals participating in the eICU Research Institute. PARTICIPANTS: Nine hundred twelve thousand five hundred and nine adult intensive care admissions from 183 hospitals. METHODS: Exposures included: systemic inflammatory response syndrome criteria ≥ 2 (Sepsis-1); systemic inflammatory response syndrome criteria with organ failure criteria ≥ 3.5 points (Sepsis-2); and sepsis-related organ failure assessment score ≥ 2 and quick score ≥ 2 (Sepsis-3). Discrimination of outcomes was determined with/without (adjusted/unadjusted) baseline risk exposure to a model. The receiver operating characteristic curve (AUROC) and odds ratios (ORs) for each decile of baseline risk of sepsis or death were assessed. RESULTS: Within the eligible cohort of 912,509, a total of 86,219 (9.4 %) patients did not survive their hospital stay and 186,870 (20.5 %) met the definition of suspected sepsis. For suspected sepsis discrimination, Sepsis-2 (unadjusted AUROC 0.67, 99 % CI: 0.66-0.67 and adjusted AUROC 0.77, 99 % CI: 0.77-0.77) outperformed Sepsis-3 (SOFA unadjusted AUROC 0.61, 99 % CI: 0.61-0.61 and adjusted AUROC 0.74, 99 % CI: 0.74-0.74) (qSOFA unadjusted AUROC 0.59, 99 % CI: 0.59-0.60 and adjusted AUROC 0.73, 99 % CI: 0.73-0.73). Sepsis-2 also outperformed Sepsis-1 (unadjusted AUROC 0.58, 99 % CI: 0.58-0.58 and adjusted AUROC 0.73, 99 % CI: 0.73-0.73). In between differences of AUROCs were statistically significantly different. Sepsis-2 ORs were higher for the outcome of suspected sepsis when considering deciles of risk than the other measurement systems. CONCLUSIONS AND RELEVANCE: Sepsis-2 outperformed other systems in suspected sepsis detection and was comparable to SOFA in prognostic accuracy of mortality in adult intensive care patients.


Assuntos
Sepse , Humanos , Adulto , Estudos de Coortes , Estudos Retrospectivos , Mortalidade Hospitalar , Sepse/diagnóstico , Unidades de Terapia Intensiva , Prognóstico , Curva ROC
7.
Lancet Digit Health ; 5(7): e435-e445, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37211455

RESUMO

BACKGROUND: Accurate prediction of side-specific extraprostatic extension (ssEPE) is essential for performing nerve-sparing surgery to mitigate treatment-related side-effects such as impotence and incontinence in patients with localised prostate cancer. Artificial intelligence (AI) might provide robust and personalised ssEPE predictions to better inform nerve-sparing strategy during radical prostatectomy. We aimed to develop, externally validate, and perform an algorithmic audit of an AI-based Side-specific Extra-Prostatic Extension Risk Assessment tool (SEPERA). METHODS: Each prostatic lobe was treated as an individual case such that each patient contributed two cases to the overall cohort. SEPERA was trained on 1022 cases from a community hospital network (Trillium Health Partners; Mississauga, ON, Canada) between 2010 and 2020. Subsequently, SEPERA was externally validated on 3914 cases across three academic centres: Princess Margaret Cancer Centre (Toronto, ON, Canada) from 2008 to 2020; L'Institut Mutualiste Montsouris (Paris, France) from 2010 to 2020; and Jules Bordet Institute (Brussels, Belgium) from 2015 to 2020. Model performance was characterised by area under the receiver operating characteristic curve (AUROC), area under the precision recall curve (AUPRC), calibration, and net benefit. SEPERA was compared against contemporary nomograms (ie, Sayyid nomogram, Soeterik nomogram [non-MRI and MRI]), as well as a separate logistic regression model using the same variables included in SEPERA. An algorithmic audit was performed to assess model bias and identify common patient characteristics among predictive errors. FINDINGS: Overall, 2468 patients comprising 4936 cases (ie, prostatic lobes) were included in this study. SEPERA was well calibrated and had the best performance across all validation cohorts (pooled AUROC of 0·77 [95% CI 0·75-0·78] and pooled AUPRC of 0·61 [0·58-0·63]). In patients with pathological ssEPE despite benign ipsilateral biopsies, SEPERA correctly predicted ssEPE in 72 (68%) of 106 cases compared with the other models (47 [44%] in the logistic regression model, none in the Sayyid model, 13 [12%] in the Soeterik non-MRI model, and five [5%] in the Soeterik MRI model). SEPERA had higher net benefit than the other models to predict ssEPE, enabling more patients to safely undergo nerve-sparing. In the algorithmic audit, no evidence of model bias was observed, with no significant difference in AUROC when stratified by race, biopsy year, age, biopsy type (systematic only vs systematic and MRI-targeted biopsy), biopsy location (academic vs community), and D'Amico risk group. According to the audit, the most common errors were false positives, particularly for older patients with high-risk disease. No aggressive tumours (ie, grade >2 or high-risk disease) were found among false negatives. INTERPRETATION: We demonstrated the accuracy, safety, and generalisability of using SEPERA to personalise nerve-sparing approaches during radical prostatectomy. FUNDING: None.


Assuntos
Inteligência Artificial , Próstata , Masculino , Humanos , Estudos Retrospectivos , Prostatectomia , Medição de Risco
9.
Chest ; 164(2): 355-368, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37040818

RESUMO

BACKGROUND: Evidence regarding acute kidney injury associated with concomitant administration of vancomycin and piperacillin-tazobactam is conflicting, particularly in patients in the ICU. RESEARCH QUESTION: Does a difference exist in the association between commonly prescribed empiric antibiotics on ICU admission (vancomycin and piperacillin-tazobactam, vancomycin and cefepime, and vancomycin and meropenem) and acute kidney injury? STUDY DESIGN AND METHODS: This was a retrospective cohort study using data from the eICU Research Institute, which contains records for ICU stays between 2010 and 2015 across 335 hospitals. Patients were enrolled if they received vancomycin and piperacillin-tazobactam, vancomycin and cefepime, or vancomycin and meropenem exclusively. Patients initially admitted to the ED were included. Patients with hospital stay duration of < 1 h, receiving dialysis, or with missing data were excluded. Acute kidney injury was defined as Kidney Disease: Improving Global Outcomes stage 2 or 3 based on serum creatinine component. Propensity score matching was used to match patients in the control (vancomycin and meropenem or vancomycin and cefepime) and treatment (vancomycin and piperacillin-tazobactam) groups, and ORs were calculated. Sensitivity analyses were performed to study the effect of longer courses of combination therapy and patients with renal insufficiency on admission. RESULTS: Thirty-five thousand six hundred fifty-four patients met inclusion criteria (vancomycin and piperacillin-tazobactam, n = 27,459; vancomycin and cefepime, n = 6,371; vancomycin and meropenem, n = 1,824). Vancomycin and piperacillin-tazobactam was associated with a higher risk of acute kidney injury and initiation of dialysis when compared with that of both vancomycin and cefepime (Acute kidney injury: OR, 1.37 [95% CI, 1.25-1.49]; dialysis: OR, 1.28 [95% CI, 1.14-1.45]) and vancomycin and meropenem (Acute kidney injury: OR, 1.27 [95%, 1.06-1.52]; dialysis: OR, 1.56 [95% CI, 1.23-2.00]). The odds of acute kidney injury developing was especially pronounced in patients without renal insufficiency receiving a longer duration of vancomycin and piperacillin-tazobactam therapy compared with vancomycin and meropenem therapy. INTERPRETATION: VPT is associated with a higher risk of acute kidney injury than both vancomycin and cefepime and vancomycin and meropenem in patients in the ICU, especially for patients with normal initial kidney function requiring longer durations of therapy. Clinicians should consider vancomycin and meropenem or vancomycin and cefepime to reduce the risk of nephrotoxicity for patients in the ICU.


Assuntos
Injúria Renal Aguda , Antibacterianos , Humanos , Antibacterianos/uso terapêutico , Cefepima/efeitos adversos , Vancomicina/efeitos adversos , Estudos Retrospectivos , Meropeném/efeitos adversos , Estado Terminal/terapia , Piperacilina/efeitos adversos , Quimioterapia Combinada , Combinação Piperacilina e Tazobactam/efeitos adversos , Injúria Renal Aguda/induzido quimicamente , Injúria Renal Aguda/epidemiologia
11.
Sci Data ; 10(1): 1, 2023 01 03.
Artigo em Inglês | MEDLINE | ID: mdl-36596836

RESUMO

Digital data collection during routine clinical practice is now ubiquitous within hospitals. The data contains valuable information on the care of patients and their response to treatments, offering exciting opportunities for research. Typically, data are stored within archival systems that are not intended to support research. These systems are often inaccessible to researchers and structured for optimal storage, rather than interpretability and analysis. Here we present MIMIC-IV, a publicly available database sourced from the electronic health record of the Beth Israel Deaconess Medical Center. Information available includes patient measurements, orders, diagnoses, procedures, treatments, and deidentified free-text clinical notes. MIMIC-IV is intended to support a wide array of research studies and educational material, helping to reduce barriers to conducting clinical research.


Assuntos
Registros Eletrônicos de Saúde , Humanos , Bases de Dados Factuais , Hospitais
12.
J Am Med Inform Assoc ; 30(4): 718-725, 2023 03 16.
Artigo em Inglês | MEDLINE | ID: mdl-36688534

RESUMO

OBJECTIVE: Convert the Medical Information Mart for Intensive Care (MIMIC)-IV database into Health Level 7 Fast Healthcare Interoperability Resources (FHIR). Additionally, generate and publish an openly available demo of the resources, and create a FHIR Implementation Guide to support and clarify the usage of MIMIC-IV on FHIR. MATERIALS AND METHODS: FHIR profiles and terminology system of MIMIC-IV were modeled from the base FHIR R4 resources. Data and terminology were reorganized from the relational structure into FHIR according to the profiles. Resources generated were validated for conformance with the FHIR profiles. Finally, FHIR resources were published as newline delimited JSON files and the profiles were packaged into an implementation guide. RESULTS: The modeling of MIMIC-IV in FHIR resulted in 25 profiles, 2 extensions, 35 ValueSets, and 34 CodeSystems. An implementation guide encompassing the FHIR modeling can be accessed at mimic.mit.edu/fhir/mimic. The generated demo dataset contained 100 patients and over 915 000 resources. The full dataset contained 315 000 patients covering approximately 5 840 000 resources. The final datasets in NDJSON format are accessible on PhysioNet. DISCUSSION: Our work highlights the challenges and benefits of generating a real-world FHIR store. The challenges arise from terminology mapping and profiling modeling decisions. The benefits come from the extensively validated openly accessible data created as a result of the modeling work. CONCLUSION: The newly created MIMIC-IV on FHIR provides one of the first accessible deidentified critical care FHIR datasets. The extensive real-world data found in MIMIC-IV on FHIR will be invaluable for research and the development of healthcare applications.


Assuntos
Nível Sete de Saúde , Disseminação de Informação , Armazenamento e Recuperação da Informação , Pacientes , Armazenamento e Recuperação da Informação/métodos , Armazenamento e Recuperação da Informação/normas , Humanos , Conjuntos de Dados como Assunto , Reprodutibilidade dos Testes , Registros Eletrônicos de Saúde , Disseminação de Informação/métodos
13.
JAMIA Open ; 5(4): ooac105, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-36570030

RESUMO

EHR-based sepsis research often uses heterogeneous definitions of sepsis leading to poor generalizability and difficulty in comparing studies to each other. We have developed OpenSep, an open-source pipeline for sepsis phenotyping according to the Sepsis-3 definition, as well as determination of time of sepsis onset and SOFA scores. The Minimal Sepsis Data Model was developed alongside the pipeline to enable the execution of the pipeline to diverse sources of electronic health record data. The pipeline's accuracy was validated by applying it to the MIMIC-IV version 1.0 data and comparing sepsis onset and SOFA scores to those produced by the pipeline developed by the curators of MIMIC. We demonstrated high reliability between both the sepsis onsets and SOFA scores, however the use of the Minimal Sepsis Data model developed for this work allows our pipeline to be applied to more broadly to data sources beyond MIMIC.

14.
15.
Sci Data ; 9(1): 487, 2022 08 10.
Artigo em Inglês | MEDLINE | ID: mdl-35948551

RESUMO

Chest radiographs allow for the meticulous examination of a patient's chest but demands specialized training for proper interpretation. Automated analysis of medical imaging has become increasingly accessible with the advent of machine learning (ML) algorithms. Large labeled datasets are key elements for training and validation of these ML solutions. In this paper we describe the Brazilian labeled chest x-ray dataset, BRAX: an automatically labeled dataset designed to assist researchers in the validation of ML models. The dataset contains 24,959 chest radiography studies from patients presenting to a large general Brazilian hospital. A total of 40,967 images are available in the BRAX dataset. All images have been verified by trained radiologists and de-identified to protect patient privacy. Fourteen labels were derived from free-text radiology reports written in Brazilian Portuguese using Natural Language Processing.


Assuntos
Algoritmos , Processamento de Linguagem Natural , Radiografia Torácica , Brasil , Humanos , Raios X
16.
Crit Care Med ; 50(7): 1040-1050, 2022 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-35354159

RESUMO

OBJECTIVES: To develop and demonstrate the feasibility of a Global Open Source Severity of Illness Score (GOSSIS)-1 for critical care patients, which generalizes across healthcare systems and countries. DESIGN: A merger of several critical care multicenter cohorts derived from registry and electronic health record data. Data were split into training (70%) and test (30%) sets, using each set exclusively for development and evaluation, respectively. Missing data were imputed when not available. SETTING/PATIENTS: Two large multicenter datasets from Australia and New Zealand (Australian and New Zealand Intensive Care Society Adult Patient Database [ANZICS-APD]) and the United States (eICU Collaborative Research Database [eICU-CRD]) representing 249,229 and 131,051 patients, respectively. ANZICS-APD and eICU-CRD contributed data from 162 and 204 hospitals, respectively. The cohort included all ICU admissions discharged in 2014-2015, excluding patients less than 16 years old, admissions less than 6 hours, and those with a previous ICU stay. INTERVENTIONS: Not applicable. MEASUREMENTS AND MAIN RESULTS: GOSSIS-1 uses data collected during the ICU stay's first 24 hours, including extrema values for vital signs and laboratory results, admission diagnosis, the Glasgow Coma Scale, chronic comorbidities, and admission/demographic variables. The datasets showed significant variation in admission-related variables, case-mix, and average physiologic state. Despite this heterogeneity, test set discrimination of GOSSIS-1 was high (area under the receiver operator characteristic curve [AUROC], 0.918; 95% CI, 0.915-0.921) and calibration was excellent (standardized mortality ratio [SMR], 0.986; 95% CI, 0.966-1.005; Brier score, 0.050). Performance was held within ANZICS-APD (AUROC, 0.925; SMR, 0.982; Brier score, 0.047) and eICU-CRD (AUROC, 0.904; SMR, 0.992; Brier score, 0.055). Compared with GOSSIS-1, Acute Physiology and Chronic Health Evaluation (APACHE)-IIIj (ANZICS-APD) and APACHE-IVa (eICU-CRD), had worse discrimination with AUROCs of 0.904 and 0.869, and poorer calibration with SMRs of 0.594 and 0.770, and Brier scores of 0.059 and 0.063, respectively. CONCLUSIONS: GOSSIS-1 is a modern, free, open-source inhospital mortality prediction algorithm for critical care patients, achieving excellent discrimination and calibration across three countries.


Assuntos
Cuidados Críticos , Unidades de Terapia Intensiva , APACHE , Adolescente , Adulto , Austrália , Mortalidade Hospitalar , Humanos
17.
Sci Rep ; 12(1): 2726, 2022 02 17.
Artigo em Inglês | MEDLINE | ID: mdl-35177653

RESUMO

Temporal dataset shift associated with changes in healthcare over time is a barrier to deploying machine learning-based clinical decision support systems. Algorithms that learn robust models by estimating invariant properties across time periods for domain generalization (DG) and unsupervised domain adaptation (UDA) might be suitable to proactively mitigate dataset shift. The objective was to characterize the impact of temporal dataset shift on clinical prediction models and benchmark DG and UDA algorithms on improving model robustness. In this cohort study, intensive care unit patients from the MIMIC-IV database were categorized by year groups (2008-2010, 2011-2013, 2014-2016 and 2017-2019). Tasks were predicting mortality, long length of stay, sepsis and invasive ventilation. Feedforward neural networks were used as prediction models. The baseline experiment trained models using empirical risk minimization (ERM) on 2008-2010 (ERM[08-10]) and evaluated them on subsequent year groups. DG experiment trained models using algorithms that estimated invariant properties using 2008-2016 and evaluated them on 2017-2019. UDA experiment leveraged unlabelled samples from 2017 to 2019 for unsupervised distribution matching. DG and UDA models were compared to ERM[08-16] models trained using 2008-2016. Main performance measures were area-under-the-receiver-operating-characteristic curve (AUROC), area-under-the-precision-recall curve and absolute calibration error. Threshold-based metrics including false-positives and false-negatives were used to assess the clinical impact of temporal dataset shift and its mitigation strategies. In the baseline experiments, dataset shift was most evident for sepsis prediction (maximum AUROC drop, 0.090; 95% confidence interval (CI), 0.080-0.101). Considering a scenario of 100 consecutively admitted patients showed that ERM[08-10] applied to 2017-2019 was associated with one additional false-negative among 11 patients with sepsis, when compared to the model applied to 2008-2010. When compared with ERM[08-16], DG and UDA experiments failed to produce more robust models (range of AUROC difference, - 0.003 to 0.050). In conclusion, DG and UDA failed to produce more robust models compared to ERM in the setting of temporal dataset shift. Alternate approaches are required to preserve model performance over time in clinical medicine.


Assuntos
Bases de Dados Factuais , Unidades de Terapia Intensiva , Tempo de Internação , Modelos Biológicos , Redes Neurais de Computação , Sepse , Idoso , Idoso de 80 Anos ou mais , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Sepse/mortalidade , Sepse/terapia
18.
J Immunother ; 44(8): 307-318, 2021 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-34406158

RESUMO

Long-term survival outcomes among melanoma patients with brain metastases treated with immune checkpoint inhibitors are limited. In this retrospective study at 2 centers, metastatic melanoma patients with radiographic evidence of brain metastases who received anti-programmed death-1 (PD-1) monotherapy or nivolumab in combination with ipilimumab between 2014 and 2017 were included. Overall survival (OS) was assessed in diagnosis-specific graded prognostic assessment (ds-GPA) and melanoma-molecular graded prognostic assessment (molGPA) prognostic risk groups. Baseline clinical covariates were used to identify predictors of OS in univariate/multivariable Cox proportional-hazards models. A total of 84 patients (58 monotherapy, 26 combination) were included with a median duration of follow-up of 43.4 months (maximum: 5.1 y). The median OS [95% confidence interval (CI)] was 3.1 months (1.8, 7) for ds-GPA 0-1, 22.1 months [5.4, not reached (NR)] for ds-GPA 2 and NR (24.9, NR) for ds-GPA 3-4 in the monotherapy cohort [hazard ratio (HR) for ds-GPA 3-4 vs. 0-1: 0.13 (95% CI: 0.052, 0.32); 0.29 (95% CI: 0.12, 0.63) for ds-GPA 2 vs. 0-1]. The median OS was 1.1 months (95% CI: 0.3, NR) for ds-GPA 0-1, 11.8 months (95% CI: 2.9, 23.3) for ds-GPA 2 and 24.4 months (95% CI: 3.4, NR) for ds-GPA 3-4 in the combination cohort [HR for 3-4 vs. 0-1: 0.013 (95% CI: 0.0012, 0.14); HR for ds-GPA 2 vs. 0-1: 0.033 (0.0035, 0.31)]. Predictors associated with longer survival included ds-GPA or molGPA>1 (among prognostic indices), neutrophil-to-lymphocyte ratio (<4 vs. ≥4), while high lactate dehydrogenase, neurological symptoms, and leptomeningeal metastases were associated with shorter survival. Baseline ds-GPA/molGPA>1 and neutrophil-to-lymphocyte ratio <4 were strong predictors of long-term survival to anti-PD-1-based immune checkpoint inhibitors in melanoma brain metastases patients previously naive to anti-PD-1 therapy in a real-world clinical setting treated at independent centers.


Assuntos
Protocolos de Quimioterapia Combinada Antineoplásica/uso terapêutico , Neoplasias Encefálicas/tratamento farmacológico , Antígeno CTLA-4/antagonistas & inibidores , Inibidores de Checkpoint Imunológico/uso terapêutico , Ipilimumab/uso terapêutico , Melanoma/tratamento farmacológico , Nivolumabe/uso terapêutico , Receptor de Morte Celular Programada 1/antagonistas & inibidores , Adulto , Idoso , Idoso de 80 Anos ou mais , Neoplasias Encefálicas/mortalidade , Neoplasias Encefálicas/secundário , Feminino , Humanos , Masculino , Melanoma/mortalidade , Melanoma/patologia , Pessoa de Meia-Idade , Prognóstico , Modelos de Riscos Proporcionais , Estudos Retrospectivos , Adulto Jovem
19.
PLoS One ; 16(7): e0253933, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34260619

RESUMO

BACKGROUND: Studies in patients receiving invasive ventilation show important differences in use of low tidal volume (VT) ventilation (LTVV) between females and males. The aims of this study were to describe temporal changes in VT and to determine what factors drive the sex difference in use of LTVV. METHODS AND FINDINGS: This is a posthoc analysis of 2 large longitudinal projects in 59 ICUs in the United States, the 'Medical information Mart for Intensive Care III' (MIMIC III) and the 'eICU Collaborative Research DataBase'. The proportion of patients under LTVV (median VT < 8 ml/kg PBW), was the primary outcome. Mediation analysis, a method to dissect total effect into direct and indirect effects, was used to understand which factors drive the sex difference. We included 3614 (44%) females and 4593 (56%) males. Median VT declined over the years, but with a persistent difference between females (from median 10.2 (9.1 to 11.4) to 8.2 (7.5 to 9.1) ml/kg PBW) vs. males (from median 9.2 [IQR 8.2 to 10.1] to 7.3 [IQR 6.6 to 8.0] ml/kg PBW) (P < .001). In females versus males, use of LTVV increased from 5 to 50% versus from 12 to 78% (difference, -27% [-29% to -25%]; P < .001). The sex difference was mainly driven by patients' body height and actual body weight (adjusted average causal mediation effect, -30% [-33% to -27%]; P < .001, and 4 [3% to 4%]; P < .001). CONCLUSIONS: While LTVV is increasingly used in females and males, females continue to receive LTVV less often than males. The sex difference is mainly driven by patients' body height and actual body weight, and not necessarily by sex. Use of LTVV in females could improve by paying more attention to a correct calculation of VT, i.e., using the correct body height.


Assuntos
Unidades de Terapia Intensiva , Análise de Mediação , Respiração Artificial , Caracteres Sexuais , Peso Corporal , Estudos de Coortes , Feminino , Humanos , Masculino , Análise Multivariada , Volume de Ventilação Pulmonar
20.
NPJ Digit Med ; 4(1): 25, 2021 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-33589700

RESUMO

Image-based teleconsultation using smartphones has become increasingly popular. In parallel, deep learning algorithms have been developed to detect radiological findings in chest X-rays (CXRs). However, the feasibility of using smartphones to automate this process has yet to be evaluated. This study developed a recalibration method to build deep learning models to detect radiological findings on CXR photographs. Two publicly available databases (MIMIC-CXR and CheXpert) were used to build the models, and four derivative datasets containing 6453 CXR photographs were collected to evaluate model performance. After recalibration, the model achieved areas under the receiver operating characteristic curve of 0.80 (95% confidence interval: 0.78-0.82), 0.88 (0.86-0.90), 0.81 (0.79-0.84), 0.79 (0.77-0.81), 0.84 (0.80-0.88), and 0.90 (0.88-0.92), respectively, for detecting cardiomegaly, edema, consolidation, atelectasis, pneumothorax, and pleural effusion. The recalibration strategy, respectively, recovered 84.9%, 83.5%, 53.2%, 57.8%, 69.9%, and 83.0% of performance losses of the uncalibrated model. We conclude that the recalibration method can transfer models from digital CXRs to CXR photographs, which is expected to help physicians' clinical works.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...