Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 256
Filtrar
1.
EBioMedicine ; 103: 105116, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38636199

RESUMO

BACKGROUND: Deep learning facilitates large-scale automated imaging evaluation of body composition. However, associations of body composition biomarkers with medical phenotypes have been underexplored. Phenome-wide association study (PheWAS) techniques search for medical phenotypes associated with biomarkers. A PheWAS integrating large-scale analysis of imaging biomarkers and electronic health record (EHR) data could discover previously unreported associations and validate expected associations. Here we use PheWAS methodology to determine the association of abdominal CT-based skeletal muscle metrics with medical phenotypes in a large North American cohort. METHODS: An automated deep learning pipeline was used to measure skeletal muscle index (SMI; biomarker of myopenia) and skeletal muscle density (SMD; biomarker of myosteatosis) from abdominal CT scans of adults between 2012 and 2018. A PheWAS was performed with logistic regression using patient sex and age as covariates to assess for associations between CT-derived muscle metrics and 611 common EHR-derived medical phenotypes. PheWAS P values were considered significant at a Bonferroni corrected threshold (α = 0.05/1222). FINDINGS: 17,646 adults (mean age, 56 years ± 19 [SD]; 57.5% women) were included. CT-derived SMI was significantly associated with 268 medical phenotypes; SMD with 340 medical phenotypes. Previously unreported associations with the highest magnitude of significance included higher SMI with decreased cardiac dysrhythmias (OR [95% CI], 0.59 [0.55-0.64]; P < 0.0001), decreased epilepsy (OR, 0.59 [0.50-0.70]; P < 0.0001), and increased elevated prostate-specific antigen (OR, 1.84 [1.47-2.31]; P < 0.0001), and higher SMD with decreased decubitus ulcers (OR, 0.36 [0.31-0.42]; P < 0.0001), sleep disorders (OR, 0.39 [0.32-0.47]; P < 0.0001), and osteomyelitis (OR, 0.43 [0.36-0.52]; P < 0.0001). INTERPRETATION: PheWAS methodology reveals previously unreported associations between CT-derived biomarkers of myopenia and myosteatosis and EHR medical phenotypes. The high-throughput PheWAS technique applied on a population scale can generate research hypotheses related to myopenia and myosteatosis and can be adapted to research possible associations of other imaging biomarkers with hundreds of EHR medical phenotypes. FUNDING: National Institutes of Health, Stanford AIMI-HAI pilot grant, Stanford Precision Health and Integrated Diagnostics, Stanford Cardiovascular Institute, Stanford Center for Digital Health, and Stanford Knight-Hennessy Scholars.


Assuntos
Fenótipo , Tomografia Computadorizada por Raios X , Humanos , Masculino , Feminino , Pessoa de Meia-Idade , Tomografia Computadorizada por Raios X/métodos , Adulto , Idoso , Composição Corporal , Biomarcadores , Fenômica/métodos , Estudo de Associação Genômica Ampla , Músculo Esquelético/diagnóstico por imagem , Músculo Esquelético/metabolismo , Registros Eletrônicos de Saúde , Aprendizado Profundo
2.
NPJ Digit Med ; 7(1): 42, 2024 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-38383884

RESUMO

A major barrier to deploying healthcare AI is trustworthiness. One form of trustworthiness is a model's robustness across subgroups: while models may exhibit expert-level performance on aggregate metrics, they often rely on non-causal features, leading to errors in hidden subgroups. To take a step closer towards trustworthy seizure onset detection from EEG, we propose to leverage annotations that are produced by healthcare personnel in routine clinical workflows-which we refer to as workflow notes-that include multiple event descriptions beyond seizures. Using workflow notes, we first show that by scaling training data to 68,920 EEG hours, seizure onset detection performance significantly improves by 12.3 AUROC (Area Under the Receiver Operating Characteristic) points compared to relying on smaller training sets with gold-standard labels. Second, we reveal that our binary seizure onset detection model underperforms on clinically relevant subgroups (e.g., up to a margin of 6.5 AUROC points between pediatrics and adults), while having significantly higher FPRs (False Positive Rates) on EEG clips showing non-epileptiform abnormalities (+19 FPR points). To improve model robustness to hidden subgroups, we train a multilabel model that classifies 26 attributes other than seizures (e.g., spikes and movement artifacts) and significantly improve overall performance (+5.9 AUROC points) while greatly improving performance among subgroups (up to +8.3 AUROC points) and decreasing false positives on non-epileptiform abnormalities (by 8 FPR points). Finally, we find that our multilabel model improves clinical utility (false positives per 24 EEG hours) by a factor of 2×.

3.
J Biomed Inform ; 147: 104522, 2023 11.
Artigo em Inglês | MEDLINE | ID: mdl-37827476

RESUMO

OBJECTIVE: Audit logs in electronic health record (EHR) systems capture interactions of providers with clinical data. We determine if machine learning (ML) models trained using audit logs in conjunction with clinical data ("observational supervision") outperform ML models trained using clinical data alone in clinical outcome prediction tasks, and whether they are more robust to temporal distribution shifts in the data. MATERIALS AND METHODS: Using clinical and audit log data from Stanford Healthcare, we trained and evaluated various ML models including logistic regression, support vector machine (SVM) classifiers, neural networks, random forests, and gradient boosted machines (GBMs) on clinical EHR data, with and without audit logs for two clinical outcome prediction tasks: major adverse kidney events within 120 days of ICU admission (MAKE-120) in acute kidney injury (AKI) patients and 30-day readmission in acute stroke patients. We further tested the best performing models using patient data acquired during different time-intervals to evaluate the impact of temporal distribution shifts on model performance. RESULTS: Performance generally improved for all models when trained with clinical EHR data and audit log data compared with those trained with only clinical EHR data, with GBMs tending to have the overall best performance. GBMs trained with clinical EHR data and audit logs outperformed GBMs trained without audit logs in both clinical outcome prediction tasks: AUROC 0.88 (95% CI: 0.85-0.91) vs. 0.79 (95% CI: 0.77-0.81), respectively, for MAKE-120 prediction in AKI patients, and AUROC 0.74 (95% CI: 0.71-0.77) vs. 0.63 (95% CI: 0.62-0.64), respectively, for 30-day readmission prediction in acute stroke patients. The performance of GBM models trained using audit log and clinical data degraded less in later time-intervals than models trained using only clinical data. CONCLUSION: Observational supervision with audit logs improved the performance of ML models trained to predict important clinical outcomes in patients with AKI and acute stroke, and improved robustness to temporal distribution shifts.


Assuntos
Injúria Renal Aguda , Acidente Vascular Cerebral , Humanos , Registros Eletrônicos de Saúde , Hospitalização , Prognóstico
4.
Breast Cancer Res ; 25(1): 92, 2023 08 06.
Artigo em Inglês | MEDLINE | ID: mdl-37544983

RESUMO

BACKGROUND: Breast density is strongly associated with breast cancer risk. Fully automated quantitative density assessment methods have recently been developed that could facilitate large-scale studies, although data on associations with long-term breast cancer risk are limited. We examined LIBRA assessments and breast cancer risk and compared results to prior assessments using Cumulus, an established computer-assisted method requiring manual thresholding. METHODS: We conducted a cohort study among 21,150 non-Hispanic white female participants of the Research Program in Genes, Environment and Health of Kaiser Permanente Northern California who were 40-74 years at enrollment, followed for up to 10 years, and had archived processed screening mammograms acquired on Hologic or General Electric full-field digital mammography (FFDM) machines and prior Cumulus density assessments available for analysis. Dense area (DA), non-dense area (NDA), and percent density (PD) were assessed using LIBRA software. Cox regression was used to estimate hazard ratios (HRs) for breast cancer associated with DA, NDA and PD modeled continuously in standard deviation (SD) increments, adjusting for age, mammogram year, body mass index, parity, first-degree family history of breast cancer, and menopausal hormone use. We also examined differences by machine type and breast view. RESULTS: The adjusted HRs for breast cancer associated with each SD increment of DA, NDA and PD were 1.36 (95% confidence interval, 1.18-1.57), 0.85 (0.77-0.93) and 1.44 (1.26-1.66) for LIBRA and 1.44 (1.33-1.55), 0.81 (0.74-0.89) and 1.54 (1.34-1.77) for Cumulus, respectively. LIBRA results were generally similar by machine type and breast view, although associations were strongest for Hologic machines and mediolateral oblique views. Results were also similar during the first 2 years, 2-5 years and 5-10 years after the baseline mammogram. CONCLUSION: Associations with breast cancer risk were generally similar for LIBRA and Cumulus density measures and were sustained for up to 10 years. These findings support the suitability of fully automated LIBRA assessments on processed FFDM images for large-scale research on breast density and cancer risk.


Assuntos
Neoplasias da Mama , Feminino , Humanos , Neoplasias da Mama/diagnóstico por imagem , Neoplasias da Mama/epidemiologia , Densidade da Mama , Estudos de Coortes , Brancos , Mama/diagnóstico por imagem , Mamografia/métodos , Fatores de Risco , Estudos de Casos e Controles
5.
J Med Imaging (Bellingham) ; 10(3): 034004, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-37388280

RESUMO

Purpose: Our study investigates whether graph-based fusion of imaging data with non-imaging electronic health records (EHR) data can improve the prediction of the disease trajectories for patients with coronavirus disease 2019 (COVID-19) beyond the prediction performance of only imaging or non-imaging EHR data. Approach: We present a fusion framework for fine-grained clinical outcome prediction [discharge, intensive care unit (ICU) admission, or death] that fuses imaging and non-imaging information using a similarity-based graph structure. Node features are represented by image embedding, and edges are encoded with clinical or demographic similarity. Results: Experiments on data collected from the Emory Healthcare Network indicate that our fusion modeling scheme performs consistently better than predictive models developed using only imaging or non-imaging features, with area under the receiver operating characteristics curve of 0.76, 0.90, and 0.75 for discharge from hospital, mortality, and ICU admission, respectively. External validation was performed on data collected from the Mayo Clinic. Our scheme highlights known biases in the model prediction, such as bias against patients with alcohol abuse history and bias based on insurance status. Conclusions: Our study signifies the importance of the fusion of multiple data modalities for the accurate prediction of clinical trajectories. The proposed graph structure can model relationships between patients based on non-imaging EHR data, and graph convolutional networks can fuse this relationship information with imaging data to effectively predict future disease trajectory more effectively than models employing only imaging or non-imaging data. Our graph-based fusion modeling frameworks can be easily extended to other prediction tasks to efficiently combine imaging data with non-imaging clinical data.

6.
Tomography ; 9(3): 995-1009, 2023 05 11.
Artigo em Inglês | MEDLINE | ID: mdl-37218941

RESUMO

Preclinical imaging is a critical component in translational research with significant complexities in workflow and site differences in deployment. Importantly, the National Cancer Institute's (NCI) precision medicine initiative emphasizes the use of translational co-clinical oncology models to address the biological and molecular bases of cancer prevention and treatment. The use of oncology models, such as patient-derived tumor xenografts (PDX) and genetically engineered mouse models (GEMMs), has ushered in an era of co-clinical trials by which preclinical studies can inform clinical trials and protocols, thus bridging the translational divide in cancer research. Similarly, preclinical imaging fills a translational gap as an enabling technology for translational imaging research. Unlike clinical imaging, where equipment manufacturers strive to meet standards in practice at clinical sites, standards are neither fully developed nor implemented in preclinical imaging. This fundamentally limits the collection and reporting of metadata to qualify preclinical imaging studies, thereby hindering open science and impacting the reproducibility of co-clinical imaging research. To begin to address these issues, the NCI co-clinical imaging research program (CIRP) conducted a survey to identify metadata requirements for reproducible quantitative co-clinical imaging. The enclosed consensus-based report summarizes co-clinical imaging metadata information (CIMI) to support quantitative co-clinical imaging research with broad implications for capturing co-clinical data, enabling interoperability and data sharing, as well as potentially leading to updates to the preclinical Digital Imaging and Communications in Medicine (DICOM) standard.


Assuntos
Metadados , Neoplasias , Animais , Camundongos , Humanos , Reprodutibilidade dos Testes , Diagnóstico por Imagem , Neoplasias/diagnóstico por imagem , Padrões de Referência
7.
Tomography ; 9(2): 810-828, 2023 04 10.
Artigo em Inglês | MEDLINE | ID: mdl-37104137

RESUMO

Co-clinical trials are the concurrent or sequential evaluation of therapeutics in both patients clinically and patient-derived xenografts (PDX) pre-clinically, in a manner designed to match the pharmacokinetics and pharmacodynamics of the agent(s) used. The primary goal is to determine the degree to which PDX cohort responses recapitulate patient cohort responses at the phenotypic and molecular levels, such that pre-clinical and clinical trials can inform one another. A major issue is how to manage, integrate, and analyze the abundance of data generated across both spatial and temporal scales, as well as across species. To address this issue, we are developing MIRACCL (molecular and imaging response analysis of co-clinical trials), a web-based analytical tool. For prototyping, we simulated data for a co-clinical trial in "triple-negative" breast cancer (TNBC) by pairing pre- (T0) and on-treatment (T1) magnetic resonance imaging (MRI) from the I-SPY2 trial, as well as PDX-based T0 and T1 MRI. Baseline (T0) and on-treatment (T1) RNA expression data were also simulated for TNBC and PDX. Image features derived from both datasets were cross-referenced to omic data to evaluate MIRACCL functionality for correlating and displaying MRI-based changes in tumor size, vascularity, and cellularity with changes in mRNA expression as a function of treatment.


Assuntos
Neoplasias de Mama Triplo Negativas , Humanos , Neoplasias de Mama Triplo Negativas/patologia , Imageamento por Ressonância Magnética , Processamento de Imagem Assistida por Computador
8.
IEEE Trans Med Imaging ; 42(7): 1932-1943, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37018314

RESUMO

The collection and curation of large-scale medical datasets from multiple institutions is essential for training accurate deep learning models, but privacy concerns often hinder data sharing. Federated learning (FL) is a promising solution that enables privacy-preserving collaborative learning among different institutions, but it generally suffers from performance deterioration due to heterogeneous data distributions and a lack of quality labeled data. In this paper, we present a robust and label-efficient self-supervised FL framework for medical image analysis. Our method introduces a novel Transformer-based self-supervised pre-training paradigm that pre-trains models directly on decentralized target task datasets using masked image modeling, to facilitate more robust representation learning on heterogeneous data and effective knowledge transfer to downstream models. Extensive empirical results on simulated and real-world medical imaging non-IID federated datasets show that masked image modeling with Transformers significantly improves the robustness of models against various degrees of data heterogeneity. Notably, under severe data heterogeneity, our method, without relying on any additional pre-training data, achieves an improvement of 5.06%, 1.53% and 4.58% in test accuracy on retinal, dermatology and chest X-ray classification compared to the supervised baseline with ImageNet pre-training. In addition, we show that our federated self-supervised pre-training methods yield models that generalize better to out-of-distribution data and perform more effectively when fine-tuning with limited labeled data, compared to existing FL algorithms. The code is available at https://github.com/rui-yan/SSL-FL.


Assuntos
Algoritmos , Diagnóstico por Imagem , Radiografia , Retina
9.
Artigo em Inglês | MEDLINE | ID: mdl-37018684

RESUMO

Reduction in 30-day readmission rate is an important quality factor for hospitals as it can reduce the overall cost of care and improve patient post-discharge outcomes. While deep-learning-based studies have shown promising empirical results, several limitations exist in prior models for hospital readmission prediction, such as: (a) only patients with certain conditions are considered, (b) do not leverage data temporality, (c) individual admissions are assumed independent of each other, which ignores patient similarity, (d) limited to single modality or single center data. In this study, we propose a multimodal, spatiotemporal graph neural network (MM-STGNN) for prediction of 30-day all-cause hospital readmission, which fuses in-patient multimodal, longitudinal data and models patient similarity using a graph. Using longitudinal chest radiographs and electronic health records from two independent centers, we show that MM-STGNN achieved an area under the receiver operating characteristic curve (AUROC) of 0.79 on both datasets. Furthermore, MM-STGNN significantly outperformed the current clinical reference standard, LACE+ (AUROC=0.61), on the internal dataset. For subset populations of patients with heart disease, our model significantly outperformed baselines, such as gradient-boosting and Long Short-Term Memory models (e.g., AUROC improved by 3.7 points in patients with heart disease). Qualitative interpretability analysis indicated that while patients' primary diagnoses were not explicitly used to train the model, features crucial for model prediction may reflect patients' diagnoses. Our model could be utilized as an additional clinical decision aid during discharge disposition and triaging high-risk patients for closer post-discharge follow-up for potential preventive measures.

10.
J Cardiovasc Electrophysiol ; 34(5): 1164-1174, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-36934383

RESUMO

BACKGROUND: Structural changes in the left atrium (LA) modestly predict outcomes in patients undergoing catheter ablation for atrial fibrillation (AF). Machine learning (ML) is a promising approach to personalize AF management strategies and improve predictive risk models after catheter ablation by integrating atrial geometry from cardiac computed tomography (CT) scans and patient-specific clinical data. We hypothesized that ML approaches based on a patient's specific data can identify responders to AF ablation. METHODS: Consecutive patients undergoing AF ablation, who had preprocedural CT scans, demographics, and 1-year follow-up data, were included in the study for a retrospective analysis. The inputs of models were CT-derived morphological features from left atrial segmentation (including the shape, volume of the LA, LA appendage, and pulmonary vein ostia) along with deep features learned directly from raw CT images, and clinical data. These were merged intelligently in a framework to learn their individual importance and produce the optimal classification. RESULTS: Three hundred twenty-one patients (64.2 ± 10.6 years, 69% male, 40% paroxysmal AF) were analyzed. Post 10-fold nested cross-validation, the model trained to intelligently merge and learn appropriate weights for clinical, morphological, and imaging data (AUC 0.821) outperformed those trained solely on clinical data (AUC 0.626), morphological (AUC 0.659), or imaging data (AUC 0.764). CONCLUSION: Our ML approach provides an end-to-end automated technique to predict AF ablation outcomes using deep learning from CT images, derived structural properties of LA, augmented by incorporation of clinical data in a merged ML framework. This can help develop personalized strategies for patient selection in invasive management of AF.


Assuntos
Fibrilação Atrial , Ablação por Cateter , Veias Pulmonares , Humanos , Masculino , Feminino , Fibrilação Atrial/diagnóstico por imagem , Fibrilação Atrial/cirurgia , Fibrilação Atrial/etiologia , Estudos Retrospectivos , Resultado do Tratamento , Átrios do Coração/diagnóstico por imagem , Átrios do Coração/cirurgia , Tomografia Computadorizada por Raios X/métodos , Ablação por Cateter/efeitos adversos , Ablação por Cateter/métodos , Aprendizado de Máquina , Recidiva , Veias Pulmonares/diagnóstico por imagem , Veias Pulmonares/cirurgia
11.
Comput Biol Med ; 154: 106594, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36753979

RESUMO

State-of-the-art (SOTA) convolutional neural network models have been widely adapted in medical imaging and applied to address different clinical problems. However, the complexity and scale of such models may not be justified in medical imaging and subject to the available resource budget. Further increasing the number of representative feature maps for the classification task decreases the model explainability. The current data normalization practice is fixed prior to model development and discounting the specification of the data domain. Acknowledging these issues, the current work proposed a new scalable model family called PlexusNet; the block architecture and model scaling by the network's depth, width, and branch regulate PlexusNet's architecture. The efficient computation costs outlined the dimensions of PlexusNet scaling and design. PlexusNet includes a new learnable data normalization algorithm for better data generalization. We applied a simple yet effective neural architecture search to design PlexusNet tailored to five clinical classification problems that achieve a performance noninferior to the SOTA models ResNet-18 and EfficientNet B0/1. It also does so with lower parameter capacity and representative feature maps in ten-fold ranges than the smallest SOTA models with comparable performance. The visualization of representative features revealed distinguishable clusters associated with categories based on latent features generated by PlexusNet. The package and source code are at https://github.com/oeminaga/PlexusNet.git.


Assuntos
Algoritmos , Redes Neurais de Computação , Diagnóstico por Imagem , Adaptação Fisiológica
12.
J Clin Med ; 12(1)2023 Jan 03.
Artigo em Inglês | MEDLINE | ID: mdl-36615186

RESUMO

With the progression of diabetic retinopathy (DR) from the non-proliferative (NPDR) to proliferative (PDR) stage, the possibility of vision impairment increases significantly. Therefore, it is clinically important to detect the progression to PDR stage for proper intervention. We propose a segmentation-assisted DR classification methodology, that builds on (and improves) current methods by using a fully convolutional network (FCN) to segment retinal neovascularizations (NV) in retinal images prior to image classification. This study utilizes the Kaggle EyePacs dataset, containing retinal photographs from patients with varying degrees of DR (mild, moderate, severe NPDR and PDR. Two graders annotated the NV (a board-certified ophthalmologist and a trained medical student). Segmentation was performed by training an FCN to locate neovascularization on 669 retinal fundus photographs labeled with PDR status according to NV presence. The trained segmentation model was used to locate probable NV in images from the classification dataset. Finally, a CNN was trained to classify the combined images and probability maps into categories of PDR. The mean accuracy of segmentation-assisted classification was 87.71% on the test set (SD = 7.71%). Segmentation-assisted classification of PDR achieved accuracy that was 7.74% better than classification alone. Our study shows that segmentation assistance improves identification of the most severe stage of diabetic retinopathy and has the potential to improve deep learning performance in other imaging problems with limited data availability.

13.
medRxiv ; 2022 Oct 28.
Artigo em Inglês | MEDLINE | ID: mdl-36324799

RESUMO

We propose a relational graph to incorporate clinical similarity between patients while building personalized clinical event predictors with a focus on hospitalized COVID-19 patients. Our graph formation process fuses heterogeneous data, i.e., chest X-rays as node features and non-imaging EHR for edge formation. While node represents a snap-shot in time for a single patient, weighted edge structure encodes complex clinical patterns among patients. While age and gender have been used in the past for patient graph formation, our method incorporates complex clinical history while avoiding manual feature selection. The model learns from the patient's own data as well as patterns among clinically-similar patients. Our visualization study investigates the effects of 'neighborhood' of a node on its predictiveness and showcases the model's tendency to focus on edge-connected patients with highly suggestive clinical features common with the node. The proposed model generalizes well by allowing edge formation process to adapt to an external cohort.

14.
JCO Clin Cancer Inform ; 6: e2200019, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35802836

RESUMO

PURPOSE: For real-world evidence, it is convenient to use routinely collected data from the electronic medical record (EMR) to measure survival outcomes. However, patients can become lost to follow-up, causing incomplete data and biased survival time estimates. We quantified this issue for patients with metastatic cancer seen in an academic health system by comparing survival estimates from EMR data only and from EMR data combined with high-quality cancer registry data. MATERIALS AND METHODS: Patients diagnosed with metastatic cancer from 2008 to 2014 were included in this retrospective study. Patients who were diagnosed with cancer or received their initial treatment within our system were included in the institutional cancer registry and this study. Overall survival was calculated using the Kaplan-Meier method. Survival curves were generated in two ways: using EMR follow-up data alone and using EMR data supplemented with data from the Stanford Cancer Registry/California Cancer Registry. RESULTS: Four thousand seventy-seven patients were included. The median follow-up using EMR + Cancer Registry data was 19.9 months, and the median follow-up in surviving patients was 67.6 months. There were 1,301 deaths recorded in the EMR and 3,140 deaths recorded in the Cancer Registry. The median overall survival from the date of cancer diagnosis using EMR data was 58.7 months (95% CI, 54.2 to 63.2); using EMR + Cancer Registry data, it was 20.8 months (95% CI, 19.6 to 22.3). A similar pattern was seen using the date of first systemic therapy or date of first hospital admission as the baseline date. CONCLUSION: Using EMR data alone, survival time was overestimated compared with EMR + Cancer Registry data.


Assuntos
Registros Eletrônicos de Saúde , Neoplasias , Seguimentos , Humanos , Neoplasias/diagnóstico , Neoplasias/terapia , Sistema de Registros , Estudos Retrospectivos
15.
Circ Arrhythm Electrophysiol ; 15(8): e010850, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35867397

RESUMO

BACKGROUND: Machine learning is a promising approach to personalize atrial fibrillation management strategies for patients after catheter ablation. Prior atrial fibrillation ablation outcome prediction studies applied classical machine learning methods to hand-crafted clinical scores, and none have leveraged intracardiac electrograms or 12-lead surface electrocardiograms for outcome prediction. We hypothesized that (1) machine learning models trained on electrograms or electrocardiogram (ECG) signals can perform better at predicting patient outcomes after atrial fibrillation ablation than existing clinical scores and (2) multimodal fusion of electrogram, ECG, and clinical features can further improve the prediction of patient outcomes. METHODS: Consecutive patients who underwent catheter ablation between 2015 and 2017 with panoramic left atrial electrogram before ablation and clinical follow-up for at least 1 year following ablation were included. Convolutional neural network and a novel multimodal fusion framework were developed for predicting 1-year atrial fibrillation recurrence after catheter ablation from electrogram, ECG signals, and clinical features. The models were trained and validated using 10-fold cross-validation on patient-level splits. RESULTS: One hundred fifty-six patients (64.5±10.5 years, 74% male, 42% paroxysmal) were analyzed. Using electrogram signals alone, the convolutional neural network achieved an area under the receiver operating characteristics curve (AUROC) of 0.731, outperforming the existing APPLE scores (AUROC=0.644) and CHA2DS2-VASc scores (AUROC=0.650). Similarly using 12-lead ECG alone, the convolutional neural network achieved an AUROC of 0.767. Combining electrogram, ECG, and clinical features, the fusion model achieved an AUROC of 0.859, outperforming single and dual modality models. CONCLUSIONS: Deep neural networks trained on electrogram or ECG signals improved the prediction of catheter ablation outcome compared with existing clinical scores, and fusion of electrogram, ECG, and clinical features further improved the prediction. This suggests the promise of using machine learning to help treatment planning for patients after catheter ablation.


Assuntos
Fibrilação Atrial , Ablação por Cateter , Fibrilação Atrial/diagnóstico , Fibrilação Atrial/etiologia , Fibrilação Atrial/cirurgia , Ablação por Cateter/efeitos adversos , Feminino , Átrios do Coração/cirurgia , Humanos , Aprendizado de Máquina , Masculino , Valor Preditivo dos Testes , Recidiva , Resultado do Tratamento
16.
Radiol Artif Intell ; 4(3): e210174, 2022 May.
Artigo em Inglês | MEDLINE | ID: mdl-35652118

RESUMO

Purpose: To develop a deep learning-based risk stratification system for thyroid nodules using US cine images. Materials and Methods: In this retrospective study, 192 biopsy-confirmed thyroid nodules (175 benign, 17 malignant) in 167 unique patients (mean age, 56 years ± 16 [SD], 137 women) undergoing cine US between April 2017 and May 2018 with American College of Radiology (ACR) Thyroid Imaging Reporting and Data System (TI-RADS)-structured radiology reports were evaluated. A deep learning-based system that exploits the cine images obtained during three-dimensional volumetric thyroid scans and outputs malignancy risk was developed and compared, using fivefold cross-validation, against a two-dimensional (2D) deep learning-based model (Static-2DCNN), a radiomics-based model using cine images (Cine-Radiomics), and the ACR TI-RADS level, with histopathologic diagnosis as ground truth. The system was used to revise the ACR TI-RADS recommendation, and its diagnostic performance was compared against the original ACR TI-RADS. Results: The system achieved higher average area under the receiver operating characteristic curve (AUC, 0.88) than Static-2DCNN (0.72, P = .03) and tended toward higher average AUC than Cine-Radiomics (0.78, P = .16) and ACR TI-RADS level (0.80, P = .21). The system downgraded recommendations for 92 benign and two malignant nodules and upgraded none. The revised recommendation achieved higher specificity (139 of 175, 79.4%) than the original ACR TI-RADS (47 of 175, 26.9%; P < .001), with no difference in sensitivity (12 of 17, 71% and 14 of 17, 82%, respectively; P = .63). Conclusion: The risk stratification system using US cine images had higher diagnostic performance than prior models and improved specificity of ACR TI-RADS when used to revise ACR TI-RADS recommendation.Keywords: Neural Networks, US, Abdomen/GI, Head/Neck, Thyroid, Computer Applications-3D, Oncology, Diagnosis, Supervised Learning, Transfer Learning, Convolutional Neural Network (CNN) Supplemental material is available for this article. © RSNA, 2022.

17.
IEEE J Biomed Health Inform ; 26(9): 4635-4644, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-35749336

RESUMO

Federated learning is an emerging research paradigm for enabling collaboratively training deep learning models without sharing patient data. However, the data from different institutions are usually heterogeneous across institutions, which may reduce the performance of models trained using federated learning. In this study, we propose a novel heterogeneity-aware federated learning method, SplitAVG, to overcome the performance drops from data heterogeneity in federated learning. Unlike previous federated methods that require complex heuristic training or hyper parameter tuning, our SplitAVG leverages the simple network split and feature map concatenation strategies to encourage the federated model training an unbiased estimator of the target data distribution. We compare SplitAVG with seven state-of-the-art federated learning methods, using centrally hosted training data as the baseline on a suite of both synthetic and real-world federated datasets. We find that the performance of models trained using all the comparison federated learning methods degraded significantly with the increasing degrees of data heterogeneity. In contrast, SplitAVG method achieves comparable results to the baseline method under all heterogeneous settings, that it achieves 96.2% of the accuracy and 110.4% of the mean absolute error obtained by the baseline in a diabetic retinopathy binary classification dataset and a bone age prediction dataset, respectively, on highly heterogeneous data partitions. We conclude that SplitAVG method can effectively overcome the performance drops from variability in data distributions across institutions. Experimental results also show that SplitAVG can be adapted to different base convolutional neural networks (CNNs) and generalized to various types of medical imaging tasks. The code is publicly available at https://github.com/zm17943/SplitAVG.


Assuntos
Aprendizado Profundo , Diagnóstico por Imagem , Humanos , Redes Neurais de Computação , Radiografia
18.
Radiol Artif Intell ; 4(2): e210092, 2022 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-35391762

RESUMO

Purpose: To automatically identify a cohort of patients with pancreatic cystic lesions (PCLs) and extract PCL measurements from historical CT and MRI reports using natural language processing (NLP) and a question answering system. Materials and Methods: Institutional review board approval was obtained for this retrospective Health Insurance Portability and Accountability Act-compliant study, and the requirement to obtain informed consent was waived. A cohort of free-text CT and MRI reports generated between January 1991 and July 2019 that covered the pancreatic region were identified. A PCL identification model was developed by modifying a rule-based information extraction model; measurement extraction was performed using a state-of-the-art question answering system. The system's performance was evaluated against radiologists' annotations. Results: For this study, 430 426 free-text radiology reports from 199 783 unique patients were identified. The NLP model for identifying PCL was applied to 1000 test samples. The interobserver agreement between the model and two radiologists was almost perfect (Fleiss κ = 0.951), and the false-positive rate and true-positive rate were 3.0% and 98.2%, respectively, against consensus of radiologists' annotations as ground truths. The overall accuracy and Lin concordance correlation coefficient for measurement extraction were 0.958 and 0.874, respectively, against radiologists' annotations as ground truths. Conclusion: An NLP-based system was developed that identifies patients with PCLs and extracts measurements from a large single-institution archive of free-text radiology reports. This approach may prove valuable to study the natural history and potential risks of PCLs and can be applied to many other use cases.Keywords: Informatics, Abdomen/GI, Pancreas, Cysts, Computer Applications-General (Informatics), Named Entity Recognition Supplemental material is available for this article. © RSNA, 2022See also commentary by Horii in this issue.

19.
Nat Commun ; 13(1): 1014, 2022 02 23.
Artigo em Inglês | MEDLINE | ID: mdl-35197467

RESUMO

Randomized clinical trials (RCT) are the gold standard for informing treatment decisions. Observational studies are often plagued by selection bias, and expert-selected covariates may insufficiently adjust for confounding. We explore how unstructured clinical text can be used to reduce selection bias and improve medical practice. We develop a framework based on natural language processing to uncover interpretable potential confounders from text. We validate our method by comparing the estimated hazard ratio (HR) with and without the confounders against established RCTs. We apply our method to four cohorts built from localized prostate and lung cancer datasets from the Stanford Cancer Institute and show that our method shifts the HR estimate towards the RCT results. The uncovered terms can also be interpreted by oncologists for clinical insights. We present this proof-of-concept study to enable more credible causal inference using observational data, uncover meaningful insights from clinical text, and inform high-stakes medical decisions.


Assuntos
Registros Eletrônicos de Saúde , Neoplasias Pulmonares , Causalidade , Humanos , Neoplasias Pulmonares/tratamento farmacológico , Masculino , Estudos Observacionais como Assunto , Ensaios Clínicos Controlados Aleatórios como Assunto , Projetos de Pesquisa
20.
J Digit Imaging ; 35(3): 524-533, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35149938

RESUMO

Scoliosis is a condition of abnormal lateral spinal curvature affecting an estimated 2 to 3% of the US population, or seven million people. The Cobb angle is the standard measurement of spinal curvature in scoliosis but is known to have high interobserver and intraobserver variability. Thus, the objective of this study was to build and validate a system for automatic quantitative evaluation of the Cobb angle and to compare AI generated and human reports in the clinical setting. After IRB was obtained, we retrospectively collected 2150 frontal view scoliosis radiographs at a tertiary referral center (January 1, 2019, to January 1, 2021, ≥ 16 years old, no hardware). The dataset was partitioned into 1505 train (70%), 215 validation (10%), and 430 test images (20%). All thoracic and lumbar vertebral bodies were segmented with bounding boxes, generating approximately 36,550 object annotations that were used to train a Faster R-CNN Resnet-101 object detection model. A controller algorithm was written to localize vertebral centroid coordinates and derive the Cobb properties (angle and endplate) of dominant and secondary curves. AI-derived Cobb angle measurements were compared to the clinical report measurements, and the Spearman rank-order demonstrated significant correlation (0.89, p < 0.001). Mean difference between AI and clinical report angle measurements was 7.34° (95% CI: 5.90-8.78°), which is similar to published literature (up to 10°). We demonstrate the feasibility of an AI system to automate measurement of level-by-level spinal angulation with performance comparable to radiologists.


Assuntos
Escoliose , Adolescente , Inteligência Artificial , Humanos , Vértebras Lombares/diagnóstico por imagem , Aprendizado de Máquina , Reprodutibilidade dos Testes , Estudos Retrospectivos , Escoliose/diagnóstico por imagem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...