Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 54
Filtrar
1.
Radiology ; 310(3): e232780, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38501952

RESUMO

Background Mirai, a state-of-the-art deep learning-based algorithm for predicting short-term breast cancer risk, outperforms standard clinical risk models. However, Mirai is a black box, risking overreliance on the algorithm and incorrect diagnoses. Purpose To identify whether bilateral dissimilarity underpins Mirai's reasoning process; create a simplified, intelligible model, AsymMirai, using bilateral dissimilarity; and determine if AsymMirai may approximate Mirai's performance in 1-5-year breast cancer risk prediction. Materials and Methods This retrospective study involved mammograms obtained from patients in the EMory BrEast imaging Dataset, known as EMBED, from January 2013 to December 2020. To approximate 1-5-year breast cancer risk predictions from Mirai, another deep learning-based model, AsymMirai, was built with an interpretable module: local bilateral dissimilarity (localized differences between left and right breast tissue). Pearson correlation coefficients were computed between the risk scores of Mirai and those of AsymMirai. Subgroup analysis was performed in patients for whom AsymMirai's year-over-year reasoning was consistent. AsymMirai and Mirai risk scores were compared using the area under the receiver operating characteristic curve (AUC), and 95% CIs were calculated using the DeLong method. Results Screening mammograms (n = 210 067) from 81 824 patients (mean age, 59.4 years ± 11.4 [SD]) were included in the study. Deep learning-extracted bilateral dissimilarity produced similar risk scores to those of Mirai (1-year risk prediction, r = 0.6832; 4-5-year prediction, r = 0.6988) and achieved similar performance as Mirai. For AsymMirai, the 1-year breast cancer risk AUC was 0.79 (95% CI: 0.73, 0.85) (Mirai, 0.84; 95% CI: 0.79, 0.89; P = .002), and the 5-year risk AUC was 0.66 (95% CI: 0.63, 0.69) (Mirai, 0.71; 95% CI: 0.68, 0.74; P < .001). In a subgroup of 183 patients for whom AsymMirai repeatedly highlighted the same tissue over time, AsymMirai achieved a 3-year AUC of 0.92 (95% CI: 0.86, 0.97). Conclusion Localized bilateral dissimilarity, an imaging marker for breast cancer risk, approximated the predictive power of Mirai and was a key to Mirai's reasoning. © RSNA, 2024 Supplemental material is available for this article See also the editorial by Freitas in this issue.


Assuntos
Neoplasias da Mama , Aprendizado Profundo , Humanos , Pessoa de Meia-Idade , Feminino , Neoplasias da Mama/diagnóstico por imagem , Neoplasias da Mama/epidemiologia , Estudos Retrospectivos , Mamografia , Mama
2.
Curr Atheroscler Rep ; 26(4): 91-102, 2024 04.
Artigo em Inglês | MEDLINE | ID: mdl-38363525

RESUMO

PURPOSE OF REVIEW: Bias in artificial intelligence (AI) models can result in unintended consequences. In cardiovascular imaging, biased AI models used in clinical practice can negatively affect patient outcomes. Biased AI models result from decisions made when training and evaluating a model. This paper is a comprehensive guide for AI development teams to understand assumptions in datasets and chosen metrics for outcome/ground truth, and how this translates to real-world performance for cardiovascular disease (CVD). RECENT FINDINGS: CVDs are the number one cause of mortality worldwide; however, the prevalence, burden, and outcomes of CVD vary across gender and race. Several biomarkers are also shown to vary among different populations and ethnic/racial groups. Inequalities in clinical trial inclusion, clinical presentation, diagnosis, and treatment are preserved in health data that is ultimately used to train AI algorithms, leading to potential biases in model performance. Despite the notion that AI models themselves are biased, AI can also help to mitigate bias (e.g., bias auditing tools). In this review paper, we describe in detail implicit and explicit biases in the care of cardiovascular disease that may be present in existing datasets but are not obvious to model developers. We review disparities in CVD outcomes across different genders and race groups, differences in treatment of historically marginalized groups, and disparities in clinical trials for various cardiovascular diseases and outcomes. Thereafter, we summarize some CVD AI literature that shows bias in CVD AI as well as approaches that AI is being used to mitigate CVD bias.


Assuntos
Inteligência Artificial , Doenças Cardiovasculares , Feminino , Masculino , Humanos , Doenças Cardiovasculares/diagnóstico por imagem , Algoritmos , Viés
3.
J Digit Imaging ; 35(2): 137-152, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35022924

RESUMO

In recent years, generative adversarial networks (GANs) have gained tremendous popularity for various imaging related tasks such as artificial image generation to support AI training. GANs are especially useful for medical imaging-related tasks where training datasets are usually limited in size and heavily imbalanced against the diseased class. We present a systematic review, following the PRISMA guidelines, of recent GAN architectures used for medical image analysis to help the readers in making an informed decision before employing GANs in developing medical image classification and segmentation models. We have extracted 54 papers that highlight the capabilities and application of GANs in medical imaging from January 2015 to August 2020 and inclusion criteria for meta-analysis. Our results show four main architectures of GAN that are used for segmentation or classification in medical imaging. We provide a comprehensive overview of recent trends in the application of GANs in clinical diagnosis through medical image segmentation and classification and ultimately share experiences for task-based GAN implementations.


Assuntos
Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Humanos , Processamento de Imagem Assistida por Computador/métodos
4.
J Biomed Inform ; 123: 103918, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-34560275

RESUMO

OBJECTIVE: With increasing patient complexity whose data are stored in fragmented health information systems, automated and time-efficient ways of gathering important information from the patients' medical history are needed for effective clinical decision making. Using COVID-19 as a case study, we developed a query-bot information retrieval system with user-feedback to allow clinicians to ask natural questions to retrieve data from patient notes. MATERIALS AND METHODS: We applied clinicalBERT, a pre-trained contextual language model, to our dataset of patient notes to obtain sentence embeddings, using K-Means to reduce computation time for real-time interaction. Rocchio algorithm was then employed to incorporate user-feedback and improve retrieval performance. RESULTS: In an iterative feedback loop experiment, MAP for final iteration was 0.93/0.94 as compared to initial MAP of 0.66/0.52 for generic and 1./1. compared to 0.79/0.83 for COVID-19 specific queries confirming that contextual model handles the ambiguity in natural language queries and feedback helps to improve retrieval performance. User-in-loop experiment also outperformed the automated pseudo relevance feedback method. Moreover, the null hypothesis which assumes identical precision between initial retrieval and relevance feedback was rejected with high statistical significance (p â‰ª 0.05). Compared to Word2Vec, TF-IDF and bioBERT models, clinicalBERT works optimally considering the balance between response precision and user-feedback. DISCUSSION: Our model works well for generic as well as COVID-19 specific queries. However, some generic queries are not answered as well as others because clustering reduces query performance and vague relations between queries and sentences are considered non-relevant. We also tested our model for queries with the same meaning but different expressions and demonstrated that these query variations yielded similar performance after incorporation of user-feedback. CONCLUSION: In conclusion, we develop an NLP-based query-bot that handles synonyms and natural language ambiguity in order to retrieve relevant information from the patient chart. User-feedback is critical to improve model performance.


Assuntos
COVID-19 , Algoritmos , Retroalimentação , Humanos , Armazenamento e Recuperação da Informação , SARS-CoV-2
5.
Int J Hyperthermia ; 38(1): 130-135, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33541151

RESUMO

OBJECTIVE: To develop a thermochromic tissue-mimicking phantom (TTMP) with an embedded 3D-printed bone mimic of the lumbar spine to evaluate MRgFUS ablation of the facet joint and medial branch nerve. MATERIALS AND METHODS: Multiple 3D-printed materials were selected and characterized by measurements of speed of sound and linear acoustic attenuation coefficient using a through-transmission technique. A 3D model of the lumbar spine was segmented from a de-identified CT scan, and 3D printed. The 3D-printed spine was embedded within a TTMP with thermochromic ink color change setpoint at 60 °C. Multiple high energy sonications were targeted to the facet joints and medial branch nerve anatomical location using an ExAblate MRgFUS system connected to a 3T MR scanner. The phantom was dissected to assess sonication targets and the surrounding structures for color change as compared to the expected region of ablation on MR-thermometry. RESULTS: The measured sound attenuation coefficient and speed of sound of gypsum was 240 Np/m-MHz and 2471 m/s, which is the closest to published values for cortical bone. Following sonication, dissection of the TTMP revealed good concordance between the regions of color change within the phantom and expected areas of ablation on MR-thermometry. No heat deposition was observed in critical areas, including the spinal canal and nerve roots from either color change or MRI. CONCLUSION: Ablated regions in the TTMP correlated well with expected ablations based on MR-thermometry. These findings demonstrate the utility of an anatomic spine phantom in evaluating MRgFUS sonication for facet joint and medial branch nerve ablations.


Assuntos
Ablação por Ultrassom Focalizado de Alta Intensidade , Termometria , Articulação Zigapofisária , Imageamento por Ressonância Magnética , Imagens de Fantasmas , Ultrassonografia
6.
J Digit Imaging ; 34(4): 1005-1013, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34405297

RESUMO

Real-time execution of machine learning (ML) pipelines on radiology images is difficult due to limited computing resources in clinical environments, whereas running them in research clusters requires efficient data transfer capabilities. We developed Niffler, an open-source Digital Imaging and Communications in Medicine (DICOM) framework that enables ML and processing pipelines in research clusters by efficiently retrieving images from the hospitals' PACS and extracting the metadata from the images. We deployed Niffler at our institution (Emory Healthcare, the largest healthcare network in the state of Georgia) and retrieved data from 715 scanners spanning 12 sites, up to 350 GB/day continuously in real-time as a DICOM data stream over the past 2 years. We also used Niffler to retrieve images bulk on-demand based on user-provided filters to facilitate several research projects. This paper presents the architecture and three such use cases of Niffler. First, we executed an IVC filter detection and segmentation pipeline on abdominal radiographs in real-time, which was able to classify 989 test images with an accuracy of 96.0%. Second, we applied the Niffler Metadata Extractor to understand the operational efficiency of individual MRI systems based on calculated metrics. We benchmarked the accuracy of the calculated exam time windows by comparing Niffler against the Clinical Data Warehouse (CDW). Niffler accurately identified the scanners' examination timeframes and idling times, whereas CDW falsely depicted several exam overlaps due to human errors. Third, with metadata extracted from the images by Niffler, we identified scanners with misconfigured time and reconfigured five scanners. Our evaluations highlight how Niffler enables real-time ML and processing pipelines in a research cluster.


Assuntos
Sistemas de Informação em Radiologia , Radiologia , Data Warehousing , Humanos , Aprendizado de Máquina , Radiografia
7.
Radiology ; 290(2): 456-464, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-30398430

RESUMO

Purpose To develop and validate a deep learning algorithm that predicts the final diagnosis of Alzheimer disease (AD), mild cognitive impairment, or neither at fluorine 18 (18F) fluorodeoxyglucose (FDG) PET of the brain and compare its performance to that of radiologic readers. Materials and Methods Prospective 18F-FDG PET brain images from the Alzheimer's Disease Neuroimaging Initiative (ADNI) (2109 imaging studies from 2005 to 2017, 1002 patients) and retrospective independent test set (40 imaging studies from 2006 to 2016, 40 patients) were collected. Final clinical diagnosis at follow-up was recorded. Convolutional neural network of InceptionV3 architecture was trained on 90% of ADNI data set and tested on the remaining 10%, as well as the independent test set, with performance compared to radiologic readers. Model was analyzed with sensitivity, specificity, receiver operating characteristic (ROC), saliency map, and t-distributed stochastic neighbor embedding. Results The algorithm achieved area under the ROC curve of 0.98 (95% confidence interval: 0.94, 1.00) when evaluated on predicting the final clinical diagnosis of AD in the independent test set (82% specificity at 100% sensitivity), an average of 75.8 months prior to the final diagnosis, which in ROC space outperformed reader performance (57% [four of seven] sensitivity, 91% [30 of 33] specificity; P < .05). Saliency map demonstrated attention to known areas of interest but with focus on the entire brain. Conclusion By using fluorine 18 fluorodeoxyglucose PET of the brain, a deep learning algorithm developed for early prediction of Alzheimer disease achieved 82% specificity at 100% sensitivity, an average of 75.8 months prior to the final diagnosis. © RSNA, 2018 Online supplemental material is available for this article. See also the editorial by Larvie in this issue.


Assuntos
Doença de Alzheimer/diagnóstico por imagem , Aprendizado Profundo , Interpretação de Imagem Assistida por Computador/métodos , Tomografia por Emissão de Pósitrons/métodos , Idoso , Idoso de 80 Anos ou mais , Algoritmos , Disfunção Cognitiva/diagnóstico por imagem , Feminino , Fluordesoxiglucose F18/uso terapêutico , Humanos , Masculino , Pessoa de Meia-Idade , Estudos Retrospectivos , Sensibilidade e Especificidade
8.
Crit Care Med ; 52(2): 345-348, 2024 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-38240516
9.
J Digit Imaging ; 32(2): 228-233, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30465142

RESUMO

Applying state-of-the-art machine learning techniques to medical images requires a thorough selection and normalization of input data. One of such steps in digital mammography screening for breast cancer is the labeling and removal of special diagnostic views, in which diagnostic tools or magnification are applied to assist in assessment of suspicious initial findings. As a common task in medical informatics is prediction of disease and its stage, these special diagnostic views, which are only enriched among the cohort of diseased cases, will bias machine learning disease predictions. In order to automate this process, here, we develop a machine learning pipeline that utilizes both DICOM headers and images to predict such views in an automatic manner, allowing for their removal and the generation of unbiased datasets. We achieve AUC of 99.72% in predicting special mammogram views when combining both types of models. Finally, we apply these models to clean up a dataset of about 772,000 images with expected sensitivity of 99.0%. The pipeline presented in this paper can be applied to other datasets to obtain high-quality image sets suitable to train algorithms for disease detection.


Assuntos
Neoplasias da Mama/diagnóstico por imagem , Aprendizado de Máquina , Mamografia/classificação , Mamografia/métodos , Automação , Conjuntos de Dados como Assunto , Feminino , Humanos , Sistemas de Informação em Radiologia , Sensibilidade e Especificidade
10.
J Digit Imaging ; 32(1): 30-37, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-30128778

RESUMO

Breast cancer is a leading cause of cancer death among women in the USA. Screening mammography is effective in reducing mortality, but has a high rate of unnecessary recalls and biopsies. While deep learning can be applied to mammography, large-scale labeled datasets, which are difficult to obtain, are required. We aim to remove many barriers of dataset development by automatically harvesting data from existing clinical records using a hybrid framework combining traditional NLP and IBM Watson. An expert reviewer manually annotated 3521 breast pathology reports with one of four outcomes: left positive, right positive, bilateral positive, negative. Traditional NLP techniques using seven different machine learning classifiers were compared to IBM Watson's automated natural language classifier. Techniques were evaluated using precision, recall, and F-measure. Logistic regression outperformed all other traditional machine learning classifiers and was used for subsequent comparisons. Both traditional NLP and Watson's NLC performed well for cases under 1024 characters with weighted average F-measures above 0.96 across all classes. Performance of traditional NLP was lower for cases over 1024 characters with an F-measure of 0.83. We demonstrate a hybrid framework using traditional NLP techniques combined with IBM Watson to annotate over 10,000 breast pathology reports for development of a large-scale database to be used for deep learning in mammography. Our work shows that traditional NLP and IBM Watson perform extremely well for cases under 1024 characters and can accelerate the rate of data annotation.


Assuntos
Neoplasias da Mama/diagnóstico por imagem , Aprendizado Profundo/estatística & dados numéricos , Registros Eletrônicos de Saúde/estatística & dados numéricos , Interpretação de Imagem Assistida por Computador/métodos , Mamografia/métodos , Mama/diagnóstico por imagem , Bases de Dados Factuais , Feminino , Humanos , Pessoa de Meia-Idade
11.
J Digit Imaging ; 31(2): 245-251, 2018 04.
Artigo em Inglês | MEDLINE | ID: mdl-28924815

RESUMO

Magnetic resonance imaging (MRI) protocoling can be time- and resource-intensive, and protocols can often be suboptimal dependent upon the expertise or preferences of the protocoling radiologist. Providing a best-practice recommendation for an MRI protocol has the potential to improve efficiency and decrease the likelihood of a suboptimal or erroneous study. The goal of this study was to develop and validate a machine learning-based natural language classifier that can automatically assign the use of intravenous contrast for musculoskeletal MRI protocols based upon the free-text clinical indication of the study, thereby improving efficiency of the protocoling radiologist and potentially decreasing errors. We utilized a deep learning-based natural language classification system from IBM Watson, a question-answering supercomputer that gained fame after challenging the best human players on Jeopardy! in 2011. We compared this solution to a series of traditional machine learning-based natural language processing techniques that utilize a term-document frequency matrix. Each classifier was trained with 1240 MRI protocols plus their respective clinical indications and validated with a test set of 280. Ground truth of contrast assignment was obtained from the clinical record. For evaluation of inter-reader agreement, a blinded second reader radiologist analyzed all cases and determined contrast assignment based on only the free-text clinical indication. In the test set, Watson demonstrated overall accuracy of 83.2% when compared to the original protocol. This was similar to the overall accuracy of 80.2% achieved by an ensemble of eight traditional machine learning algorithms based on a term-document matrix. When compared to the second reader's contrast assignment, Watson achieved 88.6% agreement. When evaluating only the subset of cases where the original protocol and second reader were concordant (n = 251), agreement climbed further to 90.0%. The classifier was relatively robust to spelling and grammatical errors, which were frequent. Implementation of this automated MR contrast determination system as a clinical decision support tool may save considerable time and effort of the radiologist while potentially decreasing error rates, and require no change in order entry or workflow.


Assuntos
Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Doenças Musculoesqueléticas/diagnóstico por imagem , Processamento de Linguagem Natural , Algoritmos , Meios de Contraste/administração & dosagem , Humanos , Injeções Intravenosas , Sistema Musculoesquelético/diagnóstico por imagem , Reprodutibilidade dos Testes , Estudos Retrospectivos
12.
JMIR Med Educ ; 10: e46500, 2024 Feb 20.
Artigo em Inglês | MEDLINE | ID: mdl-38376896

RESUMO

BACKGROUND: Artificial intelligence (AI) and machine learning (ML) are poised to have a substantial impact in the health care space. While a plethora of web-based resources exist to teach programming skills and ML model development, there are few introductory curricula specifically tailored to medical students without a background in data science or programming. Programs that do exist are often restricted to a specific specialty. OBJECTIVE: We hypothesized that a 1-month elective for fourth-year medical students, composed of high-quality existing web-based resources and a project-based structure, would empower students to learn about the impact of AI and ML in their chosen specialty and begin contributing to innovation in their field of interest. This study aims to evaluate the success of this elective in improving self-reported confidence scores in AI and ML. The authors also share our curriculum with other educators who may be interested in its adoption. METHODS: This elective was offered in 2 tracks: technical (for students who were already competent programmers) and nontechnical (with no technical prerequisites, focusing on building a conceptual understanding of AI and ML). Students established a conceptual foundation of knowledge using curated web-based resources and relevant research papers, and were then tasked with completing 3 projects in their chosen specialty: a data set analysis, a literature review, and an AI project proposal. The project-based nature of the elective was designed to be self-guided and flexible to each student's interest area and career goals. Students' success was measured by self-reported confidence in AI and ML skills in pre and postsurveys. Qualitative feedback on students' experiences was also collected. RESULTS: This web-based, self-directed elective was offered on a pass-or-fail basis each month to fourth-year students at Emory University School of Medicine beginning in May 2021. As of June 2022, a total of 19 students had successfully completed the elective, representing a wide range of chosen specialties: diagnostic radiology (n=3), general surgery (n=1), internal medicine (n=5), neurology (n=2), obstetrics and gynecology (n=1), ophthalmology (n=1), orthopedic surgery (n=1), otolaryngology (n=2), pathology (n=2), and pediatrics (n=1). Students' self-reported confidence scores for AI and ML rose by 66% after this 1-month elective. In qualitative surveys, students overwhelmingly reported enthusiasm and satisfaction with the course and commented that the self-direction and flexibility and the project-based design of the course were essential. CONCLUSIONS: Course participants were successful in diving deep into applications of AI in their widely-ranging specialties, produced substantial project deliverables, and generally reported satisfaction with their elective experience. The authors are hopeful that a brief, 1-month investment in AI and ML education during medical school will empower this next generation of physicians to pave the way for AI and ML innovation in health care.


Assuntos
Inteligência Artificial , Educação Médica , Humanos , Currículo , Internet , Estudantes de Medicina
13.
Commun Med (Lond) ; 4(1): 21, 2024 Feb 19.
Artigo em Inglês | MEDLINE | ID: mdl-38374436

RESUMO

BACKGROUND: Breast density is an important risk factor for breast cancer complemented by a higher risk of cancers being missed during screening of dense breasts due to reduced sensitivity of mammography. Automated, deep learning-based prediction of breast density could provide subject-specific risk assessment and flag difficult cases during screening. However, there is a lack of evidence for generalisability across imaging techniques and, importantly, across race. METHODS: This study used a large, racially diverse dataset with 69,697 mammographic studies comprising 451,642 individual images from 23,057 female participants. A deep learning model was developed for four-class BI-RADS density prediction. A comprehensive performance evaluation assessed the generalisability across two imaging techniques, full-field digital mammography (FFDM) and two-dimensional synthetic (2DS) mammography. A detailed subgroup performance and bias analysis assessed the generalisability across participants' race. RESULTS: Here we show that a model trained on FFDM-only achieves a 4-class BI-RADS classification accuracy of 80.5% (79.7-81.4) on FFDM and 79.4% (78.5-80.2) on unseen 2DS data. When trained on both FFDM and 2DS images, the performance increases to 82.3% (81.4-83.0) and 82.3% (81.3-83.1). Racial subgroup analysis shows unbiased performance across Black, White, and Asian participants, despite a separate analysis confirming that race can be predicted from the images with a high accuracy of 86.7% (86.0-87.4). CONCLUSIONS: Deep learning-based breast density prediction generalises across imaging techniques and race. No substantial disparities are found for any subgroup, including races that were never seen during model development, suggesting that density predictions are unbiased.


Women with dense breasts have a higher risk of breast cancer. For dense breasts, it is also more difficult to spot cancer in mammograms, which are the X-ray images commonly used for breast cancer screening. Thus, knowing about an individual's breast density provides important information to doctors and screening participants. This study investigated whether an artificial intelligence algorithm (AI) can be used to accurately determine the breast density by analysing mammograms. The study tested whether such an algorithm performs equally well across different imaging devices, and importantly, across individuals from different self-reported race groups. A large, racially diverse dataset was used to evaluate the algorithm's performance. The results show that there were no substantial differences in the accuracy for any of the groups, providing important assurances that AI can be used safely and ethically for automated prediction of breast density.

14.
Curr Probl Diagn Radiol ; 53(3): 346-352, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38302303

RESUMO

Breast cancer is the most common type of cancer in women, and early abnormality detection using mammography can significantly improve breast cancer survival rates. Diverse datasets are required to improve the training and validation of deep learning (DL) systems for autonomous breast cancer diagnosis. However, only a small number of mammography datasets are publicly available. This constraint has created challenges when comparing different DL models using the same dataset. The primary contribution of this study is the comprehensive description of a selection of currently available public mammography datasets. The information available on publicly accessible datasets is summarized and their usability reviewed to enable more effective models to be developed for breast cancer detection and to improve understanding of existing models trained using these datasets. This study aims to bridge the existing knowledge gap by offering researchers and practitioners a valuable resource to develop and assess DL models in breast cancer diagnosis.


Assuntos
Neoplasias da Mama , Aprendizado Profundo , Feminino , Humanos , Mamografia , Neoplasias da Mama/diagnóstico por imagem , Detecção Precoce de Câncer
15.
EBioMedicine ; 104: 105174, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38821021

RESUMO

BACKGROUND: Chest X-rays (CXR) are essential for diagnosing a variety of conditions, but when used on new populations, model generalizability issues limit their efficacy. Generative AI, particularly denoising diffusion probabilistic models (DDPMs), offers a promising approach to generating synthetic images, enhancing dataset diversity. This study investigates the impact of synthetic data supplementation on the performance and generalizability of medical imaging research. METHODS: The study employed DDPMs to create synthetic CXRs conditioned on demographic and pathological characteristics from the CheXpert dataset. These synthetic images were used to supplement training datasets for pathology classifiers, with the aim of improving their performance. The evaluation involved three datasets (CheXpert, MIMIC-CXR, and Emory Chest X-ray) and various experiments, including supplementing real data with synthetic data, training with purely synthetic data, and mixing synthetic data with external datasets. Performance was assessed using the area under the receiver operating curve (AUROC). FINDINGS: Adding synthetic data to real datasets resulted in a notable increase in AUROC values (up to 0.02 in internal and external test sets with 1000% supplementation, p-value <0.01 in all instances). When classifiers were trained exclusively on synthetic data, they achieved performance levels comparable to those trained on real data with 200%-300% data supplementation. The combination of real and synthetic data from different sources demonstrated enhanced model generalizability, increasing model AUROC from 0.76 to 0.80 on the internal test set (p-value <0.01). INTERPRETATION: Synthetic data supplementation significantly improves the performance and generalizability of pathology classifiers in medical imaging. FUNDING: Dr. Gichoya is a 2022 Robert Wood Johnson Foundation Harold Amos Medical Faculty Development Program and declares support from RSNA Health Disparities grant (#EIHD2204), Lacuna Fund (#67), Gordon and Betty Moore Foundation, NIH (NIBIB) MIDRC grant under contracts 75N92020C00008 and 75N92020C00021, and NHLBI Award Number R01HL167811.


Assuntos
Diagnóstico por Imagem , Curva ROC , Humanos , Diagnóstico por Imagem/métodos , Algoritmos , Radiografia Torácica/métodos , Processamento de Imagem Assistida por Computador/métodos , Bases de Dados Factuais , Área Sob a Curva , Modelos Estatísticos
16.
medRxiv ; 2024 Jan 11.
Artigo em Inglês | MEDLINE | ID: mdl-38260571

RESUMO

Background: To create an opportunistic screening strategy by multitask deep learning methods to stratify prediction for coronary artery calcium (CAC) and associated cardiovascular risk with frontal chest x-rays (CXR) and minimal data from electronic health records (EHR). Methods: In this retrospective study, 2,121 patients with available computed tomography (CT) scans and corresponding CXR images were collected internally (Mayo Enterprise) with calculated CAC scores binned into 3 categories (0, 1-99, and 100+) as ground truths for model training. Results from the internal training were tested on multiple external datasets (domestic (EUH) and foreign (VGHTPE)) with significant racial and ethnic differences and classification performance was compared. Findings: Classification performance between 0, 1-99, and 100+ CAC scores performed moderately on both the internal test and external datasets, reaching average f1-score of 0.66 for Mayo, 0.62 for EUH and 0.61 for VGHTPE. For the clinically relevant binary task of 0 vs 400+ CAC classification, the performance of our model on the internal test and external datasets reached an average AUCROC of 0.84. Interpretation: The fusion model trained on CXR performed better (0.84 average AUROC on internal and external dataset) than existing state-of-the-art models on predicting CAC scores only on internal (0.73 AUROC), with robust performance on external datasets. Thus, our proposed model may be used as a robust, first-pass opportunistic screening method for cardiovascular risk from regular chest radiographs. For community use, trained model and the inference code can be downloaded with an academic open-source license from https://github.com/jeong-jasonji/MTL_CAC_classification . Funding: The study was partially supported by National Institute of Health 1R01HL155410-01A1 award.

17.
BJU Int ; 112(4): 508-16, 2013 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-23746198

RESUMO

OBJECTIVE: To characterise the feasibility and safety of a novel transurethral ultrasound (US)-therapy device combined with real-time multi-plane magnetic resonance imaging (MRI)-based temperature monitoring and temperature feedback control, to enable spatiotemporally precise regional ablation of simulated prostate gland lesions in a preclinical canine model. To correlate ablation volumes measured with intra-procedural cumulative thermal damage estimates, post-procedural MRI, and histopathology. MATERIALS AND METHODS: Three dogs were treated with three targeted ablations each, using a prototype MRI-guided transurethral US-therapy system (Philips Healthcare, Vantaa, Finland). MRI provided images for treatment planning, guidance, real-time multi-planar thermometry, as well as post-treatment evaluation of efficacy. After treatment, specimens underwent histopathological analysis to determine the extent of necrosis and cell viability. Statistical analyses (Pearson's correlation, Student's t-test) were used to evaluate the correlation between ablation volumes measured with intra-procedural cumulative thermal damage estimates, post-procedural MRI, and histopathology. RESULTS: MRI combined with a transurethral US-therapy device enabled multi-planar temperature monitoring at the target as well as in surrounding tissues, allowing for safe, targeted, and controlled ablations of prescribed lesions. Ablated volumes measured by cumulative thermal dose positively correlated with volumes determined by histopathological analysis (r(2) 0.83, P < 0.001). Post-procedural contrast-enhanced and diffusion-weighted MRI showed a positive correlation with non-viable areas on histopathological analysis (r(2) 0.89, P < 0.001, and r(2) 0.91, P = 0.003, respectively). Additionally, there was a positive correlation between ablated volumes according to cumulative thermal dose and volumes identified on post-procedural contrast-enhanced MRI (r(2) 0.77, P < 0.01). There was no difference in mean ablation volumes assessed with the various analysis methods (P > 0.05, Student's t-test). CONCLUSIONS: MRI-guided transurethral US therapy enabled safe and targeted ablations of prescribed lesions in a preclinical canine prostate model. Ablation volumes were reliably predicted by intra- and post-procedural imaging. Clinical studies are needed to confirm the feasibility, safety, oncological control, and functional outcomes of this therapy in patients in whom focal therapy is indicated.


Assuntos
Imageamento por Ressonância Magnética , Neoplasias da Próstata/patologia , Neoplasias da Próstata/terapia , Terapia por Ultrassom/métodos , Animais , Cães , Masculino , Modelos Anatômicos , Uretra
18.
IEEE J Biomed Health Inform ; 27(8): 3936-3947, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37167055

RESUMO

Automated curation of noisy external data in the medical domain has long been in high demand, as AI technologies need to be validated using various sources with clean, annotated data. Identifying the variance between internal and external sources is a fundamental step in curating a high-quality dataset, as the data distributions from different sources can vary significantly and subsequently affect the performance of AI models. The primary challenges for detecting data shifts are - (1) accessing private data across healthcare institutions for manual detection and (2) the lack of automated approaches to learn efficient shift-data representation without training samples. To overcome these problems, we propose an automated pipeline called MedShift to detect top-level shift samples and evaluate the significance of shift data without sharing data between internal and external organizations. MedShift employs unsupervised anomaly detectors to learn the internal distribution and identify samples showing significant shiftness for external datasets, and then compares their performance. To quantify the effects of detected shift data, we train a multi-class classifier that learns internal domain knowledge and evaluates the classification performance for each class in external domains after dropping the shift data. We also propose a data quality metric to quantify the dissimilarity between internal and external datasets. We verify the efficacy of MedShift using musculoskeletal radiographs (MURA) and chest X-ray datasets from multiple external sources. Our experiments show that our proposed shift data detection pipeline can be beneficial for medical centers to curate high-quality datasets more efficiently.

19.
Front Big Data ; 6: 1173038, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37139170

RESUMO

Data integration is a well-motivated problem in the clinical data science domain. Availability of patient data, reference clinical cases, and datasets for research have the potential to advance the healthcare industry. However, the unstructured (text, audio, or video data) and heterogeneous nature of the data, the variety of data standards and formats, and patient privacy constraint make data interoperability and integration a challenge. The clinical text is further categorized into different semantic groups and may be stored in different files and formats. Even the same organization may store cases in different data structures, making data integration more challenging. With such inherent complexity, domain experts and domain knowledge are often necessary to perform data integration. However, expert human labor is time and cost prohibitive. To overcome the variability in the structure, format, and content of the different data sources, we map the text into common categories and compute similarity within those. In this paper, we present a method to categorize and merge clinical data by considering the underlying semantics behind the cases and use reference information about the cases to perform data integration. Evaluation shows that we were able to merge 88% of clinical data from five different sources.

20.
J Med Imaging (Bellingham) ; 10(3): 034004, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-37388280

RESUMO

Purpose: Our study investigates whether graph-based fusion of imaging data with non-imaging electronic health records (EHR) data can improve the prediction of the disease trajectories for patients with coronavirus disease 2019 (COVID-19) beyond the prediction performance of only imaging or non-imaging EHR data. Approach: We present a fusion framework for fine-grained clinical outcome prediction [discharge, intensive care unit (ICU) admission, or death] that fuses imaging and non-imaging information using a similarity-based graph structure. Node features are represented by image embedding, and edges are encoded with clinical or demographic similarity. Results: Experiments on data collected from the Emory Healthcare Network indicate that our fusion modeling scheme performs consistently better than predictive models developed using only imaging or non-imaging features, with area under the receiver operating characteristics curve of 0.76, 0.90, and 0.75 for discharge from hospital, mortality, and ICU admission, respectively. External validation was performed on data collected from the Mayo Clinic. Our scheme highlights known biases in the model prediction, such as bias against patients with alcohol abuse history and bias based on insurance status. Conclusions: Our study signifies the importance of the fusion of multiple data modalities for the accurate prediction of clinical trajectories. The proposed graph structure can model relationships between patients based on non-imaging EHR data, and graph convolutional networks can fuse this relationship information with imaging data to effectively predict future disease trajectory more effectively than models employing only imaging or non-imaging data. Our graph-based fusion modeling frameworks can be easily extended to other prediction tasks to efficiently combine imaging data with non-imaging clinical data.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA