Pesquisa | BVS - MINISTÉRIO DA SAÚDE

1.

AI and machine learning in medical imaging: key points from development to translation.

Samala, Ravi K; Drukker, Karen; Shukla-Dave, Amita; Chan, Heang-Ping; Sahiner, Berkman; Petrick, Nicholas; Greenspan, Hayit; Mahmood, Usman; Summers, Ronald M; Tourassi, Georgia; Deserno, Thomas M; Regge, Daniele; Näppi, Janne J; Yoshida, Hiroyuki; Huo, Zhimin; Chen, Quan; Vergara, Daniel; Cha, Kenny H; Mazurchuk, Richard; Grizzard, Kevin T; Huisman, Henkjan; Morra, Lia; Suzuki, Kenji; Armato, Samuel G; Hadjiiski, Lubomir.

BJR Artif Intell ; 1(1): ubae006, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-38828430

RESUMO

Innovation in medical imaging artificial intelligence (AI)/machine learning (ML) demands extensive data collection, algorithmic advancements, and rigorous performance assessments encompassing aspects such as generalizability, uncertainty, bias, fairness, trustworthiness, and interpretability. Achieving widespread integration of AI/ML algorithms into diverse clinical tasks will demand a steadfast commitment to overcoming issues in model design, development, and performance assessment. The complexities of AI/ML clinical translation present substantial challenges, requiring engagement with relevant stakeholders, assessment of cost-effectiveness for user and patient benefit, timely dissemination of information relevant to robust functioning throughout the AI/ML lifecycle, consideration of regulatory compliance, and feedback loops for real-world performance evidence. This commentary addresses several hurdles for the development and adoption of AI/ML technologies in medical imaging. Comprehensive attention to these underlying and often subtle factors is critical not only for tackling the challenges but also for exploring novel opportunities for the advancement of AI in radiology.

2.

Artificial intelligence in medicine: mitigating risks and maximizing benefits via quality assurance, quality control, and acceptance testing.

Mahmood, Usman; Shukla-Dave, Amita; Chan, Heang-Ping; Drukker, Karen; Samala, Ravi K; Chen, Quan; Vergara, Daniel; Greenspan, Hayit; Petrick, Nicholas; Sahiner, Berkman; Huo, Zhimin; Summers, Ronald M; Cha, Kenny H; Tourassi, Georgia; Deserno, Thomas M; Grizzard, Kevin T; Näppi, Janne J; Yoshida, Hiroyuki; Regge, Daniele; Mazurchuk, Richard; Suzuki, Kenji; Morra, Lia; Huisman, Henkjan; Armato, Samuel G; Hadjiiski, Lubomir.

BJR Artif Intell ; 1(1): ubae003, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-38476957

RESUMO

The adoption of artificial intelligence (AI) tools in medicine poses challenges to existing clinical workflows. This commentary discusses the necessity of context-specific quality assurance (QA), emphasizing the need for robust QA measures with quality control (QC) procedures that encompass (1) acceptance testing (AT) before clinical use, (2) continuous QC monitoring, and (3) adequate user training. The discussion also covers essential components of AT and QA, illustrated with real-world examples. We also highlight what we see as the shared responsibility of manufacturers or vendors, regulators, healthcare systems, medical physicists, and clinicians to enact appropriate testing and oversight to ensure a safe and equitable transformation of medicine through AI.

3.

Detection of Severe Lung Infection on Chest Radiographs of COVID-19 Patients: Robustness of AI Models across Multi-Institutional Data.

Sobiecki, André; Hadjiiski, Lubomir M; Chan, Heang-Ping; Samala, Ravi K; Zhou, Chuan; Stojanovska, Jadranka; Agarwal, Prachi P.

Diagnostics (Basel) ; 14(3)2024 Feb 05.

Artigo em Inglês | MEDLINE | ID: mdl-38337857

RESUMO

The diagnosis of severe COVID-19 lung infection is important because it carries a higher risk for the patient and requires prompt treatment with oxygen therapy and hospitalization while those with less severe lung infection often stay on observation. Also, severe infections are more likely to have long-standing residual changes in their lungs and may need follow-up imaging. We have developed deep learning neural network models for classifying severe vs. non-severe lung infections in COVID-19 patients on chest radiographs (CXR). A deep learning U-Net model was developed to segment the lungs. Inception-v1 and Inception-v4 models were trained for the classification of severe vs. non-severe COVID-19 infection. Four CXR datasets from multi-country and multi-institutional sources were used to develop and evaluate the models. The combined dataset consisted of 5748 cases and 6193 CXR images with physicians' severity ratings as reference standard. The area under the receiver operating characteristic curve (AUC) was used to evaluate model performance. We studied the reproducibility of classification performance using the different combinations of training and validation data sets. We also evaluated the generalizability of the trained deep learning models using both independent internal and external test sets. The Inception-v1 based models achieved AUC ranging between 0.81 ± 0.02 and 0.84 ± 0.0, while the Inception-v4 models achieved AUC in the range of 0.85 ± 0.06 and 0.89 ± 0.01, on the independent test sets, respectively. These results demonstrate the promise of using deep learning models in differentiating COVID-19 patients with severe from non-severe lung infection on chest radiographs.

4.

Decision region analysis for generalizability of artificial intelligence models: estimating model generalizability in the case of cross-reactivity and population shift.

Burgon, Alexis; Sahiner, Berkman; Petrick, Nicholas; Pennello, Gene; Cha, Kenny H; Samala, Ravi K.

J Med Imaging (Bellingham) ; 11(1): 014501, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-38283653

RESUMO

Purpose: Understanding an artificial intelligence (AI) model's ability to generalize to its target population is critical to ensuring the safe and effective usage of AI in medical devices. A traditional generalizability assessment relies on the availability of large, diverse datasets, which are difficult to obtain in many medical imaging applications. We present an approach for enhanced generalizability assessment by examining the decision space beyond the available testing data distribution. Approach: Vicinal distributions of virtual samples are generated by interpolating between triplets of test images. The generated virtual samples leverage the characteristics already in the test set, increasing the sample diversity while remaining close to the AI model's data manifold. We demonstrate the generalizability assessment approach on the non-clinical tasks of classifying patient sex, race, COVID status, and age group from chest x-rays. Results: Decision region composition analysis for generalizability indicated that a disproportionately large portion of the decision space belonged to a single "preferred" class for each task, despite comparable performance on the evaluation dataset. Evaluation using cross-reactivity and population shift strategies indicated a tendency to overpredict samples as belonging to the preferred class (e.g., COVID negative) for patients whose subgroup was not represented in the model development data. Conclusions: An analysis of an AI model's decision space has the potential to provide insight into model generalizability. Our approach uses the analysis of composition of the decision space to obtain an improved assessment of model generalizability in the case of limited test data.

5.

Methodology for Good Machine Learning with Multi-Omics Data.

Coroller, Thibaud; Sahiner, Berkman; Amatya, Anup; Gossmann, Alexej; Karagiannis, Konstantinos; Moloney, Conor; Samala, Ravi K; Santana-Quintero, Luis; Solovieff, Nadia; Wang, Craig; Amiri-Kordestani, Laleh; Cao, Qian; Cha, Kenny H; Charlab, Rosane; Cross, Frank H; Hu, Tingting; Huang, Ruihao; Kraft, Jeffrey; Krusche, Peter; Li, Yutong; Li, Zheng; Mazo, Ilya; Paul, Rahul; Schnakenberg, Susan; Serra, Paolo; Smith, Sean; Song, Chi; Su, Fei; Tiwari, Mohit; Vechery, Colin; Xiong, Xin; Zarate, Juan Pablo; Zhu, Hao; Chakravartty, Arunava; Liu, Qi; Ohlssen, David; Petrick, Nicholas; Schneider, Julie A; Walderhaug, Mark; Zuber, Emmanuel.

Clin Pharmacol Ther ; 115(4): 745-757, 2024 04.

Artigo em Inglês | MEDLINE | ID: mdl-37965805

RESUMO

In 2020, Novartis Pharmaceuticals Corporation and the U.S. Food and Drug Administration (FDA) started a 4-year scientific collaboration to approach complex new data modalities and advanced analytics. The scientific question was to find novel radio-genomics-based prognostic and predictive factors for HR+/HER- metastatic breast cancer under a Research Collaboration Agreement. This collaboration has been providing valuable insights to help successfully implement future scientific projects, particularly using artificial intelligence and machine learning. This tutorial aims to provide tangible guidelines for a multi-omics project that includes multidisciplinary expert teams, spanning across different institutions. We cover key ideas, such as "maintaining effective communication" and "following good data science practices," followed by the four steps of exploratory projects, namely (1) plan, (2) design, (3) develop, and (4) disseminate. We break each step into smaller concepts with strategies for implementation and provide illustrations from our collaboration to further give the readers actionable guidance.

Assuntos

Inteligência Artificial , Multiômica , Humanos , Aprendizado de Máquina , Genômica

6.

Regulatory considerations for medical imaging AI/ML devices in the United States: concepts and challenges.

Petrick, Nicholas; Chen, Weijie; Delfino, Jana G; Gallas, Brandon D; Kang, Yanna; Krainak, Daniel; Sahiner, Berkman; Samala, Ravi K.

J Med Imaging (Bellingham) ; 10(5): 051804, 2023 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-37361549

RESUMO

Purpose: To introduce developers to medical device regulatory processes and data considerations in artificial intelligence and machine learning (AI/ML) device submissions and to discuss ongoing AI/ML-related regulatory challenges and activities. Approach: AI/ML technologies are being used in an increasing number of medical imaging devices, and the fast evolution of these technologies presents novel regulatory challenges. We provide AI/ML developers with an introduction to U.S. Food and Drug Administration (FDA) regulatory concepts, processes, and fundamental assessments for a wide range of medical imaging AI/ML device types. Results: The device type for an AI/ML device and appropriate premarket regulatory pathway is based on the level of risk associated with the device and informed by both its technological characteristics and intended use. AI/ML device submissions contain a wide array of information and testing to facilitate the review process with the model description, data, nonclinical testing, and multi-reader multi-case testing being critical aspects of the AI/ML device review process for many AI/ML device submissions. The agency is also involved in AI/ML-related activities that support guidance document development, good machine learning practice development, AI/ML transparency, AI/ML regulatory research, and real-world performance assessment. Conclusion: FDA's AI/ML regulatory and scientific efforts support the joint goals of ensuring patients have access to safe and effective AI/ML devices over the entire device lifecycle and stimulating medical AI/ML innovation.

7.

Data drift in medical machine learning: implications and potential remedies.

Sahiner, Berkman; Chen, Weijie; Samala, Ravi K; Petrick, Nicholas.

Br J Radiol ; 96(1150): 20220878, 2023 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-36971405

RESUMO

Data drift refers to differences between the data used in training a machine learning (ML) model and that applied to the model in real-world operation. Medical ML systems can be exposed to various forms of data drift, including differences between the data sampled for training and used in clinical operation, differences between medical practices or context of use between training and clinical use, and time-related changes in patient populations, disease patterns, and data acquisition, to name a few. In this article, we first review the terminology used in ML literature related to data drift, define distinct types of drift, and discuss in detail potential causes within the context of medical applications with an emphasis on medical imaging. We then review the recent literature regarding the effects of data drift on medical ML systems, which overwhelmingly show that data drift can be a major cause for performance deterioration. We then discuss methods for monitoring data drift and mitigating its effects with an emphasis on pre- and post-deployment techniques. Some of the potential methods for drift detection and issues around model retraining when drift is detected are included. Based on our review, we find that data drift is a major concern in medical ML deployment and that more research is needed so that ML models can identify drift early, incorporate effective mitigation strategies and resist performance decay.

Assuntos

Aprendizado de Máquina , Computação em Informática Médica

8.

Computerized Decision Support for Bladder Cancer Treatment Response Assessment in CT Urography: Effect on Diagnostic Accuracy in Multi-Institution Multi-Specialty Study.

Sun, Di; Hadjiiski, Lubomir; Alva, Ajjai; Zakharia, Yousef; Joshi, Monika; Chan, Heang-Ping; Garje, Rohan; Pomerantz, Lauren; Elhag, Dean; Cohan, Richard H; Caoili, Elaine M; Kerr, Wesley T; Cha, Kenny H; Kirova-Nedyalkova, Galina; Davenport, Matthew S; Shankar, Prasad R; Francis, Isaac R; Shampain, Kimberly; Meyer, Nathaniel; Barkmeier, Daniel; Woolen, Sean; Palmbos, Phillip L; Weizer, Alon Z; Samala, Ravi K; Zhou, Chuan; Matuszak, Martha.

Tomography ; 8(2): 644-656, 2022 03 02.

Artigo em Inglês | MEDLINE | ID: mdl-35314631

RESUMO

This observer study investigates the effect of computerized artificial intelligence (AI)-based decision support system (CDSS-T) on physicians' diagnostic accuracy in assessing bladder cancer treatment response. The performance of 17 observers was evaluated when assessing bladder cancer treatment response without and with CDSS-T using pre- and post-chemotherapy CTU scans in 123 patients having 157 pre- and post-treatment cancer pairs. The impact of cancer case difficulty, observers' clinical experience, institution affiliation, specialty, and the assessment times on the observers' diagnostic performance with and without using CDSS-T were analyzed. It was found that the average performance of the 17 observers was significantly improved (p = 0.002) when aided by the CDSS-T. The cancer case difficulty, institution affiliation, specialty, and the assessment times influenced the observers' performance without CDSS-T. The AI-based decision support system has the potential to improve the diagnostic accuracy in assessing bladder cancer treatment response and result in more consistent performance among all physicians.

Assuntos

Sistemas de Apoio a Decisões Clínicas , Neoplasias da Bexiga Urinária , Inteligência Artificial , Humanos , Tomografia Computadorizada por Raios X , Neoplasias da Bexiga Urinária/diagnóstico por imagem , Neoplasias da Bexiga Urinária/terapia , Urografia

9.

Effect of Dose Level on Radiologists' Detection of Microcalcifications in Digital Breast Tomosynthesis: An Observer Study with Breast Phantoms.

Chan, Heang-Ping; Helvie, Mark A; Klein, Katherine A; McLaughlin, Carol; Neal, Colleen H; Oudsema, Rebecca; Rahman, W Tania; Roubidoux, Marilyn A; Hadjiiski, Lubomir M; Zhou, Chuan; Samala, Ravi K.

Acad Radiol ; 29 Suppl 1: S42-S49, 2022 01.

Artigo em Inglês | MEDLINE | ID: mdl-32950384

RESUMO

OBJECTIVES: To compare radiologists' sensitivity, confidence level, and reading efficiency of detecting microcalcifications in digital breast tomosynthesis (DBT) at two clinically relevant dose levels. MATERIALS AND METHODS: Six 5-cm-thick heterogeneous breast phantoms embedded with a total of 144 simulated microcalcification clusters of four speck sizes were imaged at two dose modes by a clinical DBT system. The DBT volumes at the two dose levels were read independently by six MQSA radiologists and one fellow with 1-33 years (median 12 years) of experience in a fully-crossed counter-balanced manner. The radiologist located each potential cluster and rated its conspicuity and his/her confidence that the marked location contained a cluster. The differences in the results between the two dose modes were analyzed by two-tailed paired t-test. RESULTS: Compared to the lower-dose mode, the average glandular dose in the higher-dose mode for the 5-cm phantoms increased from 1.34 to 2.07 mGy. The detection sensitivity increased for all speck sizes and significantly for the two smaller sizes (p <0.05). An average of 13.8% fewer false positive clusters was marked. The average conspicuity rating and the radiologists' confidence level were higher for all speck sizes and reached significance (p <0.05) for the three larger sizes. The average reading time per detected cluster reduced significantly (p <0.05) by an average of 13.2%. CONCLUSION: For a 5-cm-thick breast, an increase in average glandular dose from 1.34 to 2.07 mGy for DBT imaging increased the conspicuity of microcalcifications, improved the detection sensitivity by radiologists, increased their confidence levels, reduced false positive detections, and increased the reading efficiency.

Assuntos

Neoplasias da Mama , Calcinose , Mama/diagnóstico por imagem , Calcinose/diagnóstico por imagem , Feminino , Humanos , Masculino , Mamografia/métodos , Imagens de Fantasmas , Radiologistas

10.

Risks of feature leakage and sample size dependencies in deep feature extraction for breast mass classification.

Samala, Ravi K; Chan, Heang-Ping; Hadjiiski, Lubomir; Helvie, Mark A.

Med Phys ; 48(6): 2827-2837, 2021 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-33368376

RESUMO

PURPOSE: Transfer learning is commonly used in deep learning for medical imaging to alleviate the problem of limited available data. In this work, we studied the risk of feature leakage and its dependence on sample size when using pretrained deep convolutional neural network (DCNN) as feature extractor for classification breast masses in mammography. METHODS: Feature leakage occurs when the training set is used for feature selection and classifier modeling while the cost function is guided by the validation performance or informed by the test performance. The high-dimensional feature space extracted from pretrained DCNN suffers from the curse of dimensionality; feature subsets that can provide excessively optimistic performance can be found for the validation set or test set if the latter is allowed for unlimited reuse during algorithm development. We designed a simulation study to examine feature leakage when using DCNN as feature extractor for mass classification in mammography. Four thousand five hundred and seventy-seven unique mass lesions were partitioned by patient into three sets: 3222 for training, 508 for validation, and 847 for independent testing. Three pretrained DCNNs, AlexNet, GoogLeNet, and VGG16, were first compared using a training set in fourfold cross validation and one was selected as the feature extractor. To assess generalization errors, the independent test set was sequestered as truly unseen cases. A training set of a range of sizes from 10% to 75% was simulated by random drawing from the available training set in addition to 100% of the training set. Three commonly used feature classifiers, the linear discriminant, the support vector machine, and the random forest were evaluated. A sequential feature selection method was used to find feature subsets that could achieve high classification performance in terms of the area under the receiver operating characteristic curve (AUC) in the validation set. The extent of feature leakage and the impact of training set size were analyzed by comparison to the performance in the unseen test set. RESULTS: All three classifiers showed large generalization error between the validation set and the independent sequestered test set at all sample sizes. The generalization error decreased as the sample size increased. At 100% of the sample size, one classifier achieved an AUC as high as 0.91 on the validation set while the corresponding performance on the unseen test set only reached an AUC of 0.72. CONCLUSIONS: Our results demonstrate that large generalization errors can occur in AI tools due to feature leakage. Without evaluation on unseen test cases, optimistically biased performance may be reported inadvertently, and can lead to unrealistic expectations and reduce confidence for clinical implementation.

Assuntos

Mamografia , Redes Neurais de Computação , Algoritmos , Mama/diagnóstico por imagem , Humanos , Tamanho da Amostra

11.

Intraobserver Variability in Bladder Cancer Treatment Response Assessment With and Without Computerized Decision Support.

Hadjiiski, Lubomir M; Cha, Kenny H; Cohan, Richard H; Chan, Heang-Ping; Caoili, Elaine M; Davenport, Matthew S; Samala, Ravi K; Weizer, Alon Z; Alva, Ajjai; Kirova-Nedyalkova, Galina; Shampain, Kimberly; Meyer, Nathaniel; Barkmeier, Daniel; Woolen, Sean A; Shankar, Prasad R; Francis, Isaac R; Palmbos, Phillip L.

Tomography ; 6(2): 194-202, 2020 06.

Artigo em Inglês | MEDLINE | ID: mdl-32548296

RESUMO

We evaluated the intraobserver variability of physicians aided by a computerized decision-support system for treatment response assessment (CDSS-T) to identify patients who show complete response to neoadjuvant chemotherapy for bladder cancer, and the effects of the intraobserver variability on physicians' assessment accuracy. A CDSS-T tool was developed that uses a combination of deep learning neural network and radiomic features from computed tomography (CT) scans to detect bladder cancers that have fully responded to neoadjuvant treatment. Pre- and postchemotherapy CT scans of 157 bladder cancers from 123 patients were collected. In a multireader, multicase observer study, physician-observers estimated the likelihood of pathologic T0 disease by viewing paired pre/posttreatment CT scans placed side by side on an in-house-developed graphical user interface. Five abdominal radiologists, 4 diagnostic radiology residents, 2 oncologists, and 1 urologist participated as observers. They first provided an estimate without CDSS-T and then with CDSS-T. A subset of cases was evaluated twice to study the intraobserver variability and its effects on observer consistency. The mean areas under the curves for assessment of pathologic T0 disease were 0.85 for CDSS-T alone, 0.76 for physicians without CDSS-T and improved to 0.80 for physicians with CDSS-T (P = .001) in the original evaluation, and 0.78 for physicians without CDSS-T and improved to 0.81 for physicians with CDSS-T (P = .010) in the repeated evaluation. The intraobserver variability was significantly reduced with CDSS-T (P < .0001). The CDSS-T can significantly reduce physicians' variability and improve their accuracy for identifying complete response of muscle-invasive bladder cancer to neoadjuvant chemotherapy.

Assuntos

Sistemas de Apoio a Decisões Clínicas , Neoplasias da Bexiga Urinária , Humanos , Variações Dependentes do Observador , Médicos , Tomografia Computadorizada por Raios X , Neoplasias da Bexiga Urinária/diagnóstico por imagem , Neoplasias da Bexiga Urinária/tratamento farmacológico

12.

Computer-aided diagnosis in the era of deep learning.

Chan, Heang-Ping; Hadjiiski, Lubomir M; Samala, Ravi K.

Med Phys ; 47(5): e218-e227, 2020 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-32418340

RESUMO

Computer-aided diagnosis (CAD) has been a major field of research for the past few decades. CAD uses machine learning methods to analyze imaging and/or nonimaging patient data and makes assessment of the patient's condition, which can then be used to assist clinicians in their decision-making process. The recent success of the deep learning technology in machine learning spurs new research and development efforts to improve CAD performance and to develop CAD for many other complex clinical tasks. In this paper, we discuss the potential and challenges in developing CAD tools using deep learning technology or artificial intelligence (AI) in general, the pitfalls and lessons learned from CAD in screening mammography and considerations needed for future implementation of CAD or AI in clinical use. It is hoped that the past experiences and the deep learning technology will lead to successful advancement and lasting growth in this new era of CAD, thereby enabling CAD to deliver intelligent aids to improve health care.

Assuntos

Aprendizado Profundo , Diagnóstico por Computador/métodos , Humanos

13.

Generalization error analysis for deep convolutional neural network with transfer learning in breast cancer diagnosis.

Samala, Ravi K; Chan, Heang-Ping; Hadjiiski, Lubomir M; Helvie, Mark A; Richter, Caleb D.

Phys Med Biol ; 65(10): 105002, 2020 05 11.

Artigo em Inglês | MEDLINE | ID: mdl-32208369

RESUMO

Deep convolutional neural network (DCNN), now popularly called artificial intelligence (AI), has shown the potential to improve over previous computer-assisted tools in medical imaging developed in the past decades. A DCNN has millions of free parameters that need to be trained, but the training sample set is limited in size for most medical imaging tasks so that transfer learning is typically used. Automatic data mining may be an efficient way to enlarge the collected data set but the data can be noisy such as incorrect labels or even a wrong type of image. In this work we studied the generalization error of DCNN with transfer learning in medical imaging for the task of classifying malignant and benign masses on mammograms. With a finite available data set, we simulated a training set containing corrupted data or noisy labels. The balance between learning and memorization of the DCNN was manipulated by varying the proportion of corrupted data in the training set. The generalization error of DCNN was analyzed by the area under the receiver operating characteristic curve for the training and test sets and the weight changes after transfer learning. The study demonstrates that the transfer learning strategy of DCNN for such tasks needs to be designed properly, taking into consideration the constraints of the available training set having limited size and quality for the classification task at hand, to minimize memorization and improve generalizability.

Assuntos

Neoplasias da Mama/diagnóstico por imagem , Aprendizado Profundo , Processamento de Imagem Assistida por Computador/métodos , Feminino , Humanos , Mamografia , Curva ROC

14.

Deep Learning in Medical Image Analysis.

Chan, Heang-Ping; Samala, Ravi K; Hadjiiski, Lubomir M; Zhou, Chuan.

Adv Exp Med Biol ; 1213: 3-21, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-32030660

RESUMO

Deep learning is the state-of-the-art machine learning approach. The success of deep learning in many pattern recognition applications has brought excitement and high expectations that deep learning, or artificial intelligence (AI), can bring revolutionary changes in health care. Early studies of deep learning applied to lesion detection or classification have reported superior performance compared to those by conventional techniques or even better than radiologists in some tasks. The potential of applying deep-learning-based medical image analysis to computer-aided diagnosis (CAD), thus providing decision support to clinicians and improving the accuracy and efficiency of various diagnostic and treatment processes, has spurred new research and development efforts in CAD. Despite the optimism in this new era of machine learning, the development and implementation of CAD or AI tools in clinical practice face many challenges. In this chapter, we will discuss some of these issues and efforts needed to develop robust deep-learning-based CAD tools and integrate these tools into the clinical workflow, thereby advancing towards the goal of providing reliable intelligent aids for patient care.

Assuntos

Aprendizado Profundo , Diagnóstico por Computador , Diagnóstico por Imagem , Interpretação de Imagem Assistida por Computador , Humanos

15.

CAD and AI for breast cancer-recent development and challenges.

Chan, Heang-Ping; Samala, Ravi K; Hadjiiski, Lubomir M.

Br J Radiol ; 93(1108): 20190580, 2020 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-31742424

RESUMO

Computer-aided diagnosis (CAD) has been a popular area of research and development in the past few decades. In CAD, machine learning methods and multidisciplinary knowledge and techniques are used to analyze the patient information and the results can be used to assist clinicians in their decision making process. CAD may analyze imaging information alone or in combination with other clinical data. It may provide the analyzed information directly to the clinician or correlate the analyzed results with the likelihood of certain diseases based on statistical modeling of the past cases in the population. CAD systems can be developed to provide decision support for many applications in the patient care processes, such as lesion detection, characterization, cancer staging, treatment planning and response assessment, recurrence and prognosis prediction. The new state-of-the-art machine learning technique, known as deep learning (DL), has revolutionized speech and text recognition as well as computer vision. The potential of major breakthrough by DL in medical image analysis and other CAD applications for patient care has brought about unprecedented excitement of applying CAD, or artificial intelligence (AI), to medicine in general and to radiology in particular. In this paper, we will provide an overview of the recent developments of CAD using DL in breast imaging and discuss some challenges and practical issues that may impact the advancement of artificial intelligence and its integration into clinical workflow.

Assuntos

Inteligência Artificial/tendências , Neoplasias da Mama/diagnóstico por imagem , Diagnóstico por Computador/tendências , Bibliometria , Sistemas de Apoio a Decisões Clínicas , Aprendizado Profundo/tendências , Diagnóstico por Computador/métodos , Feminino , Humanos , Interpretação de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Imageamento por Ressonância Magnética/tendências , Mamografia/métodos , Redes Neurais de Computação , Garantia da Qualidade dos Cuidados de Saúde , Radiologia/educação , Ultrassonografia Mamária/métodos , Ultrassonografia Mamária/tendências

16.

Breast Cancer Diagnosis in Digital Breast Tomosynthesis: Effects of Training Sample Size on Multi-Stage Transfer Learning Using Deep Neural Nets.

Samala, Ravi K; Hadjiiski, Lubomir; Helvie, Mark A; Richter, Caleb D; Cha, Kenny H.

IEEE Trans Med Imaging ; 38(3): 686-696, 2019 03.

Artigo em Inglês | MEDLINE | ID: mdl-31622238

RESUMO

In this paper, we developed a deep convolutional neural network (CNN) for the classification of malignant and benign masses in digital breast tomosynthesis (DBT) using a multi-stage transfer learning approach that utilized data from similar auxiliary domains for intermediate-stage fine-tuning. Breast imaging data from DBT, digitized screen-film mammography, and digital mammography totaling 4039 unique regions of interest (1797 malignant and 2242 benign) were collected. Using cross validation, we selected the best transfer network from six transfer networks by varying the level up to which the convolutional layers were frozen. In a single-stage transfer learning approach, knowledge from CNN trained on the ImageNet data was fine-tuned directly with the DBT data. In a multi-stage transfer learning approach, knowledge learned from ImageNet was first fine-tuned with the mammography data and then fine-tuned with the DBT data. Two transfer networks were compared for the second-stage transfer learning by freezing most of the CNN structures versus freezing only the first convolutional layer. We studied the dependence of the classification performance on training sample size for various transfer learning and fine-tuning schemes by varying the training data from 1% to 100% of the available sets. The area under the receiver operating characteristic curve (AUC) was used as a performance measure. The view-based AUC on the test set for single-stage transfer learning was 0.85 ± 0.05 and improved significantly (p <; 0.05$ ) to 0.91 ± 0.03 for multi-stage learning. This paper demonstrated that, when the training sample size from the target domain is limited, an additional stage of transfer learning using data from a similar auxiliary domain is advantageous.

Assuntos

Neoplasias da Mama/diagnóstico por imagem , Aprendizado de Máquina , Mamografia/métodos , Redes Neurais de Computação , Área Sob a Curva , Humanos , Michigan , Tamanho da Amostra

17.

Deep Learning Approach for Assessment of Bladder Cancer Treatment Response.

Wu, Eric; Hadjiiski, Lubomir M; Samala, Ravi K; Chan, Heang-Ping; Cha, Kenny H; Richter, Caleb; Cohan, Richard H; Caoili, Elaine M; Paramagul, Chintana; Alva, Ajjai; Weizer, Alon Z.

Tomography ; 5(1): 201-208, 2019 03.

Artigo em Inglês | MEDLINE | ID: mdl-30854458

RESUMO

We compared the performance of different Deep learning-convolutional neural network (DL-CNN) models for bladder cancer treatment response assessment based on transfer learning by freezing different DL-CNN layers and varying the DL-CNN structure. Pre- and posttreatment computed tomography scans of 123 patients (cancers, 129; pre- and posttreatment cancer pairs, 158) undergoing chemotherapy were collected. After chemotherapy 33% of patients had T0 stage cancer (complete response). Regions of interest in pre- and posttreatment scans were extracted from the segmented lesions and combined into hybrid pre -post image pairs (h-ROIs). Training (pairs, 94; h-ROIs, 6209), validation (10 pairs) and test sets (54 pairs) were obtained. The DL-CNN consisted of 2 convolution (C1-C2), 2 locally connected (L3-L4), and 1 fully connected layers. The DL-CNN was trained with h-ROIs to classify cancers as fully responding (stage T0) or not fully responding to chemotherapy. Two radiologists provided lesion likelihood of being stage T0 posttreatment. The test area under the ROC curve (AUC) was 0.73 for T0 prediction by the base DL-CNN structure with randomly initialized weights. The base DL-CNN structure with pretrained weights and transfer learning (no frozen layers) achieved test AUC of 0.79. The test AUCs for 3 modified DL-CNN structures (different C1-C2 max pooling filter sizes, strides, and padding, with transfer learning) were 0.72, 0.86, and 0.69. For the base DL-CNN with (C1) frozen, (C1-C2) frozen, and (C1-C2-L3) frozen, the test AUCs were 0.81, 0.78, and 0.71, respectively. The radiologists' AUCs were 0.76 and 0.77. DL-CNN performed better with pretrained than randomly initialized weights.

Assuntos

Aprendizado Profundo , Neoplasias da Bexiga Urinária/diagnóstico por imagem , Neoplasias da Bexiga Urinária/tratamento farmacológico , Antineoplásicos/uso terapêutico , Cistectomia , Sistemas de Apoio a Decisões Clínicas , Monitoramento de Medicamentos/métodos , Humanos , Terapia Neoadjuvante/métodos , Curva ROC , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Sensibilidade e Especificidade , Tomografia Computadorizada por Raios X/métodos , Transferência de Experiência , Resultado do Tratamento , Urografia/métodos

18.

Diagnostic Accuracy of CT for Prediction of Bladder Cancer Treatment Response with and without Computerized Decision Support.

Cha, Kenny H; Hadjiiski, Lubomir M; Cohan, Richard H; Chan, Heang-Ping; Caoili, Elaine M; Davenport, Matthew S; Samala, Ravi K; Weizer, Alon Z; Alva, Ajjai; Kirova-Nedyalkova, Galina; Shampain, Kimberly; Meyer, Nathaniel; Barkmeier, Daniel; Woolen, Sean; Shankar, Prasad R; Francis, Isaac R; Palmbos, Phillip.

Acad Radiol ; 26(9): 1137-1145, 2019 09.

Artigo em Inglês | MEDLINE | ID: mdl-30424999

RESUMO

RATIONALE AND OBJECTIVES: To evaluate whether a computed tomography (CT)-based computerized decision-support system for muscle-invasive bladder cancer treatment response assessment (CDSS-T) can improve identification of patients who have responded completely to neoadjuvant chemotherapy. MATERIALS AND METHODS: Following Institutional Review Board approval, pre-chemotherapy and post-chemotherapy CT scans of 123 subjects with 157 muscle-invasive bladder cancer foci were collected retrospectively. CT data were analyzed with a CDSS-T that uses a combination of deep-learning convolutional neural network and radiomic features to distinguish muscle-invasive bladder cancers that have fully responded to neoadjuvant treatment from those that have not. Leave-one-case-out cross-validation was used to minimize overfitting. Five attending abdominal radiologists, four diagnostic radiology residents, two attending oncologists, and one attending urologist estimated the likelihood of pathologic T0 disease (complete response) by viewing paired pre/post-treatment CT scans placed side-by-side on an internally-developed graphical user interface. The observers provided an estimate without use of CDSS-T and then were permitted to revise their estimate after a CDSS-T-derived likelihood score was displayed. Observer estimates were analyzed with multi-reader, multi-case receiver operating characteristic methodology. The area under the curve (AUC) and the statistical significance of the difference were estimated. RESULTS: The mean AUCs for assessment of pathologic T0 disease were 0.80 for CDSS-T alone, 0.74 for physicians not using CDSS-T, and 0.77 for physicians using CDSS-T. The increase in the physicians' performance was statistically significant (P < .05). CONCLUSION: CDSS-T improves physician performance for identifying complete response of muscle-invasive bladder cancer to neoadjuvant chemotherapy.

Assuntos

Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Tomografia Computadorizada por Raios X , Neoplasias da Bexiga Urinária/diagnóstico por imagem , Neoplasias da Bexiga Urinária/tratamento farmacológico , Adulto , Idoso , Idoso de 80 Anos ou mais , Área Sob a Curva , Quimioterapia Adjuvante , Sistemas de Apoio a Decisões Clínicas , Aprendizado Profundo , Feminino , Humanos , Imunoglobulina G/uso terapêutico , Masculino , Melfalan/uso terapêutico , Pessoa de Meia-Idade , Terapia Neoadjuvante , Invasividade Neoplásica , Estadiamento de Neoplasias , Curva ROC , Estudos Retrospectivos , Resultado do Tratamento , Neoplasias da Bexiga Urinária/patologia

19.

Deep-learning convolutional neural network: Inner and outer bladder wall segmentation in CT urography.

Gordon, Marshall N; Hadjiiski, Lubomir M; Cha, Kenny H; Samala, Ravi K; Chan, Heang-Ping; Cohan, Richard H; Caoili, Elaine M.

Med Phys ; 46(2): 634-648, 2019 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-30520055

RESUMO

PURPOSE: We are developing a computerized segmentation tool for the inner and outer bladder wall as a part of an image analysis pipeline for CT urography (CTU). MATERIALS AND METHODS: A data set of 172 CTU cases was collected retrospectively with Institutional Review Board (IRB) approval. The data set was randomly split into two independent sets of training (81 cases) and testing (92 cases) which were manually outlined for both the inner and outer wall. We trained a deep-learning convolutional neural network (DL-CNN) to distinguish the bladder wall from the inside and outside of the bladder using neighborhood information. Approximately, 240 000 regions of interest (ROIs) of 16 × 16 pixels in size were extracted from regions in the training cases identified by the manually outlined inner and outer bladder walls to form a training set for the DL-CNN; half of the ROIs were selected to include the bladder wall and the other half were selected to exclude the bladder wall with some of these ROIs being inside the bladder and the rest outside the bladder entirely. The DL-CNN trained on these ROIs was applied to the cases in the test set slice-by-slice to generate a bladder wall likelihood map where the gray level of a given pixel represents the likelihood that a given pixel would belong to the bladder wall. We then used the DL-CNN likelihood map as an energy term in the energy equation of a cascaded level sets method to segment the inner and outer bladder wall. The DL-CNN segmentation with level sets was compared to the three-dimensional (3D) hand-segmented contours as a reference standard. RESULTS: For the inner wall contour, the training set achieved the average volume intersection, average volume error, average absolute volume error, and average distance of 90.0 ± 8.7%, -4.2 ± 18.4%, 12.9 ± 13.9%, and 3.0 ± 1.6 mm, respectively. The corresponding values for the test set were 86.9 ± 9.6%, -8.3 ± 37.7%, 18.4 ± 33.8%, and 3.4 ± 1.8 mm, respectively. For the outer wall contour, the training set achieved the values of 93.7 ± 3.9%, -7.8 ± 11.4%, 10.3 ± 9.3%, and 3.0 ± 1.2 mm, respectively. The corresponding values for the test set were 87.5 ± 9.9%, -1.2 ± 20.8%, 11.9 ± 17.0%, and 3.5 ± 2.3 mm, respectively. CONCLUSIONS: Our study demonstrates that DL-CNN-assisted level sets can effectively segment bladder walls from the inner bladder and outer structures despite a lack of consistent distinctions along the inner wall. However, even with the addition of level sets, the inner and outer walls may still be over-segmented and the DL-CNN-assisted level sets may incorrectly segment parts of the prostate that overlap with the outer bladder wall. The outer wall segmentation was improved compared to our previous method and the DL-CNN-assisted level sets were also able to segment the inner bladder wall with similar performance. This study shows the DL-CNN-assisted level set segmentation tool can effectively segment the inner and outer wall of the bladder.

Assuntos

Aprendizado Profundo , Processamento de Imagem Assistida por Computador/métodos , Tomografia Computadorizada por Raios X , Bexiga Urinária/diagnóstico por imagem , Urografia , Humanos , Doses de Radiação , Bexiga Urinária/anatomia & histologia

20.

Evolutionary pruning of transfer learned deep convolutional neural network for breast cancer diagnosis in digital breast tomosynthesis.

Samala, Ravi K; Chan, Heang-Ping; Hadjiiski, Lubomir M; Helvie, Mark A; Richter, Caleb; Cha, Kenny.

Phys Med Biol ; 63(9): 095005, 2018 05 01.

Artigo em Inglês | MEDLINE | ID: mdl-29616660

RESUMO

Deep learning models are highly parameterized, resulting in difficulty in inference and transfer learning for image recognition tasks. In this work, we propose a layered pathway evolution method to compress a deep convolutional neural network (DCNN) for classification of masses in digital breast tomosynthesis (DBT). The objective is to prune the number of tunable parameters while preserving the classification accuracy. In the first stage transfer learning, 19 632 augmented regions-of-interest (ROIs) from 2454 mass lesions on mammograms were used to train a pre-trained DCNN on ImageNet. In the second stage transfer learning, the DCNN was used as a feature extractor followed by feature selection and random forest classification. The pathway evolution was performed using genetic algorithm in an iterative approach with tournament selection driven by count-preserving crossover and mutation. The second stage was trained with 9120 DBT ROIs from 228 mass lesions using leave-one-case-out cross-validation. The DCNN was reduced by 87% in the number of neurons, 34% in the number of parameters, and 95% in the number of multiply-and-add operations required in the convolutional layers. The test AUC on 89 mass lesions from 94 independent DBT cases before and after pruning were 0.88 and 0.90, respectively, and the difference was not statistically significant (p > 0.05). The proposed DCNN compression approach can reduce the number of required operations by 95% while maintaining the classification performance. The approach can be extended to other deep neural networks and imaging tasks where transfer learning is appropriate.

Assuntos

Neoplasias da Mama/diagnóstico por imagem , Neoplasias da Mama/diagnóstico , Aprendizado Profundo , Processamento de Imagem Assistida por Computador/métodos , Mamografia/métodos , Redes Neurais de Computação , Feminino , Humanos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA