Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 66
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Radiology ; 290(3): 621-628, 2019 03.
Artigo em Inglês | MEDLINE | ID: mdl-30526359

RESUMO

Purpose To investigate the combination of mammography radiomics and quantitative three-compartment breast (3CB) image analysis of dual-energy mammography to limit unnecessary benign breast biopsies. Materials and Methods For this prospective study, dual-energy craniocaudal and mediolateral oblique mammograms were obtained immediately before biopsy in 109 women (mean age, 51 years; range, 31-85 years) with Breast Imaging Reporting and Data System category 4 or 5 breast masses (35 invasive cancers, 74 benign) from 2013 through 2017. The three quantitative compartments of water, lipid, and protein thickness at each pixel were calculated from the attenuation at high and low energy by using a within-image phantom. Masses were automatically segmented and features were extracted from the low-energy mammograms and the quantitative compartment images. Tenfold cross-validations using a linear discriminant classifier with predefined feature signatures helped differentiate between malignant and benign masses by means of (a) water-lipid-protein composition images alone, (b) mammography radiomics alone, and (c) a combined image analysis of both. Positive predictive value of biopsy performed (PPV3) at maximum sensitivity was the primary performance metric, and results were compared with those for conventional diagnostic digital mammography. Results The PPV3 for conventional diagnostic digital mammography in our data set was 32.1% (35 of 109; 95% confidence interval [CI]: 23.9%, 41.3%), with a sensitivity of 100%. In comparison, combined mammography radiomics plus quantitative 3CB image analysis had PPV3 of 49% (34 of 70; 95% CI: 36.5%, 58.9%; P < .001), with a sensitivity of 97% (34 of 35; 95% CI: 90.3%, 100%; P < .001) and 35.8% (39 of 109) fewer total biopsies (P < .001). Conclusion Quantitative three-compartment breast image analysis of breast masses combined with mammography radiomics has the potential to reduce unnecessary breast biopsies. © RSNA, 2018 Online supplemental material is available for this article.


Assuntos
Doenças Mamárias/diagnóstico por imagem , Doenças Mamárias/patologia , Mamografia/métodos , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Adulto , Idoso , Idoso de 80 Anos ou mais , Biópsia , Diagnóstico Diferencial , Feminino , Humanos , Pessoa de Meia-Idade , Valor Preditivo dos Testes , Estudos Prospectivos , Sensibilidade e Especificidade
2.
Cancer ; 122(5): 748-57, 2016 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-26619259

RESUMO

BACKGROUND: The objective of this study was to demonstrate that computer-extracted image phenotypes (CEIPs) of biopsy-proven breast cancer on magnetic resonance imaging (MRI) can accurately predict pathologic stage. METHODS: The authors used a data set of deidentified breast MRIs organized by the National Cancer Institute in The Cancer Imaging Archive. In total, 91 biopsy-proven breast cancers were analyzed from patients who had information available on pathologic stage (stage I, n = 22; stage II, n = 58; stage III, n = 11) and surgically verified lymph node status (negative lymph nodes, n = 46; ≥ 1 positive lymph node, n = 44; no lymph nodes examined, n = 1). Tumors were characterized according to 1) radiologist-measured size and 2) CEIP. Then, models were built that combined 2 CEIPs to predict tumor pathologic stage and lymph node involvement, and the models were evaluated in a leave-1-out, cross-validation analysis with the area under the receiver operating characteristic curve (AUC) as the value of interest. RESULTS: Tumor size was the most powerful predictor of pathologic stage, but CEIPs that captured biologic behavior also emerged as predictive (eg, stage I and II vs stage III demonstrated an AUC of 0.83). No size measure was successful in the prediction of positive lymph nodes, but adding a CEIP that described tumor "homogeneity" significantly improved discrimination (AUC = 0.62; P = .003) compared with chance. CONCLUSIONS: The current results indicate that MRI phenotypes have promise for predicting breast cancer pathologic stage and lymph node status. Cancer 2016;122:748-757. © 2015 American Cancer Society.


Assuntos
Neoplasias da Mama/patologia , Carcinoma Ductal de Mama/patologia , Carcinoma Lobular/patologia , Processamento de Imagem Assistida por Computador/métodos , Linfonodos/patologia , Adulto , Idoso , Idoso de 80 Anos ou mais , Feminino , Humanos , Imageamento por Ressonância Magnética , Pessoa de Meia-Idade , Estadiamento de Neoplasias , Fenótipo , Prognóstico , Curva ROC
3.
Radiology ; 281(2): 382-391, 2016 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-27144536

RESUMO

Purpose To investigate relationships between computer-extracted breast magnetic resonance (MR) imaging phenotypes with multigene assays of MammaPrint, Oncotype DX, and PAM50 to assess the role of radiomics in evaluating the risk of breast cancer recurrence. Materials and Methods Analysis was conducted on an institutional review board-approved retrospective data set of 84 deidentified, multi-institutional breast MR examinations from the National Cancer Institute Cancer Imaging Archive, along with clinical, histopathologic, and genomic data from The Cancer Genome Atlas. The data set of biopsy-proven invasive breast cancers included 74 (88%) ductal, eight (10%) lobular, and two (2%) mixed cancers. Of these, 73 (87%) were estrogen receptor positive, 67 (80%) were progesterone receptor positive, and 19 (23%) were human epidermal growth factor receptor 2 positive. For each case, computerized radiomics of the MR images yielded computer-extracted tumor phenotypes of size, shape, margin morphology, enhancement texture, and kinetic assessment. Regression and receiver operating characteristic analysis were conducted to assess the predictive ability of the MR radiomics features relative to the multigene assay classifications. Results Multiple linear regression analyses demonstrated significant associations (R2 = 0.25-0.32, r = 0.5-0.56, P < .0001) between radiomics signatures and multigene assay recurrence scores. Important radiomics features included tumor size and enhancement texture, which indicated tumor heterogeneity. Use of radiomics in the task of distinguishing between good and poor prognosis yielded area under the receiver operating characteristic curve values of 0.88 (standard error, 0.05), 0.76 (standard error, 0.06), 0.68 (standard error, 0.08), and 0.55 (standard error, 0.09) for MammaPrint, Oncotype DX, PAM50 risk of relapse based on subtype, and PAM50 risk of relapse based on subtype and proliferation, respectively, with all but the latter showing statistical difference from chance. Conclusion Quantitative breast MR imaging radiomics shows promise for image-based phenotyping in assessing the risk of breast cancer recurrence. © RSNA, 2016 Online supplemental material is available for this article.


Assuntos
Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Genômica/métodos , Imageamento por Ressonância Magnética/métodos , Recidiva Local de Neoplasia/genética , Recidiva Local de Neoplasia/patologia , Adulto , Idoso , Idoso de 80 Anos ou mais , Biomarcadores Tumorais/análise , Feminino , Expressão Gênica , Humanos , Aumento da Imagem , Interpretação de Imagem Assistida por Computador , Pessoa de Meia-Idade , Fenótipo , Valor Preditivo dos Testes , Estudos Retrospectivos , Medição de Risco
4.
AJR Am J Roentgenol ; 206(6): 1341-50, 2016 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-27043979

RESUMO

OBJECTIVE: The objective of our study was to assess and compare, in a reader study, radiologists' performance in the detection of breast cancer using full-field digital mammography (FFDM) alone and using FFDM with 3D automated breast ultrasound (ABUS). MATERIALS AND METHODS: In this multireader, multicase, sequential-design reader study, 17 Mammography Quality Standards Act-qualified radiologists interpreted a cancer-enriched set of FFDM and ABUS examinations. All imaging studies were of asymptomatic women with BI-RADS C or D breast density. Readers first interpreted FFDM alone and subsequently interpreted FFDM combined with ABUS. The analysis included 185 cases: 133 noncancers and 52 biopsy-proven cancers. Of the 52 cancer cases, the screening FFDM images were interpreted as showing BI-RADS 1 or 2 findings in 31 cases and BI-RADS 0 findings in 21 cases. For the cases interpreted as BI-RADS 0, a forced BI-RADS score was also given. Reader performance was compared in terms of AUC under the ROC curve, sensitivity, and specificity. RESULTS: The AUC was 0.72 for FFDM alone and 0.82 for FFDM combined with ABUS, yielding a statistically significant 14% relative improvement in AUC (i.e., change in AUC = 0.10 [95% CI, 0.07-0.14]; p < 0.001). When a cutpoint of BI-RADS 3 was used, the sensitivity across all readers was 57.5% for FFDM alone and 74.1% for FFDM with ABUS, yielding a statistically significant increase in sensitivity (p < 0.001) (relative increase = 29%). Overall specificity was 78.1% for FFDM alone and 76.1% for FFDM with ABUS (p = 0.496). For only the mammography-negative cancers, the average AUC was 0.60 for FFDM alone and 0.75 for FFDM with ABUS, yielding a statistically significant 25% relative improvement in AUC with the addition of ABUS (p < 0.001). CONCLUSION: Combining mammography with ABUS, compared with mammography alone, significantly improved readers' detection of breast cancers in women with dense breast tissue without substantially affecting specificity.


Assuntos
Neoplasias da Mama/diagnóstico por imagem , Carcinoma/diagnóstico por imagem , Mamografia , Ultrassonografia Mamária , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Detecção Precoce de Câncer , Feminino , Humanos , Pessoa de Meia-Idade , Valor Preditivo dos Testes , Curva ROC , Estudos Retrospectivos , Adulto Jovem
5.
J Med Imaging (Bellingham) ; 11(5): 054501, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-39280239

RESUMO

Significance: Uterine fibroids (UFs) can pose a serious health risk to women. UFs are benign tumors that vary in clinical presentation from asymptomatic to causing debilitating symptoms. UF management is limited by our inability to predict UF growth rate and future morbidity. Aim: We aim to develop a predictive model to identify UFs with increased growth rates and possible resultant morbidity. Approach: We retrospectively analyzed 44 expertly outlined UFs from 20 patients who underwent two multi-parametric MR imaging exams as part of a prospective study over an average of 16 months. We identified 44 initial features by extracting quantitative magnetic resonance imaging (MRI) features plus morphological and textural radiomics features from DCE, T2, and apparent diffusion coefficient sequences. Principal component analysis reduced dimensionality, with the smallest number of components explaining over 97.5% of the variance selected. Employing a leave-one-fibroid-out scheme, a linear discriminant analysis classifier utilized these components to output a growth risk score. Results: The classifier incorporated the first three principal components and achieved an area under the receiver operating characteristic curve of 0.80 (95% confidence interval [0.69; 0.91]), effectively distinguishing UFs growing faster than the median growth rate of 0.93 cm 3 / year / fibroid from slower-growing ones within the cohort. Time-to-event analysis, dividing the cohort based on the median growth risk score, yielded a hazard ratio of 0.33 [0.15; 0.76], demonstrating potential clinical utility. Conclusion: We developed a promising predictive model utilizing quantitative MRI features and principal component analysis to identify UFs with increased growth rates. Furthermore, the model's discrimination ability supports its potential clinical utility in developing tailored patient and fibroid-specific management once validated on a larger cohort.

6.
Med Phys ; 51(3): 1812-1821, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-37602841

RESUMO

BACKGROUND: Artificial intelligence/computer-aided diagnosis (AI/CADx) and its use of radiomics have shown potential in diagnosis and prognosis of breast cancer. Performance metrics such as the area under the receiver operating characteristic (ROC) curve (AUC) are frequently used as figures of merit for the evaluation of CADx. Methods for evaluating lesion-based measures of performance may enhance the assessment of AI/CADx pipelines, particularly in the situation of comparing performances by classifier. PURPOSE: The purpose of this study was to investigate the use case of two standard classifiers to (1) compare overall classification performance of the classifiers in the task of distinguishing between benign and malignant breast lesions using radiomic features extracted from dynamic contrast-enhanced magnetic resonance (DCE-MR) images, (2) define a new repeatability metric (termed sureness), and (3) use sureness to examine if one classifier provides an advantage in AI diagnostic performance by lesion when using radiomic features. METHODS: Images of 1052 breast lesions (201 benign, 851 cancers) had been retrospectively collected under HIPAA/IRB compliance. The lesions had been segmented automatically using a fuzzy c-means method and thirty-two radiomic features had been extracted. Classification was investigated for the task of malignant lesions (81% of the dataset) versus benign lesions (19%). Two classifiers (linear discriminant analysis, LDA and support vector machines, SVM) were trained and tested within 0.632 bootstrap analyses (2000 iterations). Whole-set classification performance was evaluated at two levels: (1) the 0.632+ bias-corrected area under the ROC curve (AUC) and (2) performance metric curves which give variability in operating sensitivity and specificity at a target operating point (95% target sensitivity). Sureness was defined as 1-95% confidence interval of the classifier output for each lesion for each classifier. Lesion-based repeatability was evaluated at two levels: (1) repeatability profiles, which represent the distribution of sureness across the decision threshold and (2) sureness of each lesion. The latter was used to identify lesions with better sureness with one classifier over another while maintaining lesion-based performance across the bootstrap iterations. RESULTS: In classification performance assessment, the median and 95% CI of difference in AUC between the two classifiers did not show evidence of difference (ΔAUC = -0.003 [-0.031, 0.018]). Both classifiers achieved the target sensitivity. Sureness was more consistent across the classifier output range for the SVM classifier than the LDA classifier. The SVM resulted in a net gain of 33 benign lesions and 307 cancers with higher sureness and maintained lesion-based performance. However, with the LDA there was a notable percentage of benign lesions (42%) with better sureness but lower lesion-based performance. CONCLUSIONS: When there is no evidence for difference in performance between classifiers using AUC or other performance summary measures, a lesion-based sureness metric may provide additional insight into AI pipeline design. These findings present and emphasize the utility of lesion-based repeatability via sureness in AI/CADx as a complementary enhancement to other evaluation measures.


Assuntos
Inteligência Artificial , Neoplasias da Mama , Humanos , Feminino , Estudos Retrospectivos , Imageamento por Ressonância Magnética/métodos , Neoplasias da Mama/patologia , Aprendizado de Máquina
7.
J Med Imaging (Bellingham) ; 11(2): 024504, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38576536

RESUMO

Purpose: The Medical Imaging and Data Resource Center (MIDRC) was created to facilitate medical imaging machine learning (ML) research for tasks including early detection, diagnosis, prognosis, and assessment of treatment response related to the coronavirus disease 2019 pandemic and beyond. The purpose of this work was to create a publicly available metrology resource to assist researchers in evaluating the performance of their medical image analysis ML algorithms. Approach: An interactive decision tree, called MIDRC-MetricTree, has been developed, organized by the type of task that the ML algorithm was trained to perform. The criteria for this decision tree were that (1) users can select information such as the type of task, the nature of the reference standard, and the type of the algorithm output and (2) based on the user input, recommendations are provided regarding appropriate performance evaluation approaches and metrics, including literature references and, when possible, links to publicly available software/code as well as short tutorial videos. Results: Five types of tasks were identified for the decision tree: (a) classification, (b) detection/localization, (c) segmentation, (d) time-to-event (TTE) analysis, and (e) estimation. As an example, the classification branch of the decision tree includes two-class (binary) and multiclass classification tasks and provides suggestions for methods, metrics, software/code recommendations, and literature references for situations where the algorithm produces either binary or non-binary (e.g., continuous) output and for reference standards with negligible or non-negligible variability and unreliability. Conclusions: The publicly available decision tree is a resource to assist researchers in conducting task-specific performance evaluations, including classification, detection/localization, segmentation, TTE, and estimation tasks.

8.
BJR Artif Intell ; 1(1): ubae006, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38828430

RESUMO

Innovation in medical imaging artificial intelligence (AI)/machine learning (ML) demands extensive data collection, algorithmic advancements, and rigorous performance assessments encompassing aspects such as generalizability, uncertainty, bias, fairness, trustworthiness, and interpretability. Achieving widespread integration of AI/ML algorithms into diverse clinical tasks will demand a steadfast commitment to overcoming issues in model design, development, and performance assessment. The complexities of AI/ML clinical translation present substantial challenges, requiring engagement with relevant stakeholders, assessment of cost-effectiveness for user and patient benefit, timely dissemination of information relevant to robust functioning throughout the AI/ML lifecycle, consideration of regulatory compliance, and feedback loops for real-world performance evidence. This commentary addresses several hurdles for the development and adoption of AI/ML technologies in medical imaging. Comprehensive attention to these underlying and often subtle factors is critical not only for tackling the challenges but also for exploring novel opportunities for the advancement of AI in radiology.

9.
BJR Artif Intell ; 1(1): ubae003, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38476957

RESUMO

The adoption of artificial intelligence (AI) tools in medicine poses challenges to existing clinical workflows. This commentary discusses the necessity of context-specific quality assurance (QA), emphasizing the need for robust QA measures with quality control (QC) procedures that encompass (1) acceptance testing (AT) before clinical use, (2) continuous QC monitoring, and (3) adequate user training. The discussion also covers essential components of AT and QA, illustrated with real-world examples. We also highlight what we see as the shared responsibility of manufacturers or vendors, regulators, healthcare systems, medical physicists, and clinicians to enact appropriate testing and oversight to ensure a safe and equitable transformation of medicine through AI.

10.
Br J Radiol ; 96(1150): 20221152, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37698542

RESUMO

Artificial intelligence (AI), in one form or another, has been a part of medical imaging for decades. The recent evolution of AI into approaches such as deep learning has dramatically accelerated the application of AI across a wide range of radiologic settings. Despite the promises of AI, developers and users of AI technology must be fully aware of its potential biases and pitfalls, and this knowledge must be incorporated throughout the AI system development pipeline that involves training, validation, and testing. Grand challenges offer an opportunity to advance the development of AI methods for targeted applications and provide a mechanism for both directing and facilitating the development of AI systems. In the process, a grand challenge centralizes (with the challenge organizers) the burden of providing a valid benchmark test set to assess performance and generalizability of participants' models and the collection and curation of image metadata, clinical/demographic information, and the required reference standard. The most relevant grand challenges are those designed to maximize the open-science nature of the competition, with code and trained models deposited for future public access. The ultimate goal of AI grand challenges is to foster the translation of AI systems from competition to research benefit and patient care. Rather than reference the many medical imaging grand challenges that have been organized by groups such as MICCAI, RSNA, AAPM, and grand-challenge.org, this review assesses the role of grand challenges in promoting AI technologies for research advancement and for eventual clinical implementation, including their promises and limitations.


Assuntos
Inteligência Artificial , Radiologia , Humanos , Radiografia , Diagnóstico por Imagem , Assistência ao Paciente
11.
J Med Imaging (Bellingham) ; 10(5): 051801, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37915406

RESUMO

The editorial introduces the JMI Special Section on Artificial Intelligence for Medical Imaging in Clinical Practice.

12.
J Med Imaging (Bellingham) ; 10(4): 044504, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37608852

RESUMO

Purpose: Image-based prediction of coronavirus disease 2019 (COVID-19) severity and resource needs can be an important means to address the COVID-19 pandemic. In this study, we propose an artificial intelligence/machine learning (AI/ML) COVID-19 prognosis method to predict patients' needs for intensive care by analyzing chest X-ray radiography (CXR) images using deep learning. Approach: The dataset consisted of 8357 CXR exams from 5046 COVID-19-positive patients as confirmed by reverse transcription polymerase chain reaction (RT-PCR) tests for the SARS-CoV-2 virus with a training/validation/test split of 64%/16%/20% on a by patient level. Our model involved a DenseNet121 network with a sequential transfer learning technique employed to train on a sequence of gradually more specific and complex tasks: (1) fine-tuning a model pretrained on ImageNet using a previously established CXR dataset with a broad spectrum of pathologies; (2) refining on another established dataset to detect pneumonia; and (3) fine-tuning using our in-house training/validation datasets to predict patients' needs for intensive care within 24, 48, 72, and 96 h following the CXR exams. The classification performances were evaluated on our independent test set (CXR exams of 1048 patients) using the area under the receiver operating characteristic curve (AUC) as the figure of merit in the task of distinguishing between those COVID-19-positive patients who required intensive care following the imaging exam and those who did not. Results: Our proposed AI/ML model achieved an AUC (95% confidence interval) of 0.78 (0.74, 0.81) when predicting the need for intensive care 24 h in advance, and at least 0.76 (0.73, 0.80) for 48 h or more in advance using predictions based on the AI prognostic marker derived from CXR images. Conclusions: This AI/ML prediction model for patients' needs for intensive care has the potential to support both clinical decision-making and resource management.

13.
J Med Imaging (Bellingham) ; 10(6): 064502, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37990686

RESUMO

Purpose: Given the dependence of radiomic-based computer-aided diagnosis artificial intelligence on accurate lesion segmentation, we assessed the performances of 2D and 3D U-Nets in breast lesion segmentation on dynamic contrast-enhanced (DCE) magnetic resonance imaging (MRI) relative to fuzzy c-means (FCM) and radiologist segmentations. Approach: Using 994 unique breast lesions imaged with DCE-MRI, three segmentation algorithms (FCM clustering, 2D and 3D U-Net convolutional neural networks) were investigated. Center slice segmentations produced by FCM, 2D U-Net, and 3D U-Net were evaluated using radiologist segmentations as truth, and volumetric segmentations produced by 2D U-Net slices and 3D U-Net were compared using FCM as a surrogate reference standard. Fivefold cross-validation by lesion was conducted on the U-Nets; Dice similarity coefficient (DSC) and Hausdorff distance (HD) served as performance metrics. Segmentation performances were compared across different input image and lesion types. Results: 2D U-Net outperformed 3D U-Net for center slice (DSC, HD p<0.001) and volume segmentations (DSC, HD p<0.001). 2D U-Net outperformed FCM in center slice segmentation (DSC p<0.001). The use of second postcontrast subtraction images showed greater performance than first postcontrast subtraction images using the 2D and 3D U-Net (DSC p<0.05). Additionally, mass segmentation outperformed nonmass segmentation from first and second postcontrast subtraction images using 2D and 3D U-Nets (DSC, HD p<0.001). Conclusions: Results suggest that 2D U-Net is promising in segmenting mass and nonmass enhancing breast lesions from first and second postcontrast subtraction MRIs and thus could be an effective alternative to FCM or 3D U-Net.

14.
J Med Imaging (Bellingham) ; 10(6): 064501, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-38074627

RESUMO

Purpose: The Medical Imaging and Data Resource Center (MIDRC) is a multi-institutional effort to accelerate medical imaging machine intelligence research and create a publicly available image repository/commons as well as a sequestered commons for performance evaluation and benchmarking of algorithms. After de-identification, approximately 80% of the medical images and associated metadata become part of the open commons and 20% are sequestered from the open commons. To ensure that both commons are representative of the population available, we introduced a stratified sampling method to balance the demographic characteristics across the two datasets. Approach: Our method uses multi-dimensional stratified sampling where several demographic variables of interest are sequentially used to separate the data into individual strata, each representing a unique combination of variables. Within each resulting stratum, patients are assigned to the open or sequestered commons. This algorithm was used on an example dataset containing 5000 patients using the variables of race, age, sex at birth, ethnicity, COVID-19 status, and image modality and compared resulting demographic distributions to naïve random sampling of the dataset over 2000 independent trials. Results: Resulting prevalence of each demographic variable matched the prevalence from the input dataset within one standard deviation. Mann-Whitney U test results supported the hypothesis that sequestration by stratified sampling provided more balanced subsets than naïve randomization, except for demographic subcategories with very low prevalence. Conclusions: The developed multi-dimensional stratified sampling algorithm can partition a large dataset while maintaining balance across several variables, superior to the balance achieved from naïve randomization.

15.
J Med Imaging (Bellingham) ; 10(6): 61105, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37469387

RESUMO

Purpose: The Medical Imaging and Data Resource Center (MIDRC) open data commons was launched to accelerate the development of artificial intelligence (AI) algorithms to help address the COVID-19 pandemic. The purpose of this study was to quantify longitudinal representativeness of the demographic characteristics of the primary MIDRC dataset compared to the United States general population (US Census) and COVID-19 positive case counts from the Centers for Disease Control and Prevention (CDC). Approach: The Jensen-Shannon distance (JSD), a measure of similarity of two distributions, was used to longitudinally measure the representativeness of the distribution of (1) all unique patients in the MIDRC data to the 2020 US Census and (2) all unique COVID-19 positive patients in the MIDRC data to the case counts reported by the CDC. The distributions were evaluated in the demographic categories of age at index, sex, race, ethnicity, and the combination of race and ethnicity. Results: Representativeness of the MIDRC data by ethnicity and the combination of race and ethnicity was impacted by the percentage of CDC case counts for which this was not reported. The distributions by sex and race have retained their level of representativeness over time. Conclusion: The representativeness of the open medical imaging datasets in the curated public data commons at MIDRC has evolved over time as the number of contributing institutions and overall number of subjects have grown. The use of metrics, such as the JSD support measurement of representativeness, is one step needed for fair and generalizable AI algorithm development.

16.
J Med Imaging (Bellingham) ; 10(6): 061104, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37125409

RESUMO

Purpose: To recognize and address various sources of bias essential for algorithmic fairness and trustworthiness and to contribute to a just and equitable deployment of AI in medical imaging, there is an increasing interest in developing medical imaging-based machine learning methods, also known as medical imaging artificial intelligence (AI), for the detection, diagnosis, prognosis, and risk assessment of disease with the goal of clinical implementation. These tools are intended to help improve traditional human decision-making in medical imaging. However, biases introduced in the steps toward clinical deployment may impede their intended function, potentially exacerbating inequities. Specifically, medical imaging AI can propagate or amplify biases introduced in the many steps from model inception to deployment, resulting in a systematic difference in the treatment of different groups. Approach: Our multi-institutional team included medical physicists, medical imaging artificial intelligence/machine learning (AI/ML) researchers, experts in AI/ML bias, statisticians, physicians, and scientists from regulatory bodies. We identified sources of bias in AI/ML, mitigation strategies for these biases, and developed recommendations for best practices in medical imaging AI/ML development. Results: Five main steps along the roadmap of medical imaging AI/ML were identified: (1) data collection, (2) data preparation and annotation, (3) model development, (4) model evaluation, and (5) model deployment. Within these steps, or bias categories, we identified 29 sources of potential bias, many of which can impact multiple steps, as well as mitigation strategies. Conclusions: Our findings provide a valuable resource to researchers, clinicians, and the public at large.

17.
Med Phys ; 50(2): e1-e24, 2023 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-36565447

RESUMO

Rapid advances in artificial intelligence (AI) and machine learning, and specifically in deep learning (DL) techniques, have enabled broad application of these methods in health care. The promise of the DL approach has spurred further interest in computer-aided diagnosis (CAD) development and applications using both "traditional" machine learning methods and newer DL-based methods. We use the term CAD-AI to refer to this expanded clinical decision support environment that uses traditional and DL-based AI methods. Numerous studies have been published to date on the development of machine learning tools for computer-aided, or AI-assisted, clinical tasks. However, most of these machine learning models are not ready for clinical deployment. It is of paramount importance to ensure that a clinical decision support tool undergoes proper training and rigorous validation of its generalizability and robustness before adoption for patient care in the clinic. To address these important issues, the American Association of Physicists in Medicine (AAPM) Computer-Aided Image Analysis Subcommittee (CADSC) is charged, in part, to develop recommendations on practices and standards for the development and performance assessment of computer-aided decision support systems. The committee has previously published two opinion papers on the evaluation of CAD systems and issues associated with user training and quality assurance of these systems in the clinic. With machine learning techniques continuing to evolve and CAD applications expanding to new stages of the patient care process, the current task group report considers the broader issues common to the development of most, if not all, CAD-AI applications and their translation from the bench to the clinic. The goal is to bring attention to the proper training and validation of machine learning algorithms that may improve their generalizability and reliability and accelerate the adoption of CAD-AI systems for clinical decision support.


Assuntos
Inteligência Artificial , Diagnóstico por Computador , Humanos , Reprodutibilidade dos Testes , Diagnóstico por Computador/métodos , Diagnóstico por Imagem , Aprendizado de Máquina
18.
JAMA Netw Open ; 6(2): e230524, 2023 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-36821110

RESUMO

Importance: An accurate and robust artificial intelligence (AI) algorithm for detecting cancer in digital breast tomosynthesis (DBT) could significantly improve detection accuracy and reduce health care costs worldwide. Objectives: To make training and evaluation data for the development of AI algorithms for DBT analysis available, to develop well-defined benchmarks, and to create publicly available code for existing methods. Design, Setting, and Participants: This diagnostic study is based on a multi-institutional international grand challenge in which research teams developed algorithms to detect lesions in DBT. A data set of 22 032 reconstructed DBT volumes was made available to research teams. Phase 1, in which teams were provided 700 scans from the training set, 120 from the validation set, and 180 from the test set, took place from December 2020 to January 2021, and phase 2, in which teams were given the full data set, took place from May to July 2021. Main Outcomes and Measures: The overall performance was evaluated by mean sensitivity for biopsied lesions using only DBT volumes with biopsied lesions; ties were broken by including all DBT volumes. Results: A total of 8 teams participated in the challenge. The team with the highest mean sensitivity for biopsied lesions was the NYU B-Team, with 0.957 (95% CI, 0.924-0.984), and the second-place team, ZeDuS, had a mean sensitivity of 0.926 (95% CI, 0.881-0.964). When the results were aggregated, the mean sensitivity for all submitted algorithms was 0.879; for only those who participated in phase 2, it was 0.926. Conclusions and Relevance: In this diagnostic study, an international competition produced algorithms with high sensitivity for using AI to detect lesions on DBT images. A standardized performance benchmark for the detection task using publicly available clinical imaging data was released, with detailed descriptions and analyses of submitted algorithms accompanied by a public release of their predictions and code for selected methods. These resources will serve as a foundation for future research on computer-assisted diagnosis methods for DBT, significantly lowering the barrier of entry for new researchers.


Assuntos
Inteligência Artificial , Neoplasias da Mama , Humanos , Feminino , Benchmarking , Mamografia/métodos , Algoritmos , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Neoplasias da Mama/diagnóstico por imagem
19.
J Med Imaging (Bellingham) ; 9(3): 035502, 2022 May.
Artigo em Inglês | MEDLINE | ID: mdl-35656541

RESUMO

Purpose: The aim of this study is to (1) demonstrate a graphical method and interpretation framework to extend performance evaluation beyond receiver operating characteristic curve analysis and (2) assess the impact of disease prevalence and variability in training and testing sets, particularly when a specific operating point is used. Approach: The proposed performance metric curves (PMCs) simultaneously assess sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV), and the 95% confidence intervals thereof, as a function of the threshold for the decision variable. We investigated the utility of PMCs using six example operating points associated with commonly used methods to select operating points (including the Youden index and maximum mutual information). As an example, we applied PMCs to the task of distinguishing between malignant and benign breast lesions using human-engineered radiomic features extracted from dynamic contrast-enhanced magnetic resonance images. The dataset had 1885 lesions, with the images acquired in 2015 and 2016 serving as the training set (1450 lesions) and those acquired in 2017 as the test set (435 lesions). Our study used this dataset in two ways: (1) the clinical dataset itself and (2) simulated datasets with features based on the clinical set but with five different disease prevalences. The median and 95% CI of the number of type I (false positive) and type II (false negative) errors were determined for each operating point of interest. Results: PMCs from both the clinical and simulated datasets demonstrated that PMCs could support interpretation of the impact of decision threshold choice on type I and type II errors of classification, particularly relevant to prevalence. Conclusion: PMCs allow simultaneous evaluation of the four performance metrics of sensitivity, specificity, PPV, and NPV as a function of the decision threshold. This may create a better understanding of two-class classifier performance in machine learning.

20.
J Med Imaging (Bellingham) ; 8(3): 031901, 2021 May.
Artigo em Inglês | MEDLINE | ID: mdl-34179216

RESUMO

The editorial introduces the Special Section on Radiogenomics in Prognosis and Treatment for Volume 8 Issue 3 of the Journal of Medical Imaging.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA