Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
4.
Nat Commun ; 15(1): 524, 2024 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-38225244

RESUMO

Artificial intelligence (AI) systems have been shown to help dermatologists diagnose melanoma more accurately, however they lack transparency, hindering user acceptance. Explainable AI (XAI) methods can help to increase transparency, yet often lack precise, domain-specific explanations. Moreover, the impact of XAI methods on dermatologists' decisions has not yet been evaluated. Building upon previous research, we introduce an XAI system that provides precise and domain-specific explanations alongside its differential diagnoses of melanomas and nevi. Through a three-phase study, we assess its impact on dermatologists' diagnostic accuracy, diagnostic confidence, and trust in the XAI-support. Our results show strong alignment between XAI and dermatologist explanations. We also show that dermatologists' confidence in their diagnoses, and their trust in the support system significantly increase with XAI compared to conventional AI. This study highlights dermatologists' willingness to adopt such XAI systems, promoting future use in the clinic.


Assuntos
Melanoma , Confiança , Humanos , Inteligência Artificial , Dermatologistas , Melanoma/diagnóstico , Diagnóstico Diferencial
5.
JAMA Dermatol ; 160(3): 303-311, 2024 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-38324293

RESUMO

Importance: The development of artificial intelligence (AI)-based melanoma classifiers typically calls for large, centralized datasets, requiring hospitals to give away their patient data, which raises serious privacy concerns. To address this concern, decentralized federated learning has been proposed, where classifier development is distributed across hospitals. Objective: To investigate whether a more privacy-preserving federated learning approach can achieve comparable diagnostic performance to a classical centralized (ie, single-model) and ensemble learning approach for AI-based melanoma diagnostics. Design, Setting, and Participants: This multicentric, single-arm diagnostic study developed a federated model for melanoma-nevus classification using histopathological whole-slide images prospectively acquired at 6 German university hospitals between April 2021 and February 2023 and benchmarked it using both a holdout and an external test dataset. Data analysis was performed from February to April 2023. Exposures: All whole-slide images were retrospectively analyzed by an AI-based classifier without influencing routine clinical care. Main Outcomes and Measures: The area under the receiver operating characteristic curve (AUROC) served as the primary end point for evaluating the diagnostic performance. Secondary end points included balanced accuracy, sensitivity, and specificity. Results: The study included 1025 whole-slide images of clinically melanoma-suspicious skin lesions from 923 patients, consisting of 388 histopathologically confirmed invasive melanomas and 637 nevi. The median (range) age at diagnosis was 58 (18-95) years for the training set, 57 (18-93) years for the holdout test dataset, and 61 (18-95) years for the external test dataset; the median (range) Breslow thickness was 0.70 (0.10-34.00) mm, 0.70 (0.20-14.40) mm, and 0.80 (0.30-20.00) mm, respectively. The federated approach (0.8579; 95% CI, 0.7693-0.9299) performed significantly worse than the classical centralized approach (0.9024; 95% CI, 0.8379-0.9565) in terms of AUROC on a holdout test dataset (pairwise Wilcoxon signed-rank, P < .001) but performed significantly better (0.9126; 95% CI, 0.8810-0.9412) than the classical centralized approach (0.9045; 95% CI, 0.8701-0.9331) on an external test dataset (pairwise Wilcoxon signed-rank, P < .001). Notably, the federated approach performed significantly worse than the ensemble approach on both the holdout (0.8867; 95% CI, 0.8103-0.9481) and external test dataset (0.9227; 95% CI, 0.8941-0.9479). Conclusions and Relevance: The findings of this diagnostic study suggest that federated learning is a viable approach for the binary classification of invasive melanomas and nevi on a clinically representative distributed dataset. Federated learning can improve privacy protection in AI-based melanoma diagnostics while simultaneously promoting collaboration across institutions and countries. Moreover, it may have the potential to be extended to other image classification tasks in digital cancer histopathology and beyond.


Assuntos
Dermatologia , Melanoma , Nevo , Neoplasias Cutâneas , Humanos , Melanoma/diagnóstico , Inteligência Artificial , Estudos Retrospectivos , Neoplasias Cutâneas/diagnóstico , Nevo/diagnóstico
8.
Eur J Cancer ; 169: 146-155, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-35569281

RESUMO

BACKGROUND: Targeted therapies for metastatic uveal melanoma have shown limited benefit in biomarker-unselected populations. The Treat20 Plus study prospectively evaluated the feasibility of a precision oncology strategy in routine clinical practice. PATIENTS AND METHODS: Fresh biopsies were analyzed by high-throughput genomics (whole-genome, whole-exome, and RNA sequencing). A multidisciplinary molecular and immunologic tumor board (MiTB) made individualized treatment recommendations based on identified molecular aberrations, patient situation, drug, and clinical trial availability. Therapy selection was at the discretion of the treating physician. The primary endpoint was the feasibility of the precision oncology clinical program. RESULTS: Molecular analyses were available for 39/45 patients (87%). The MiTB provided treatment recommendations for 40/45 patients (89%), of whom 27/45 (60%) received ≥1 matched therapy. First-line matched therapies included MEK inhibitors (n = 15), MET inhibitors (n = 10), sorafenib (n = 1), and nivolumab (n = 1). The best response to first-line matched therapy was partial response in one patient (nivolumab based on tumor mutational burden), mixed response in two patients, and stable disease in 12 patients for a clinical benefit of 56%. The matched therapy population had a median progression-free survival and overall survival of 3.3 and 13.9 months, respectively. The growth modulation index with matched therapy was >1.33 in 6/17 patients (35%) with prior systemic therapy, suggesting clinical benefit. CONCLUSIONS: A precision oncology approach was feasible for patients with metastatic uveal melanoma, with 60% receiving a therapy matched to identify molecular aberrations. The clinical benefit after checkpoint inhibitors highlights the value of tumor mutational burden testing.


Assuntos
Segunda Neoplasia Primária , Neoplasias Uveais , Biomarcadores Tumorais/genética , Estudos de Viabilidade , Humanos , Melanoma , Segunda Neoplasia Primária/tratamento farmacológico , Nivolumabe/uso terapêutico , Medicina de Precisão , Estudos Prospectivos , Neoplasias Uveais/tratamento farmacológico , Neoplasias Uveais/genética
9.
Eur J Cancer ; 167: 54-69, 2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-35390650

RESUMO

BACKGROUND: Due to their ability to solve complex problems, deep neural networks (DNNs) are becoming increasingly popular in medical applications. However, decision-making by such algorithms is essentially a black-box process that renders it difficult for physicians to judge whether the decisions are reliable. The use of explainable artificial intelligence (XAI) is often suggested as a solution to this problem. We investigate how XAI is used for skin cancer detection: how is it used during the development of new DNNs? What kinds of visualisations are commonly used? Are there systematic evaluations of XAI with dermatologists or dermatopathologists? METHODS: Google Scholar, PubMed, IEEE Explore, Science Direct and Scopus were searched for peer-reviewed studies published between January 2017 and October 2021 applying XAI to dermatological images: the search terms histopathological image, whole-slide image, clinical image, dermoscopic image, skin, dermatology, explainable, interpretable and XAI were used in various combinations. Only studies concerned with skin cancer were included. RESULTS: 37 publications fulfilled our inclusion criteria. Most studies (19/37) simply applied existing XAI methods to their classifier to interpret its decision-making. Some studies (4/37) proposed new XAI methods or improved upon existing techniques. 14/37 studies addressed specific questions such as bias detection and impact of XAI on man-machine-interactions. However, only three of them evaluated the performance and confidence of humans using CAD systems with XAI. CONCLUSION: XAI is commonly applied during the development of DNNs for skin cancer detection. However, a systematic and rigorous evaluation of its usefulness in this scenario is lacking.


Assuntos
Inteligência Artificial , Neoplasias Cutâneas , Algoritmos , Humanos , Redes Neurais de Computação , Neoplasias Cutâneas/diagnóstico
10.
Eur J Cancer ; 173: 307-316, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-35973360

RESUMO

BACKGROUND: Image-based cancer classifiers suffer from a variety of problems which negatively affect their performance. For example, variation in image brightness or different cameras can already suffice to diminish performance. Ensemble solutions, where multiple model predictions are combined into one, can improve these problems. However, ensembles are computationally intensive and less transparent to practitioners than single model solutions. Constructing model soups, by averaging the weights of multiple models into a single model, could circumvent these limitations while still improving performance. OBJECTIVE: To investigate the performance of model soups for a dermoscopic melanoma-nevus skin cancer classification task with respect to (1) generalisation to images from other clinics, (2) robustness against small image changes and (3) calibration such that the confidences correspond closely to the actual predictive uncertainties. METHODS: We construct model soups by fine-tuning pre-trained models on seven different image resolutions and subsequently averaging their weights. Performance is evaluated on a multi-source dataset including holdout and external components. RESULTS: We find that model soups improve generalisation and calibration on the external component while maintaining performance on the holdout component. For robustness, we observe performance improvements for pertubated test images, while the performance on corrupted test images remains on par. CONCLUSIONS: Overall, souping for skin cancer classifiers has a positive effect on generalisation, robustness and calibration. It is easy for practitioners to implement and by combining multiple models into a single model, complexity is reduced. This could be an important factor in achieving clinical applicability, as less complexity generally means more transparency.


Assuntos
Melanoma , Neoplasias Cutâneas , Dermoscopia/métodos , Humanos , Melanoma/diagnóstico por imagem , Sensibilidade e Especificidade , Neoplasias Cutâneas/diagnóstico por imagem , Melanoma Maligno Cutâneo
11.
Eur J Cancer ; 149: 94-101, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33838393

RESUMO

BACKGROUND: Clinicians and pathologists traditionally use patient data in addition to clinical examination to support their diagnoses. OBJECTIVES: We investigated whether a combination of histologic whole slides image (WSI) analysis based on convolutional neural networks (CNNs) and commonly available patient data (age, sex and anatomical site of the lesion) in a binary melanoma/nevus classification task could increase the performance compared with CNNs alone. METHODS: We used 431 WSIs from two different laboratories and analysed the performance of classifiers that used the image or patient data individually or three common fusion techniques. Furthermore, we tested a naive combination of patient data and an image classifier: for cases interpreted as 'uncertain' (CNN output score <0.7), the decision of the CNN was replaced by the decision of the patient data classifier. RESULTS: The CNN on its own achieved the best performance (mean ± standard deviation of five individual runs) with AUROC of 92.30% ± 0.23% and balanced accuracy of 83.17% ± 0.38%. While the classification performance was not significantly improved in general by any of the tested fusions, naive strategy of replacing the image classifier with the patient data classifier on slides with low output scores improved balanced accuracy to 86.72% ± 0.36%. CONCLUSION: In most cases, the CNN on its own was so accurate that patient data integration did not provide any benefit. However, incorporating patient data for lesions that were classified by the CNN with low 'confidence' improved balanced accuracy.


Assuntos
Interpretação de Imagem Assistida por Computador , Melanoma/patologia , Microscopia , Redes Neurais de Computação , Nevo/patologia , Neoplasias Cutâneas/patologia , Adulto , Fatores Etários , Idoso , Bases de Dados Factuais , Feminino , Alemanha , Humanos , Masculino , Melanoma/classificação , Pessoa de Meia-Idade , Nevo/classificação , Valor Preditivo dos Testes , Reprodutibilidade dos Testes , Estudos Retrospectivos , Fatores Sexuais , Neoplasias Cutâneas/classificação
12.
Eur J Cancer ; 155: 191-199, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-34388516

RESUMO

BACKGROUND: One prominent application for deep learning-based classifiers is skin cancer classification on dermoscopic images. However, classifier evaluation is often limited to holdout data which can mask common shortcomings such as susceptibility to confounding factors. To increase clinical applicability, it is necessary to thoroughly evaluate such classifiers on out-of-distribution (OOD) data. OBJECTIVE: The objective of the study was to establish a dermoscopic skin cancer benchmark in which classifier robustness to OOD data can be measured. METHODS: Using a proprietary dermoscopic image database and a set of image transformations, we create an OOD robustness benchmark and evaluate the robustness of four different convolutional neural network (CNN) architectures on it. RESULTS: The benchmark contains three data sets-Skin Archive Munich (SAM), SAM-corrupted (SAM-C) and SAM-perturbed (SAM-P)-and is publicly available for download. To maintain the benchmark's OOD status, ground truth labels are not provided and test results should be sent to us for assessment. The SAM data set contains 319 unmodified and biopsy-verified dermoscopic melanoma (n = 194) and nevus (n = 125) images. SAM-C and SAM-P contain images from SAM which were artificially modified to test a classifier against low-quality inputs and to measure its prediction stability over small image changes, respectively. All four CNNs showed susceptibility to corruptions and perturbations. CONCLUSIONS: This benchmark provides three data sets which allow for OOD testing of binary skin cancer classifiers. Our classifier performance confirms the shortcomings of CNNs and provides a frame of reference. Altogether, this benchmark should facilitate a more thorough evaluation process and thereby enable the development of more robust skin cancer classifiers.


Assuntos
Benchmarking/normas , Redes Neurais de Computação , Neoplasias Cutâneas/classificação , Humanos
13.
Eur J Cancer ; 156: 202-216, 2021 10.
Artigo em Inglês | MEDLINE | ID: mdl-34509059

RESUMO

BACKGROUND: Multiple studies have compared the performance of artificial intelligence (AI)-based models for automated skin cancer classification to human experts, thus setting the cornerstone for a successful translation of AI-based tools into clinicopathological practice. OBJECTIVE: The objective of the study was to systematically analyse the current state of research on reader studies involving melanoma and to assess their potential clinical relevance by evaluating three main aspects: test set characteristics (holdout/out-of-distribution data set, composition), test setting (experimental/clinical, inclusion of metadata) and representativeness of participating clinicians. METHODS: PubMed, Medline and ScienceDirect were screened for peer-reviewed studies published between 2017 and 2021 and dealing with AI-based skin cancer classification involving melanoma. The search terms skin cancer classification, deep learning, convolutional neural network (CNN), melanoma (detection), digital biomarkers, histopathology and whole slide imaging were combined. Based on the search results, only studies that considered direct comparison of AI results with clinicians and had a diagnostic classification as their main objective were included. RESULTS: A total of 19 reader studies fulfilled the inclusion criteria. Of these, 11 CNN-based approaches addressed the classification of dermoscopic images; 6 concentrated on the classification of clinical images, whereas 2 dermatopathological studies utilised digitised histopathological whole slide images. CONCLUSIONS: All 19 included studies demonstrated superior or at least equivalent performance of CNN-based classifiers compared with clinicians. However, almost all studies were conducted in highly artificial settings based exclusively on single images of the suspicious lesions. Moreover, test sets mainly consisted of holdout images and did not represent the full range of patient populations and melanoma subtypes encountered in clinical practice.


Assuntos
Dermatologistas , Dermoscopia , Diagnóstico por Computador , Interpretação de Imagem Assistida por Computador , Melanoma/patologia , Microscopia , Redes Neurais de Computação , Patologistas , Neoplasias Cutâneas/patologia , Automação , Biópsia , Competência Clínica , Aprendizado Profundo , Humanos , Melanoma/classificação , Valor Preditivo dos Testes , Reprodutibilidade dos Testes , Neoplasias Cutâneas/classificação
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA