Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 79
Filtrar
1.
Artigo em Inglês | MEDLINE | ID: mdl-38980777

RESUMO

Image analysis can play an important role in supporting histopathological diagnoses of lung cancer, with deep learning methods already achieving remarkable results. However, due to the large scale of whole-slide images (WSIs), creating manual pixel-wise annotations from expert pathologists is expensive and time-consuming. In addition, the heterogeneity of tumors and similarities in the morphological phenotype of tumor subtypes have caused inter-observer variability in annotations, which limits optimal performance. Effective use of weak labels could potentially alleviate these issues. In this paper, we propose a two-stage transformer-based weakly supervised learning framework called Simple Shuffle-Remix Vision Transformer (SSRViT). Firstly, we introduce a Shuffle-Remix Vision Transformer (SRViT) to retrieve discriminative local tokens and extract effective representative features. Then, the token features are selected and aggregated to generate sparse representations of WSIs, which are fed into a simple transformer-based classifier (SViT) for slide-level prediction. Experimental results demonstrate that the performance of our proposed SSRViT is significantly improved compared with other state-of-the-art methods in discriminating between adenocarcinoma, pulmonary sclerosing pneumocytoma and normal lung tissue (accuracy of 96.9% and AUC of 99.6%).

2.
PLoS One ; 19(5): e0301969, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38771787

RESUMO

PURPOSE: This study aims to introduce an innovative multi-step pipeline for automatic tumor-stroma ratio (TSR) quantification as a potential prognostic marker for pancreatic cancer, addressing the limitations of existing staging systems and the lack of commonly used prognostic biomarkers. METHODS: The proposed approach involves a deep-learning-based method for the automatic segmentation of tumor epithelial cells, tumor bulk, and stroma from whole-slide images (WSIs). Models were trained using five-fold cross-validation and evaluated on an independent external test set. TSR was computed based on the segmented components. Additionally, TSR's predictive value for six-month survival on the independent external dataset was assessed. RESULTS: Median Dice (inter-quartile range (IQR)) of 0.751(0.15) and 0.726(0.25) for tumor epithelium segmentation on internal and external test sets, respectively. Median Dice of 0.76(0.11) and 0.863(0.17) for tumor bulk segmentation on internal and external test sets, respectively. TSR was evaluated as an independent prognostic marker, demonstrating a cross-validation AUC of 0.61±0.12 for predicting six-month survival on the external dataset. CONCLUSION: Our pipeline for automatic TSR quantification offers promising potential as a prognostic marker for pancreatic cancer. The results underscore the feasibility of computational biomarker discovery in enhancing patient outcome prediction, thus contributing to personalized patient management.


Assuntos
Biomarcadores Tumorais , Neoplasias Pancreáticas , Humanos , Neoplasias Pancreáticas/patologia , Neoplasias Pancreáticas/diagnóstico , Neoplasias Pancreáticas/mortalidade , Prognóstico , Feminino , Células Estromais/patologia , Masculino , Aprendizado Profundo , Idoso , Pessoa de Meia-Idade , Processamento de Imagem Assistida por Computador/métodos
3.
Nat Methods ; 21(2): 182-194, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38347140

RESUMO

Validation metrics are key for tracking scientific progress and bridging the current chasm between artificial intelligence research and its translation into practice. However, increasing evidence shows that, particularly in image analysis, metrics are often chosen inadequately. Although taking into account the individual strengths, weaknesses and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multistage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides a reliable and comprehensive common point of access to information on pitfalls related to validation metrics in image analysis. Although focused on biomedical image analysis, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy. The work serves to enhance global comprehension of a key topic in image analysis validation.


Assuntos
Inteligência Artificial
4.
Nat Methods ; 21(2): 195-212, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38347141

RESUMO

Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. In biomedical image analysis, chosen performance metrics often do not reflect the domain interest, and thus fail to adequately measure scientific progress and hinder translation of ML techniques into practice. To overcome this, we created Metrics Reloaded, a comprehensive framework guiding researchers in the problem-aware selection of metrics. Developed by a large international consortium in a multistage Delphi process, it is based on the novel concept of a problem fingerprint-a structured representation of the given problem that captures all aspects that are relevant for metric selection, from the domain interest to the properties of the target structure(s), dataset and algorithm output. On the basis of the problem fingerprint, users are guided through the process of choosing and applying appropriate validation metrics while being made aware of potential pitfalls. Metrics Reloaded targets image analysis problems that can be interpreted as classification tasks at image, object or pixel level, namely image-level classification, object detection, semantic segmentation and instance segmentation tasks. To improve the user experience, we implemented the framework in the Metrics Reloaded online tool. Following the convergence of ML methodology across application domains, Metrics Reloaded fosters the convergence of validation methodology. Its applicability is demonstrated for various biomedical use cases.


Assuntos
Algoritmos , Processamento de Imagem Assistida por Computador , Aprendizado de Máquina , Semântica
5.
Sci Rep ; 14(1): 5004, 2024 02 29.
Artigo em Inglês | MEDLINE | ID: mdl-38424226

RESUMO

White matter hyperintensities (WMH) are the most prevalent markers of cerebral small vessel disease (SVD), which is the major vascular risk factor for dementia. Microvascular pathology and neuroinflammation are suggested to drive the transition from normal-appearing white matter (NAWM) to WMH, particularly in individuals with hypertension. However, current imaging techniques cannot capture ongoing NAWM changes. The transition from NAWM into WMH is a continuous process, yet white matter lesions are often examined dichotomously, which may explain their underlying heterogeneity. Therefore, we examined microvascular and neurovascular inflammation pathology in NAWM and severe WMH three-dimensionally, along with gradual magnetic resonance imaging (MRI) fluid-attenuated inversion recovery (FLAIR) signal (sub-)segmentation. In WMH, the vascular network exhibited reduced length and complexity compared to NAWM. Neuroinflammation was more severe in WMH. Vascular inflammation was more pronounced in NAWM, suggesting its potential significance in converting NAWM into WMH. Moreover, the (sub-)segmentation of FLAIR signal displayed varying degrees of vascular pathology, particularly within WMH regions. These findings highlight the intricate interplay between microvascular pathology and neuroinflammation in the transition from NAWM to WMH. Further examination of neurovascular inflammation across MRI-visible alterations could aid deepening our understanding on WMH conversion, and therewith how to improve the prognosis of SVD.


Assuntos
Substância Branca , Humanos , Substância Branca/patologia , Doenças Neuroinflamatórias , Imageamento por Ressonância Magnética/métodos , Inflamação/diagnóstico por imagem , Inflamação/patologia , Fatores de Risco
6.
Med Image Anal ; 93: 103063, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38194735

RESUMO

The frequency of basal cell carcinoma (BCC) cases is putting an increasing strain on dermatopathologists. BCC is the most common type of skin cancer, and its incidence is increasing rapidly worldwide. AI can play a significant role in reducing the time and effort required for BCC diagnostics and thus improve the overall efficiency of the process. To train such an AI system in a fully-supervised fashion however, would require a large amount of pixel-level annotation by already strained dermatopathologists. Therefore, in this study, our primary objective was to develop a weakly-supervised for the identification of basal cell carcinoma (BCC) and the stratification of BCC into low-risk and high-risk categories within histopathology whole-slide images (WSI). We compared Clustering-constrained Attention Multiple instance learning (CLAM) with StreamingCLAM and hypothesized that the latter would be the superior approach. A total of 5147 images were used to train and validate the models, which were subsequently tested on an internal set of 949 images and an external set of 183 images. The labels for training were automatically extracted from free-text pathology reports using a rule-based approach. All data has been made available through the COBRA dataset. The results showed that both the CLAM and StreamingCLAM models achieved high performance for the detection of BCC, with an area under the ROC curve (AUC) of 0.994 and 0.997, respectively, on the internal test set and 0.983 and 0.993 on the external dataset. Furthermore, the models performed well on risk stratification, with AUC values of 0.912 and 0.931, respectively, on the internal set, and 0.851 and 0.883 on the external set. In every single metric the StreamingCLAM model outperformed the CLAM model or is on par. The performance of both models was comparable to that of two pathologists who scored 240 BCC positive slides. Additionally, in the public test set, StreamingCLAM demonstrated a comparable AUC of 0.958, markedly superior to CLAM's 0.803. This difference was statistically significant and emphasized the strength and better adaptability of the StreamingCLAM approach.


Assuntos
Carcinoma Basocelular , Neoplasias Cutâneas , Humanos , Carcinoma Basocelular/diagnóstico por imagem , Área Sob a Curva , Neoplasias Cutâneas/diagnóstico por imagem , Aprendizado de Máquina Supervisionado
7.
Med Image Anal ; 93: 103088, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38228075

RESUMO

The ability to detect anomalies, i.e. anything not seen during training or out-of-distribution (OOD), in medical imaging applications is essential for successfully deploying machine learning systems. Filtering out OOD data using unsupervised learning is especially promising because it does not require costly annotations. A new class of models called AnoDDPMs, based on denoising diffusion probabilistic models (DDPMs), has recently achieved significant progress in unsupervised OOD detection. This work provides a benchmark for unsupervised OOD detection methods in digital pathology. By leveraging fast sampling techniques, we apply AnoDDPM on a large enough scale for whole-slide image analysis on the complete test set of the Camelyon16 challenge. Based on ROC analysis, we show that AnoDDPMs can detect OOD data with an AUC of up to 94.13 and 86.93 on two patch-level OOD detection tasks, outperforming the other unsupervised methods. We observe that AnoDDPMs alter the semantic properties of inputs, replacing anomalous data with more benign-looking tissue. Furthermore, we highlight the flexibility of AnoDDPM towards different information bottlenecks by evaluating reconstruction errors for inputs with different signal-to-noise ratios. While there is still a significant performance gap with fully supervised learning, AnoDDPMs show considerable promise in the field of OOD detection in digital pathology.


Assuntos
Benchmarking , Processamento de Imagem Assistida por Computador , Humanos , Difusão , Aprendizado de Máquina , Modelos Estatísticos
8.
Comput Biol Med ; 170: 108018, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38281317

RESUMO

In histopathology practice, scanners, tissue processing, staining, and image acquisition protocols vary from center to center, resulting in subtle variations in images. Vanilla convolutional neural networks are sensitive to such domain shifts. Data augmentation is a popular way to improve domain generalization. Currently, state-of-the-art domain generalization in computational pathology is achieved using a manually curated set of augmentation transforms. However, manual tuning of augmentation parameters is time-consuming and can lead to sub-optimal generalization performance. Meta-learning frameworks can provide efficient ways to find optimal training hyper-parameters, including data augmentation. In this study, we hypothesize that an automated search of augmentation hyper-parameters can provide superior generalization performance and reduce experimental optimization time. We select four state-of-the-art automatic augmentation methods from general computer vision and investigate their capacity to improve domain generalization in histopathology. We analyze their performance on data from 25 centers across two different tasks: tumor metastasis detection in lymph nodes and breast cancer tissue type classification. On tumor metastasis detection, most automatic augmentation methods achieve comparable performance to state-of-the-art manual augmentation. On breast cancer tissue type classification, the leading automatic augmentation method significantly outperforms state-of-the-art manual data augmentation.


Assuntos
Neoplasias da Mama , Aprendizado Profundo , Humanos , Feminino , Processamento de Imagem Assistida por Computador/métodos , Redes Neurais de Computação , Mama
9.
Sci Rep ; 14(1): 1497, 2024 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-38233535

RESUMO

Whole-mount sectioning is a technique in histopathology where a full slice of tissue, such as a transversal cross-section of a prostate specimen, is prepared on a large microscope slide without further sectioning into smaller fragments. Although this technique can offer improved correlation with pre-operative imaging and is paramount for multimodal research, it is not commonly employed due to its technical difficulty, associated cost and cumbersome integration in (digital) pathology workflows. In this work, we present a computational tool named PythoStitcher which reconstructs artificial whole-mount sections from digitized tissue fragments, thereby bringing the benefits of whole-mount sections to pathology labs currently unable to employ this technique. Our proposed algorithm consists of a multi-step approach where it (i) automatically determines how fragments need to be reassembled, (ii) iteratively optimizes the stitch using a genetic algorithm and (iii) efficiently reconstructs the final artificial whole-mount section on full resolution (0.25 µm/pixel). PythoStitcher was validated on a total of 198 cases spanning five datasets with a varying number of tissue fragments originating from different organs from multiple centers. PythoStitcher successfully reconstructed the whole-mount section in 86-100% of cases for a given dataset with a residual registration mismatch of 0.65-2.76 mm on automatically selected landmarks. It is expected that our algorithm can aid pathology labs unable to employ whole-mount sectioning through faster clinical case evaluation and improved radiology-pathology correlation workflows.


Assuntos
Algoritmos , Diagnóstico por Imagem , Processamento de Imagem Assistida por Computador , Humanos
10.
ArXiv ; 2024 Feb 23.
Artigo em Inglês | MEDLINE | ID: mdl-36945687

RESUMO

Validation metrics are key for the reliable tracking of scientific progress and for bridging the current chasm between artificial intelligence (AI) research and its translation into practice. However, increasing evidence shows that particularly in image analysis, metrics are often chosen inadequately in relation to the underlying research problem. This could be attributed to a lack of accessibility of metric-related knowledge: While taking into account the individual strengths, weaknesses, and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multi-stage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides the first reliable and comprehensive common point of access to information on pitfalls related to validation metrics in image analysis. Focusing on biomedical image analysis but with the potential of transfer to other fields, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy. To facilitate comprehension, illustrations and specific examples accompany each pitfall. As a structured body of information accessible to researchers of all levels of expertise, this work enhances global comprehension of a key topic in image analysis validation.

11.
Pathobiology ; 91(1): 8-17, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-36791682

RESUMO

The expanding digitalization of routine diagnostic histological slides holds a potential to apply artificial intelligence (AI) to pathology, including bone marrow (BM) histology. In this perspective, we describe potential tasks in diagnostics that can be supported, investigations that can be guided, and questions that can be answered by the future application of AI on whole-slide images of BM biopsies. These range from characterization of cell lineages and quantification of cells and stromal structures to disease prediction. First glimpses show an exciting potential to detect subtle phenotypic changes with AI that are due to specific genotypes. The discussion is illustrated by examples of current AI research using BM biopsy slides. In addition, we briefly discuss current challenges for implementation of AI-supported diagnostics.


Assuntos
Inteligência Artificial , Medula Óssea , Humanos , Biópsia , Linhagem da Célula , Genótipo
12.
Artigo em Inglês | MEDLINE | ID: mdl-37831571

RESUMO

Many inherently ambiguous tasks in medical imaging suffer from inter-observer variability, resulting in a reference standard defined by a distribution of labels with high variance. Training only on a consensus or majority vote label, as is common in medical imaging, discards valuable information on uncertainty amongst a panel of experts. In this work, we propose to train on the full label distribution to predict the uncertainty within a panel of experts and the most likely ground-truth label. To do so, we propose a new stochastic classification framework based on the conditional variational auto-encoder, which we refer to as the Latent Doctor Model (LDM). In an extensive comparative analysis, we compare the LDM with a model trained on the majority vote label and other methods capable of learning a distribution of labels. We show that the LDM is able to reproduce the reference-standard distribution significantly better than the majority vote baseline. Compared to the other baseline methods, we demonstrate that the LDM performs best at modeling the label distribution and its corresponding uncertainty in two prostate tumor grading tasks. Furthermore, we show competitive performance of the LDM with the more computationally demanding deep ensembles on a tumor budding classification task.

13.
Med Image Anal ; 88: 102881, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37437452

RESUMO

Current hardware limitations make it impossible to train convolutional neural networks on gigapixel image inputs directly. Recent developments in weakly supervised learning, such as attention-gated multiple instance learning, have shown promising results, but often use multi-stage or patch-wise training strategies risking suboptimal feature extraction, which can negatively impact performance. In this paper, we propose to train a ResNet-34 encoder with an attention-gated classification head in an end-to-end fashion, which we call StreamingCLAM, using a streaming implementation of convolutional layers. This allows us to train end-to-end on 4-gigapixel microscopic images using only slide-level labels. We achieve a mean area under the receiver operating characteristic curve of 0.9757 for metastatic breast cancer detection (CAMELYON16), close to fully supervised approaches using pixel-level annotations. Our model can also detect MYC-gene translocation in histologic slides of diffuse large B-cell lymphoma, achieving a mean area under the ROC curve of 0.8259. Furthermore, we show that our model offers a degree of interpretability through the attention mechanism.


Assuntos
Neoplasias da Mama , Redes Neurais de Computação , Humanos , Feminino , Neoplasias da Mama/diagnóstico por imagem , Neoplasias da Mama/patologia , Curva ROC
14.
Clin Genitourin Cancer ; 21(5): e352-e361, 2023 10.
Artigo em Inglês | MEDLINE | ID: mdl-37164814

RESUMO

INTRODUCTION: Prostate specific membrane antigen (PSMA) directed radioligand therapy (RLT) is a novel therapy for metastatic castration-resistant prostate cancer (mCRPC) patients. However, it is still poorly understood why approximately 40% of the patients does not respond to PSMA-RLT. The aims of this study were to evaluate the pretreatment PSMA expression on immunohistochemistry (IHC) and PSMA uptake on PET/CT imaging in mCRPC patients who underwent PSMA-RLT. We correlated these parameters and a cell proliferation marker (Ki67) to the therapeutic efficacy of PSMA-RLT. PATIENTS AND METHODS: In this retrospective study, mCRPC patients who underwent PSMA-RLT were analyzed. Patients biopsies were scored for immunohistochemical Ki67 expression, PSMA staining intensity and percentage of cells with PSMA expression. Moreover, the PSMA tracer uptake of the tumor lesion(s) and healthy organs on PET/CT imaging was assessed. The primary outcome was to evaluate the association between histological PSMA protein expression of tumor in pre-PSMA-RLT biopsies and the PSMA uptake on PSMA PET/CT imaging of the biopsied lesion. Secondary outcomes were to assess the relationship between PSMA expression and Ki67 on IHC and the progression free survival (PFS) and overall survival (OS) following PSMA-RLT. RESULTS: In total, 22 mCRPC patients were included in this study. Nineteen (86%) patients showed a high and homogenous PSMA expression of >80% on IHC. Three (14%) patients had low PSMA expression on IHC. Although there was limited PSMA uptake on PET/CT imaging, these 3 patients had lower PSMA uptake on PET/CT imaging compared to the patients with high PSMA expression on IHC. Yet, no correlation was found between PSMA uptake on PET/CT imaging and PSMA expression on IHC (SUVmax: R2 = 0.046 and SUVavg: R2 = 0.036). The 3 patients had a shorter PFS compared to the patients with high PSMA expression on IHC (HR: 4.76, 95% CI: 1.14-19.99; P = .033). Patients with low Ki67 expression had a longer PFS and OS compared to patients with a high Ki67 expression (HR: 0.40, 95% CI: 0.15-1.06; P = .013) CONCLUSION: The PSMA uptake on PSMA-PET/CT generally followed the PSMA expression on IHC. However, heterogeneity may be missed on PSMA-PET/CT. Immunohistochemical PSMA and Ki67 expression in fresh tumor biopsies, may contribute to predict treatment efficacy of PSMA-RLT in mCRPC patients. This needs to be further explored in prospective cohorts.


Assuntos
Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada , Neoplasias de Próstata Resistentes à Castração , Masculino , Humanos , Antígeno Ki-67 , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada/métodos , Neoplasias de Próstata Resistentes à Castração/diagnóstico por imagem , Neoplasias de Próstata Resistentes à Castração/radioterapia , Neoplasias de Próstata Resistentes à Castração/metabolismo , Estudos Retrospectivos , Estudos Prospectivos , Antígeno Prostático Específico , Dipeptídeos/uso terapêutico , Resultado do Tratamento , Biópsia
15.
Med Image Anal ; 85: 102755, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36724605

RESUMO

Recently, large, high-quality public datasets have led to the development of convolutional neural networks that can detect lymph node metastases of breast cancer at the level of expert pathologists. Many cancers, regardless of the site of origin, can metastasize to lymph nodes. However, collecting and annotating high-volume, high-quality datasets for every cancer type is challenging. In this paper we investigate how to leverage existing high-quality datasets most efficiently in multi-task settings for closely related tasks. Specifically, we will explore different training and domain adaptation strategies, including prevention of catastrophic forgetting, for breast, colon and head-and-neck cancer metastasis detection in lymph nodes. Our results show state-of-the-art performance on colon and head-and-neck cancer metastasis detection tasks. We show the effectiveness of adaptation of networks from one cancer type to another to obtain multi-task metastasis detection networks. Furthermore, we show that leveraging existing high-quality datasets can significantly boost performance on new target tasks and that catastrophic forgetting can be effectively mitigated.Last, we compare different mitigation strategies.


Assuntos
Neoplasias da Mama , Neoplasias de Cabeça e Pescoço , Humanos , Feminino , Metástase Linfática/patologia , Redes Neurais de Computação , Linfonodos/patologia , Neoplasias da Mama/patologia
16.
Med Image Anal ; 83: 102655, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36306568

RESUMO

Machine learning model deployment in clinical practice demands real-time risk assessment to identify situations in which the model is uncertain. Once deployed, models should be accurate for classes seen during training while providing informative estimates of uncertainty to flag abnormalities and unseen classes for further analysis. Although recent developments in uncertainty estimation have resulted in an increasing number of methods, a rigorous empirical evaluation of their performance on large-scale digital pathology datasets is lacking. This work provides a benchmark for evaluating prevalent methods on multiple datasets by comparing the uncertainty estimates on both in-distribution and realistic near and far out-of-distribution (OOD) data on a whole-slide level. To this end, we aggregate uncertainty values from patch-based classifiers to whole-slide level uncertainty scores. We show that results found in classical computer vision benchmarks do not always translate to the medical imaging setting. Specifically, we demonstrate that deep ensembles perform best at detecting far-OOD data but can be outperformed on a more challenging near-OOD detection task by multi-head ensembles trained for optimal ensemble diversity. Furthermore, we demonstrate the harmful impact OOD data can have on the performance of deployed machine learning models. Overall, we show that uncertainty estimates can be used to discriminate in-distribution from OOD data with high AUC scores. Still, model deployment might require careful tuning based on prior knowledge of prospective OOD data.


Assuntos
Aprendizado de Máquina , Patologia , Humanos , Estudos Prospectivos
17.
Data Brief ; 45: 108739, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-36426089

RESUMO

In the present work, we present a publicly available, expert-segmented representative dataset of 158 3.0 Tesla biparametric MRIs [1]. There is an increasing number of studies investigating prostate and prostate carcinoma segmentation using deep learning (DL) with 3D architectures [2], [3], [4], [5], [6], [7]. The development of robust and data-driven DL models for prostate segmentation and assessment is currently limited by the availability of openly available expert-annotated datasets [8], [9], [10]. The dataset contains 3.0 Tesla MRI images of the prostate of patients with suspected prostate cancer. Patients over 50 years of age who had a 3.0 Tesla MRI scan of the prostate that met PI-RADS version 2.1 technical standards were included. All patients received a subsequent biopsy or surgery so that the MRI diagnosis could be verified/matched with the histopathologic diagnosis. For patients who had undergone multiple MRIs, the last MRI, which was less than six months before biopsy/surgery, was included. All patients were examined at a German university hospital (Charité Universitätsmedizin Berlin) between 02/2016 and 01/2020. All MRI were acquired with two 3.0 Tesla MRI scanners (Siemens VIDA and Skyra, Siemens Healthineers, Erlangen, Germany). Axial T2W sequences and axial diffusion-weighted sequences (DWI) with apparent diffusion coefficient maps (ADC) were included in the data set. T2W sequences and ADC maps were annotated by two board-certified radiologists with 6 and 8 years of experience, respectively. For T2W sequences, the central gland (central zone and transitional zone) and peripheral zone were segmented. If areas of suspected prostate cancer (PIRADS score of ≥ 4) were identified on examination, they were segmented in both the T2W sequences and ADC maps. Because restricted diffusion is best seen in DWI images with high b-values, only these images were selected and all images with low b-values were discarded. Data were then anonymized and converted to NIfTI (Neuroimaging Informatics Technology Initiative) format.

18.
Cancers (Basel) ; 14(21)2022 Nov 03.
Artigo em Inglês | MEDLINE | ID: mdl-36358842

RESUMO

Poor generalizability is a major barrier to clinical implementation of artificial intelligence in digital pathology. The aim of this study was to test the generalizability of a pretrained deep learning model to a new diagnostic setting and to a small change in surgical indication. A deep learning model for breast cancer metastases detection in sentinel lymph nodes, trained on CAMELYON multicenter data, was used as a base model, and achieved an AUC of 0.969 (95% CI 0.926-0.998) and FROC of 0.838 (95% CI 0.757-0.913) on CAMELYON16 test data. On local sentinel node data, the base model performance dropped to AUC 0.929 (95% CI 0.800-0.998) and FROC 0.744 (95% CI 0.566-0.912). On data with a change in surgical indication (axillary dissections) the base model performance indicated an even larger drop with a FROC of 0.503 (95%CI 0.201-0.911). The model was retrained with addition of local data, resulting in about a 4% increase for both AUC and FROC for sentinel nodes, and an increase of 11% in AUC and 49% in FROC for axillary nodes. Pathologist qualitative evaluation of the retrained model´s output showed no missed positive slides. False positives, false negatives and one previously undetected micro-metastasis were observed. The study highlights the generalization challenge even when using a multicenter trained model, and that a small change in indication can considerably impact the model´s performance.

19.
Nat Commun ; 13(1): 4128, 2022 07 15.
Artigo em Inglês | MEDLINE | ID: mdl-35840566

RESUMO

International challenges have become the de facto standard for comparative assessment of image analysis algorithms. Although segmentation is the most widely investigated medical image processing task, the various challenges have been organized to focus only on specific clinical tasks. We organized the Medical Segmentation Decathlon (MSD)-a biomedical image analysis challenge, in which algorithms compete in a multitude of both tasks and modalities to investigate the hypothesis that a method capable of performing well on multiple tasks will generalize well to a previously unseen task and potentially outperform a custom-designed solution. MSD results confirmed this hypothesis, moreover, MSD winner continued generalizing well to a wide range of other clinical problems for the next two years. Three main conclusions can be drawn from this study: (1) state-of-the-art image segmentation algorithms generalize well when retrained on unseen tasks; (2) consistent algorithmic performance across multiple tasks is a strong surrogate of algorithmic generalizability; (3) the training of accurate AI segmentation models is now commoditized to scientists that are not versed in AI model training.


Assuntos
Algoritmos , Processamento de Imagem Assistida por Computador , Processamento de Imagem Assistida por Computador/métodos
20.
Comput Biol Med ; 148: 105817, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-35841780

RESUMO

BACKGROUND: The development of deep learning (DL) models for prostate segmentation on magnetic resonance imaging (MRI) depends on expert-annotated data and reliable baselines, which are often not publicly available. This limits both reproducibility and comparability. METHODS: Prostate158 consists of 158 expert annotated biparametric 3T prostate MRIs comprising T2w sequences and diffusion-weighted sequences with apparent diffusion coefficient maps. Two U-ResNets trained for segmentation of anatomy (central gland, peripheral zone) and suspicious lesions for prostate cancer (PCa) with a PI-RADS score of ≥4 served as baseline algorithms. Segmentation performance was evaluated using the Dice similarity coefficient (DSC), the Hausdorff distance (HD), and the average surface distance (ASD). The Wilcoxon test with Bonferroni correction was used to evaluate differences in performance. The generalizability of the baseline model was assessed using the open datasets Medical Segmentation Decathlon and PROSTATEx. RESULTS: Compared to Reader 1, the models achieved a DSC/HD/ASD of 0.88/18.3/2.2 for the central gland, 0.75/22.8/1.9 for the peripheral zone, and 0.45/36.7/17.4 for PCa. Compared with Reader 2, the DSC/HD/ASD were 0.88/17.5/2.6 for the central gland, 0.73/33.2/1.9 for the peripheral zone, and 0.4/39.5/19.1 for PCa. Interrater agreement measured in DSC/HD/ASD was 0.87/11.1/1.0 for the central gland, 0.75/15.8/0.74 for the peripheral zone, and 0.6/18.8/5.5 for PCa. Segmentation performances on the Medical Segmentation Decathlon and PROSTATEx were 0.82/22.5/3.4; 0.86/18.6/2.5 for the central gland, and 0.64/29.2/4.7; 0.71/26.3/2.2 for the peripheral zone. CONCLUSIONS: We provide an openly accessible, expert-annotated 3T dataset of prostate MRI and a reproducible benchmark to foster the development of prostate segmentation algorithms.


Assuntos
Próstata , Neoplasias da Próstata , Algoritmos , Humanos , Imageamento por Ressonância Magnética , Masculino , Reprodutibilidade dos Testes , Estudos Retrospectivos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA