Búsqueda | Portal de Búsqueda de la BVS Enfermería

HER2 challenge contest: a detailed assessment of automated HER2 scoring algorithms in whole slide images of breast cancer tissues.

Qaiser, Talha; Mukherjee, Abhik; Reddy Pb, Chaitanya; Munugoti, Sai D; Tallam, Vamsi; Pitkäaho, Tomi; Lehtimäki, Taina; Naughton, Thomas; Berseth, Matt; Pedraza, Aníbal; Mukundan, Ramakrishnan; Smith, Matthew; Bhalerao, Abhir; Rodner, Erik; Simon, Marcel; Denzler, Joachim; Huang, Chao-Hui; Bueno, Gloria; Snead, David; Ellis, Ian O; Ilyas, Mohammad; Rajpoot, Nasir.

Histopathology ; 72(2): 227-238, 2018 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-28771788

RESUMEN

AIMS: Evaluating expression of the human epidermal growth factor receptor 2 (HER2) by visual examination of immunohistochemistry (IHC) on invasive breast cancer (BCa) is a key part of the diagnostic assessment of BCa due to its recognized importance as a predictive and prognostic marker in clinical practice. However, visual scoring of HER2 is subjective, and consequently prone to interobserver variability. Given the prognostic and therapeutic implications of HER2 scoring, a more objective method is required. In this paper, we report on a recent automated HER2 scoring contest, held in conjunction with the annual PathSoc meeting held in Nottingham in June 2016, aimed at systematically comparing and advancing the state-of-the-art artificial intelligence (AI)-based automated methods for HER2 scoring. METHODS AND RESULTS: The contest data set comprised digitized whole slide images (WSI) of sections from 86 cases of invasive breast carcinoma stained with both haematoxylin and eosin (H&E) and IHC for HER2. The contesting algorithms predicted scores of the IHC slides automatically for an unseen subset of the data set and the predicted scores were compared with the 'ground truth' (a consensus score from at least two experts). We also report on a simple 'Man versus Machine' contest for the scoring of HER2 and show that the automated methods could beat the pathology experts on this contest data set. CONCLUSIONS: This paper presents a benchmark for comparing the performance of automated algorithms for scoring of HER2. It also demonstrates the enormous potential of automated algorithms in assisting the pathologist with objective IHC scoring.

Asunto(s)

Algoritmos , Biomarcadores de Tumor/análisis , Neoplasias de la Mama/diagnóstico , Interpretación de Imagen Asistida por Computador/métodos , Receptor ErbB-2/análisis , Femenino , Humanos , Inmunohistoquímica

Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer.

Ehteshami Bejnordi, Babak; Veta, Mitko; Johannes van Diest, Paul; van Ginneken, Bram; Karssemeijer, Nico; Litjens, Geert; van der Laak, Jeroen A W M; Hermsen, Meyke; Manson, Quirine F; Balkenhol, Maschenka; Geessink, Oscar; Stathonikos, Nikolaos; van Dijk, Marcory Crf; Bult, Peter; Beca, Francisco; Beck, Andrew H; Wang, Dayong; Khosla, Aditya; Gargeya, Rishab; Irshad, Humayun; Zhong, Aoxiao; Dou, Qi; Li, Quanzheng; Chen, Hao; Lin, Huang-Jing; Heng, Pheng-Ann; Haß, Christian; Bruni, Elia; Wong, Quincy; Halici, Ugur; Öner, Mustafa Ümit; Cetin-Atalay, Rengul; Berseth, Matt; Khvatkov, Vitali; Vylegzhanin, Alexei; Kraus, Oren; Shaban, Muhammad; Rajpoot, Nasir; Awan, Ruqayya; Sirinukunwattana, Korsuk; Qaiser, Talha; Tsang, Yee-Wah; Tellez, David; Annuscheit, Jonas; Hufnagl, Peter; Valkonen, Mira; Kartasalo, Kimmo; Latonen, Leena; Ruusuvuori, Pekka; Liimatainen, Kaisa.

JAMA ; 318(22): 2199-2210, 2017 12 12.

Artículo en Inglés | MEDLINE | ID: mdl-29234806

RESUMEN

Importance: Application of deep learning algorithms to whole-slide pathology images can potentially improve diagnostic accuracy and efficiency. Objective: Assess the performance of automated deep learning algorithms at detecting metastases in hematoxylin and eosin-stained tissue sections of lymph nodes of women with breast cancer and compare it with pathologists' diagnoses in a diagnostic setting. Design, Setting, and Participants: Researcher challenge competition (CAMELYON16) to develop automated solutions for detecting lymph node metastases (November 2015-November 2016). A training data set of whole-slide images from 2 centers in the Netherlands with (n = 110) and without (n = 160) nodal metastases verified by immunohistochemical staining were provided to challenge participants to build algorithms. Algorithm performance was evaluated in an independent test set of 129 whole-slide images (49 with and 80 without metastases). The same test set of corresponding glass slides was also evaluated by a panel of 11 pathologists with time constraint (WTC) from the Netherlands to ascertain likelihood of nodal metastases for each slide in a flexible 2-hour session, simulating routine pathology workflow, and by 1 pathologist without time constraint (WOTC). Exposures: Deep learning algorithms submitted as part of a challenge competition or pathologist interpretation. Main Outcomes and Measures: The presence of specific metastatic foci and the absence vs presence of lymph node metastasis in a slide or image using receiver operating characteristic curve analysis. The 11 pathologists participating in the simulation exercise rated their diagnostic confidence as definitely normal, probably normal, equivocal, probably tumor, or definitely tumor. Results: The area under the receiver operating characteristic curve (AUC) for the algorithms ranged from 0.556 to 0.994. The top-performing algorithm achieved a lesion-level, true-positive fraction comparable with that of the pathologist WOTC (72.4% [95% CI, 64.3%-80.4%]) at a mean of 0.0125 false-positives per normal whole-slide image. For the whole-slide image classification task, the best algorithm (AUC, 0.994 [95% CI, 0.983-0.999]) performed significantly better than the pathologists WTC in a diagnostic simulation (mean AUC, 0.810 [range, 0.738-0.884]; P < .001). The top 5 algorithms had a mean AUC that was comparable with the pathologist interpreting the slides in the absence of time constraints (mean AUC, 0.960 [range, 0.923-0.994] for the top 5 algorithms vs 0.966 [95% CI, 0.927-0.998] for the pathologist WOTC). Conclusions and Relevance: In the setting of a challenge competition, some deep learning algorithms achieved better diagnostic performance than a panel of 11 pathologists participating in a simulation exercise designed to mimic routine pathology workflow; algorithm performance was comparable with an expert pathologist interpreting whole-slide images without time constraints. Whether this approach has clinical utility will require evaluation in a clinical setting.

Asunto(s)

Neoplasias de la Mama/patología , Metástasis Linfática/diagnóstico , Aprendizaje Automático , Patólogos , Algoritmos , Femenino , Humanos , Metástasis Linfática/patología , Patología Clínica , Curva ROC

Standardized Assessment of Automatic Segmentation of White Matter Hyperintensities and Results of the WMH Segmentation Challenge.

Kuijf, Hugo J; Biesbroek, J Matthijs; De Bresser, Jeroen; Heinen, Rutger; Andermatt, Simon; Bento, Mariana; Berseth, Matt; Belyaev, Mikhail; Cardoso, M Jorge; Casamitjana, Adria; Collins, D Louis; Dadar, Mahsa; Georgiou, Achilleas; Ghafoorian, Mohsen; Jin, Dakai; Khademi, April; Knight, Jesse; Li, Hongwei; Llado, Xavier; Luna, Miguel; Mahmood, Qaiser; McKinley, Richard; Mehrtash, Alireza; Ourselin, Sebastien; Park, Bo-Yong; Park, Hyunjin; Park, Sang Hyun; Pezold, Simon; Puybareau, Elodie; Rittner, Leticia; Sudre, Carole H; Valverde, Sergi; Vilaplana, Veronica; Wiest, Roland; Xu, Yongchao; Xu, Ziyue; Zeng, Guodong; Zhang, Jianguo; Zheng, Guoyan; Chen, Christopher; van der Flier, Wiesje; Barkhof, Frederik; Viergever, Max A; Biessels, Geert Jan.

IEEE Trans Med Imaging ; 38(11): 2556-2568, 2019 11.

Artículo en Inglés | MEDLINE | ID: mdl-30908194

RESUMEN

Quantification of cerebral white matter hyperintensities (WMH) of presumed vascular origin is of key importance in many neurological research studies. Currently, measurements are often still obtained from manual segmentations on brain MR images, which is a laborious procedure. The automatic WMH segmentation methods exist, but a standardized comparison of the performance of such methods is lacking. We organized a scientific challenge, in which developers could evaluate their methods on a standardized multi-center/-scanner image dataset, giving an objective comparison: the WMH Segmentation Challenge. Sixty T1 + FLAIR images from three MR scanners were released with the manual WMH segmentations for training. A test set of 110 images from five MR scanners was used for evaluation. The segmentation methods had to be containerized and submitted to the challenge organizers. Five evaluation metrics were used to rank the methods: 1) Dice similarity coefficient; 2) modified Hausdorff distance (95th percentile); 3) absolute log-transformed volume difference; 4) sensitivity for detecting individual lesions; and 5) F1-score for individual lesions. In addition, the methods were ranked on their inter-scanner robustness; 20 participants submitted their methods for evaluation. This paper provides a detailed analysis of the results. In brief, there is a cluster of four methods that rank significantly better than the other methods, with one clear winner. The inter-scanner robustness ranking shows that not all the methods generalize to unseen scanners. The challenge remains open for future submissions and provides a public platform for method evaluation.

Asunto(s)

Procesamiento de Imagen Asistido por Computador/métodos , Imagen por Resonancia Magnética/métodos , Sustancia Blanca/diagnóstico por imagen , Anciano , Algoritmos , Femenino , Humanos , Masculino , Persona de Mediana Edad

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA