RESUMEN
Large language models (LLMs) are undergoing intensive research for various healthcare domains. This systematic review and meta-analysis assesses current applications, methodologies, and the performance of LLMs in clinical oncology. A mixed-methods approach was used to extract, summarize, and compare methodological approaches and outcomes. This review includes 34 studies. LLMs are primarily evaluated on their ability to answer oncologic questions across various domains. The meta-analysis highlights a significant performance variance, influenced by diverse methodologies and evaluation criteria. Furthermore, differences in inherent model capabilities, prompting strategies, and oncological subdomains contribute to heterogeneity. The lack of use of standardized and LLM-specific reporting protocols leads to methodological disparities, which must be addressed to ensure comparability in LLM research and ultimately leverage the reliable integration of LLM technologies into clinical practice.
RESUMEN
BACKGROUND: Early detection of melanoma, a potentially lethal type of skin cancer with high prevalence worldwide, improves patient prognosis. In retrospective studies, artificial intelligence (AI) has proven to be helpful for enhancing melanoma detection. However, there are few prospective studies confirming these promising results. Existing studies are limited by low sample sizes, too homogenous datasets, or lack of inclusion of rare melanoma subtypes, preventing a fair and thorough evaluation of AI and its generalizability, a crucial aspect for its application in the clinical setting. METHODS: Therefore, we assessed "All Data are Ext" (ADAE), an established open-source ensemble algorithm for detecting melanomas, by comparing its diagnostic accuracy to that of dermatologists on a prospectively collected, external, heterogeneous test set comprising eight distinct hospitals, four different camera setups, rare melanoma subtypes, and special anatomical sites. We advanced the algorithm with real test-time augmentation (R-TTA, i.e., providing real photographs of lesions taken from multiple angles and averaging the predictions), and evaluated its generalization capabilities. RESULTS: Overall, the AI shows higher balanced accuracy than dermatologists (0.798, 95% confidence interval (CI) 0.779-0.814 vs. 0.781, 95% CI 0.760-0.802; p = 4.0e-145), obtaining a higher sensitivity (0.921, 95% CI 0.900-0.942 vs. 0.734, 95% CI 0.701-0.770; p = 3.3e-165) at the cost of a lower specificity (0.673, 95% CI 0.641-0.702 vs. 0.828, 95% CI 0.804-0.852; p = 3.3e-165). CONCLUSION: As the algorithm exhibits a significant performance advantage on our heterogeneous dataset exclusively comprising melanoma-suspicious lesions, AI may offer the potential to support dermatologists, particularly in diagnosing challenging cases.
Melanoma is a type of skin cancer that can spread to other parts of the body, often resulting in death. Early detection improves survival rates. Computational tools that use artificial intelligence (AI) can be used to detect melanoma. However, few studies have checked how well the AI works on real-world data obtained from patients. We tested a previously developed AI tool on data obtained from eight different hospitals that used different types of cameras, which also included images taken of rare melanoma types and from a range of different parts of the body. The AI tool was more likely to correctly identify melanoma than dermatologists. This AI tool could be used to help dermatologists diagnose melanoma, particularly those that are difficult for dermatologists to diagnose.
RESUMEN
Early cutaneous squamous cell carcinoma (cSCC) diagnosis is essential to initiate adequate targeted treatment. Noninvasive diagnostic technologies could overcome the need of multiple biopsies and reduce tumor recurrence. To assess performance of noninvasive technologies for cSCC diagnostics, 947 relevant records were identified through a systematic literature search. Among the 15 selected studies within this systematic review, 7 were included in the meta-analysis, comprising of 1144 patients, 224 cSCC lesions, and 1729 clinical diagnoses. Overall, the sensitivity values are 92% (95% confidence interval [CI] = 86.6-96.4%) for high-frequency ultrasound, 75% (95% CI = 65.7-86.2%) for optical coherence tomography, and 63% (95% CI = 51.3-69.1%) for reflectance confocal microscopy. The overall specificity values are 88% (95% CI = 82.7-92.5%), 95% (95% CI = 92.7-97.3%), and 96% (95% CI = 94.8-97.4%), respectively. Physician's expertise is key for high diagnostic performance of investigated devices. This can be justified by the provision of additional tissue information, which requires physician interpretation, despite insufficient standardized diagnostic criteria. Furthermore, few deep learning studies were identified. Thus, integration of deep learning into the investigated devices is a potential investigating field in cSCC diagnosis.
RESUMEN
BACKGROUND: To reduce smoking uptake in adolescents, the medical students' network Education Against Tobacco (EAT) has developed a school-based intervention involving a face-aging mobile app (Smokerface). METHODS: A two-arm cluster-randomized controlled trial was conducted, evaluating the 2016 EAT intervention, which employed the mobile app Smokerface and which was delivered by medical students. Schools were randomized to intervention or control group. Surveys were conducted at baseline (pre-intervention) and at 9, 16, and 24 months post-intervention via paper & pencil questionnaires. The primary outcome was the difference in within-group changes in smoking prevalence between intervention and control group at 24 months. RESULTS: Overall, 144 German secondary schools comprising 11,286 pupils participated in the baseline survey, of which 100 schools participated in the baseline and at least one of the follow-up surveys, yielding 7437 pupils in the analysis sample. After 24 months, smoking prevalence was numerically lower in the intervention group compared to control group (12.9 % vs. 14.3 %); however, between-group differences in change in smoking prevalence between baseline and 24-months follow-up (OR=0.83, 95 %-CI: 0.64-1.09) were not statistically significant (p = 0.176). Intention to start smoking among baseline non-smokers declined non-significantly in the intervention group (p = 0.064), and remained essentially unchanged in the control group, but between-group differences in changes at the 24-months follow-up (OR=0.88, 0.64-1.21) were not statistically significant (p = 0.417). CONCLUSION: While a trend towards beneficial effects of the intervention regarding smoking prevalence as well as intention to start smoking among baseline non-smokers was observed, our smoking prevention trial demonstrated no significant effect of the intervention.
Asunto(s)
Aplicaciones Móviles , Prevención del Hábito de Fumar , Estudiantes de Medicina , Humanos , Femenino , Masculino , Adolescente , Alemania/epidemiología , Prevención del Hábito de Fumar/métodos , Instituciones Académicas , Servicios de Salud Escolar , Prevalencia , Cese del Hábito de Fumar/métodosRESUMEN
In the spectrum of colorectal tumors, microsatellite-stable (MSS) tumors with DNA polymerase ε (POLE) mutations exhibit a hypermutated profile, holding the potential to respond to immunotherapy similarly to their microsatellite-instable (MSI) counterparts. Yet, due to their rarity and the associated testing costs, systematic screening for these mutations is not commonly pursued. Notably, the histopathological phenotype resulting from POLE mutations is theorized to resemble that of MSI. This resemblance not only could facilitate their detection by a transformer-based Deep Learning (DL) system trained on MSI pathology slides, but also indicates the possibility for MSS patients with POLE mutations to access enhanced treatment options, which might otherwise be overlooked. To harness this potential, we trained a Deep Learning classifier on a large dataset with the ground truth for microsatellite status and subsequently validated its capabilities for MSI and POLE detection across three external cohorts. Our model accurately identified MSI status in both the internal and external resection cohorts using pathology images alone. Notably, with a classification threshold of 0.5, over 75% of POLE driver mutant patients in the external resection cohorts were flagged as "positive" by a DL system trained on MSI status. In a clinical setting, deploying this DL model as a preliminary screening tool could facilitate the efficient identification of clinically relevant MSI and POLE mutations in colorectal tumors, in one go.
RESUMEN
BACKGROUND: Ultraviolet (UV)-exposure behaviors can directly impact an individual's skin cancer risk, with many habits formed during childhood and adolescence. We explored the utility of a photoaging smartphone application to motivate youth to improve sun safety practices. METHODS: Participants completed a preintervention survey to gather baseline sun safety perceptions and behaviors. Participants then used a photoaging mobile application to view the projected effects of chronic UV exposure on participants' self-face image over time, followed by a postintervention survey to assess motivation to engage in future sun safety practices. RESULTS: The study sample included 87 participants (median [interquartile (IQR)] age, 14 [11-16] years). Most participants were White (50.6%) and reported skin type that burns a little and tans easily (42.5%). Preintervention sun exposure behaviors among participants revealed that 33 (37.9%) mostly or always used sunscreen on a sunny day, 48 (55.2%) experienced at least one sunburn over the past year, 26 (30.6%) engaged in outdoor sunbathing at least once during the past year, and zero (0%) used indoor tanning beds. Non-skin of color (18 [41.9%], p = .02) and older (24 [41.4%], p = .007) participants more often agreed they felt better with a tan. Most participants agreed the intervention increased their motivation to practice sun-protective behaviors (wear sunscreen, 74 [85.1%]; wear hats, 64 [74.4%]; avoid indoor tanning, 73 [83.9%]; avoid outdoor tanning, 68 [79%]). CONCLUSION: The findings of this cross-sectional study suggest that a photoaging smartphone application may serve as a useful tool to promote sun safety behaviors from a young age.
Asunto(s)
Conductas Relacionadas con la Salud , Aplicaciones Móviles , Teléfono Inteligente , Quemadura Solar , Humanos , Adolescente , Masculino , Femenino , Niño , Quemadura Solar/prevención & control , Protectores Solares/uso terapéutico , Neoplasias Cutáneas/prevención & control , Neoplasias Cutáneas/etiología , Baño de Sol/psicología , Promoción de la Salud/métodos , Encuestas y Cuestionarios , Luz Solar/efectos adversosRESUMEN
The variation in histologic staining between different medical centers is one of the most profound challenges in the field of computer-aided diagnosis. The appearance disparity of pathological whole slide images causes algorithms to become less reliable, which in turn impedes the wide-spread applicability of downstream tasks like cancer diagnosis. Furthermore, different stainings lead to biases in the training which in case of domain shifts negatively affect the test performance. Therefore, in this paper we propose MultiStain-CycleGAN, a multi-domain approach to stain normalization based on CycleGAN. Our modifications to CycleGAN allow us to normalize images of different origins without retraining or using different models. We perform an extensive evaluation of our method using various metrics and compare it to commonly used methods that are multi-domain capable. First, we evaluate how well our method fools a domain classifier that tries to assign a medical center to an image. Then, we test our normalization on the tumor classification performance of a downstream classifier. Furthermore, we evaluate the image quality of the normalized images using the Structural similarity index and the ability to reduce the domain shift using the Fréchet inception distance. We show that our method proves to be multi-domain capable, provides a very high image quality among the compared methods, and can most reliably fool the domain classifier while keeping the tumor classifier performance high. By reducing the domain influence, biases in the data can be removed on the one hand and the origin of the whole slide image can be disguised on the other, thus enhancing patient data privacy.
Asunto(s)
Colorantes , Neoplasias , Humanos , Colorantes/química , Coloración y Etiquetado , Algoritmos , Diagnóstico por Computador , Procesamiento de Imagen Asistido por Computador/métodosAsunto(s)
Inteligencia Artificial , Dermatólogos , Prioridad del Paciente , Neoplasias Cutáneas , Adulto , Anciano , Femenino , Humanos , Masculino , Persona de Mediana Edad , Dermatólogos/estadística & datos numéricos , Dermatólogos/psicología , Dermatología/métodos , Prioridad del Paciente/estadística & datos numéricos , Estudios Prospectivos , Neoplasias Cutáneas/diagnóstico , Encuestas y Cuestionarios/estadística & datos numéricosRESUMEN
Although pathological tissue analysis is typically performed on single 2-dimensional (2D) histologic reference slides, 3-dimensional (3D) reconstruction from a sequence of histologic sections could provide novel opportunities for spatial analysis of the extracted tissue. In this review, we analyze recent works published after 2018 and report information on the extracted tissue types, the section thickness, and the number of sections used for reconstruction. By analyzing the technological requirements for 3D reconstruction, we observe that software tools exist, both free and commercial, which include the functionality to perform 3D reconstruction from a sequence of histologic images. Through the analysis of the most recent works, we provide an overview of the workflows and tools that are currently used for 3D reconstruction from histologic sections and address points for future work, such as a missing common file format or computer-aided analysis of the reconstructed model.
Asunto(s)
Imagenología Tridimensional , Imagenología Tridimensional/métodos , Humanos , Programas Informáticos , AnimalesRESUMEN
Importance: The development of artificial intelligence (AI)-based melanoma classifiers typically calls for large, centralized datasets, requiring hospitals to give away their patient data, which raises serious privacy concerns. To address this concern, decentralized federated learning has been proposed, where classifier development is distributed across hospitals. Objective: To investigate whether a more privacy-preserving federated learning approach can achieve comparable diagnostic performance to a classical centralized (ie, single-model) and ensemble learning approach for AI-based melanoma diagnostics. Design, Setting, and Participants: This multicentric, single-arm diagnostic study developed a federated model for melanoma-nevus classification using histopathological whole-slide images prospectively acquired at 6 German university hospitals between April 2021 and February 2023 and benchmarked it using both a holdout and an external test dataset. Data analysis was performed from February to April 2023. Exposures: All whole-slide images were retrospectively analyzed by an AI-based classifier without influencing routine clinical care. Main Outcomes and Measures: The area under the receiver operating characteristic curve (AUROC) served as the primary end point for evaluating the diagnostic performance. Secondary end points included balanced accuracy, sensitivity, and specificity. Results: The study included 1025 whole-slide images of clinically melanoma-suspicious skin lesions from 923 patients, consisting of 388 histopathologically confirmed invasive melanomas and 637 nevi. The median (range) age at diagnosis was 58 (18-95) years for the training set, 57 (18-93) years for the holdout test dataset, and 61 (18-95) years for the external test dataset; the median (range) Breslow thickness was 0.70 (0.10-34.00) mm, 0.70 (0.20-14.40) mm, and 0.80 (0.30-20.00) mm, respectively. The federated approach (0.8579; 95% CI, 0.7693-0.9299) performed significantly worse than the classical centralized approach (0.9024; 95% CI, 0.8379-0.9565) in terms of AUROC on a holdout test dataset (pairwise Wilcoxon signed-rank, P < .001) but performed significantly better (0.9126; 95% CI, 0.8810-0.9412) than the classical centralized approach (0.9045; 95% CI, 0.8701-0.9331) on an external test dataset (pairwise Wilcoxon signed-rank, P < .001). Notably, the federated approach performed significantly worse than the ensemble approach on both the holdout (0.8867; 95% CI, 0.8103-0.9481) and external test dataset (0.9227; 95% CI, 0.8941-0.9479). Conclusions and Relevance: The findings of this diagnostic study suggest that federated learning is a viable approach for the binary classification of invasive melanomas and nevi on a clinically representative distributed dataset. Federated learning can improve privacy protection in AI-based melanoma diagnostics while simultaneously promoting collaboration across institutions and countries. Moreover, it may have the potential to be extended to other image classification tasks in digital cancer histopathology and beyond.
Asunto(s)
Dermatología , Melanoma , Nevo , Neoplasias Cutáneas , Humanos , Melanoma/diagnóstico , Inteligencia Artificial , Estudios Retrospectivos , Neoplasias Cutáneas/diagnóstico , Nevo/diagnósticoRESUMEN
BACKGROUND: Artificial intelligence (AI) has numerous applications in pathology, supporting diagnosis and prognostication in cancer. However, most AI models are trained on highly selected data, typically one tissue slide per patient. In reality, especially for large surgical resection specimens, dozens of slides can be available for each patient. Manually sorting and labelling whole-slide images (WSIs) is a very time-consuming process, hindering the direct application of AI on the collected tissue samples from large cohorts. In this study we addressed this issue by developing a deep-learning (DL)-based method for automatic curation of large pathology datasets with several slides per patient. METHODS: We collected multiple large multicentric datasets of colorectal cancer histopathological slides from the United Kingdom (FOXTROT, N = 21,384 slides; CR07, N = 7985 slides) and Germany (DACHS, N = 3606 slides). These datasets contained multiple types of tissue slides, including bowel resection specimens, endoscopic biopsies, lymph node resections, immunohistochemistry-stained slides, and tissue microarrays. We developed, trained, and tested a deep convolutional neural network model to predict the type of slide from the slide overview (thumbnail) image. The primary statistical endpoint was the macro-averaged area under the receiver operating curve (AUROCs) for detection of the type of slide. RESULTS: In the primary dataset (FOXTROT), with an AUROC of 0.995 [95% confidence interval [CI]: 0.994-0.996] the algorithm achieved a high classification performance and was able to accurately predict the type of slide from the thumbnail image alone. In the two external test cohorts (CR07, DACHS) AUROCs of 0.982 [95% CI: 0.979-0.985] and 0.875 [95% CI: 0.864-0.887] were observed, which indicates the generalizability of the trained model on unseen datasets. With a confidence threshold of 0.95, the model reached an accuracy of 94.6% (7331 classified cases) in CR07 and 85.1% (2752 classified cases) for the DACHS cohort. CONCLUSION: Our findings show that using the low-resolution thumbnail image is sufficient to accurately classify the type of slide in digital pathology. This can support researchers to make the vast resource of existing pathology archives accessible to modern AI models with only minimal manual annotations.
Asunto(s)
Neoplasias Colorrectales , Aprendizaje Profundo , Humanos , Neoplasias Colorrectales/patología , Neoplasias Colorrectales/diagnóstico , Redes Neurales de la Computación , Procesamiento de Imagen Asistido por Computador/métodos , Interpretación de Imagen Asistida por Computador/métodosRESUMEN
Pathologists routinely use immunohistochemical (IHC)-stained tissue slides against MelanA in addition to hematoxylin and eosin (H&E)-stained slides to improve their accuracy in diagnosing melanomas. The use of diagnostic Deep Learning (DL)-based support systems for automated examination of tissue morphology and cellular composition has been well studied in standard H&E-stained tissue slides. In contrast, there are few studies that analyze IHC slides using DL. Therefore, we investigated the separate and joint performance of ResNets trained on MelanA and corresponding H&E-stained slides. The MelanA classifier achieved an area under receiver operating characteristics curve (AUROC) of 0.82 and 0.74 on out of distribution (OOD)-datasets, similar to the H&E-based benchmark classification of 0.81 and 0.75, respectively. A combined classifier using MelanA and H&E achieved AUROCs of 0.85 and 0.81 on the OOD datasets. DL MelanA-based assistance systems show the same performance as the benchmark H&E classification and may be improved by multi stain classification to assist pathologists in their clinical routine.
Asunto(s)
Aprendizaje Profundo , Melanoma , Humanos , Melanoma/diagnóstico , Inmunohistoquímica , Antígeno MART-1 , Curva ROCRESUMEN
Artificial intelligence (AI) systems have been shown to help dermatologists diagnose melanoma more accurately, however they lack transparency, hindering user acceptance. Explainable AI (XAI) methods can help to increase transparency, yet often lack precise, domain-specific explanations. Moreover, the impact of XAI methods on dermatologists' decisions has not yet been evaluated. Building upon previous research, we introduce an XAI system that provides precise and domain-specific explanations alongside its differential diagnoses of melanomas and nevi. Through a three-phase study, we assess its impact on dermatologists' diagnostic accuracy, diagnostic confidence, and trust in the XAI-support. Our results show strong alignment between XAI and dermatologist explanations. We also show that dermatologists' confidence in their diagnoses, and their trust in the support system significantly increase with XAI compared to conventional AI. This study highlights dermatologists' willingness to adopt such XAI systems, promoting future use in the clinic.
Asunto(s)
Melanoma , Confianza , Humanos , Inteligencia Artificial , Dermatólogos , Melanoma/diagnóstico , Diagnóstico DiferencialRESUMEN
BACKGROUND: Precise prognosis prediction in patients with colorectal cancer (ie, forecasting survival) is pivotal for individualised treatment and care. Histopathological tissue slides of colorectal cancer specimens contain rich prognostically relevant information. However, existing studies do not have multicentre external validation with real-world sample processing protocols, and algorithms are not yet widely used in clinical routine. METHODS: In this retrospective, multicentre study, we collected tissue samples from four groups of patients with resected colorectal cancer from Australia, Germany, and the USA. We developed and externally validated a deep learning-based prognostic-stratification system for automatic prediction of overall and cancer-specific survival in patients with resected colorectal cancer. We used the model-predicted risk scores to stratify patients into different risk groups and compared survival outcomes between these groups. Additionally, we evaluated the prognostic value of these risk groups after adjusting for established prognostic variables. FINDINGS: We trained and validated our model on a total of 4428 patients. We found that patients could be divided into high-risk and low-risk groups on the basis of the deep learning-based risk score. On the internal test set, the group with a high-risk score had a worse prognosis than the group with a low-risk score, as reflected by a hazard ratio (HR) of 4·50 (95% CI 3·33-6·09) for overall survival and 8·35 (5·06-13·78) for disease-specific survival (DSS). We found consistent performance across three large external test sets. In a test set of 1395 patients, the high-risk group had a lower DSS than the low-risk group, with an HR of 3·08 (2·44-3·89). In two additional test sets, the HRs for DSS were 2·23 (1·23-4·04) and 3·07 (1·78-5·3). We showed that the prognostic value of the deep learning-based risk score is independent of established clinical risk factors. INTERPRETATION: Our findings indicate that attention-based self-supervised deep learning can robustly offer a prognosis on clinical outcomes in patients with colorectal cancer, generalising across different populations and serving as a potentially new prognostic tool in clinical decision making for colorectal cancer management. We release all source codes and trained models under an open-source licence, allowing other researchers to reuse and build upon our work. FUNDING: The German Federal Ministry of Health, the Max-Eder-Programme of German Cancer Aid, the German Federal Ministry of Education and Research, the German Academic Exchange Service, and the EU.
Asunto(s)
Neoplasias Colorrectales , Aprendizaje Profundo , Humanos , Estudios Retrospectivos , Pronóstico , Factores de Riesgo , Neoplasias Colorrectales/diagnóstico , Neoplasias Colorrectales/patologíaAsunto(s)
Melanoma , Nevo Pigmentado , Nevo , Neoplasias Cutáneas , Humanos , Melanoma/diagnóstico , Neoplasias Cutáneas/diagnósticoRESUMEN
BACKGROUND: Sentinel lymph node (SLN) status is a clinically important prognostic biomarker in breast cancer and is used to guide therapy, especially for hormone receptor-positive, HER2-negative cases. However, invasive lymph node staging is increasingly omitted before therapy, and studies such as the randomised Intergroup Sentinel Mamma (INSEMA) trial address the potential for further de-escalation of axillary surgery. Therefore, it would be helpful to accurately predict the pretherapeutic sentinel status using medical images. METHODS: Using a ResNet 50 architecture pretrained on ImageNet and a previously successful strategy, we trained deep learning (DL)-based image analysis algorithms to predict sentinel status on hematoxylin/eosin-stained images of predominantly luminal, primary breast tumours from the INSEMA trial and three additional, independent cohorts (The Cancer Genome Atlas (TCGA) and cohorts from the University hospitals of Mannheim and Regensburg), and compared their performance with that of a logistic regression using clinical data only. Performance on an INSEMA hold-out set was investigated in a blinded manner. RESULTS: None of the generated image analysis algorithms yielded significantly better than random areas under the receiver operating characteristic curves on the test sets, including the hold-out test set from INSEMA. In contrast, the logistic regression fitted on the Mannheim cohort retained a better than random performance on INSEMA and Regensburg. Including the image analysis model output in the logistic regression did not improve performance further on INSEMA. CONCLUSIONS: Employing DL-based image analysis on histological slides, we could not predict SLN status for unseen cases in the INSEMA trial and other predominantly luminal cohorts.
Asunto(s)
Neoplasias de la Mama , Aprendizaje Profundo , Linfadenopatía , Ganglio Linfático Centinela , Femenino , Humanos , Axila/patología , Neoplasias de la Mama/diagnóstico por imagen , Neoplasias de la Mama/cirugía , Neoplasias de la Mama/genética , Escisión del Ganglio Linfático/métodos , Ganglios Linfáticos/patología , Metástasis Linfática/patología , Ganglio Linfático Centinela/patología , Biopsia del Ganglio Linfático Centinela/métodosRESUMEN
BACKGROUND: Historically, cancer diagnoses have been made by pathologists using two-dimensional histological slides. However, with the advent of digital pathology and artificial intelligence, slides are being digitised, providing new opportunities to integrate their information. Since nature is 3-dimensional (3D), it seems intuitive to digitally reassemble the 3D structure for diagnosis. OBJECTIVE: To develop the first human-3D-melanoma-histology-model with full data and code availability. Further, to evaluate the 3D-simulation together with experienced pathologists in the field and discuss the implications of digital 3D-models for the future of digital pathology. METHODS: A malignant melanoma of the skin was digitised via 3 µm cuts by a slide scanner; an open-source software was then leveraged to construct the 3D model. A total of nine pathologists from four different countries with at least 10 years of experience in the histologic diagnosis of melanoma tested the model and discussed their experiences as well as implications for future pathology. RESULTS: We successfully constructed and tested the first 3D-model of human melanoma. Based on testing, 88.9% of pathologists believe that the technology is likely to enter routine pathology within the next 10 years; advantages include a better reflectance of anatomy, 3D assessment of symmetry and the opportunity to simultaneously evaluate different tissue levels at the same time; limitations include the high consumption of tissue and a yet inferior resolution due to computational limitations. CONCLUSIONS: 3D-histology-models are promising for digital pathology of cancer and melanoma specifically, however, there are yet limitations which need to be carefully addressed.
RESUMEN
Studies have shown that colorectal cancer prognosis can be predicted by deep learning-based analysis of histological tissue sections of the primary tumor. So far, this has been achieved using a binary prediction. Survival curves might contain more detailed information and thus enable a more fine-grained risk prediction. Therefore, we established survival curve-based CRC survival predictors and benchmarked them against standard binary survival predictors, comparing their performance extensively on the clinical high and low risk subsets of one internal and three external cohorts. Survival curve-based risk prediction achieved a very similar risk stratification to binary risk prediction for this task. Exchanging other components of the pipeline, namely input tissue and feature extractor, had largely identical effects on model performance independently of the type of risk prediction. An ensemble of all survival curve-based models exhibited a more robust performance, as did a similar ensemble based on binary risk prediction. Patients could be further stratified within clinical risk groups. However, performance still varied across cohorts, indicating limited generalization of all investigated image analysis pipelines, whereas models using clinical data performed robustly on all cohorts.