ABSTRACT
Histopathology is the reference standard for pathology diagnosis, and has evolved with the digitization of glass slides [ie, whole slide images (WSIs)]. While trained histopathologists are able to diagnose diseases by examining WSIs visually, this process is time consuming and prone to variability. To address these issues, artificial intelligence models are being developed to generate slide-level representations of WSIs, summarizing the entire slide as a single vector. This enables various computational pathology applications, including interslide search, multimodal training, and slide-level classification. Achieving expressive and robust slide-level representations hinges on patch feature extraction and aggregation steps. This study proposed an additional binary patch grouping (BPG) step, a plugin that can be integrated into various slide-level representation pipelines, to enhance the quality of slide-level representation in bone marrow histopathology. BPG excludes patches with less clinical relevance through minimal interaction with the pathologist; a one-time human intervention for the entire process. This study further investigated domain-general versus domain-specific feature extraction models based on convolution and attention and examined two different feature aggregation methods, with and without BPG, showing BPG's generalizability. The results showed that using BPG boosts the performance of WSI retrieval (mean average precision at 10) by 4% and improves WSI classification (weighted-F1) by 5% compared to not using BPG. Additionally, domain-general large models and parameterized pooling produced the best-quality slide-level representations.
Subject(s)
Artificial Intelligence , Bone Marrow , Humans , Dietary Supplements , PathologistsABSTRACT
Brain metastases can occur in nearly half of patients with early and locally advanced (stage I-III) non-small cell lung cancer (NSCLC). There are no reliable histopathologic or molecular means to identify those who are likely to develop brain metastases. We sought to determine if deep learning (DL) could be applied to routine H&E-stained primary tumor tissue sections from stage I-III NSCLC patients to predict the development of brain metastasis. Diagnostic slides from 158 patients with stage I-III NSCLC followed for at least 5 years for the development of brain metastases (Met+, 65 patients) versus no progression (Met-, 93 patients) were subjected to whole-slide imaging. Three separate iterations were performed by first selecting 118 cases (45 Met+, 73 Met-) to train and validate the DL algorithm, while 40 separate cases (20 Met+, 20 Met-) were used as the test set. The DL algorithm results were compared to a blinded review by four expert pathologists. The DL-based algorithm was able to distinguish the eventual development of brain metastases with an accuracy of 87% (p < 0.0001) compared with an average of 57.3% by the four pathologists and appears to be particularly useful in predicting brain metastases in stage I patients. The DL algorithm appears to focus on a complex set of histologic features. DL-based algorithms using routine H&E-stained slides may identify patients who are likely to develop brain metastases from those who will remain disease free over extended (>5 year) follow-up and may thus be spared systemic therapy. © 2024 The Authors. The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland.
Subject(s)
Brain Neoplasms , Carcinoma, Non-Small-Cell Lung , Lung Neoplasms , Humans , Lung Neoplasms/pathology , Carcinoma, Non-Small-Cell Lung/pathology , Algorithms , PathologistsABSTRACT
No standard tool to measure pathologist workload currently exists. An accurate measure of workload is needed for determining the number of pathologists to be hired, distributing the workload fairly among pathologists, and assessing the overall cost of pathology consults. Initially, simple tools such as counting cases or slides were used to give an estimate of the workload. More recently, multiple workload models, including relative value units (RVUs), the Royal College of Pathologists (RCP) point system, Level 4 Equivalent (L4E), Work2Quality (W2Q), and the University of Washington, Seattle (UW) slide count method, have been developed. There is no "ideal" model that is universally accepted. The main differences among the models come from the weights assigned to different specimen types, differential calculations for organs, and the capture of additional tasks needed for safe and timely patient care. Academic centers tend to see more complex cases that require extensive sampling and additional testing, while community-based and private laboratories deal more with biopsies. Additionally, some systems do not account for teaching, participation in multidisciplinary rounds, quality assurance activities, and medical oversight. A successful workload model needs to be continually updated to reflect the current state of practice.Awareness about physician burnout has gained attention in recent years and has been added to the World Health Organization's International Classification of Diseases (World Health Organization, WHO) as an occupational phenomenon. However, the extent to which this affects pathologists is not well understood. According to the WHO, burnout syndrome is diagnosed by the presence of three components: emotional exhaustion, depersonalization from one's work (cynicism related to one's job), and a low sense of personal achievement or accomplishment. Three drivers of burnout are the demand for productivity, lack of recognition, and electronic health records. Prominent consequences of physician burnout are economic and personal costs to the public and to the providers.Wellness is physical and mental well-being that allows individuals to manage stress effectively and to thrive in both their professional and personal lives. To achieve wellness, it is necessary to understand the root causes of burnout, including over-work and working under stressful conditions. Wellness is more than the absence of stress or burnout, and the responsibility of wellness should be shared by pathologists themselves, their healthcare organization, and governing bodies. Each pathologist needs to take their own path to achieve wellness.
Subject(s)
Burnout, Professional , Pathologists , Workload , HumansABSTRACT
BACKGROUND: In 2022, our team launched the pioneering national proficiency testing (PT) scheme for the pathological diagnosis of breast cancer, rapidly establishing its credibility throughout China. Aiming to continuously monitor and improve the proficiency of Chinese pathologists in breast pathology, the second round of the PT scheme was initiated in 2023, which will expand the number of participating institutions, and will conduct a nationwide investigation into the interpretation of HER2 0, 1+, and 2+/FISH- categories in China. METHODS: The methodology employed in the current round of PT scheme closely mirrors that of the preceding cycle in 2022, which is designed and implemented according to the "Conformity assessment-General requirements for proficiency testing"(GB/T27043-2012/ISO/IEC 17043:2010). More importantly, we utilized a statistics-based method to generate assigned values to enhance their robustness and credibility. RESULTS: The final PT results, published on the website of the National Quality Control Center for Cancer ( http://117.133.40.88:3927 ), showed that all participants passed the testing. However, a few institutions demonstrated systemic biases in scoring HER2 0, 1+, and 2+/FISH- with accuracy levels below 59%, considered unsatisfactory. Especially, the concordance rate for HER2 0 cases was only 78.1%, indicating challenges in distinguishing HER2 0 from low HER2 expression. Meanwhile, areas for histologic type and grade interpretation improvement were also noted. CONCLUSIONS: Our PT scheme demonstrated high proficiency in diagnosing breast cancer in China. But it also identified systemic biases in scoring HER2 0, 1+, and 2+/FISH- at some institutions. More importantly, our study highlighted challenges in the evaluation at the extreme lower end of the HER2 staining spectrum, a crucial area for further research. Meanwhile, it also revealed the need for improvements in interpreting histologic types and grades. These findings strengthened the importance of robust quality assurance mechanisms, like the nationwide PT scheme conducted in this study, to maintain high diagnostic standards and identify areas requiring further training and enhancement.
Subject(s)
Breast Neoplasms , Laboratory Proficiency Testing , Receptor, ErbB-2 , Humans , Female , Breast Neoplasms/diagnosis , Breast Neoplasms/pathology , Breast Neoplasms/metabolism , Receptor, ErbB-2/metabolism , China , In Situ Hybridization, Fluorescence/standards , Biomarkers, Tumor , PathologistsABSTRACT
Carcinoma of unknown primary (CUP) is a heterogeneous group of metastatic cancers in which the site of origin is not identifiable. These carcinomas have a poor outcome due to their late presentation with metastatic disease, difficulty in identifying the origin and delay in treatment. The aim of the pathologist is to broadly classify and subtype the cancer and, where possible, to confirm the likely primary site as this information best predicts patient outcome and guides treatment. In this review, we provide histopathologists with diagnostic practice points which contribute to identifying the primary origin in such cases. We present the current clinical evaluation and management from the point of view of the oncologist. We discuss the role of the pathologist in the diagnostic pathway including the control of pre-analytical conditions, assessment of sample adequacy, diagnosis of cancer including diagnostic pitfalls, and evaluation of prognostic and predictive markers. An integrated diagnostic report is ideal in cases of CUP, with results discussed at a forum such as a molecular tumour board and matched with targeted treatment. This highly specialized evolving area ultimately leads to personalized oncology and potentially improved outcomes for patients.
Subject(s)
Carcinoma , Neoplasms, Unknown Primary , Humans , Neoplasms, Unknown Primary/diagnosis , Neoplasms, Unknown Primary/pathology , Neoplasms, Unknown Primary/therapy , Pathologists , Carcinoma/diagnosis , Carcinoma/metabolism , PrognosisABSTRACT
PURPOSE: Ki-67 expression levels in breast cancer have prognostic and predictive significance. Therefore, accurate Ki-67 evaluation is important for optimal patient care. Although an algorithm developed by the International Ki-67 in Breast Cancer Working Group (IKWG) improves interobserver variability, it is tedious and time-consuming. In this study, we simplify IKWG algorithm and evaluate its interobserver agreement among breast pathologists in Ki-67 evaluation. METHODS: Six subspecialized breast pathologists (4 juniors, 2 seniors) assessed the percentage of positive cells in 5% increments in 57 immunostained Ki-67 slides. The time spent on each slide was recorded. Two rounds of ring study (R1, R2) were performed before and after training with the modified IKWG algorithm (eyeballing method at 400× instead of counting 100 tumor nuclei per area). Concordance was assessed using Kendall's and Kappa coefficients. RESULTS: Analysis of ordinal scale ratings for all categories with 5% increments showed almost perfect agreement in R1 (0.821) and substantial in R2 (0.793); Seniors and juniors had substantial agreement in R1 (0.718 vs. 0.649) and R2 (0.756 vs. 0.658). In dichotomous scale analysis using 20% as the cutoff, the overall agreement was moderate in R1 (0.437) and R2 (0.479), among seniors (R1: 0.436; R2: 0.437) and juniors (R1: 0.445; R2: 0.505). Average scoring time per case was higher in R2 (71 vs. 37 s). CONCLUSION: The modified IKWG algorithm does not significantly improve interobserver agreement. A better algorithm or assistance from digital image analysis is needed to improve interobserver variability in Ki-67 evaluation.
Subject(s)
Breast Neoplasms , Humans , Female , Breast Neoplasms/pathology , Ki-67 Antigen/metabolism , Observer Variation , Pathologists , Breast/pathology , Reproducibility of ResultsABSTRACT
Recent progress in computational pathology has been driven by deep learning. While code and data availability are essential to reproduce findings from preceding publications, ensuring a deep learning model's reusability is more challenging. For that, the codebase should be well-documented and easy to integrate into existing workflows and models should be robust toward noise and generalizable toward data from different sources. Strikingly, only a few computational pathology algorithms have been reused by other researchers so far, let alone employed in a clinical setting. To assess the current state of reproducibility and reusability of computational pathology algorithms, we evaluated peer-reviewed articles available in PubMed, published between January 2019 and March 2021, in 5 use cases: stain normalization; tissue type segmentation; evaluation of cell-level features; genetic alteration prediction; and inference of grading, staging, and prognostic information. We compiled criteria for data and code availability and statistical result analysis and assessed them in 160 publications. We found that only one-quarter (41 of 160 publications) made code publicly available. Among these 41 studies, three-quarters (30 of 41) analyzed their results statistically, half of them (20 of 41) released their trained model weights, and approximately a third (16 of 41) used an independent cohort for evaluation. Our review is intended for both pathologists interested in deep learning and researchers applying algorithms to computational pathology challenges. We provide a detailed overview of publications with published code in the field, list reusable data handling tools, and provide criteria for reproducibility and reusability.
Subject(s)
Deep Learning , Humans , Reproducibility of Results , Algorithms , PathologistsABSTRACT
Conventional histopathology involves expensive and labor-intensive processes that often consume tissue samples, rendering them unavailable for other analyses. We present a novel end-to-end workflow for pathology powered by hyperspectral microscopy and deep learning. First, we developed a custom hyperspectral microscope to nondestructively image the autofluorescence of unstained tissue sections. We then trained a deep learning model to use autofluorescence to generate virtual histologic stains, which avoids the cost and variability of chemical staining procedures and conserves tissue samples. We showed that the virtual images reproduce the histologic features present in the real-stained images using a randomized nonalcoholic steatohepatitis (NASH) scoring comparison study, where both real and virtual stains are scored by pathologists (D.T., A.D.B., R.K.P.). The test showed moderate-to-good concordance between pathologists' scoring on corresponding real and virtual stains. Finally, we developed deep learning-based models for automated NASH Clinical Research Network score prediction. We showed that the end-to-end automated pathology platform is comparable with an independent panel of pathologists for NASH Clinical Research Network scoring when evaluated against the expert pathologist consensus scores. This study provides proof of concept for this virtual staining strategy, which could improve cost, efficiency, and reliability in pathology and enable novel approaches to spatial biology research.
Subject(s)
Deep Learning , Non-alcoholic Fatty Liver Disease , Humans , Microscopy , Reproducibility of Results , PathologistsABSTRACT
We review B-cell neoplasms in the 5th edition of the World Health Organization classification of hematolymphoid tumors (WHO-HEM5). The revised classification is based on a multidisciplinary approach including input from pathologists, clinicians, and other experts. The WHO-HEM5 follows a hierarchical structure allowing the use of family (class)-level definitions when defining diagnostic criteria are partially met or a complete investigational workup is not possible. Disease types and subtypes have expanded compared with the WHO revised 4th edition (WHO-HEM4R), mainly because of the expansion in genomic knowledge of these diseases. In this review, we focus on highlighting changes and updates in the classification of B-cell lymphomas, providing a comparison with WHO-HEM4R, and offering guidance on how the new classification can be applied to the diagnosis of B-cell lymphomas in routine practice.
Subject(s)
Hematologic Neoplasms , Lymphoma, B-Cell , Humans , Lymphoma, B-Cell/pathology , World Health Organization , Pathologists , Hematologic Neoplasms/pathologyABSTRACT
Pathologists have, over several decades, developed criteria for diagnosing and grading prostate cancer. However, this knowledge has not, so far, been included in the design of convolutional neural networks (CNN) for prostate cancer detection and grading. Further, it is not known whether the features learned by machine-learning algorithms coincide with diagnostic features used by pathologists. We propose a framework that enforces algorithms to learn the cellular and subcellular differences between benign and cancerous prostate glands in digital slides from hematoxylin and eosin-stained tissue sections. After accurate gland segmentation and exclusion of the stroma, the central component of the pipeline, named HistoEM, utilizes a histogram embedding of features from the latent space of the CNN encoder. Each gland is represented by 128 feature-wise histograms that provide the input into a second network for benign vs cancer classification of the whole gland. Cancer glands are further processed by a U-Net structured network to separate low-grade from high-grade cancer. Our model demonstrates similar performance compared with other state-of-the-art prostate cancer grading models with gland-level resolution. To understand the features learned by HistoEM, we first rank features based on the distance between benign and cancer histograms and visualize the tissue origins of the 2 most important features. A heatmap of pixel activation by each feature is generated using Grad-CAM and overlaid on nuclear segmentation outlines. We conclude that HistoEM, similar to pathologists, uses nuclear features for the detection of prostate cancer. Altogether, this novel approach can be broadly deployed to visualize computer-learned features in histopathology images.
Subject(s)
Pathologists , Prostatic Neoplasms , Male , Humans , Workflow , Neural Networks, Computer , Algorithms , Prostatic Neoplasms/pathologyABSTRACT
Evidence-based medicine (EBM) can be an unfamiliar territory for those working in tumor pathology research, and there is a great deal of uncertainty about how to undertake an EBM approach to planning and reporting histopathology-based studies. In this article, reviewed and endorsed by the Word Health Organization International Agency for Research on Cancer's International Collaboration for Cancer Classification and Research, we aim to help pathologists and researchers understand the basics of planning an evidence-based tumor pathology research study, as well as our recommendations on how to report the findings from these. We introduce some basic EBM concepts, a framework for research questions, and thoughts on study design and emphasize the concept of reporting standards. There are many study-specific reporting guidelines available, and we provide an overview of these. However, existing reporting guidelines perhaps do not always fit tumor pathology research papers, and hence, here, we collate the key reporting data set together into one generic checklist that we think will simplify the task for pathologists. The article aims to complement our recent hierarchy of evidence for tumor pathology and glossary of evidence (study) types in tumor pathology. Together, these articles should help any researcher get to grips with the basics of EBM for planning and publishing research in tumor pathology, as well as encourage an improved standard of the reports available to us all in the literature.
Subject(s)
Evidence-Based Medicine , Neoplasms , World Health Organization , Humans , Neoplasms/pathology , Neoplasms/classification , Pathologists , Biomedical Research , Research Design/standards , Pathology/standards , Evidence GapsABSTRACT
Somatic tumor testing in prostate cancer (PCa) can guide treatment options by identifying clinically actionable variants in DNA damage repair genes, including acquired variants not detected using germline testing alone. Guidelines currently recommend performing somatic tumor testing in metastatic PCa, whereas there is no consensus on the role of testing in regional disease, and the optimal testing strategy is only evolving. This study evaluates the frequency, distribution, and pathologic correlates of somatic DNA damage repair mutations in metastatic and localized PCa following the implementation of pathologist-driven reflex testing at diagnosis. A cohort of 516 PCa samples were sequenced using a custom next-generation sequencing panel including homologous recombination repair and mismatch repair genes. Variants were classified based on the Association for Molecular Pathology/American Society of Clinical Oncology/College of American Pathologists guidelines. In total, 183 (35.5%) patients had at least one variant, which is as follows: 72 of 516 (13.9%) patients had at least 1 tier I or tier II variant, whereas 111 of 516 (21.5%) patients had a tier III variant. Tier I/II variant(s) were identified in 27% (12/44) of metastatic biopsy samples and 13% (61/472) of primary samples. Overall, 12% (62/516) of patients had at least 1 tier I/II variant in a homologous recombination repair gene, whereas 2.9% (10/516) had at least 1 tier I/II variant in a mismatch repair gene. The presence of a tier I/II variant was not significantly associated with the grade group (GG) or presence of intraductal/cribriform carcinoma in the primary tumor. Among the 309 reflex-tested hormone-naive primary tumors, tier I/II variants were identified in 10% (31/309) of cases, which is as follows: 9.2% (9/98) GG2; 9% (9/100) GG3; 9.1% (4/44) GG4; and 13.4% (9/67) GG5 cases. Our findings confirm the use of somatic tumor testing in detecting variants of clinical significance in PCa and provide insights that can inform the design of testing strategies. Pathologist-initiated reflex testing streamlines the availability of the results for clinical decision-making; however, pathologic parameters such as GG and the presence of intraductal/cribriform carcinoma may not be reliable to guide patient selection.
Subject(s)
Prostatic Neoplasms , Tertiary Care Centers , Humans , Male , Prostatic Neoplasms/genetics , Prostatic Neoplasms/pathology , Prostatic Neoplasms/diagnosis , Aged , Middle Aged , Mutation , High-Throughput Nucleotide Sequencing , PathologistsABSTRACT
Several studies have developed various artificial intelligence (AI) models for immunohistochemical analysis of programmed death ligand 1 (PD-L1) in patients with non-small cell lung carcinoma; however, none have focused on specific ways by which AI-assisted systems could help pathologists determine the tumor proportion score (TPS). In this study, we developed an AI model to calculate the TPS of the PD-L1 22C3 assay and evaluated whether and how this AI-assisted system could help pathologists determine the TPS and analyze how AI-assisted systems could affect pathologists' assessment accuracy. We assessed the 4 methods of the AI-assisted system: (1 and 2) pathologists first assessed and then referred to automated AI scoring results (1, positive tumor cell percentage; 2, positive tumor cell percentage and visualized overlay image) for final confirmation, and (3 and 4) pathologists referred to the automated AI scoring results (3, positive tumor cell percentage; 4, positive tumor cell percentage and visualized overlay image) while determining TPS. Mixed-model analysis was used to calculate the odds ratios (ORs) with 95% CI for AI-assisted TPS methods 1 to 4 compared with pathologists' scoring. For all 584 samples of the tissue microarray, the OR for AI-assisted TPS methods 1 to 4 was 0.94 to 1.07 and not statistically significant. Of them, we found 332 discordant cases, on which the pathologists' judgments were inconsistent; the ORs for AI-assisted TPS methods 1, 2, 3, and 4 were 1.28 (1.06-1.54; P = .012), 1.29 (1.06-1.55; P = .010), 1.28 (1.06-1.54; P = .012), and 1.29 (1.06-1.55; P = .010), respectively, which were statistically significant. For discordant cases, the OR for each AI-assisted TPS method compared with the others was 0.99 to 1.01 and not statistically significant. This study emphasized the usefulness of the AI-assisted system for cases in which pathologists had difficulty determining the PD-L1 TPS.
Subject(s)
B7-H1 Antigen , Biomarkers, Tumor , Carcinoma, Non-Small-Cell Lung , Deep Learning , Immunohistochemistry , Lung Neoplasms , Pathologists , Humans , Carcinoma, Non-Small-Cell Lung/pathology , Carcinoma, Non-Small-Cell Lung/metabolism , Lung Neoplasms/pathology , Lung Neoplasms/metabolism , B7-H1 Antigen/analysis , Immunohistochemistry/methods , Biomarkers, Tumor/analysis , Female , Male , Reproducibility of ResultsABSTRACT
Whole slide imaging is becoming a routine procedure in clinical diagnosis. Advanced image analysis techniques have been developed to assist pathologists in disease diagnosis, staging, subtype classification, and risk stratification. Recently, deep learning algorithms have achieved state-of-the-art performances in various imaging analysis tasks, including tumor region segmentation, nuclei detection, and disease classification. However, widespread clinical use of these algorithms is hampered by their performances often degrading due to image quality issues commonly seen in real-world pathology imaging data such as low resolution, blurring regions, and staining variation. Restore-Generative Adversarial Network (GAN), a deep learning model, was developed to improve the imaging qualities by restoring blurred regions, enhancing low resolution, and normalizing staining colors. The results demonstrate that Restore-GAN can significantly improve image quality, which leads to improved model robustness and performance for existing deep learning algorithms in pathology image analysis. Restore-GAN has the potential to be used to facilitate the applications of deep learning models in digital pathology analyses.
Subject(s)
Algorithms , Pathologists , Humans , Cell Nucleus , Image Processing, Computer-Assisted , Staining and LabelingABSTRACT
Colorectal cancer (CRC) is one of the most common types of cancer among men and women. The grading of dysplasia and the detection of adenocarcinoma are important clinical tasks in the diagnosis of CRC and shape the patients' follow-up plans. This study evaluated the feasibility of deep learning models for the classification of colorectal lesions into four classes: benign, low-grade dysplasia, high-grade dysplasia, and adenocarcinoma. To this end, a deep neural network was developed on a training set of 655 whole slide images of digitized colorectal resection slides from a tertiary medical institution; and the network was evaluated on an internal test set of 234 slides, as well as on an external test set of 606 adenocarcinoma slides from The Cancer Genome Atlas database. The model achieved an overall accuracy, sensitivity, and specificity of 95.5%, 91.0%, and 97.1%, respectively, on the internal test set, and an accuracy and sensitivity of 98.5% for adenocarcinoma detection task on the external test set. Results suggest that such deep learning models can potentially assist pathologists in grading colorectal dysplasia, detecting adenocarcinoma, prescreening, and prioritizing the reviewing of suspicious cases to improve the turnaround time for patients with a high risk of CRC. Furthermore, the high sensitivity on the external test set suggests the model's generalizability in detecting colorectal adenocarcinoma on whole slide images across different institutions.
Subject(s)
Adenocarcinoma , Colorectal Neoplasms , Deep Learning , Male , Humans , Female , Neural Networks, Computer , Adenocarcinoma/diagnosis , Adenocarcinoma/pathology , Pathologists , Hyperplasia , Colorectal Neoplasms/diagnosisABSTRACT
Artificial intelligence (AI)-based diagnostic tools can offer numerous benefits to the field of histopathology, including improved diagnostic accuracy, efficiency and productivity. As a result, such tools are likely to have an increasing role in routine practice. However, all AI tools are prone to errors, and these AI-associated errors have been identified as a major risk in the introduction of AI into healthcare. The errors made by AI tools are different, in terms of both cause and nature, to the errors made by human pathologists. As highlighted by the National Institute for Health and Care Excellence, it is imperative that practising pathologists understand the potential limitations of AI tools, including the errors made. Pathologists are in a unique position to be gatekeepers of AI tool use, maximizing patient benefit while minimizing harm. Furthermore, their pathological knowledge is essential to understanding when, and why, errors have occurred and so to developing safer future algorithms. This paper summarises the literature on errors made by AI diagnostic tools in histopathology. These include erroneous errors, data concerns (data bias, hidden stratification, data imbalances, distributional shift, and lack of generalisability), reinforcement of outdated practices, unsafe failure mode, automation bias, and insensitivity to impact. Methods to reduce errors in both tool design and clinical use are discussed, and the practical roles for pathologists in error minimisation are highlighted. This aims to inform and empower pathologists to move safely through this seismic change in practice and help ensure that novel AI tools are adopted safely.
Subject(s)
Artificial Intelligence , Pathologists , Humans , AlgorithmsABSTRACT
BACKGROUND AND AIMS: ChatGPT is a powerful artificial intelligence (AI) chatbot developed by the OpenAI research laboratory which is capable of analysing human input and generating human-like responses. Early research into the potential application of ChatGPT in healthcare has focused mainly on clinical and administrative functions. The diagnostic ability and utility of ChatGPT in histopathology is not well defined. We benchmarked the performance of ChatGPT against pathologists in diagnostic histopathology, and evaluated the collaborative potential between pathologists and ChatGPT to deliver more accurate diagnoses. METHODS AND RESULTS: In Part 1 of the study, pathologists and ChatGPT were subjected to a series of questions encompassing common diagnostic conundrums in histopathology. For Part 2, pathologists reviewed a series of challenging virtual slides and provided their diagnoses before and after consultation with ChatGPT. We found that ChatGPT performed worse than pathologists in reaching the correct diagnosis. Consultation with ChatGPT provided limited help and information generated from ChatGPT is dependent on the prompts provided by the pathologists and is not always correct. Finally, we surveyed pathologists who rated the diagnostic accuracy of ChatGPT poorly, but found it useful as an advanced search engine. CONCLUSIONS: The use of ChatGPT4 as a diagnostic tool in histopathology is limited by its inherent shortcomings. Judicious evaluation of the information and histopathology diagnosis generated from ChatGPT4 is essential and cannot replace the acuity and judgement of a pathologist. However, future advances in generative AI may expand its role in the field of histopathology.
Subject(s)
Artificial Intelligence , Pathologists , Humans , Biopsy , Referral and Consultation , SoftwareABSTRACT
A growing body of research supports stromal tumour-infiltrating lymphocyte (TIL) density in breast cancer to be a robust prognostic and predicive biomarker. The gold standard for stromal TIL density quantitation in breast cancer is pathologist visual assessment using haematoxylin and eosin-stained slides. Artificial intelligence/machine-learning algorithms are in development to automate the stromal TIL scoring process, and must be validated against a reference standard such as pathologist visual assessment. Visual TIL assessment may suffer from significant interobserver variability. To improve interobserver agreement, regulatory science experts at the US Food and Drug Administration partnered with academic pathologists internationally to create a freely available online continuing medical education (CME) course to train pathologists in assessing breast cancer stromal TILs using an interactive format with expert commentary. Here we describe and provide a user guide to this CME course, whose content was designed to improve pathologist accuracy in scoring breast cancer TILs. We also suggest subsequent steps to translate knowledge into clinical practice with proficiency testing.
Subject(s)
Breast Neoplasms , Humans , Female , Pathologists , Lymphocytes, Tumor-Infiltrating , Artificial Intelligence , PrognosisABSTRACT
In recent years anatomical pathology has been revolutionised by the incorporation of molecular findings into routine diagnostic practice, and in some diseases the presence of specific molecular alterations are now essential for diagnosis. Spatial transcriptomics describes a group of technologies that provide up to transcriptome-wide expression profiling while preserving the spatial origin of the data, with many of these technologies able to provide these data using a single tissue section. Spatial transcriptomics allows expression profiling of highly specific areas within a tissue section potentially to subcellular resolution, and allows correlation of expression data with morphology, tissue type and location relative to other structures. While largely still research laboratory-based, several spatial transcriptomics methods have now achieved compatibility with formalin-fixed paraffin-embedded tissue (FFPE), allowing their use in diagnostic tissue samples, and with further development potentially leading to their incorporation in routine anatomical pathology practice. This mini review provides an overview of spatial transcriptomics methods, with an emphasis on platforms compatible with FFPE tissue, approaches to assess the data and potential applications in anatomical pathology practice.
Subject(s)
Gene Expression Profiling , Pathologists , Humans , Paraffin Embedding/methods , Gene Expression Profiling/methods , Transcriptome , Formaldehyde/metabolismABSTRACT
Many patients with non-small cell lung cancer do not receive guideline-recommended, biomarker-directed therapy, despite the potential for improved clinical outcomes. Access to timely, accurate, and comprehensive molecular profiling, including targetable protein overexpression, is essential to allow fully informed treatment decisions to be taken. In turn, this requires optimal tissue management to protect and maximize the use of this precious finite resource. Here, a group of leading thoracic pathologists recommend factors to consider for optimal tissue management. Starting from when lung cancer is first suspected, keeping predictive biomarker testing in the front of the mind should drive the development of practices and procedures that conserve tissue appropriately to support molecular characterization and treatment selection.