ABSTRACT
Recent advances in the field of immuno-oncology have brought transformative changes in the management of cancer patients. The immune profile of tumours has been found to have key value in predicting disease prognosis and treatment response in various cancers. Multiplex immunohistochemistry and immunofluorescence have emerged as potent tools for the simultaneous detection of multiple protein biomarkers in a single tissue section, thereby expanding opportunities for molecular and immune profiling while preserving tissue samples. By establishing the phenotype of individual tumour cells when distributed within a mixed cell population, the identification of clinically relevant biomarkers with high-throughput multiplex immunophenotyping of tumour samples has great potential to guide appropriate treatment choices. Moreover, the emergence of novel multi-marker imaging approaches can now provide unprecedented insights into the tumour microenvironment, including the potential interplay between various cell types. However, there are significant challenges to widespread integration of these technologies in daily research and clinical practice. This review addresses the challenges and potential solutions within a structured framework of action from a regulatory and clinical trial perspective. New developments within the field of immunophenotyping using multiplexed tissue imaging platforms and associated digital pathology are also described, with a specific focus on translational implications across different subtypes of cancer. © 2024 The Authors. The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland.
Subject(s)
Breast Neoplasms , Humans , Female , Biomarkers, Tumor/genetics , Prognosis , Phenotype , United Kingdom , Tumor MicroenvironmentABSTRACT
This work puts forth and demonstrates the utility of a reporting framework for collecting and evaluating annotations of medical images used for training and testing artificial intelligence (AI) models in assisting detection and diagnosis. AI has unique reporting requirements, as shown by the AI extensions to the Consolidated Standards of Reporting Trials (CONSORT) and Standard Protocol Items: Recommendations for Interventional Trials (SPIRIT) checklists and the proposed AI extensions to the Standards for Reporting Diagnostic Accuracy (STARD) and Transparent Reporting of a Multivariable Prediction model for Individual Prognosis or Diagnosis (TRIPOD) checklists. AI for detection and/or diagnostic image analysis requires complete, reproducible, and transparent reporting of the annotations and metadata used in training and testing data sets. In an earlier work by other researchers, an annotation workflow and quality checklist for computational pathology annotations were proposed. In this manuscript, we operationalize this workflow into an evaluable quality checklist that applies to any reader-interpreted medical images, and we demonstrate its use for an annotation effort in digital pathology. We refer to this quality framework as the Collection and Evaluation of Annotations for Reproducible Reporting of Artificial Intelligence (CLEARR-AI).
Subject(s)
Artificial Intelligence , Checklist , Humans , Prognosis , Image Processing, Computer-Assisted , Research DesignABSTRACT
A growing body of research supports stromal tumour-infiltrating lymphocyte (TIL) density in breast cancer to be a robust prognostic and predicive biomarker. The gold standard for stromal TIL density quantitation in breast cancer is pathologist visual assessment using haematoxylin and eosin-stained slides. Artificial intelligence/machine-learning algorithms are in development to automate the stromal TIL scoring process, and must be validated against a reference standard such as pathologist visual assessment. Visual TIL assessment may suffer from significant interobserver variability. To improve interobserver agreement, regulatory science experts at the US Food and Drug Administration partnered with academic pathologists internationally to create a freely available online continuing medical education (CME) course to train pathologists in assessing breast cancer stromal TILs using an interactive format with expert commentary. Here we describe and provide a user guide to this CME course, whose content was designed to improve pathologist accuracy in scoring breast cancer TILs. We also suggest subsequent steps to translate knowledge into clinical practice with proficiency testing.
Subject(s)
Breast Neoplasms , Humans , Female , Pathologists , Lymphocytes, Tumor-Infiltrating , Artificial Intelligence , PrognosisABSTRACT
Quantifying tumor-infiltrating lymphocytes (TILs) in breast cancer tumors is a challenging task for pathologists. With the advent of whole slide imaging that digitizes glass slides, it is possible to apply computational models to quantify TILs for pathologists. Development of computational models requires significant time, expertise, consensus, and investment. To reduce this burden, we are preparing a dataset for developers to validate their models and a proposal to the Medical Device Development Tool (MDDT) program in the Center for Devices and Radiological Health of the U.S. Food and Drug Administration (FDA). If the FDA qualifies the dataset for its submitted context of use, model developers can use it in a regulatory submission within the qualified context of use without additional documentation. Our dataset aims at reducing the regulatory burden placed on developers of models that estimate the density of TILs and will allow head-to-head comparison of multiple computational models on the same data. In this paper, we discuss the MDDT preparation and submission process, including the feedback we received from our initial interactions with the FDA and propose how a qualified MDDT validation dataset could be a mechanism for open, fair, and consistent measures of computational model performance. Our experiences will help the community understand what the FDA considers relevant and appropriate (from the perspective of the submitter), at the early stages of the MDDT submission process, for validating stromal TIL density estimation models and other potential computational models. © 2023 The Authors. The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland. This article has been contributed to by U.S. Government employees and their work is in the public domain in the USA.
Subject(s)
Lymphocytes, Tumor-Infiltrating , Pathologists , United States , Humans , United States Food and Drug Administration , Lymphocytes, Tumor-Infiltrating/pathology , United KingdomABSTRACT
Modern histologic imaging platforms coupled with machine learning methods have provided new opportunities to map the spatial distribution of immune cells in the tumor microenvironment. However, there exists no standardized method for describing or analyzing spatial immune cell data, and most reported spatial analyses are rudimentary. In this review, we provide an overview of two approaches for reporting and analyzing spatial data (raster versus vector-based). We then provide a compendium of spatial immune cell metrics that have been reported in the literature, summarizing prognostic associations in the context of a variety of cancers. We conclude by discussing two well-described clinical biomarkers, the breast cancer stromal tumor infiltrating lymphocytes score and the colon cancer Immunoscore, and describe investigative opportunities to improve clinical utility of these spatial biomarkers. © 2023 The Pathological Society of Great Britain and Ireland.
Subject(s)
Colonic Neoplasms , Humans , Biomarkers , Benchmarking , Lymphocytes, Tumor-Infiltrating , Spatial Analysis , Tumor MicroenvironmentABSTRACT
The clinical significance of the tumor-immune interaction in breast cancer is now established, and tumor-infiltrating lymphocytes (TILs) have emerged as predictive and prognostic biomarkers for patients with triple-negative (estrogen receptor, progesterone receptor, and HER2-negative) breast cancer and HER2-positive breast cancer. How computational assessments of TILs might complement manual TIL assessment in trial and daily practices is currently debated. Recent efforts to use machine learning (ML) to automatically evaluate TILs have shown promising results. We review state-of-the-art approaches and identify pitfalls and challenges of automated TIL evaluation by studying the root cause of ML discordances in comparison to manual TIL quantification. We categorize our findings into four main topics: (1) technical slide issues, (2) ML and image analysis aspects, (3) data challenges, and (4) validation issues. The main reason for discordant assessments is the inclusion of false-positive areas or cells identified by performance on certain tissue patterns or design choices in the computational implementation. To aid the adoption of ML for TIL assessment, we provide an in-depth discussion of ML and image analysis, including validation issues that need to be considered before reliable computational reporting of TILs can be incorporated into the trial and routine clinical management of patients with triple-negative breast cancer. © 2023 The Authors. The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland.
Subject(s)
Mammary Neoplasms, Animal , Triple Negative Breast Neoplasms , Humans , Animals , Lymphocytes, Tumor-Infiltrating , Biomarkers , Machine LearningABSTRACT
Graph data models are an emerging approach to structure clinical and biomedical information. These models offer intriguing opportunities for novel approaches in healthcare, such as disease phenotyping, risk prediction, and personalized precision care. The combination of data and information in a graph model to create knowledge graphs has rapidly expanded in biomedical research, but the integration of real-world data from the electronic health record has been limited. To broadly apply knowledge graphs to EHR and other real-world data, a deeper understanding of how to represent these data in a standardized graph model is needed. We provide an overview of the state-of-the-art research for clinical and biomedical data integration and summarize the potential to accelerate healthcare and precision medicine research through insight generation from integrated knowledge graphs.
Subject(s)
Algorithms , Biomedical Research , Humans , Pattern Recognition, Automated , Phenotype , Precision MedicineABSTRACT
BACKGROUND: Clinical babesiosis is diagnosed, and parasite burden is determined, by microscopic inspection of a thick or thin Giemsa-stained peripheral blood smear. However, quantitative analysis by manual microscopy is subject to error. As such, methods for the automated measurement of percent parasitemia in digital microscopic images of peripheral blood smears could improve clinical accuracy, relative to the predicate method. METHODS: Individual erythrocyte images were manually labeled as "parasite" or "normal" and were used to train a model for binary image classification. The best model was then used to calculate percent parasitemia from a clinical validation dataset, and values were compared to a clinical reference value. Lastly, model interpretability was examined using an integrated gradient to identify pixels most likely to influence classification decisions. RESULTS: The precision and recall of the model during development testing were 0.92 and 1.00, respectively. In clinical validation, the model returned increasing positive signal with increasing mean reference value. However, there were 2 highly erroneous false positive values returned by the model. Further, the model incorrectly assessed 3 cases well above the clinical threshold of 10%. The integrated gradient suggested potential sources of false positives including rouleaux formations, cell boundaries, and precipitate as deterministic factors in negative erythrocyte images. CONCLUSIONS: While the model demonstrated highly accurate single cell classification and correctly assessed most slides, several false positives were highly incorrect. This project highlights the need for integrated testing of machine learning-based models, even when models in the development phase perform well.
Subject(s)
Babesia , Parasitemia , Erythrocytes , Humans , Microscopy/methods , Neural Networks, Computer , Parasitemia/diagnosisABSTRACT
PURPOSE: To compare patient preferences for eyeglasses prescribed using a low-cost, portable wavefront autorefractor versus standard subjective refraction (SR). DESIGN: Randomized, cross-over clinical trial. PARTICIPANTS: Patients aged 18 to 40 years presenting with refractive errors (REs) to a tertiary eye hospital in Southern India. METHODS: Participants underwent SR followed by autorefraction (AR) using the monocular version of the QuickSee device (PlenOptika Inc). An independent optician, masked to the refraction approach, prepared eyeglasses based on each refraction approach. Participants (masked to refraction source) were randomly assigned to use SR- or AR-based eyeglasses first, followed by the other pair, for 1 week each. At the end of each week, participants had their vision checked and were interviewed about their experience with the eyeglasses. MAIN OUTCOME MEASURES: Patients preferring eyeglasses were chosen using AR and SR. RESULTS: The 400 participants enrolled between March 26, 2018, and August 2, 2019, had a mean (standard deviation) age of 28.4 (6.6) years, and 68.8% were women. There was a strong correlation between spherical equivalents using SR and AR (r = 0.97, P < 0.001) with a mean difference of -0.07 diopters (D) (95% limits of agreement [LoA], -0.68 to 0.83). Of the 301 patients (75.2%) who completed both follow-up visits, 50.5% (n = 152) and 49.5% (n = 149) preferred glasses prescribed using SR and AR, respectively (95% CI, 45.7-56.3; P = 0.86). There were no differences in demographic or vision characteristics between participants with different preferences (P > 0.05 for all). CONCLUSIONS: We observed a strong agreement between the prescriptions from SR and AR, and eyeglasses prescribed using SR and AR were equally preferred by patients. Wider use of prescribing based on AR alone in resource-limited settings is supported by these findings.
Subject(s)
Eyeglasses , Prescriptions , Refractive Errors/diagnosis , Retinoscopy/economics , Retinoscopy/standards , Adolescent , Adult , Cross-Over Studies , Double-Blind Method , Female , Humans , Male , Patient Acceptance of Health Care , Refraction, Ocular/physiology , Refractive Errors/physiopathology , Refractive Errors/therapy , Reproducibility of Results , Young AdultABSTRACT
PURPOSE: We propose an addition to the Snosek et al. classification to include a subtype variant of sternalis muscle: mixed type and triple subtype. METHODS: Dissection of the anterior thorax of a 96-year-old female cadaver revealed bilateral sternalis muscles with an undocumented variant of the right sternalis muscle. RESULTS: The left sternalis muscle presented as a simple type-left single using the Snosek et al. classification scheme. The right sternalis muscle revealed a previously undocumented classification type. It consisted of three bellies and two heads, with the lateral head formed by two converging bellies and the medial head formed from the superficial medial belly. CONCLUSIONS: The unique presentation of right sternalis muscle can be classified by expanding the Snosek et al. classification scheme to include triple-bellied subtypes. This presentation is classified as a mixed type-right triple, with single bicipital converging and single bicipital diverging. Documentation of sternalis muscle variations can prevent misdiagnoses within the anterior thorax.
Subject(s)
Muscle, Skeletal/anatomy & histology , Sternum/anatomy & histology , Thoracic Wall/anatomy & histology , Aged, 80 and over , Anatomic Variation , Cadaver , Dissection , Female , HumansABSTRACT
Importance: Physicians who belong to minoritized racial and ethnic groups remain underrepresented and underpromoted. Serving as a chief resident is an important position of leadership and prestige, and indicates a benchmark for future professional success. However, it is unknown if disparities in race and/or sex exist in the chief resident selection process. Objective: To describe race, ethnicity, and sex of emergency medicine (EM) chief residents and determine the association of racial identity and the intersectionality of race and sex for selecting chief residents in US emergency medicine departments. Design, Setting, and Participants: This cohort study analyzed data collected from the Association of American Medical Colleges and the Electronic Residency Application Service in the graduating classes of 2017 and 2018. Data were analyzed between December 2021 and January 2023. Main Outcomes and Measures: Relative risk (RR) of selection for chief residency for Black, Asian, and Hispanic EM residents in comparison with White counterparts. Results: Among 3408 studied residents, 738 (21.7%) served as chief resident (2253 male [66.1%]; 451 Asian [13.2%], 144 Black [4.2%], 158 Hispanic [4.6%], 239 more than 1 race [7.0%], 46 other [1.3%], and 2370 White [69.5%]). Of chiefs, 81 (11.0%) identified as Asian, 17 (2.3%) as Black, and 26 (3.5%) Hispanic. Asian residents were 78% (95% CI, 63%-96%) as likely to be promoted to chief resident compared with White peers, and Black residents were 51% (95% CI, 32%-80%) as likely as White residents. In our fully adjusted model, racial differences remained significant for Black residents, who were half as likely as white residents to be selected for chief residency (adjusted risk ratio [aRR], 0.55; 95% CI, 0.36-0.82). Overall, White women were most likely to be selected for chief residency and 20% more likely to be selected than White men counterparts (aRR, 1.20; 95% CI, 1.03-1.39). In comparison, women underrepresented in medicine (a category that included residents identified as Black, Hispanic, American Indian or Alaskan Native, and Native Hawaiian or Other Pacific Islander) were least likely to be selected for chief promotion, and 50% as likely to be selected for chief resident compared with White men (aRR, 0.50; 95% CI, 0.06-0.66). Conclusions and Relevance: In this 2024 nationally representative study of EM residents, chief promotion was lower among residents identifying as Asian or Black, and in particular, women underrepresented in medicine. This study's findings suggest further review of chief resident selection process by residency programs and accreditation bodies is needed to ensure workforce equity for promotion and opportunities for leadership.
Subject(s)
Emergency Medicine , Internship and Residency , Humans , Internship and Residency/statistics & numerical data , Emergency Medicine/education , Emergency Medicine/statistics & numerical data , Female , Male , United States , Adult , Cohort StudiesABSTRACT
OBJECTIVES: To introduce quantum computing technologies as a tool for biomedical research and highlight future applications within healthcare, focusing on its capabilities, benefits, and limitations. TARGET AUDIENCE: Investigators seeking to explore quantum computing and create quantum-based applications for healthcare and biomedical research. SCOPE: Quantum computing requires specialized hardware, known as quantum processing units, that use quantum bits (qubits) instead of classical bits to perform computations. This article will cover (1) proposed applications where quantum computing offers advantages to classical computing in biomedicine; (2) an introduction to how quantum computers operate, tailored for biomedical researchers; (3) recent progress that has expanded access to quantum computing; and (4) challenges, opportunities, and proposed solutions to integrate quantum computing in biomedical applications.
Subject(s)
Biomedical Research , Quantum Theory , Humans , Delivery of Health Care , Computing MethodologiesABSTRACT
Studies during the COVID-19 pandemic showed that children had heightened nasal innate immune responses compared with adults. To evaluate the role of nasal viruses and bacteria in driving these responses, we performed cytokine profiling and comprehensive, symptom-agnostic testing for respiratory viruses and bacterial pathobionts in nasopharyngeal samples from children tested for SARS-CoV-2 in 2021-22 (n = 467). Respiratory viruses and/or pathobionts were highly prevalent (82% of symptomatic and 30% asymptomatic children; 90 and 49% for children <5 years). Virus detection and load correlated with the nasal interferon response biomarker CXCL10, and the previously reported discrepancy between SARS-CoV-2 viral load and nasal interferon response was explained by viral coinfections. Bacterial pathobionts correlated with a distinct proinflammatory response with elevated IL-1ß and TNF but not CXCL10. Furthermore, paired samples from healthy 1-year-olds collected 1-2 wk apart revealed frequent respiratory virus acquisition or clearance, with mucosal immunophenotype changing in parallel. These findings reveal that frequent, dynamic host-pathogen interactions drive nasal innate immune activation in children.
Subject(s)
COVID-19 , Immunity, Innate , SARS-CoV-2 , Humans , Immunity, Innate/immunology , Child, Preschool , Infant , COVID-19/immunology , COVID-19/virology , Child , SARS-CoV-2/immunology , Female , Male , Nasopharynx/immunology , Nasopharynx/virology , Nasopharynx/microbiology , Viral Load , Nasal Mucosa/immunology , Nasal Mucosa/virology , Nasal Mucosa/microbiology , Cytokines/metabolism , Cytokines/immunology , Host-Pathogen Interactions/immunology , Adolescent , Nose/immunology , Nose/virology , Nose/microbiology , Coinfection/immunology , Coinfection/virologyABSTRACT
Purpose: Validation of artificial intelligence (AI) algorithms in digital pathology with a reference standard is necessary before widespread clinical use, but few examples focus on creating a reference standard based on pathologist annotations. This work assesses the results of a pilot study that collects density estimates of stromal tumor-infiltrating lymphocytes (sTILs) in breast cancer biopsy specimens. This work will inform the creation of a validation dataset for the evaluation of AI algorithms fit for a regulatory purpose. Approach: Collaborators and crowdsourced pathologists contributed glass slides, digital images, and annotations. Here, "annotations" refer to any marks, segmentations, measurements, or labels a pathologist adds to a report, image, region of interest (ROI), or biological feature. Pathologists estimated sTILs density in 640 ROIs from hematoxylin and eosin stained slides of 64 patients via two modalities: an optical light microscope and two digital image viewing platforms. Results: The pilot study generated 7373 sTILs density estimates from 29 pathologists. Analysis of annotations found the variability of density estimates per ROI increases with the mean; the root mean square differences were 4.46, 14.25, and 26.25 as the mean density ranged from 0% to 10%, 11% to 40%, and 41% to 100%, respectively. The pilot study informs three areas of improvement for future work: technical workflows, annotation platforms, and agreement analysis methods. Upgrades to the workflows and platforms will improve operability and increase annotation speed and consistency. Conclusions: Exploratory data analysis demonstrates the need to develop new statistical approaches for agreement. The pilot study dataset and analysis methods are publicly available to allow community feedback. The development and results of the validation dataset will be publicly available to serve as an instructive tool that can be replicated by developers and researchers.
ABSTRACT
PURPOSE: Validating artificial intelligence algorithms for clinical use in medical images is a challenging endeavor due to a lack of standard reference data (ground truth). This topic typically occupies a small portion of the discussion in research papers since most of the efforts are focused on developing novel algorithms. In this work, we present a collaboration to create a validation dataset of pathologist annotations for algorithms that process whole slide images. We focus on data collection and evaluation of algorithm performance in the context of estimating the density of stromal tumor-infiltrating lymphocytes (sTILs) in breast cancer. METHODS: We digitized 64 glass slides of hematoxylin- and eosin-stained invasive ductal carcinoma core biopsies prepared at a single clinical site. A collaborating pathologist selected 10 regions of interest (ROIs) per slide for evaluation. We created training materials and workflows to crowdsource pathologist image annotations on two modes: an optical microscope and two digital platforms. The microscope platform allows the same ROIs to be evaluated in both modes. The workflows collect the ROI type, a decision on whether the ROI is appropriate for estimating the density of sTILs, and if appropriate, the sTIL density value for that ROI. RESULTS: In total, 19 pathologists made 1645 ROI evaluations during a data collection event and the following 2 weeks. The pilot study yielded an abundant number of cases with nominal sTIL infiltration. Furthermore, we found that the sTIL densities are correlated within a case, and there is notable pathologist variability. Consequently, we outline plans to improve our ROI and case sampling methods. We also outline statistical methods to account for ROI correlations within a case and pathologist variability when validating an algorithm. CONCLUSION: We have built workflows for efficient data collection and tested them in a pilot study. As we prepare for pivotal studies, we will investigate methods to use the dataset as an external validation tool for algorithms. We will also consider what it will take for the dataset to be fit for a regulatory purpose: study size, patient population, and pathologist training and qualifications. To this end, we will elicit feedback from the Food and Drug Administration via the Medical Device Development Tool program and from the broader digital pathology and AI community. Ultimately, we intend to share the dataset, statistical methods, and lessons learned.
ABSTRACT
Unlocking the full potential of pathology data by gaining computational access to histological pixel data and metadata (digital pathology) is one of the key promises of computational pathology. Despite scientific progress and several regulatory approvals for primary diagnosis using whole-slide imaging, true clinical adoption at scale is slower than anticipated. In the U.S., advances in digital pathology are often siloed pursuits by individual stakeholders, and to our knowledge, there has not been a systematic approach to advance the field through a regulatory science initiative. The Alliance for Digital Pathology (the Alliance) is a recently established, volunteer, collaborative, regulatory science initiative to standardize digital pathology processes to speed up innovation to patients. The purpose is: (1) to account for the patient perspective by including patient advocacy; (2) to investigate and develop methods and tools for the evaluation of effectiveness, safety, and quality to specify risks and benefits in the precompetitive phase; (3) to help strategize the sequence of clinically meaningful deliverables; (4) to encourage and streamline the development of ground-truth data sets for machine learning model development and validation; and (5) to clarify regulatory pathways by investigating relevant regulatory science questions. The Alliance accepts participation from all stakeholders, and we solicit clinically relevant proposals that will benefit the field at large. The initiative will dissolve once a clinical, interoperable, modularized, integrated solution (from tissue acquisition to diagnostic algorithm) has been implemented. In times of rapidly evolving discoveries, scientific input from subject-matter experts is one essential element to inform regulatory guidance and decision-making. The Alliance aims to establish and promote synergistic regulatory science efforts that will leverage diverse inputs to move digital pathology forward and ultimately improve patient care.
ABSTRACT
Assessment of tumor-infiltrating lymphocytes (TILs) is increasingly recognized as an integral part of the prognostic workflow in triple-negative (TNBC) and HER2-positive breast cancer, as well as many other solid tumors. This recognition has come about thanks to standardized visual reporting guidelines, which helped to reduce inter-reader variability. Now, there are ripe opportunities to employ computational methods that extract spatio-morphologic predictive features, enabling computer-aided diagnostics. We detail the benefits of computational TILs assessment, the readiness of TILs scoring for computational assessment, and outline considerations for overcoming key barriers to clinical translation in this arena. Specifically, we discuss: 1. ensuring computational workflows closely capture visual guidelines and standards; 2. challenges and thoughts standards for assessment of algorithms including training, preanalytical, analytical, and clinical validation; 3. perspectives on how to realize the potential of machine learning models and to overcome the perceptual and practical limits of visual scoring.