RESUMEN
The US is experiencing an opioid epidemic, and opioid overdose is causing more than 100 deaths per day. Early identification of patients at high risk of Opioid Overdose (OD) can help to make targeted preventative interventions. We aim to build a deep learning model that can predict the patients at high risk for opioid overdose and identify most relevant features. The study included the information of 5,231,614 patients from the Health Facts database with at least one opioid prescription between January 1, 2008 and December 31, 2017. Potential predictors (n = 1185) were extracted to build a feature matrix for prediction. Long Short-Term Memory (LSTM) based models were built to predict overdose risk in the next hospital visit. Prediction performance was compared with other machine learning methods assessed using machine learning metrics. Our sequential deep learning models built upon LSTM outperformed the other methods on opioid overdose prediction. LSTM with attention mechanism achieved the highest F-1 score (F-1 score: 0.7815, AUCROC: 0.8449). The model is also able to reveal top ranked predictive features by permutation important method, including medications and vital signs. This study demonstrates that a temporal deep learning based predictive model can achieve promising results on identifying risk of opioid overdose of patients using the history of electronic health records. It provides an alternative informatics-based approach to improving clinical decision support for possible early detection and intervention to reduce opioid overdose.
Asunto(s)
Aprendizaje Profundo , Sobredosis de Opiáceos , Analgésicos Opioides/efectos adversos , Registros Electrónicos de Salud , Humanos , PrescripcionesRESUMEN
The Cancer Genome Atlas (TCGA) represents one of several international consortia dedicated to performing comprehensive genomic and epigenomic analyses of selected tumour types to advance our understanding of disease and provide an open-access resource for worldwide cancer research. Thirty-three tumour types (selected by histology or tissue of origin, to include both common and rare diseases), comprising >11 000 specimens, were subjected to DNA sequencing, copy number and methylation analysis, and transcriptomic, proteomic and histological evaluation. Each cancer type was analysed individually to identify tissue-specific alterations, and make correlations across different molecular platforms. The final dataset was then normalized and combined for the PanCancer Initiative, which seeks to identify commonalities across different cancer types or cells of origin/lineage, or within anatomically or morphologically related groups. An important resource generated along with the rich molecular studies is an extensive digital pathology slide archive, composed of frozen section tissue directly related to the tissues analysed as part of TCGA, and representative formalin-fixed paraffin-embedded, haematoxylin and eosin (H&E)-stained diagnostic slides. These H&E image resources have primarily been used to verify diagnoses and histological subtypes with some limited extraction of standard pathological variables such as mitotic activity, grade, and lymphocytic infiltrates. Largely overlooked is the richness of these scanned images for more sophisticated feature extraction approaches coupled with machine learning, and ultimately correlation with molecular features and clinical endpoints. Here, we document initial attempts to exploit TCGA imaging archives, and describe some of the tools, and the rapidly evolving image analysis/feature extraction landscape. Our hope is to inform, and ultimately inspire and challenge, the pathology and cancer research communities to exploit these imaging resources so that the full potential of this integral platform of TCGA can be used to complement and enhance the insightful integrated analyses from the genomic and epigenomic platforms. Copyright © 2017 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.
Asunto(s)
Biomarcadores de Tumor/genética , Genómica/métodos , Neoplasias/genética , Neoplasias/patología , Patología Molecular/métodos , Bases de Datos Genéticas , Epigénesis Genética , Predisposición Genética a la Enfermedad , Genoma Humano , Humanos , Interpretación de Imagen Asistida por Computador , Neoplasias/terapia , Fenotipo , Valor Predictivo de las PruebasRESUMEN
We propose a sparse Convolutional Autoencoder (CAE) for simultaneous nucleus detection and feature extraction in histopathology tissue images. Our CAE detects and encodes nuclei in image patches in tissue images into sparse feature maps that encode both the location and appearance of nuclei. A primary contribution of our work is the development of an unsupervised detection network by using the characteristics of histopathology image patches. The pretrained nucleus detection and feature extraction modules in our CAE can be fine-tuned for supervised learning in an end-to-end fashion. We evaluate our method on four datasets and achieve state-of-the-art results. In addition, we are able to achieve comparable performance with only 5% of the fully- supervised annotation cost.
RESUMEN
We propose a software platform that integrates methods and tools for multi-objective parameter auto-tuning in tissue image segmentation workflows. The goal of our work is to provide an approach for improving the accuracy of nucleus/cell segmentation pipelines by tuning their input parameters. The shape, size, and texture features of nuclei in tissue are important biomarkers for disease prognosis, and accurate computation of these features depends on accurate delineation of boundaries of nuclei. Input parameters in many nucleus segmentation workflows affect segmentation accuracy and have to be tuned for optimal performance. This is a time-consuming and computationally expensive process; automating this step facilitates more robust image segmentation workflows and enables more efficient application of image analysis in large image datasets. Our software platform adjusts the parameters of a nuclear segmentation algorithm to maximize the quality of image segmentation results while minimizing the execution time. It implements several optimization methods to search the parameter space efficiently. In addition, the methodology is developed to execute on high-performance computing systems to reduce the execution time of the parameter tuning phase. These capabilities are packaged in a Docker container for easy deployment and can be used through a friendly interface extension in 3D Slicer. Our results using three real-world image segmentation workflows demonstrate that the proposed solution is able to (1) search a small fraction (about 100 points) of the parameter space, which contains billions to trillions of points, and improve the quality of segmentation output by × 1.20, × 1.29, and × 1.29, on average; (2) decrease the execution time of a segmentation workflow by up to 11.79× while improving output quality; and (3) effectively use parallel systems to accelerate parameter tuning and segmentation phases.
Asunto(s)
Núcleo Celular , Rastreo Celular/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Algoritmos , Neoplasias Encefálicas/diagnóstico por imagen , Neoplasias Encefálicas/patología , Glioblastoma/diagnóstico por imagen , Glioblastoma/patología , Humanos , Programas Informáticos , Interfaz Usuario-Computador , Flujo de TrabajoRESUMEN
Motivation: Sensitivity analysis and parameter tuning are important processes in large-scale image analysis. They are very costly because the image analysis workflows are required to be executed several times to systematically correlate output variations with parameter changes or to tune parameters. An integrated solution with minimum user interaction that uses effective methodologies and high performance computing is required to scale these studies to large imaging datasets and expensive analysis workflows. Results: The experiments with two segmentation workflows show that the proposed approach can (i) quickly identify and prune parameters that are non-influential; (ii) search a small fraction (about 100 points) of the parameter search space with billions to trillions of points and improve the quality of segmentation results (Dice and Jaccard metrics) by as much as 1.42× compared to the results from the default parameters; (iii) attain good scalability on a high performance cluster with several effective optimizations. Conclusions: Our work demonstrates the feasibility of performing sensitivity analyses, parameter studies and auto-tuning with large datasets. The proposed framework can enable the quantification of error estimations and output variations in image segmentation pipelines. Availability and Implementation: Source code: https://github.com/SBU-BMI/region-templates/ . Contact: teodoro@unb.br. Supplementary information: Supplementary data are available at Bioinformatics online.
Asunto(s)
Algoritmos , Procesamiento de Imagen Asistido por Computador/métodos , Neoplasias Encefálicas/patología , Glioblastoma/patología , HumanosRESUMEN
UNLABELLED: GlycoPattern is Web-based bioinformatics resource to support the analysis of glycan array data for the Consortium for Functional Glycomics. This resource includes algorithms and tools to discover structural motifs, a heatmap visualization to compare multiple experiments, hierarchical clustering of Glycan Binding Proteins with respect to their binding motifs and a structural search feature on the experimental data. AVAILABILITY AND IMPLEMENTATION: GlycoPattern is freely available on the Web at http://glycopattern.emory.edu with all major browsers supported.
Asunto(s)
Glicómica/métodos , Análisis por Micromatrices/métodos , Polisacáridos/química , Programas Informáticos , Algoritmos , Proteínas Portadoras/química , Proteínas Portadoras/metabolismo , Análisis por Conglomerados , Minería de Datos , Internet , Polisacáridos/metabolismoRESUMEN
Deep learning models have shown promise in histopathology image analysis, but their opaque decision-making process poses challenges in high-risk medical scenarios. Here we introduce HIPPO, an explainable AI method that interrogates attention-based multiple instance learning (ABMIL) models in computational pathology by generating counterfactual examples through tissue patch modifications in whole slide images. Applying HIPPO to ABMIL models trained to detect breast cancer metastasis reveals that they may overlook small tumors and can be misled by non-tumor tissue, while attention maps-widely used for interpretation-often highlight regions that do not directly influence predictions. By interpreting ABMIL models trained on a prognostic prediction task, HIPPO identified tissue areas with stronger prognostic effects than high-attention regions, which sometimes showed counterintuitive influences on risk scores. These findings demonstrate HIPPO's capacity for comprehensive model evaluation, bias detection, and quantitative hypothesis testing. HIPPO greatly expands the capabilities of explainable AI tools to assess the trustworthy and reliable development, deployment, and regulation of weakly-supervised models in computational pathology.
RESUMEN
Digital pathology has seen a proliferation of deep learning models in recent years, but many models are not readily reusable. To address this challenge, we developed WSInfer: an open-source software ecosystem designed to streamline the sharing and reuse of deep learning models for digital pathology. The increased access to trained models can augment research on the diagnostic, prognostic, and predictive capabilities of digital pathology.
RESUMEN
Large-scale, multi-site collaboration is becoming indispensable for a wide range of research and clinical activities in oncology. To facilitate the next generation of advances in cancer biology, precision oncology and the population sciences it will be necessary to develop and implement data management and analytic tools that empower investigators to reliably and objectively detect, characterize and chronicle the phenotypic and genomic changes that occur during the transformation from the benign to cancerous state and throughout the course of disease progression. To facilitate these efforts it is incumbent upon the informatics community to establish the workflows and architectures that automate the aggregation and organization of a growing range and number of clinical data types and modalities ranging from new molecular and laboratory tests to sophisticated diagnostic imaging studies. In an attempt to meet those challenges, leading health care centers across the country are making steep investments to establish enterprise-wide, data warehouses. A significant limitation of many data warehouses, however, is that they are designed to support only alphanumeric information. In contrast to those traditional designs, the system that we have developed supports automated collection and mining of multimodal data including genomics, digital pathology and radiology images. In this paper, our team describes the design, development and implementation of a multi-modal, Clinical & Research Data Warehouse (CRDW) that is tightly integrated with a suite of computational and machine-learning tools to provide actionable insight into the underlying characteristics of the tumor environment that would not be revealed using standard methods and tools. The System features a flexible Extract, Transform and Load (ETL) interface that enables it to adapt to aggregate data originating from different clinical and research sources depending on the specific EHR and other data sources utilized at a given deployment site.
RESUMEN
The Cancer Genome Atlas (TCGA) project has generated gene expression data that divides glioblastoma (GBM) into four transcriptional classes: proneural, neural, classical, and mesenchymal. Because transcriptional class is only partially explained by underlying genomic alterations, we hypothesize that the tumor microenvironment may also have an impact. In this study, we focused on necrosis and angiogenesis because their presence is both prognostically and biologically significant. These features were quantified in digitized histological images of TCGA GBM frozen section slides that were immediately adjacent to samples used for molecular analysis. Correlating these features with transcriptional data, we found that the mesenchymal transcriptional class was significantly enriched with GBM samples that contained a high degree of necrosis. Furthermore, among 2422 genes that correlated with the degree of necrosis in GBMs, transcription factors known to drive the mesenchymal expression class were most closely related, including C/EBP-ß, C/EBP-δ, STAT3, FOSL2, bHLHE40, and RUNX1. Non-mesenchymal GBMs in the TCGA data set were found to become more transcriptionally similar to the mesenchymal class with increasing levels of necrosis. In addition, high expression levels of the master mesenchymal factors C/EBP-ß, C/EBP-δ, and STAT3 were associated with a poor prognosis. Strong, specific expression of C/EBP-ß and C/EBP-δ by hypoxic, perinecrotic cells in GBM likely account for their tight association with necrosis and may be related to their poor prognosis.
Asunto(s)
Glioblastoma/genética , Microambiente Tumoral/genética , Biomarcadores de Tumor/metabolismo , Proteína beta Potenciadora de Unión a CCAAT/metabolismo , Proteína delta de Unión al Potenciador CCAAT/metabolismo , Hipoxia de la Célula/fisiología , Regulación Neoplásica de la Expresión Génica , Genes Relacionados con las Neoplasias , Glioblastoma/irrigación sanguínea , Glioblastoma/metabolismo , Glioblastoma/patología , Humanos , Macrófagos/metabolismo , Células Madre Mesenquimatosas/patología , Mutación , Necrosis , Proteínas de Neoplasias/metabolismo , Neovascularización Patológica , Pronóstico , Factor de Transcripción STAT3/metabolismo , Transducción de Señal/fisiología , Factores de Transcripción/fisiología , Transcripción Genética , Células Tumorales Cultivadas , Regulación hacia ArribaRESUMEN
The growing amount of data in operational electronic health record systems provides unprecedented opportunity for its reuse for many tasks, including comparative effectiveness research. However, there are many caveats to the use of such data. Electronic health record data from clinical settings may be inaccurate, incomplete, transformed in ways that undermine their meaning, unrecoverable for research, of unknown provenance, of insufficient granularity, and incompatible with research protocols. However, the quantity and real-world nature of these data provide impetus for their use, and we develop a list of caveats to inform would-be users of such data as well as provide an informatics roadmap that aims to insure this opportunity to augment comparative effectiveness research can be best leveraged.
Asunto(s)
Investigación sobre la Eficacia Comparativa/organización & administración , Recolección de Datos/métodos , Recolección de Datos/normas , Registros Electrónicos de Salud/organización & administración , Proyectos de Investigación/normas , Investigación sobre la Eficacia Comparativa/normas , Interpretación Estadística de Datos , Registros Electrónicos de Salud/normas , Humanos , Revisión de Utilización de Seguros/organización & administraciónRESUMEN
OBJECTIVE: To create an analytics platform for specifying and detecting clinical phenotypes and other derived variables in electronic health record (EHR) data for quality improvement investigations. MATERIALS AND METHODS: We have developed an architecture for an Analytic Information Warehouse (AIW). It supports transforming data represented in different physical schemas into a common data model, specifying derived variables in terms of the common model to enable their reuse, computing derived variables while enforcing invariants and ensuring correctness and consistency of data transformations, long-term curation of derived data, and export of derived data into standard analysis tools. It includes software that implements these features and a computing environment that enables secure high-performance access to and processing of large datasets extracted from EHRs. RESULTS: We have implemented and deployed the architecture in production locally. The software is available as open source. We have used it as part of hospital operations in a project to reduce rates of hospital readmission within 30days. The project examined the association of over 100 derived variables representing disease and co-morbidity phenotypes with readmissions in 5years of data from our institution's clinical data warehouse and the UHC Clinical Database (CDB). The CDB contains administrative data from over 200 hospitals that are in academic medical centers or affiliated with such centers. DISCUSSION AND CONCLUSION: A widely available platform for managing and detecting phenotypes in EHR data could accelerate the use of such data in quality improvement and comparative effectiveness studies.
Asunto(s)
Registros Electrónicos de Salud , Programas Informáticos , Algoritmos , Sistemas de Administración de Bases de Datos , Readmisión del PacienteRESUMEN
Inflammatory bowel disease (IBD) is characterized by chronic, dysregulated inflammation in the gastrointestinal tract. The heterogeneity of IBD is reflected through two major subtypes, Crohn's Disease (CD) and Ulcerative Colitis (UC). CD and UC differ across symptomatic presentation, histology, immune responses, and treatment. While colitis mouse models have been influential in deciphering IBD pathogenesis, no single model captures the full heterogeneity of clinical disease. The translational capacity of mouse models may be augmented by shifting to multi-mouse model studies that aggregate analysis across various well-controlled phenotypes. Here, we evaluate the value of histology in multi-mouse model characterizations by building upon a previous pipeline that detects histological disease classes in hematoxylin and eosin (H&E)-stained murine colons. Specifically, we map immune marker positivity across serially-sectioned slides to H&E histological classes across the dextran sodium sulfate (DSS) chemical induction model and the intestinal epithelium-specific, inducible Villin-CreERT2;Klf5fl/fl (Klf5ΔIND) genetic model. In this study, we construct the beginning frameworks to define H&E-patch-based immunophenotypes based on IHC-H&E mappings.
Asunto(s)
Colitis Ulcerosa , Colitis , Enfermedad de Crohn , Enfermedades Inflamatorias del Intestino , Animales , Ratones , Colitis/inducido químicamente , Fenotipo , Inflamación , Modelos Animales de EnfermedadRESUMEN
BACKGROUND AND OBJECTIVE: Histopathology is the gold standard for diagnosis of many cancers. Recent advances in computer vision, specifically deep learning, have facilitated the analysis of histopathology images for many tasks, including the detection of immune cells and microsatellite instability. However, it remains difficult to identify optimal models and training configurations for different histopathology classification tasks due to the abundance of available architectures and the lack of systematic evaluations. Our objective in this work is to present a software tool that addresses this need and enables robust, systematic evaluation of neural network models for patch classification in histology in a light-weight, easy-to-use package for both algorithm developers and biomedical researchers. METHODS: Here we present ChampKit (Comprehensive Histopathology Assessment of Model Predictions toolKit): an extensible, fully reproducible evaluation toolkit that is a one-stop-shop to train and evaluate deep neural networks for patch classification. ChampKit curates a broad range of public datasets. It enables training and evaluation of models supported by timm directly from the command line, without the need for users to write any code. External models are enabled through a straightforward API and minimal coding. As a result, Champkit facilitates the evaluation of existing and new models and deep learning architectures on pathology datasets, making it more accessible to the broader scientific community. To demonstrate the utility of ChampKit, we establish baseline performance for a subset of possible models that could be employed with ChampKit, focusing on several popular deep learning models, namely ResNet18, ResNet50, and R26-ViT, a hybrid vision transformer. In addition, we compare each model trained either from random weight initialization or with transfer learning from ImageNet pretrained models. For ResNet18, we also consider transfer learning from a self-supervised pretrained model. RESULTS: The main result of this paper is the ChampKit software. Using ChampKit, we were able to systemically evaluate multiple neural networks across six datasets. We observed mixed results when evaluating the benefits of pretraining versus random intialization, with no clear benefit except in the low data regime, where transfer learning was found to be beneficial. Surprisingly, we found that transfer learning from self-supervised weights rarely improved performance, which is counter to other areas of computer vision. CONCLUSIONS: Choosing the right model for a given digital pathology dataset is nontrivial. ChampKit provides a valuable tool to fill this gap by enabling the evaluation of hundreds of existing (or user-defined) deep learning models across a variety of pathology tasks. Source code and data for the tool are freely accessible at https://github.com/SBU-BMI/champkit.
Asunto(s)
Neoplasias , Redes Neurales de la Computación , Humanos , Algoritmos , Programas Informáticos , Técnicas HistológicasRESUMEN
Despite recent methodology advancements in clinical natural language processing (NLP), the adoption of clinical NLP models within the translational research community remains hindered by process heterogeneity and human factor variations. Concurrently, these factors also dramatically increase the difficulty in developing NLP models in multi-site settings, which is necessary for algorithm robustness and generalizability. Here, we reported on our experience developing an NLP solution for Coronavirus Disease 2019 (COVID-19) signs and symptom extraction in an open NLP framework from a subset of sites participating in the National COVID Cohort (N3C). We then empirically highlight the benefits of multi-site data for both symbolic and statistical methods, as well as highlight the need for federated annotation and evaluation to resolve several pitfalls encountered in the course of these efforts.
Asunto(s)
COVID-19 , Procesamiento de Lenguaje Natural , Humanos , Registros Electrónicos de Salud , AlgoritmosRESUMEN
BACKGROUND: AKI is associated with mortality in patients hospitalized with coronavirus disease 2019 (COVID-19); however, its incidence, geographic distribution, and temporal trends since the start of the pandemic are understudied. METHODS: Electronic health record data were obtained from 53 health systems in the United States in the National COVID Cohort Collaborative. We selected hospitalized adults diagnosed with COVID-19 between March 6, 2020, and January 6, 2022. AKI was determined with serum creatinine and diagnosis codes. Time was divided into 16-week periods (P1-6) and geographical regions into Northeast, Midwest, South, and West. Multivariable models were used to analyze the risk factors for AKI or mortality. RESULTS: Of a total cohort of 336,473, 129,176 (38%) patients had AKI. Fifty-six thousand three hundred and twenty-two (17%) lacked a diagnosis code but had AKI based on the change in serum creatinine. Similar to patients coded for AKI, these patients had higher mortality compared with those without AKI. The incidence of AKI was highest in P1 (47%; 23,097/48,947), lower in P2 (37%; 12,102/32,513), and relatively stable thereafter. Compared with the Midwest, the Northeast, South, and West had higher adjusted odds of AKI in P1. Subsequently, the South and West regions continued to have the highest relative AKI odds. In multivariable models, AKI defined by either serum creatinine or diagnostic code and the severity of AKI was associated with mortality. CONCLUSIONS: The incidence and distribution of COVID-19-associated AKI changed since the first wave of the pandemic in the United States. PODCAST: This article contains a podcast at https://dts.podtrac.com/redirect.mp3/www.asn-online.org/media/podcast/CJASN/2023_08_08_CJN0000000000000192.mp3.
Asunto(s)
Lesión Renal Aguda , COVID-19 , Adulto , Humanos , COVID-19/complicaciones , COVID-19/epidemiología , Estudios Retrospectivos , Creatinina , Factores de Riesgo , Lesión Renal Aguda/diagnóstico , Mortalidad HospitalariaRESUMEN
Pathology is a medical subspecialty that practices the diagnosis of disease. Microscopic examination of tissue reveals information enabling the pathologist to render accurate diagnoses and to guide therapy. The basic process by which anatomic pathologists render diagnoses has remained relatively unchanged over the last century, yet advances in information technology now offer significant opportunities in image-based diagnostic and research applications. Pathology has lagged behind other healthcare practices such as radiology where digital adoption is widespread. As devices that generate whole slide images become more practical and affordable, practices will increasingly adopt this technology and eventually produce an explosion of data that will quickly eclipse the already vast quantities of radiology imaging data. These advances are accompanied by significant challenges for data management and storage, but they also introduce new opportunities to improve patient care by streamlining and standardizing diagnostic approaches and uncovering disease mechanisms. Computer-based image analysis is already available in commercial diagnostic systems, but further advances in image analysis algorithms are warranted in order to fully realize the benefits of digital pathology in medical discovery and patient care. In coming decades, pathology image analysis will extend beyond the streamlining of diagnostic workflows and minimizing interobserver variability and will begin to provide diagnostic assistance, identify therapeutic targets, and predict patient outcomes and therapeutic responses.
RESUMEN
Background: Deep learning methods have demonstrated remarkable performance in pathology image analysis, but they are computationally very demanding. The aim of our study is to reduce their computational cost to enable their use with large tissue image datasets. Methods: We propose a method called Network Auto-Reduction (NAR) that simplifies a Convolutional Neural Network (CNN) by reducing the network to minimize the computational cost of doing a prediction. NAR performs a compound scaling in which the width, depth, and resolution dimensions of the network are reduced together to maintain a balance among them in the resulting simplified network. We compare our method with a state-of-the-art solution called ResRep. The evaluation is carried out with popular CNN architectures and a real-world application that identifies distributions of tumor-infiltrating lymphocytes in tissue images. Results: The experimental results show that both ResRep and NAR are able to generate simplified, more efficient versions of ResNet50 V2. The simplified versions by ResRep and NAR require 1.32× and 3.26× fewer floating-point operations (FLOPs), respectively, than the original network without a loss in classification power as measured by the Area under the Curve (AUC) metric. When applied to a deeper and more computationally expensive network, Inception V4, NAR is able to generate a version that requires 4× lower than the original version with the same AUC performance. Conclusions: NAR is able to achieve substantial reductions in the execution cost of two popular CNN architectures, while resulting in small or no loss in model accuracy. Such cost savings can significantly improve the use of deep learning methods in digital pathology. They can enable studies with larger tissue image datasets and facilitate the use of less expensive and more accessible graphics processing units (GPUs), thus reducing the computing costs of a study.
RESUMEN
Inflammatory bowel disease (IBD) is a chronic immune-mediated disease of the gastrointestinal tract. While therapies exist, response can be limited within the patient population. Researchers have thus studied mouse models of colitis to further understand pathogenesis and identify new treatment targets. Flow cytometry and RNA-sequencing can phenotype immune populations with single-cell resolution but provide no spatial context. Spatial context may be particularly important in colitis mouse models, due to the simultaneous presence of colonic regions that are involved or uninvolved with disease. These regions can be identified on hematoxylin and eosin (H&E)-stained colonic tissue slides based on the presence of abnormal or normal histology. However, detection of such regions requires expert interpretation by pathologists. This can be a tedious process that may be difficult to perform consistently across experiments. To this end, we trained a deep learning model to detect 'Involved' and 'Uninvolved' regions from H&E-stained colonic tissue slides. Our model was trained on specimens from controls and three mouse models of colitis-the dextran sodium sulfate (DSS) chemical induction model, the recently established intestinal epithelium-specific, inducible Klf5ΔIND (Villin-CreERT2;Klf5fl/fl) genetic model, and one that combines both induction methods. Image patches predicted to be 'Involved' and 'Uninvolved' were extracted across mice to cluster and identify histological classes. We quantified the proportion of 'Uninvolved' patches and 'Involved' patch classes in murine swiss-rolled colons. Furthermore, we trained linear determinant analysis classifiers on these patch proportions to predict mouse model and clinical score bins in a prospectively treated cohort of mice. Such a pipeline has the potential to reveal histological links and improve synergy between various colitis mouse model studies to identify new therapeutic targets and pathophysiological mechanisms.
Asunto(s)
Colitis , Aprendizaje Profundo , Animales , Colon/patología , Sulfato de Dextran/toxicidad , Modelos Animales de Enfermedad , Humanos , Ratones , Ratones Endogámicos C57BLRESUMEN
Background: Acute kidney injury (AKI) is associated with mortality in patients hospitalized with COVID-19, however, its incidence, geographic distribution, and temporal trends since the start of the pandemic are understudied. Methods: Electronic health record data were obtained from 53 health systems in the United States (US) in the National COVID Cohort Collaborative (N3C). We selected hospitalized adults diagnosed with COVID-19 between March 6th, 2020, and January 6th, 2022. AKI was determined with serum creatinine (SCr) and diagnosis codes. Time were divided into 16-weeks (P1-6) periods and geographical regions into Northeast, Midwest, South, and West. Multivariable models were used to analyze the risk factors for AKI or mortality. Results: Out of a total cohort of 306,061, 126,478 (41.0 %) patients had AKI. Among these, 17.9% lacked a diagnosis code but had AKI based on the change in SCr. Similar to patients coded for AKI, these patients had higher mortality compared to those without AKI. The incidence of AKI was highest in P1 (49.3%), reduced in P2 (40.6%), and relatively stable thereafter. Compared to the Midwest, the Northeast, South, and West had higher adjusted AKI incidence in P1, subsequently, the South and West regions continued to have the highest relative incidence. In multivariable models, AKI defined by either SCr or diagnostic code, and the severity of AKI was associated with mortality. Conclusions: Uncoded cases of COVID-19-associated AKI are common and associated with mortality. The incidence and distribution of COVID-19-associated AKI have changed since the first wave of the pandemic in the US.