Búsqueda | Portal Regional de la BVS Paraguay

1.

Navigating electronic health record accuracy by examination of sex incongruent conditions.

Cai, Ling; DeBerardinis, Ralph J; Zhan, Xiaowei; Xiao, Guanghua; Xie, Yang.

J Am Med Inform Assoc ; 2024 Sep 10.

Artículo en Inglés | MEDLINE | ID: mdl-39254529

RESUMEN

OBJECTIVE: The increasing reliance on electronic health records (EHRs) for research and clinical care necessitates robust methods for assessing data quality and identifying inconsistencies. To address this need, we develop and apply the incongruence rate (IR) using sex-specific medical conditions. We also characterized participants with incongruent records to better understand the scope and nature of data discrepancies. MATERIALS AND METHODS: In this cross-sectional study, we used the All of Us Research Program's latest version 7 (v7) EHR data to identify prevalent sex-specific conditions and evaluated the occurrence of incongruent cases, quantified as IR. RESULTS: Among the 92 597 males and 152 551 females with condition occurrence data available from All of Us and sex-conformed gender, we identified 167 prevalent sex-specific conditions. Among the 37 537 biological males and 95 499 biological females with these sex-specific conditions, we detected an overall IR of 0.86%. Attempt to include non-cisgender participants result in inflated overall IR. Additionally, a significant proportion of participants with incongruent conditions also presented with conditions congruent to their biological sex, indicating a mix of accurate and erroneous records. These incongruences were not geographically or temporally isolated, suggesting systematic issues in EHR data integrity. DISCUSSION: Our findings call attention to the existence of systemic data incongruences in sex-specific conditions and the need for robust validation checks. Extending IR evaluation to non-cisgender participants or non-sex-based conditions remain a challenge. CONCLUSION: The sex condition-specific IR, when applied to adult populations, provides a valuable metric for data quality assessment in EHRs.

2.

Mapping cellular interactions from spatially resolved transcriptomics data.

Zhu, James; Wang, Yunguan; Chang, Woo Yong; Malewska, Alicia; Napolitano, Fabiana; Gahan, Jeffrey C; Unni, Nisha; Zhao, Min; Yuan, Rongqing; Wu, Fangjiang; Yue, Lauren; Guo, Lei; Zhao, Zhuo; Chen, Danny Z; Hannan, Raquibul; Zhang, Siyuan; Xiao, Guanghua; Mu, Ping; Hanker, Ariella B; Strand, Douglas; Arteaga, Carlos L; Desai, Neil; Wang, Xinlei; Xie, Yang; Wang, Tao.

Nat Methods ; 2024 Sep 03.

Artículo en Inglés | MEDLINE | ID: mdl-39227721

RESUMEN

Cell-cell communication (CCC) is essential to how life forms and functions. However, accurate, high-throughput mapping of how expression of all genes in one cell affects expression of all genes in another cell is made possible only recently through the introduction of spatially resolved transcriptomics (SRT) technologies, especially those that achieve single-cell resolution. Nevertheless, substantial challenges remain to analyze such highly complex data properly. Here, we introduce a multiple-instance learning framework, Spacia, to detect CCCs from data generated by SRTs, by uniquely exploiting their spatial modality. We highlight Spacia's power to overcome fundamental limitations of popular analytical tools for inference of CCCs, including losing single-cell resolution, limited to ligand-receptor relationships and prior interaction databases, high false positive rates and, most importantly, the lack of consideration of the multiple-sender-to-one-receiver paradigm. We evaluated the fitness of Spacia for three commercialized single-cell resolution SRT technologies: MERSCOPE/Vizgen, CosMx/NanoString and Xenium/10x. Overall, Spacia represents a notable step in advancing quantitative theories of cellular communications.

3.

STIE: Single-cell level deconvolution, convolution, and clustering in in situ capturing-based spatial transcriptomics.

Zhu, Shijia; Kubota, Naoto; Wang, Shidan; Wang, Tao; Xiao, Guanghua; Hoshida, Yujin.

Nat Commun ; 15(1): 7559, 2024 Aug 30.

Artículo en Inglés | MEDLINE | ID: mdl-39214995

RESUMEN

In in situ capturing-based spatial transcriptomics, spots of the same size and printed at fixed locations cannot precisely capture the randomly-located single cells, therefore inherently failing to profile transcriptome at the single-cell level. To this end, we present STIE, an Expectation Maximization algorithm that aligns the spatial transcriptome to its matched histology image-based nuclear morphology and recovers missing cells from ~70% gap area, thereby achieving the real single-cell level and whole-slide scale deconvolution, convolution, and clustering for both low- and high-resolution spots. STIE characterizes cell-type-specific gene expression and demonstrates outperforming concordance with true cell-type-specific transcriptomic signatures than the other spot- and subspot-level methods. Furthermore, STIE reveals the single-cell level insights, for instance, lower actual spot resolution than its reported spot size, unbiased evaluation of cell type colocalization, superior power of high-resolution spot in distinguishing nuanced cell types, and spatial cell-cell interactions at the single-cell level other than spot level.

Asunto(s)

Algoritmos , Perfilación de la Expresión Génica , Análisis de la Célula Individual , Transcriptoma , Análisis de la Célula Individual/métodos , Perfilación de la Expresión Génica/métodos , Análisis por Conglomerados , Animales , Humanos , Procesamiento de Imagen Asistido por Computador/métodos , Ratones

4.

Enhancing Medical Imaging Segmentation with GB-SAM: A Novel Approach to Tissue Segmentation Using Granular Box Prompts.

Villanueva-Miranda, Ismael; Rong, Ruichen; Quan, Peiran; Wen, Zhuoyu; Zhan, Xiaowei; Yang, Donghan M; Chi, Zhikai; Xie, Yang; Xiao, Guanghua.

Cancers (Basel) ; 16(13)2024 Jun 28.

Artículo en Inglés | MEDLINE | ID: mdl-39001452

RESUMEN

Recent advances in foundation models have revolutionized model development in digital pathology, reducing dependence on extensive manual annotations required by traditional methods. The ability of foundation models to generalize well with few-shot learning addresses critical barriers in adapting models to diverse medical imaging tasks. This work presents the Granular Box Prompt Segment Anything Model (GB-SAM), an improved version of the Segment Anything Model (SAM) fine-tuned using granular box prompts with limited training data. The GB-SAM aims to reduce the dependency on expert pathologist annotators by enhancing the efficiency of the automated annotation process. Granular box prompts are small box regions derived from ground truth masks, conceived to replace the conventional approach of using a single large box covering the entire H&E-stained image patch. This method allows a localized and detailed analysis of gland morphology, enhancing the segmentation accuracy of individual glands and reducing the ambiguity that larger boxes might introduce in morphologically complex regions. We compared the performance of our GB-SAM model against U-Net trained on different sizes of the CRAG dataset. We evaluated the models across histopathological datasets, including CRAG, GlaS, and Camelyon16. GB-SAM consistently outperformed U-Net, with reduced training data, showing less segmentation performance degradation. Specifically, on the CRAG dataset, GB-SAM achieved a Dice coefficient of 0.885 compared to U-Net's 0.857 when trained on 25% of the data. Additionally, GB-SAM demonstrated segmentation stability on the CRAG testing dataset and superior generalization across unseen datasets, including challenging lymph node segmentation in Camelyon16, which achieved a Dice coefficient of 0.740 versus U-Net's 0.491. Furthermore, compared to SAM-Path and Med-SAM, GB-SAM showed competitive performance. GB-SAM achieved a Dice score of 0.900 on the CRAG dataset, while SAM-Path achieved 0.884. On the GlaS dataset, Med-SAM reported a Dice score of 0.956, whereas GB-SAM achieved 0.885 with significantly less training data. These results highlight GB-SAM's advanced segmentation capabilities and reduced dependency on large datasets, indicating its potential for practical deployment in digital pathology, particularly in settings with limited annotated datasets.

5.

Deep Learning-Based Automated Measurement of Murine Bone Length in Radiographs.

Rong, Ruichen; Denton, Kristin; Jin, Kevin W; Quan, Peiran; Wen, Zhuoyu; Kozlitina, Julia; Lyon, Stephen; Wang, Aileen; Wise, Carol A; Beutler, Bruce; Yang, Donghan M; Li, Qiwei; Rios, Jonathan J; Xiao, Guanghua.

Bioengineering (Basel) ; 11(7)2024 Jul 01.

Artículo en Inglés | MEDLINE | ID: mdl-39061752

RESUMEN

Genetic mouse models of skeletal abnormalities have demonstrated promise in the identification of phenotypes relevant to human skeletal diseases. Traditionally, phenotypes are assessed by manually examining radiographs, a tedious and potentially error-prone process. In response, this study developed a deep learning-based model that streamlines the measurement of murine bone lengths from radiographs in an accurate and reproducible manner. A bone detection and measurement pipeline utilizing the Keypoint R-CNN algorithm with an EfficientNet-B3 feature extraction backbone was developed to detect murine bone positions and measure their lengths. The pipeline was developed utilizing 94 X-ray images with expert annotations on the start and end position of each murine bone. The accuracy of our pipeline was evaluated on an independent dataset test with 592 images, and further validated on a previously published dataset of 21,300 mouse radiographs. The results showed that our model performed comparably to humans in measuring tibia and femur lengths (R2 > 0.92, p-value = 0) and significantly outperformed humans in measuring pelvic lengths in terms of precision and consistency. Furthermore, the model improved the precision and consistency of genetic association mapping results, identifying significant associations between genetic mutations and skeletal phenotypes with reduced variability. This study demonstrates the feasibility and efficiency of automated murine bone length measurement in the identification of mouse models of abnormal skeletal phenotypes.

6.

A Pan-Cancer Patient-Derived Xenograft Histology Image Repository with Genomic and Pathologic Annotations Enables Deep Learning Analysis.

White, Brian S; Woo, Xing Yi; Koc, Soner; Sheridan, Todd; Neuhauser, Steven B; Wang, Shidan; Evrard, Yvonne A; Chen, Li; Foroughi Pour, Ali; Landua, John D; Mashl, R Jay; Davies, Sherri R; Fang, Bingliang; Raso, Maria Gabriela; Evans, Kurt W; Bailey, Matthew H; Chen, Yeqing; Xiao, Min; Rubinstein, Jill C; Sanderson, Brian J; Lloyd, Michael W; Domanskyi, Sergii; Dobrolecki, Lacey E; Fujita, Maihi; Fujimoto, Junya; Xiao, Guanghua; Fields, Ryan C; Mudd, Jacqueline L; Xu, Xiaowei; Hollingshead, Melinda G; Jiwani, Shahanawaz; Acevedo, Saul; Davis-Dusenbery, Brandi N; Robinson, Peter N; Moscow, Jeffrey A; Doroshow, James H; Mitsiades, Nicholas; Kaochar, Salma; Pan, Chong-Xian; Carvajal-Carmona, Luis G; Welm, Alana L; Welm, Bryan E; Govindan, Ramaswamy; Li, Shunqiang; Davies, Michael A; Roth, Jack A; Meric-Bernstam, Funda; Xie, Yang; Herlyn, Meenhard; Ding, Li.

Cancer Res ; 84(13): 2060-2072, 2024 07 02.

Artículo en Inglés | MEDLINE | ID: mdl-39082680

RESUMEN

Patient-derived xenografts (PDX) model human intra- and intertumoral heterogeneity in the context of the intact tissue of immunocompromised mice. Histologic imaging via hematoxylin and eosin (H&E) staining is routinely performed on PDX samples, which could be harnessed for computational analysis. Prior studies of large clinical H&E image repositories have shown that deep learning analysis can identify intercellular and morphologic signals correlated with disease phenotype and therapeutic response. In this study, we developed an extensive, pan-cancer repository of >1,000 PDX and paired parental tumor H&E images. These images, curated from the PDX Development and Trial Centers Research Network Consortium, had a range of associated genomic and transcriptomic data, clinical metadata, pathologic assessments of cell composition, and, in several cases, detailed pathologic annotations of neoplastic, stromal, and necrotic regions. The amenability of these images to deep learning was highlighted through three applications: (i) development of a classifier for neoplastic, stromal, and necrotic regions; (ii) development of a predictor of xenograft-transplant lymphoproliferative disorder; and (iii) application of a published predictor of microsatellite instability. Together, this PDX Development and Trial Centers Research Network image repository provides a valuable resource for controlled digital pathology analysis, both for the evaluation of technical issues and for the development of computational image-based methods that make clinical predictions based on PDX treatment studies. Significance: A pan-cancer repository of >1,000 patient-derived xenograft hematoxylin and eosin-stained images will facilitate cancer biology investigations through histopathologic analysis and contributes important model system data that expand existing human histology repositories.

Asunto(s)

Aprendizaje Profundo , Neoplasias , Humanos , Animales , Ratones , Neoplasias/genética , Neoplasias/patología , Neoplasias/diagnóstico por imagen , Genómica/métodos , Xenoinjertos , Ensayos Antitumor por Modelo de Xenoinjerto , Trastornos Linfoproliferativos/genética , Trastornos Linfoproliferativos/patología , Procesamiento de Imagen Asistido por Computador/métodos

7.

iIMPACT: integrating image and molecular profiles for spatial transcriptomics analysis.

Jiang, Xi; Wang, Shidan; Guo, Lei; Zhu, Bencong; Wen, Zhuoyu; Jia, Liwei; Xu, Lin; Xiao, Guanghua; Li, Qiwei.

Genome Biol ; 25(1): 147, 2024 06 06.

Artículo en Inglés | MEDLINE | ID: mdl-38844966

RESUMEN

Current clustering analysis of spatial transcriptomics data primarily relies on molecular information and fails to fully exploit the morphological features present in histology images, leading to compromised accuracy and interpretability. To overcome these limitations, we have developed a multi-stage statistical method called iIMPACT. It identifies and defines histology-based spatial domains based on AI-reconstructed histology images and spatial context of gene expression measurements, and detects domain-specific differentially expressed genes. Through multiple case studies, we demonstrate iIMPACT outperforms existing methods in accuracy and interpretability and provides insights into the cellular spatial organization and landscape of functional genes within spatial transcriptomics data.

Asunto(s)

Perfilación de la Expresión Génica , Transcriptoma , Perfilación de la Expresión Génica/métodos , Humanos , Análisis por Conglomerados , Procesamiento de Imagen Asistido por Computador/métodos

8.

A critical assessment of using ChatGPT for extracting structured data from clinical notes.

Huang, Jingwei; Yang, Donghan M; Rong, Ruichen; Nezafati, Kuroush; Treager, Colin; Chi, Zhikai; Wang, Shidan; Cheng, Xian; Guo, Yujia; Klesse, Laura J; Xiao, Guanghua; Peterson, Eric D; Zhan, Xiaowei; Xie, Yang.

NPJ Digit Med ; 7(1): 106, 2024 May 01.

Artículo en Inglés | MEDLINE | ID: mdl-38693429

RESUMEN

Existing natural language processing (NLP) methods to convert free-text clinical notes into structured data often require problem-specific annotations and model training. This study aims to evaluate ChatGPT's capacity to extract information from free-text medical notes efficiently and comprehensively. We developed a large language model (LLM)-based workflow, utilizing systems engineering methodology and spiral "prompt engineering" process, leveraging OpenAI's API for batch querying ChatGPT. We evaluated the effectiveness of this method using a dataset of more than 1000 lung cancer pathology reports and a dataset of 191 pediatric osteosarcoma pathology reports, comparing the ChatGPT-3.5 (gpt-3.5-turbo-16k) outputs with expert-curated structured data. ChatGPT-3.5 demonstrated the ability to extract pathological classifications with an overall accuracy of 89%, in lung cancer dataset, outperforming the performance of two traditional NLP methods. The performance is influenced by the design of the instructive prompt. Our case analysis shows that most misclassifications were due to the lack of highly specialized pathology terminology, and erroneous interpretation of TNM staging rules. Reproducibility shows the relatively stable performance of ChatGPT-3.5 over time. In pediatric osteosarcoma dataset, ChatGPT-3.5 accurately classified both grades and margin status with accuracy of 98.6% and 100% respectively. Our study shows the feasibility of using ChatGPT to process large volumes of clinical notes for structured information extraction without requiring extensive task-specific human annotation and model training. The results underscore the potential role of LLMs in transforming unstructured healthcare data into structured formats, thereby supporting research and aiding clinical decision-making.

9.

A Lung Cancer Mouse Model Database.

Cai, Ling; Gao, Ying; DeBerardinis, Ralph J; Acquaah-Mensah, George; Aidinis, Vassilis; Beane, Jennifer E; Biswal, Shyam; Chen, Ting; Concepcion-Crisol, Carla P; Grüner, Barbara M; Jia, Deshui; Jones, Robert; Kurie, Jonathan M; Lee, Min Gyu; Lindahl, Per; Lissanu, Yonathan; Lorz Lopez, Maria Corina; Martinelli, Rosanna; Mazur, Pawel K; Mazzilli, Sarah A; Mii, Shinji; Moll, Herwig; Moorehead, Roger; Morrisey, Edward E; Ng, Sheng Rong; Oser, Matthew G; Pandiri, Arun R; Powell, Charles A; Ramadori, Giorgio; Santos Lafuente, Mirentxu; Snyder, Eric; Sotillo, Rocio; Su, Kang-Yi; Taki, Tetsuro; Taparra, Kekoa; Xia, Yifeng; van Veen, Ed; Winslow, Monte M; Xiao, Guanghua; Rudin, Charles M; Oliver, Trudy G; Xie, Yang; Minna, John D.

bioRxiv ; 2024 May 14.

Artículo en Inglés | MEDLINE | ID: mdl-38464291

RESUMEN

Lung cancer, the leading cause of cancer mortality, exhibits diverse histological subtypes and genetic complexities. Numerous preclinical mouse models have been developed to study lung cancer, but data from these models are disparate, siloed, and difficult to compare in a centralized fashion. Here we established the Lung Cancer Mouse Model Database (LCMMDB), an extensive repository of 1,354 samples from 77 transcriptomic datasets covering 974 samples from genetically engineered mouse models (GEMMs), 368 samples from carcinogen-induced models, and 12 samples from a spontaneous model. Meticulous curation and collaboration with data depositors have produced a robust and comprehensive database, enhancing the fidelity of the genetic landscape it depicts. The LCMMDB aligns 859 tumors from GEMMs with human lung cancer mutations, enabling comparative analysis and revealing a pressing need to broaden the diversity of genetic aberrations modeled in GEMMs. Accompanying this resource, we developed a web application that offers researchers intuitive tools for in-depth gene expression analysis. With standardized reprocessing of gene expression data, the LCMMDB serves as a powerful platform for cross-study comparison and lays the groundwork for future research, aiming to bridge the gap between mouse models and human lung cancer for improved translational relevance.

10.

Deep convolutional neural network and IoT technology for healthcare.

Wassan, Sobia; Dongyan, Hu; Suhail, Beenish; Jhanjhi, N Z; Xiao, Guanghua; Ahmed, Suhail; Murugesan, Raja Kumar.

Digit Health ; 10: 20552076231220123, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-38250147

RESUMEN

Background: Deep Learning is an AI technology that trains computers to analyze data in an approach similar to the human brain. Deep learning algorithms can find complex patterns in images, text, audio, and other data types to provide accurate predictions and conclusions. Neuronal networks are another name for Deep Learning. These layers are the input, the hidden, and the output of a deep learning model. First, data is taken in by the input layer, and then it is processed by the output layer. Deep Learning has many advantages over traditional machine learning algorithms like a KA-nearest neighbor, support vector algorithms, and regression approaches. Deep learning models can read more complex data than traditional machine learning methods. Objectives: This research aims to find the ideal number of best-hidden layers for the neural network and different activation function variations. The article also thoroughly analyzes how various frameworks can be used to create a comparison or fast neural networks. The final goal of the article is to investigate all such innovative techniques that allow us to speed up the training of neural networks without losing accuracy. Methods: A sample data Set from 2001 was collected by www.Kaggle.com. We can reduce the total number of layers in the deep learning model. This will enable us to use our time. To perform the ReLU activation, we will make use of two layers that are completely connected. If the value being supplied is larger than zero, the ReLU activation will return 0, and else it will output the value being input directly. Results: We use multiple parameters to determine the most effective method to test how well our method works. In the next paragraph, we'll discuss how the calculation changes secret-shared Values. By adopting 19 train set features, we train our reliable model to predict healthcare cost's (numerical) target feature. We found that 0.89503 was the best choice because it gave us a good fit (R2) and let us set enough coefficients to 0. To develop our stable model with this Set of parameters, we require 26 iterations. We use an R2 of 0.89503, an MSE of 0.01094, an RMSE of 0.10458, a mean residual deviance of 0.01094, a mean absolute error of 0.07452, and a root mean squared log error of 0.07207. After training the model on the train set, we applied the same parameters to the test set and obtained an R2 of 0.90707, MSE of 0.01045, RMSE of 0.10224, mean residual deviation of 0.01045, MAE of 0.06954, and RMSE of 0.07051, validating our solution approach. The objective value of our secured model is higher than that of the scikit-learn model, although the former performs better on goodness-of-fit criteria. As a result, our protected model performs quite well, marginally outperforming the (very optimized) scikit-learn model. Using a backpropagation algorithm and stochastic gradient descent, deep Learning develops artificial neural systems with several interconnected layers. There may be hidden layers of neurons in the network that have the tanh, rectification, and max-out hyperparameters. Modern features like momentum training, dropout, active learning rate, rate annealed, and L1 or L2 regularization provide exceptional prediction performance. The worldwide model's parameters are multi-threadedly (asynchronously) trained on the data from that node, and the model-based data is then gradually augmented by model averaging over the entire network. The method is executed on a single-node, direct H2O cluster initiated by the operator. The operation is parallel despite there just being a single node involved. The number of threads may be adjusted in the settings menu under Preferences and General. The optimal number of threads for the system is used automatically. Successful predictions in the healthcare data sets are made using the H2O Deep Learning operator. There will be a classification done since its label is binomial. The Splitting Validation operator creates test and training datasets to evaluate the model. By default, the settings of the Deep Learning activator are used. To put it another way, we'll construct two hidden layers, each containing 50 neurons. The Accuracy measure is computed by linking the annotated Sample Set with a Performer (Binominal Classification) operator. Table 3 displays the Deep Learning Model, the labeled data, and the Performance Vector that resulted from the technique. Conclusions: Deep learning algorithms can be used to design systems that report data on patients and deliver warnings to medical applications or electronic health information if there are changes in the patient's health. These systems could be created using deep Learning. This helps verify that patients get the proper effective care at the proper time for each specific patient. A healthcare decision support system was presented using the Internet of Things and deep learning methods. In the proposed system, we examined the capability of integrating deep learning technology into automatic diagnosis and IoT capabilities for faster message exchange over the Internet. We have selected the suitable Neural Network structure (number of best-hidden layers and activation function classes) to construct the e-health system. In addition, the e-health system relied on data from doctors to understand the Neural Network. In the validation method, the total evaluation of the proposed healthcare system for diagnostics provides dependability under various patient conditions. Based on evaluation and simulation findings, a dual hidden layer of feed-forward NN and its neurons store the tanh function more effectively than other NN. To overcome challenges, this study will integrate artificial intelligence with IoT. This study aims to determine the NN's optimal layer counts and activation function variations.

11.

MetaNorm: incorporating meta-analytic priors into normalization of NanoString nCounter data.

Barth, Jackson; Yang, Yuqiu; Xiao, Guanghua; Wang, Xinlei.

Bioinformatics ; 40(1)2024 01 02.

Artículo en Inglés | MEDLINE | ID: mdl-38237909

RESUMEN

MOTIVATION: Non-informative or diffuse prior distributions are widely employed in Bayesian data analysis to maintain objectivity. However, when meaningful prior information exists and can be identified, using an informative prior distribution to accurately reflect current knowledge may lead to superior outcomes and great efficiency. RESULTS: We propose MetaNorm, a Bayesian algorithm for normalizing NanoString nCounter gene expression data. MetaNorm is based on RCRnorm, a powerful method designed under an integrated series of hierarchical models that allow various sources of error to be explained by different types of probes in the nCounter system. However, a lack of accurate prior information, weak computational efficiency, and instability of estimates that sometimes occur weakens the approach despite its impressive performance. MetaNorm employs priors carefully constructed from a rigorous meta-analysis to leverage information from large public data. Combined with additional algorithmic enhancements, MetaNorm improves RCRnorm by yielding more stable estimation of normalized values, better convergence diagnostics and superior computational efficiency. AVAILABILITY AND IMPLEMENTATION: R Code for replicating the meta-analysis and the normalization function can be found at github.com/jbarth216/MetaNorm.

Asunto(s)

Algoritmos , Análisis de Datos , Teorema de Bayes

12.

Mapping Cellular Interactions from Spatially Resolved Transcriptomics Data.

Zhu, James; Wang, Yunguan; Chang, Woo Yong; Malewska, Alicia; Napolitano, Fabiana; Gahan, Jeffrey C; Unni, Nisha; Zhao, Min; Yuan, Rongqing; Wu, Fangjiang; Yue, Lauren; Guo, Lei; Zhao, Zhuo; Chen, Danny Z; Hannan, Raquibul; Zhang, Siyuan; Xiao, Guanghua; Mu, Ping; Hanker, Ariella B; Strand, Douglas; Arteaga, Carlos L; Desai, Neil; Wang, Xinlei; Xie, Yang; Wang, Tao.

bioRxiv ; 2024 Jan 25.

Artículo en Inglés | MEDLINE | ID: mdl-37781617

RESUMEN

Cell-cell communication (CCC) is essential to how life forms and functions. However, accurate, high-throughput mapping of how expression of all genes in one cell affects expression of all genes in another cell is made possible only recently, through the introduction of spatially resolved transcriptomics technologies (SRTs), especially those that achieve single cell resolution. However, significant challenges remain to analyze such highly complex data properly. Here, we introduce a Bayesian multi-instance learning framework, spacia, to detect CCCs from data generated by SRTs, by uniquely exploiting their spatial modality. We highlight spacia's power to overcome fundamental limitations of popular analytical tools for inference of CCCs, including losing single-cell resolution, limited to ligand-receptor relationships and prior interaction databases, high false positive rates, and most importantly the lack of consideration of the multiple-sender-to-one-receiver paradigm. We evaluated the fitness of spacia for all three commercialized single cell resolution ST technologies: MERSCOPE/Vizgen, CosMx/Nanostring, and Xenium/10X. Spacia unveiled how endothelial cells, fibroblasts and B cells in the tumor microenvironment contribute to Epithelial-Mesenchymal Transition and lineage plasticity in prostate cancer cells. We deployed spacia in a set of pan-cancer datasets and showed that B cells also participate in PDL1/PD1 signaling in tumors. We demonstrated that a CD8+ T cell/PDL1 effectiveness signature derived from spacia analyses is associated with patient survival and response to immune checkpoint inhibitor treatments in 3,354 patients. We revealed differential spatial interaction patterns between Î³Î´ T cells and liver hepatocytes in healthy and cancerous contexts. Overall, spacia represents a notable step in advancing quantitative theories of cellular communications.

13.

Deep Learning-Based H-Score Quantification of Immunohistochemistry-Stained Images.

Wen, Zhuoyu; Luo, Danni; Wang, Shidan; Rong, Ruichen; Evers, Bret M; Jia, Liwei; Fang, Yisheng; Daoud, Elena V; Yang, Shengjie; Gu, Zifan; Arner, Emily N; Lewis, Cheryl M; Solis Soto, Luisa M; Fujimoto, Junya; Behrens, Carmen; Wistuba, Ignacio I; Yang, Donghan M; Brekken, Rolf A; O'Donnell, Kathryn A; Xie, Yang; Xiao, Guanghua.

Mod Pathol ; 37(2): 100398, 2024 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-38043788

RESUMEN

Immunohistochemistry (IHC) is a well-established and commonly used staining method for clinical diagnosis and biomedical research. In most IHC images, the target protein is conjugated with a specific antibody and stained using diaminobenzidine (DAB), resulting in a brown coloration, whereas hematoxylin serves as a blue counterstain for cell nuclei. The protein expression level is quantified through the H-score, calculated from DAB staining intensity within the target cell region. Traditionally, this process requires evaluation by 2 expert pathologists, which is both time consuming and subjective. To enhance the efficiency and accuracy of this process, we have developed an automatic algorithm for quantifying the H-score of IHC images. To characterize protein expression in specific cell regions, a deep learning model for region recognition was trained based on hematoxylin staining only, achieving pixel accuracy for each class ranging from 0.92 to 0.99. Within the desired area, the algorithm categorizes DAB intensity of each pixel as negative, weak, moderate, or strong staining and calculates the final H-score based on the percentage of each intensity category. Overall, this algorithm takes an IHC image as input and directly outputs the H-score within a few seconds, significantly enhancing the speed of IHC image analysis. This automated tool provides H-score quantification with precision and consistency comparable to experienced pathologists but at a significantly reduced cost during IHC diagnostic workups. It holds significant potential to advance biomedical research reliant on IHC staining for protein expression quantification.

Asunto(s)

Aprendizaje Profundo , Humanos , Inmunohistoquímica , Hematoxilina/metabolismo , Algoritmos , Núcleo Celular/metabolismo

14.

Deep learning of cell spatial organizations identifies clinically relevant insights in tissue images.

Wang, Shidan; Rong, Ruichen; Zhou, Qin; Yang, Donghan M; Zhang, Xinyi; Zhan, Xiaowei; Bishop, Justin; Chi, Zhikai; Wilhelm, Clare J; Zhang, Siyuan; Pickering, Curtis R; Kris, Mark G; Minna, John; Xie, Yang; Xiao, Guanghua.

Nat Commun ; 14(1): 7872, 2023 Dec 11.

Artículo en Inglés | MEDLINE | ID: mdl-38081823

RESUMEN

Recent advancements in tissue imaging techniques have facilitated the visualization and identification of various cell types within physiological and pathological contexts. Despite the emergence of cell-cell interaction studies, there is a lack of methods for evaluating individual spatial interactions. In this study, we introduce Ceograph, a cell spatial organization-based graph convolutional network designed to analyze cell spatial organization (for example,. the cell spatial distribution, morphology, proximity, and interactions) derived from pathology images. Ceograph identifies key cell spatial organization features by accurately predicting their influence on patient clinical outcomes. In patients with oral potentially malignant disorders, our model highlights reduced structural concordance and increased closeness in epithelial substrata as driving features for an elevated risk of malignant transformation. In lung cancer patients, Ceograph detects elongated tumor nuclei and diminished stroma-stroma closeness as biomarkers for insensitivity to EGFR tyrosine kinase inhibitors. With its potential to predict various clinical outcomes, Ceograph offers a deeper understanding of biological processes and supports the development of personalized therapeutic strategies.

Asunto(s)

Aprendizaje Profundo , Neoplasias Pulmonares , Humanos , Comunicación Celular , Núcleo Celular , Neoplasias Pulmonares/diagnóstico por imagen

15.

Reconstructing Spatial Transcriptomics at the Single-cell Resolution with BayesDeep.

Jiang, Xi; Dong, Lei; Wang, Shidan; Wen, Zhuoyu; Chen, Mingyi; Xu, Lin; Xiao, Guanghua; Li, Qiwei.

bioRxiv ; 2023 Dec 08.

Artículo en Inglés | MEDLINE | ID: mdl-38106214

RESUMEN

Spatially resolved transcriptomics (SRT) techniques have revolutionized the characterization of molecular profiles while preserving spatial and morphological context. However, most next-generation sequencing-based SRT techniques are limited to measuring gene expression in a confined array of spots, capturing only a fraction of the spatial domain. Typically, these spots encompass gene expression from a few to hundreds of cells, underscoring a critical need for more detailed, single-cell resolution SRT data to enhance our understanding of biological functions within the tissue context. Addressing this challenge, we introduce BayesDeep, a novel Bayesian hierarchical model that leverages cellular morphological data from histology images, commonly paired with SRT data, to reconstruct SRT data at the single-cell resolution. BayesDeep effectively model count data from SRT studies via a negative binomial regression model. This model incorporates explanatory variables such as cell types and nuclei-shape information for each cell extracted from the paired histology image. A feature selection scheme is integrated to examine the association between the morphological and molecular profiles, thereby improving the model robustness. We applied BayesDeep to two real SRT datasets, successfully demonstrating its capability to reconstruct SRT data at the single-cell resolution. This advancement not only yields new biological insights but also significantly enhances various downstream analyses, such as pseudotime and cell-cell communication.

16.

Osteosarcoma Explorer: A Data Commons With Clinical, Genomic, Protein, and Tissue Imaging Data for Osteosarcoma Research.

Yang, Donghan M; Zhou, Qinbo; Furman-Cline, Lauren; Cheng, Xian; Luo, Danni; Lai, Hongyin; Li, Yueqi; Jin, Kevin W; Yao, Bo; Leavey, Patrick J; Rakheja, Dinesh; Lo, Tammy; Hall, David; Barkauskas, Donald A; Shulman, David S; Janeway, Katherine; Khanna, Chand; Gorlick, Richard; Menzies, Christopher; Zhan, Xiaowei; Xiao, Guanghua; Skapek, Stephen X; Xu, Lin; Klesse, Laura J; Crompton, Brian D; Xie, Yang.

JCO Clin Cancer Inform ; 7: e2300104, 2023 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-37956387

RESUMEN

PURPOSE: Osteosarcoma research advancement requires enhanced data integration across different modalities and sources. Current osteosarcoma research, encompassing clinical, genomic, protein, and tissue imaging data, is hindered by the siloed landscape of data generation and storage. MATERIALS AND METHODS: Clinical, molecular profiling, and tissue imaging data for 573 patients with pediatric osteosarcoma were collected from four public and institutional sources. A common data model incorporating standardized terminology was created to facilitate the transformation, integration, and load of source data into a relational database. On the basis of this database, a data commons accompanied by a user-friendly web portal was developed, enabling various data exploration and analytics functions. RESULTS: The Osteosarcoma Explorer (OSE) was released to the public in 2021. Leveraging a comprehensive and harmonized data set on the backend, the OSE offers a wide range of functions, including Cohort Discovery, Patient Dashboard, Image Visualization, and Online Analysis. Since its initial release, the OSE has experienced an increasing utilization by the osteosarcoma research community and provided solid, continuous user support. To our knowledge, the OSE is the largest (N = 573) and most comprehensive research data commons for pediatric osteosarcoma, a rare disease. This project demonstrates an effective framework for data integration and data commons development that can be readily applied to other projects sharing similar goals. CONCLUSION: The OSE offers an online exploration and analysis platform for integrated clinical, molecular profiling, and tissue imaging data of osteosarcoma. Its underlying data model, database, and web framework support continuous expansion onto new data modalities and sources.

Asunto(s)

Manejo de Datos , Osteosarcoma , Niño , Humanos , Bases de Datos Factuales , Genómica , Osteosarcoma/diagnóstico por imagen , Osteosarcoma/genética

17.

Artificial intelligence in mental healthcare: an overview and future perspectives.

Jin, Kevin W; Li, Qiwei; Xie, Yang; Xiao, Guanghua.

Br J Radiol ; 96(1150): 20230213, 2023 Oct.

Artículo en Inglés | MEDLINE | ID: mdl-37698582

RESUMEN

Artificial intelligence is disrupting the field of mental healthcare through applications in computational psychiatry, which leverages quantitative techniques to inform our understanding, detection, and treatment of mental illnesses. This paper provides an overview of artificial intelligence technologies in modern mental healthcare and surveys recent advances made by researchers, focusing on the nascent field of digital psychiatry. We also consider the ethical implications of artificial intelligence playing a greater role in mental healthcare.

Asunto(s)

Trastornos Mentales , Servicios de Salud Mental , Psiquiatría , Humanos , Inteligencia Artificial , Atención a la Salud/métodos , Trastornos Mentales/diagnóstico , Trastornos Mentales/terapia

18.

ScopeViewer: A Browser-Based Solution for Visualizing Spatial Transcriptomics Data.

Luo, Danni; Robertson, Sophie; Zhan, Yuanchun; Rong, Ruichen; Wang, Shidan; Jiang, Xi; Yang, Sen; Palmer, Suzette; Jia, Liwei; Li, Qiwei; Xiao, Guanghua; Zhan, Xiaowei.

bioRxiv ; 2023 Jul 25.

Artículo en Inglés | MEDLINE | ID: mdl-37546786

RESUMEN

Motivation: Spatial transcriptomics (ST) enables a high-resolution interrogation of molecular characteristics within specific spatial contexts and tissue morphology. Despite its potential, visualization of ST data is a challenging task due to the complexities in handling, sharing and visualizing large image datasets together with molecular information. Results: We introduce ScopeViewer, a browser-based software designed to overcome these challenges. ScopeViewer offers the following functionalities: (1) It visualizes large image data and associated annotations at various zoom levels, allowing for intricate exploration of the data; (2) It enables dual interactive viewing of the original images along with their annotations, providing a comprehensive understanding of the context; (3) It displays spatial molecular features with optimized bandwidth, ensuring a smooth user experience; and (4) It bolsters data security by circumventing data transfers. Availability: ScopeViewer is available at: https://datacommons.swmed.edu/scopeviewer.

19.

A Deep Learning Onion Peeling Approach to Measure Oral Epithelium Layer Number.

Zhang, Xinyi; Gleber-Netto, Frederico O; Wang, Shidan; Jin, Kevin W; Yang, Donghan M; Gillenwater, Ann M; Myers, Jeffrey N; Ferrarotto, Renata; Pickering, Curtis R; Xiao, Guanghua.

Cancers (Basel) ; 15(15)2023 Jul 31.

Artículo en Inglés | MEDLINE | ID: mdl-37568707

RESUMEN

Head and neck squamous cell carcinoma (HNSCC), specifically in the oral cavity (oral squamous cell carcinoma, OSCC), is a common, complex cancer that significantly affects patients' quality of life. Early diagnosis typically improves prognoses yet relies on pathologist examination of histology images that exhibit high inter- and intra-observer variation. The advent of deep learning has automated this analysis, notably with object segmentation. However, techniques for automated oral dysplasia diagnosis have been limited to shape or cell stain information, without addressing the diagnostic potential in counting the number of cell layers in the oral epithelium. Our study attempts to address this gap by combining the existing U-Net and HD-Staining architectures for segmenting the oral epithelium and introducing a novel algorithm that we call Onion Peeling for counting the epithelium layer number. Experimental results show a close correlation between our algorithmic and expert manual layer counts, demonstrating the feasibility of automated layer counting. We also show the clinical relevance of oral epithelial layer number to grading oral dysplasia severity through survival analysis. Overall, our study shows that automated counting of oral epithelium layers can represent a potential addition to the digital pathology toolbox. Model generalizability and accuracy could be improved further with a larger training dataset.

20.

Unsupervised domain adaptation for nuclei segmentation: Adapting from hematoxylin & eosin stained slides to immunohistochemistry stained slides using a curriculum approach.

Wang, Shidan; Rong, Ruichen; Gu, Zifan; Fujimoto, Junya; Zhan, Xiaowei; Xie, Yang; Xiao, Guanghua.

Comput Methods Programs Biomed ; 241: 107768, 2023 Nov.

Artículo en Inglés | MEDLINE | ID: mdl-37619429

RESUMEN

BACKGROUND AND OBJECTIVE: Unsupervised domain adaptation (UDA) is a powerful approach in tackling domain discrepancies and reducing the burden of laborious and error-prone pixel-level annotations for instance segmentation. However, the domain adaptation strategies utilized in previous instance segmentation models pool all the labeled/detected instances together to train the instance-level GAN discriminator, which neglects the differences among multiple instance categories. Such pooling prevents UDA instance segmentation models from learning categorical correspondence between source and target domains for accurate instance classification; METHODS: To tackle this challenge, we propose an Instance Segmentation CycleGAN (ISC-GAN) algorithm for UDA multiclass-instance segmentation. We conduct extensive experiments on the multiclass nuclei recognition task to transfer knowledge from hematoxylin and eosin to immunohistochemistry stained pathology images. Specifically, we fuse CycleGAN with Mask R-CNN to learn categorical correspondence with image-level domain adaptation and virtual supervision. Moreover, we utilize Curriculum Learning to separate the learning process into two steps: (1) learning segmentation only on labeled source data, and (2) learning target domain segmentation with paired virtual labels generated by ISC-GAN. The performance was further improved through experiments with other strategies, including Shared Weights, Knowledge Distillation, and Expanded Source Data. RESULTS: Comparing to the baseline model or the three UDA instance detection and segmentation models, ISC-GAN illustrates the state-of-the-art performance, with 39.1% average precision and 48.7% average recall. The source codes of ISC-GAN are available at https://github.com/sdw95927/InstanceSegmentation-CycleGAN. CONCLUSION: ISC-GAN adapted knowledge from hematoxylin and eosin to immunohistochemistry stained pathology images, suggesting the potential for reducing the need for large annotated pathological image datasets in deep learning and computer vision tasks.

Asunto(s)

Algoritmos , Curriculum , Eosina Amarillenta-(YS) , Hematoxilina , Inmunohistoquímica

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA