Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
1.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-38980369

RESUMEN

Recent studies have extensively used deep learning algorithms to analyze gene expression to predict disease diagnosis, treatment effectiveness, and survival outcomes. Survival analysis studies on diseases with high mortality rates, such as cancer, are indispensable. However, deep learning models are plagued by overfitting owing to the limited sample size relative to the large number of genes. Consequently, the latest style-transfer deep generative models have been implemented to generate gene expression data. However, these models are limited in their applicability for clinical purposes because they generate only transcriptomic data. Therefore, this study proposes ctGAN, which enables the combined transformation of gene expression and survival data using a generative adversarial network (GAN). ctGAN improves survival analysis by augmenting data through style transformations between breast cancer and 11 other cancer types. We evaluated the concordance index (C-index) enhancements compared with previous models to demonstrate its superiority. Performance improvements were observed in nine of the 11 cancer types. Moreover, ctGAN outperformed previous models in seven out of the 11 cancer types, with colon adenocarcinoma (COAD) exhibiting the most significant improvement (median C-index increase of ~15.70%). Furthermore, integrating the generated COAD enhanced the log-rank p-value (0.041) compared with using only the real COAD (p-value = 0.797). Based on the data distribution, we demonstrated that the model generated highly plausible data. In clustering evaluation, ctGAN exhibited the highest performance in most cases (89.62%). These findings suggest that ctGAN can be meaningfully utilized to predict disease progression and select personalized treatments in the medical field.


Asunto(s)
Aprendizaje Profundo , Humanos , Análisis de Supervivencia , Algoritmos , Neoplasias/genética , Neoplasias/mortalidad , Perfilación de la Expresión Génica/métodos , Redes Neurales de la Computación , Biología Computacional/métodos , Neoplasias de la Mama/genética , Neoplasias de la Mama/mortalidad , Femenino , Regulación Neoplásica de la Expresión Génica
2.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-38960404

RESUMEN

Recent advances in microfluidics and sequencing technologies allow researchers to explore cellular heterogeneity at single-cell resolution. In recent years, deep learning frameworks, such as generative models, have brought great changes to the analysis of transcriptomic data. Nevertheless, relying on the potential space of these generative models alone is insufficient to generate biological explanations. In addition, most of the previous work based on generative models is limited to shallow neural networks with one to three layers of latent variables, which may limit the capabilities of the models. Here, we propose a deep interpretable generative model called d-scIGM for single-cell data analysis. d-scIGM combines sawtooth connectivity techniques and residual networks, thereby constructing a deep generative framework. In addition, d-scIGM incorporates hierarchical prior knowledge of biological domains to enhance the interpretability of the model. We show that d-scIGM achieves excellent performance in a variety of fundamental tasks, including clustering, visualization, and pseudo-temporal inference. Through topic pathway studies, we found that d-scIGM-learned topics are better enriched for biologically meaningful pathways compared to the baseline models. Furthermore, the analysis of drug response data shows that d-scIGM can capture drug response patterns in large-scale experiments, which provides a promising way to elucidate the underlying biological mechanisms. Lastly, in the melanoma dataset, d-scIGM accurately identified different cell types and revealed multiple melanin-related driver genes and key pathways, which are critical for understanding disease mechanisms and drug development.


Asunto(s)
Aprendizaje Profundo , RNA-Seq , Análisis de Expresión Génica de una Sola Célula , Humanos , Algoritmos , Biología Computacional/métodos , Redes Neurales de la Computación , RNA-Seq/métodos , Análisis de Expresión Génica de una Sola Célula/métodos
3.
Brief Bioinform ; 24(6)2023 Sep 22.
Artículo en Inglés | MEDLINE | ID: mdl-37756591

RESUMEN

In the process of drug discovery, one of the key problems is how to improve the biological activity and ADMET properties starting from a specific structure, which is also called structural optimization. Based on a starting scaffold, the use of deep generative model to generate molecules with desired drug-like properties will provide a powerful tool to accelerate the structural optimization process. However, the existing generative models remain challenging in extracting molecular features efficiently in 3D space to generate drug-like 3D molecules. Moreover, most of the existing ADMET prediction models made predictions of different properties through a single model, which can result in reduced prediction accuracy on some datasets. To effectively generate molecules from a specific scaffold and provide basis for the structural optimization, the 3D-SMGE (3-Dimensional Scaffold-based Molecular Generation and Evaluation) work consisting of molecular generation and prediction of ADMET properties is presented. For the molecular generation, we proposed 3D-SMG, a novel deep generative model for the end-to-end design of 3D molecules. In the 3D-SMG model, we designed the cross-aggregated continuous-filter convolution (ca-cfconv), which is used to achieve efficient and low-cost 3D spatial feature extraction while ensuring the invariance of atomic space rotation. 3D-SMG was proved to generate valid, unique and novel molecules with high drug-likeness. Besides, the proposed data-adaptive multi-model ADMET prediction method outperformed or maintained the best evaluation metrics on 24 out of 27 ADMET benchmark datasets. 3D-SMGE is anticipated to emerge as a powerful tool for hit-to-lead structural optimizations and accelerate the drug discovery process.

4.
Nano Lett ; 24(15): 4447-4453, 2024 Apr 17.
Artículo en Inglés | MEDLINE | ID: mdl-38588344

RESUMEN

Modern microscopy techniques can be used to investigate soft nano-objects at the nanometer scale. However, time-consuming microscopy measurements combined with low numbers of observable polydisperse objects often limit the statistics. We propose a method for identifying the most representative objects from their respective point clouds. These point cloud data are obtained, for example, through the localization of single emitters in super-resolution fluorescence microscopy. External stimuli, such as temperature, can cause changes in the shape and properties of adaptive objects. Due to the demanding and time-consuming nature of super-resolution microscopy experiments, only a limited number of temperature steps can be performed. Therefore, we propose a deep generative model that learns the underlying point distribution of temperature-dependent microgels, enabling the reliable generation of unlimited samples with an arbitrary number of localizations. Our method greatly cuts down the data collection effort across diverse experimental conditions, proving invaluable for soft condensed matter studies.

5.
Small ; : e2402685, 2024 May 21.
Artículo en Inglés | MEDLINE | ID: mdl-38770745

RESUMEN

Designing novel materials is greatly dependent on understanding the design principles, physical mechanisms, and modeling methods of material microstructures, requiring experienced designers with expertise and several rounds of trial and error. Although recent advances in deep generative networks have enabled the inverse design of material microstructures, most studies involve property-conditional generation and focus on a specific type of structure, resulting in limited generation diversity and poor human-computer interaction. In this study, a pioneering text-to-microstructure deep generative network (Txt2Microstruct-Net) is proposed that enables the generation of 3D material microstructures directly from text prompts without additional optimization procedures. The Txt2Microstruct-Net model is trained on a large microstructure-caption paired dataset that is extensible using the algorithms provided. Moreover, the model is sufficiently flexible to generate different geometric representations, such as voxels and point clouds. The model's performance is also demonstrated in the inverse design of material microstructures and metamaterials. It has promising potential for interactive microstructure design when associated with large language models and could be a user-friendly tool for material design and discovery.

6.
BMC Bioinformatics ; 24(1): 297, 2023 Jul 21.
Artículo en Inglés | MEDLINE | ID: mdl-37480001

RESUMEN

BACKGROUND: Protein engineering aims to improve the functional properties of existing proteins to meet people's needs. Current deep learning-based models have captured evolutionary, functional, and biochemical features contained in amino acid sequences. However, the existing generative models need to be improved when capturing the relationship between amino acid sites on longer sequences. At the same time, the distribution of protein sequences in the homologous family has a specific positional relationship in the latent space. We want to use this relationship to search for new variants directly from the vicinity of better-performing varieties. RESULTS: To improve the representation learning ability of the model for longer sequences and the similarity between the generated sequences and the original sequences, we propose a temporal variational autoencoder (T-VAE) model. T-VAE consists of an encoder and a decoder. The encoder expands the receptive field of neurons in the network structure by dilated causal convolution, thereby improving the encoding representation ability of longer sequences. The decoder decodes the sampled data into variants closely resembling the original sequence. CONCLUSION: Compared to other models, the person correlation coefficient between the predicted values of protein fitness obtained by T-VAE and the truth values was higher, and the mean absolute deviation was lower. In addition, the T-VAE model has a better representation learning ability for longer sequences when comparing the encoding of protein sequences of different lengths. These results show that our model has more advantages in representation learning for longer sequences. To verify the model's generative effect, we also calculate the sequence identity between the generated data and the input data. The sequence identity obtained by T-VAE improved by 12.9% compared to the baseline model.


Asunto(s)
Aminoácidos , Evolución Biológica , Humanos , Proteínas Mutantes , Secuencia de Aminoácidos , Aprendizaje
7.
BMC Bioinformatics ; 24(1): 233, 2023 Jun 05.
Artículo en Inglés | MEDLINE | ID: mdl-37277701

RESUMEN

BACKGROUND: Three-dimensional structures of protein-ligand complexes provide valuable insights into their interactions and are crucial for molecular biological studies and drug design. However, their high-dimensional and multimodal nature hinders end-to-end modeling, and earlier approaches depend inherently on existing protein structures. To overcome these limitations and expand the range of complexes that can be accurately modeled, it is necessary to develop efficient end-to-end methods. RESULTS: We introduce an equivariant diffusion-based generative model that learns the joint distribution of ligand and protein conformations conditioned on the molecular graph of a ligand and the sequence representation of a protein extracted from a pre-trained protein language model. Benchmark results show that this protein structure-free model is capable of generating diverse structures of protein-ligand complexes, including those with correct binding poses. Further analyses indicate that the proposed end-to-end approach is particularly effective when the ligand-bound protein structure is not available. CONCLUSION: The present results demonstrate the effectiveness and generative capability of our end-to-end complex structure modeling framework with diffusion-based generative models. We suppose that this framework will lead to better modeling of protein-ligand complexes, and we expect further improvements and wide applications.


Asunto(s)
Diseño de Fármacos , Proteínas , Ligandos , Proteínas/química , Conformación Proteica
8.
Neuroimage ; 274: 120142, 2023 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-37120044

RESUMEN

Resting-state magnetoencephalography (MEG) data show complex but structured spatiotemporal patterns. However, the neurophysiological basis of these signal patterns is not fully known and the underlying signal sources are mixed in MEG measurements. Here, we developed a method based on the nonlinear independent component analysis (ICA), a generative model trainable with unsupervised learning, to learn representations from resting-state MEG data. After being trained with a large dataset from the Cam-CAN repository, the model has learned to represent and generate patterns of spontaneous cortical activity using latent nonlinear components, which reflects principal cortical patterns with specific spectral modes. When applied to the downstream classification task of audio-visual MEG, the nonlinear ICA model achieves competitive performance with deep neural networks despite limited access to labels. We further validate the generalizability of the model across different datasets by applying it to an independent neurofeedback dataset for decoding the subject's attentional states, providing a real-time feature extraction and decoding mindfulness and thought-inducing tasks with an accuracy of around 70% at the individual level, which is much higher than obtained by linear ICA or other baseline methods. Our results demonstrate that nonlinear ICA is a valuable addition to existing tools, particularly suited for unsupervised representation learning of spontaneous MEG activity which can then be applied to specific goals or tasks when labelled data are scarce.


Asunto(s)
Magnetoencefalografía , Neurorretroalimentación , Humanos , Magnetoencefalografía/métodos , Encéfalo/fisiología , Neurorretroalimentación/métodos , Redes Neurales de la Computación , Atención
9.
Brief Bioinform ; 22(6)2021 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-34415297

RESUMEN

Deep generative models have been an upsurge in the deep learning community since they were proposed. These models are designed for generating new synthetic data including images, videos and texts by fitting the data approximate distributions. In the last few years, deep generative models have shown superior performance in drug discovery especially de novo molecular design. In this study, deep generative models are reviewed to witness the recent advances of de novo molecular design for drug discovery. In addition, we divide those models into two categories based on molecular representations in silico. Then these two classical types of models are reported in detail and discussed about both pros and cons. We also indicate the current challenges in deep generative models for de novo molecular design. De novo molecular design automatically is promising but a long road to be explored.


Asunto(s)
Aprendizaje Profundo , Diseño de Fármacos/métodos , Descubrimiento de Drogas/métodos , Modelos Moleculares
10.
Sensors (Basel) ; 22(24)2022 Dec 16.
Artículo en Inglés | MEDLINE | ID: mdl-36560269

RESUMEN

Chemical agents are one of the major threats to soldiers in modern warfare, so it is so important to detect chemical agents rapidly and accurately on battlefields. Raman spectroscopy-based detectors are widely used but have many limitations. The Raman spectrum changes unpredictably due to various environmental factors, and it is hard for detectors to make appropriate judgments about new chemical substances without prior information. Thus, the existing detectors with inflexible techniques based on determined rules cannot deal with such problems flexibly and reactively. Artificial intelligence (AI)-based detection techniques can be good alternatives to the existing techniques for chemical agent detection. To build AI-based detection systems, sufficient amounts of data for training are required, but it is not easy to produce and handle fatal chemical agents, which causes difficulty in securing data in advance. To overcome the limitations, in this paper, we propose the distributed Raman spectrum data augmentation system that leverages federated learning (FL) with deep generative models, such as generative adversarial network (GAN) and autoencoder. Furthermore, the proposed system utilizes various additional techniques in combination to generate a large number of Raman spectrum data with reality along with diversity. We implemented the proposed system and conducted diverse experiments to evaluate the system. The evaluation results validated that the proposed system can train the models more quickly through cooperation among decentralized troops without exchanging raw data and generate realistic Raman spectrum data well. Moreover, we confirmed that the classification model on the proposed system performed learning much faster and outperformed the existing systems.


Asunto(s)
Inteligencia Artificial , Aprendizaje , Redes de Comunicación de Computadores , Juicio
11.
Sensors (Basel) ; 22(20)2022 Oct 17.
Artículo en Inglés | MEDLINE | ID: mdl-36298252

RESUMEN

Research on face recognition with masked faces has been increasingly important due to the prolonged COVID-19 pandemic. To make face recognition practical and robust, a large amount of face image data should be acquired for training purposes. However, it is difficult to obtain masked face images for each human subject. To cope with this difficulty, this paper proposes a simple yet practical method to synthesize a realistic masked face for an unseen face image. For this, a cascade of two convolutional auto-encoders (CAEs) has been designed. The former CAE generates a pose-alike face wearing a mask pattern, which is expected to fit the input face in terms of pose view. The output of the former CAE is readily fed into the secondary CAE for extracting a segmentation map that localizes the mask region on the face. Using the segmentation map, the mask pattern can be successfully fused with the input face by means of simple image processing techniques. The proposed method relies on face appearance reconstruction without any facial landmark detection or localization techniques. Extensive experiments with the GTAV Face database and Labeled Faces in the Wild (LFW) database show that the two complementary generators could rapidly and accurately produce synthetic faces even for challenging input faces (e.g., low-resolution face of 25 × 25 pixels with out-of-plane rotations).


Asunto(s)
COVID-19 , Reconocimiento Facial , Humanos , Pandemias , Procesamiento de Imagen Asistido por Computador/métodos , Bases de Datos Factuales
12.
Entropy (Basel) ; 24(8)2022 Jul 26.
Artículo en Inglés | MEDLINE | ID: mdl-35893009

RESUMEN

Convolutional neural networks have greatly improved the performance of image super-resolution. However, perceptual networks have problems such as blurred line structures and a lack of high-frequency information when reconstructing image textures. To mitigate these issues, a generative adversarial network based on multiscale asynchronous learning is proposed in this paper, whereby a pyramid structure is employed in the network model to integrate high-frequency information at different scales. Our scheme employs a U-net as a discriminator to focus on the consistency of adjacent pixels in the input image and uses the LPIPS loss for perceptual extreme super-resolution with stronger supervision. Experiments on benchmark datasets and independent datasets Set5, Set14, BSD100, and SunHays80 show that our approach is effective in restoring detailed texture information from low-resolution images.

13.
Neuroimage ; 241: 118423, 2021 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-34303794

RESUMEN

Resting state functional magnetic resonance imaging (rsfMRI) data exhibits complex but structured patterns. However, the underlying origins are unclear and entangled in rsfMRI data. Here we establish a variational auto-encoder, as a generative model trainable with unsupervised learning, to disentangle the unknown sources of rsfMRI activity. After being trained with large data from the Human Connectome Project, the model has learned to represent and generate patterns of cortical activity and connectivity using latent variables. The latent representation and its trajectory represent the spatiotemporal characteristics of rsfMRI activity. The latent variables reflect the principal gradients of the latent trajectory and drive activity changes in cortical networks. Representational geometry captured as covariance or correlation between latent variables, rather than cortical connectivity, can be used as a more reliable feature to accurately identify subjects from a large group, even if only a short period of data is available in each subject. Our results demonstrate that VAE is a valuable addition to existing tools, particularly suited for unsupervised representation learning of resting state fMRI activity.


Asunto(s)
Encéfalo/diagnóstico por imagen , Conectoma/métodos , Individualidad , Imagen por Resonancia Magnética/métodos , Descanso , Aprendizaje Automático no Supervisado , Encéfalo/fisiología , Bases de Datos Factuales , Humanos , Descanso/fisiología
14.
Comput Biol Med ; 168: 107738, 2024 01.
Artículo en Inglés | MEDLINE | ID: mdl-37995536

RESUMEN

Electronic medical records(EMR) have considerable potential to advance healthcare technologies, including medical AI. Nevertheless, due to the privacy issues associated with the sharing of patient's personal information, it is difficult to sufficiently utilize them. Generative models based on deep learning can solve this problem by creating synthetic data similar to real patient data. However, the data used for training these deep learning models run into the risk of getting leaked because of malicious attacks. This means that traditional deep learning-based generative models cannot completely solve the privacy issues. Therefore, we suggested a method to prevent the leakage of training data by protecting the model from malicious attacks using local differential privacy(LDP). Our method was evaluated in terms of utility and privacy. Experimental results demonstrated that the proposed method can generate medical data with reasonable performance while protecting training data from malicious attacks.


Asunto(s)
Registros Electrónicos de Salud , Privacidad , Humanos , Instituciones de Salud
15.
Adv Sci (Weinh) ; 11(26): e2400829, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38704695

RESUMEN

Self-assembling peptides have numerous applications in medicine, food chemistry, and nanotechnology. However, their discovery has traditionally been serendipitous rather than driven by rational design. Here, HydrogelFinder, a foundation model is developed for the rational design of self-assembling peptides from scratch. This model explores the self-assembly properties by molecular structure, leveraging 1,377 self-assembling non-peptidal small molecules to navigate chemical space and improve structural diversity. Utilizing HydrogelFinder, 111 peptide candidates are generated and synthesized 17 peptides, subsequently experimentally validating the self-assembly and biophysical characteristics of nine peptides ranging from 1-10 amino acids-all achieved within a 19-day workflow. Notably, the two de novo-designed self-assembling peptides demonstrated low cytotoxicity and biocompatibility, as confirmed by live/dead assays. This work highlights the capacity of HydrogelFinder to diversify the design of self-assembling peptides through non-peptidal small molecules, offering a powerful toolkit and paradigm for future peptide discovery endeavors.


Asunto(s)
Péptidos , Péptidos/química
16.
bioRxiv ; 2024 Feb 06.
Artículo en Inglés | MEDLINE | ID: mdl-38370817

RESUMEN

This study introduces the Deep Normative Tractometry (DNT) framework, that encodes the joint distribution of both macrostructural and microstructural profiles of the brain white matter tracts through a variational autoencoder (VAE). By training on data from healthy controls, DNT learns the normative distribution of tract data, and can delineate along-tract micro-and macro-structural abnormalities. Leveraging a large sample size via generative pre-training, we assess DNT's generalizability using transfer learning on data from an independent cohort acquired in India. Our findings demonstrate DNT's capacity to detect widespread diffusivity abnormalities along tracts in mild cognitive impairment and Alzheimer's disease, aligning closely with results from the Bundle Analytics (BUAN) tractometry pipeline. By incorporating tract geometry information, DNT may be able to distinguish disease-related abnormalities in anisotropy from tract macrostructure, and shows promise in enhancing fine-scale mapping and detection of white matter alterations in neurodegenerative conditions.

17.
Interdiscip Sci ; 2024 Apr 29.
Artículo en Inglés | MEDLINE | ID: mdl-38683279

RESUMEN

The structures of fentanyl and its analogues are easy to be modified and few types have been included in database so far, which allow criminals to avoid the supervision of relevant departments. This paper introduces a molecular graph-based transformer model, which is combined with a data augmentation method based on substructure replacement to generate novel fentanyl analogues. 140,000 molecules were generated, and after a set of screening, 36,799 potential fentanyl analogues were finally obtained. We calculated the molecular properties of 36,799 potential fentanyl analogues. The results showed that the model could learn some properties of original fentanyl molecules. We compared the generated molecules from transformer model and data augmentation method based on substructure replacement with those generated by the other two molecular generation models based on deep learning, and found that the model in this paper can generate more novel potential fentanyl analogues. Finally, the findings of the paper indicate that transformer model based on molecular graph helps us explore the structure of potential fentanyl molecules as well as understand distribution of original molecules of fentanyl.

18.
Accid Anal Prev ; 205: 107688, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-38917716

RESUMEN

Crash scenario-based testing is crucial for assessing autonomous driving safety. However, existing studies on scenario generation tend to prioritize concrete scenarios for direct testing, neglecting the construction of fundamentally functional scenarios with a broader range. Police-reported historical crash data is a valuable supplement, yet detecting all potential crash scenarios is laborious. In order to address this issue, this study proposes an adaptive search sampling framework based on deep generative model and surrogate model (SM) to extract master scenario samples from police-reported historical crash data. The framework starts with selecting representative samples from the full crash dataset as initial master scenario samples using various sampling techniques. Evaluation indexes are then constructed, and derived scenario samples are synthesized using the deep generative model. To enhance efficiency, an SM is established to replace the generative model's training and data generation process. Based on the SM, an adaptive search sampling method is developed, which iteratively adjusts the sampling strategy using the Similarity Score to achieve comprehensive sampling. Experimental results demonstrate the notable advantage of the adaptive search sampling method over other sampling methods. Furthermore, statistical analysis and visualization assessments confirm the effectiveness and accuracy of the proposed method.


Asunto(s)
Accidentes de Tránsito , Conducción de Automóvil , Policia , Humanos , Conducción de Automóvil/legislación & jurisprudencia , Conducción de Automóvil/estadística & datos numéricos , Accidentes de Tránsito/prevención & control , Modelos Estadísticos
19.
Cell Syst ; 15(2): 180-192.e7, 2024 Feb 21.
Artículo en Inglés | MEDLINE | ID: mdl-38387441

RESUMEN

Analyzing colocalization of single cells with heterogeneous molecular phenotypes is essential for understanding cell-cell interactions, and cellular responses to external stimuli and their biological functions in diseases and tissues. However, existing computational methodologies identified the colocalization patterns between predefined cell populations, which can obscure the molecular signatures arising from intercellular communication. Here, we introduce DeepCOLOR, a computational framework based on a deep generative model that recovers intercellular colocalization networks with single-cell resolution by the integration of single-cell and spatial transcriptomes. Along with colocalized population detection accuracy that is superior to existing methods in simulated dataset, DeepCOLOR identified plausible cell-cell interaction candidates between colocalized single cells and segregated cell populations defined by the colocalization relationships in mouse brain tissues, human squamous cell carcinoma samples, and human lung tissues infected with SARS-CoV-2. DeepCOLOR is applicable to studying cell-cell interactions behind various spatial niches. A record of this paper's transparent peer review process is included in the supplemental information.


Asunto(s)
Comunicación Celular , Revisión por Pares , Humanos , Animales , Ratones , Fenotipo , SARS-CoV-2 , Análisis de la Célula Individual
20.
Comput Biol Med ; 173: 108322, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38554658

RESUMEN

Patient-derived organoids have proven to be a highly relevant model for evaluating of disease mechanisms and drug efficacies, as they closely recapitulate in vivo physiology. Colorectal cancer organoids, specifically, exhibit a diverse range of morphologies, which have been analyzed with image-based profiling. However, the relationship between morphological subtypes and functional parameters of the organoids remains underexplored. Here, we identified two distinct morphological subtypes ("cystic" and "solid") across 31360 bright field images using image-based profiling, which correlated differently with viability and apoptosis level of colorectal cancer organoids. Leveraging object detection neural networks, we were able to categorize single organoids achieving higher viability scores as "cystic" than "solid" subtype. Furthermore, a deep generative model was proposed to predict apoptosis intensity based on a apoptosis-featured dataset encompassing over 17000 bright field and matched fluorescent images. Notably, a significant correlation of 0.91 between the predicted value and ground truth was achived, underscoring the feasibility of this generative model as a potential means for assessing organoid functional parameters. The underlying cellular heterogeneity of the organoids, i.e., conserved colonic cell types and rare immune components, was also verified with scRNA sequencing, implying a compromised tumor microenvironment. Additionally, the "cystic" subtype was identified as a relapse phenotype featuring intestinal stem cell signatures, suggesting that this visually discernible relapse phenotype shows potential as a novel biomarker for colorectal cancer diagnosis and prognosis. In summary, our findings demonstrate that the morphological heterogeneity of colorectal cancer organoids explicitly recapitulate the association of phenotypic features and exogenous perturbations through the image-based profiling, providing new insights into disease mechanisms.


Asunto(s)
Neoplasias Colorrectales , Aprendizaje Profundo , Humanos , Neoplasias Colorrectales/genética , Recurrencia Local de Neoplasia/metabolismo , Recurrencia Local de Neoplasia/patología , Organoides/metabolismo , Organoides/patología , Recurrencia , Microambiente Tumoral
SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda