Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 597
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Cell ; 183(5): 1249-1263.e23, 2020 11 25.
Artículo en Inglés | MEDLINE | ID: mdl-33181068

RESUMEN

The hippocampal-entorhinal system is important for spatial and relational memory tasks. We formally link these domains, provide a mechanistic understanding of the hippocampal role in generalization, and offer unifying principles underlying many entorhinal and hippocampal cell types. We propose medial entorhinal cells form a basis describing structural knowledge, and hippocampal cells link this basis with sensory representations. Adopting these principles, we introduce the Tolman-Eichenbaum machine (TEM). After learning, TEM entorhinal cells display diverse properties resembling apparently bespoke spatial responses, such as grid, band, border, and object-vector cells. TEM hippocampal cells include place and landmark cells that remap between environments. Crucially, TEM also aligns with empirically recorded representations in complex non-spatial tasks. TEM also generates predictions that hippocampal remapping is not random as previously believed; rather, structural knowledge is preserved across environments. We confirm this structural transfer over remapping in simultaneously recorded place and grid cells.


Asunto(s)
Corteza Entorrinal/fisiología , Generalización Psicológica , Hipocampo/fisiología , Memoria/fisiología , Modelos Neurológicos , Animales , Conocimiento , Células de Lugar/citología , Sensación , Análisis y Desempeño de Tareas
2.
Annu Rev Neurosci ; 44: 253-273, 2021 07 08.
Artículo en Inglés | MEDLINE | ID: mdl-33730510

RESUMEN

The central theme of this review is the dynamic interaction between information selection and learning. We pose a fundamental question about this interaction: How do we learn what features of our experiences are worth learning about? In humans, this process depends on attention and memory, two cognitive functions that together constrain representations of the world to features that are relevant for goal attainment. Recent evidence suggests that the representations shaped by attention and memory are themselves inferred from experience with each task. We review this evidence and place it in the context of work that has explicitly characterized representation learning as statistical inference. We discuss how inference can be scaled to real-world decisions by approximating beliefs based on a small number of experiences. Finally, we highlight some implications of this inference process for human decision-making in social environments.


Asunto(s)
Cognición , Aprendizaje , Atención , Humanos
3.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38446740

RESUMEN

Protein annotation has long been a challenging task in computational biology. Gene Ontology (GO) has become one of the most popular frameworks to describe protein functions and their relationships. Prediction of a protein annotation with proper GO terms demands high-quality GO term representation learning, which aims to learn a low-dimensional dense vector representation with accompanying semantic meaning for each functional label, also known as embedding. However, existing GO term embedding methods, which mainly take into account ancestral co-occurrence information, have yet to capture the full topological information in the GO-directed acyclic graph (DAG). In this study, we propose a novel GO term representation learning method, PO2Vec, to utilize the partial order relationships to improve the GO term representations. Extensive evaluations show that PO2Vec achieves better outcomes than existing embedding methods in a variety of downstream biological tasks. Based on PO2Vec, we further developed a new protein function prediction method PO2GO, which demonstrates superior performance measured in multiple metrics and annotation specificity as well as few-shot prediction capability in the benchmarks. These results suggest that the high-quality representation of GO structure is critical for diverse biological tasks including computational protein annotation.


Asunto(s)
Benchmarking , Biología Computacional , Ontología de Genes , Aprendizaje , Anotación de Secuencia Molecular
4.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38701412

RESUMEN

Trajectory inference is a crucial task in single-cell RNA-sequencing downstream analysis, which can reveal the dynamic processes of biological development, including cell differentiation. Dimensionality reduction is an important step in the trajectory inference process. However, most existing trajectory methods rely on cell features derived from traditional dimensionality reduction methods, such as principal component analysis and uniform manifold approximation and projection. These methods are not specifically designed for trajectory inference and fail to fully leverage prior information from upstream analysis, limiting their performance. Here, we introduce scCRT, a novel dimensionality reduction model for trajectory inference. In order to utilize prior information to learn accurate cells representation, scCRT integrates two feature learning components: a cell-level pairwise module and a cluster-level contrastive module. The cell-level module focuses on learning accurate cell representations in a reduced-dimensionality space while maintaining the cell-cell positional relationships in the original space. The cluster-level contrastive module uses prior cell state information to aggregate similar cells, preventing excessive dispersion in the low-dimensional space. Experimental findings from 54 real and 81 synthetic datasets, totaling 135 datasets, highlighted the superior performance of scCRT compared with commonly used trajectory inference methods. Additionally, an ablation study revealed that both cell-level and cluster-level modules enhance the model's ability to learn accurate cell features, facilitating cell lineage inference. The source code of scCRT is available at https://github.com/yuchen21-web/scCRT-for-scRNA-seq.


Asunto(s)
Algoritmos , Análisis de Expresión Génica de una Sola Célula , Biología Computacional/métodos , RNA-Seq/métodos , Análisis de Expresión Génica de una Sola Célula/métodos , Programas Informáticos
5.
Brief Bioinform ; 24(1)2023 01 19.
Artículo en Inglés | MEDLINE | ID: mdl-36642408

RESUMEN

Current machine learning-based methods have achieved inspiring predictions in the scenarios of mono-type and multi-type drug-drug interactions (DDIs), but they all ignore enhancive and depressive pharmacological changes triggered by DDIs. In addition, these pharmacological changes are asymmetric since the roles of two drugs in an interaction are different. More importantly, these pharmacological changes imply significant topological patterns among DDIs. To address the above issues, we first leverage Balance theory and Status theory in social networks to reveal the topological patterns among directed pharmacological DDIs, which are modeled as a signed and directed network. Then, we design a novel graph representation learning model named SGRL-DDI (social theory-enhanced graph representation learning for DDI) to realize the multitask prediction of DDIs. SGRL-DDI model can capture the task-joint information by integrating relation graph convolutional networks with Balance and Status patterns. Moreover, we utilize task-specific deep neural networks to perform two tasks, including the prediction of enhancive/depressive DDIs and the prediction of directed DDIs. Based on DDI entries collected from DrugBank, the superiority of our model is demonstrated by the comparison with other state-of-the-art methods. Furthermore, the ablation study verifies that Balance and Status patterns help characterize directed pharmacological DDIs, and that the joint of two tasks provides better DDI representations than individual tasks. Last, we demonstrate the practical effectiveness of our model by a version-dependent test, where 88.47 and 81.38% DDI out of newly added entries provided by the latest release of DrugBank are validated in two predicting tasks respectively.


Asunto(s)
Aprendizaje Automático , Redes Neurales de la Computación , Interacciones Farmacológicas
6.
Brief Bioinform ; 24(1)2023 01 19.
Artículo en Inglés | MEDLINE | ID: mdl-36642409

RESUMEN

Protein language models, trained on millions of biologically observed sequences, generate feature-rich numerical representations of protein sequences. These representations, called sequence embeddings, can infer structure-functional properties, despite protein language models being trained on primary sequence alone. While sequence embeddings have been applied toward tasks such as structure and function prediction, applications toward alignment-free sequence classification have been hindered by the lack of studies to derive, quantify and evaluate relationships between protein sequence embeddings. Here, we develop workflows and visualization methods for the classification of protein families using sequence embedding derived from protein language models. A benchmark of manifold visualization methods reveals that Neighbor Joining (NJ) embedding trees are highly effective in capturing global structure while achieving similar performance in capturing local structure compared with popular dimensionality reduction techniques such as t-SNE and UMAP. The statistical significance of hierarchical clusters on a tree is evaluated by resampling embeddings using a variational autoencoder (VAE). We demonstrate the application of our methods in the classification of two well-studied enzyme superfamilies, phosphatases and protein kinases. Our embedding-based classifications remain consistent with and extend upon previously published sequence alignment-based classifications. We also propose a new hierarchical classification for the S-Adenosyl-L-Methionine (SAM) enzyme superfamily which has been difficult to classify using traditional alignment-based approaches. Beyond applications in sequence classification, our results further suggest NJ trees are a promising general method for visualizing high-dimensional data sets.


Asunto(s)
Secuencia de Aminoácidos , Proteínas , Análisis por Conglomerados , Proteínas/química , Alineación de Secuencia
7.
Brief Bioinform ; 24(6)2023 09 22.
Artículo en Inglés | MEDLINE | ID: mdl-37742053

RESUMEN

Identifying the potential bacteriophages (phage) candidate to treat bacterial infections plays an essential role in the research of human pathogens. Computational approaches are recognized as a valid way to predict bacteria and target phages. However, most of the current methods only utilize lower-order biological information without considering the higher-order connectivity patterns, which helps to improve the predictive accuracy. Therefore, we developed a novel microbial heterogeneous interaction network (MHIN)-based model called PTBGRP to predict new phages for bacterial hosts. Specifically, PTBGRP first constructs an MHIN by integrating phage-bacteria interaction (PBI) and six bacteria-bacteria interaction networks with their biological attributes. Then, different representation learning methods are deployed to extract higher-level biological features and lower-level topological features from MHIN. Finally, PTBGRP employs a deep neural network as the classifier to predict unknown PBI pairs based on the fused biological information. Experiment results demonstrated that PTBGRP achieves the best performance on the corresponding ESKAPE pathogens and PBI dataset when compared with state-of-art methods. In addition, case studies of Klebsiella pneumoniae and Staphylococcus aureus further indicate that the consideration of rich heterogeneous information enables PTBGRP to accurately predict PBI from a more comprehensive perspective. The webserver of the PTBGRP predictor is freely available at http://120.77.11.78/PTBGRP/.


Asunto(s)
Bacteriófagos , Infecciones Estafilocócicas , Humanos , Aprendizaje , Bacterias , Redes Neurales de la Computación
8.
Brief Bioinform ; 25(1)2023 11 22.
Artículo en Inglés | MEDLINE | ID: mdl-38040492

RESUMEN

Accurate prediction of TCR-pMHC binding is important for the development of cancer immunotherapies, especially TCR-based agents. Existing algorithms often experience diminished performance when dealing with unseen epitopes, primarily due to the complexity in TCR-pMHC recognition patterns and the scarcity of available data for training. We have developed a novel deep learning model, 'TCR Antigen Binding Recognition' based on BERT, named as TABR-BERT. Leveraging BERT's potent representation learning capabilities, TABR-BERT effectively captures essential information regarding TCR-pMHC interactions from TCR sequences, antigen epitope sequences and epitope-MHC binding. By transferring this knowledge to predict TCR-pMHC recognition, TABR-BERT demonstrated better results in benchmark tests than existing methods, particularly for unseen epitopes.


Asunto(s)
Algoritmos , Receptores de Antígenos de Linfocitos T , Receptores de Antígenos de Linfocitos T/genética , Unión Proteica , Epítopos/metabolismo , Aprendizaje Automático
9.
Brief Bioinform ; 24(6)2023 09 22.
Artículo en Inglés | MEDLINE | ID: mdl-37950905

RESUMEN

Cancer genomics is dedicated to elucidating the genes and pathways that contribute to cancer progression and development. Identifying cancer genes (CGs) associated with the initiation and progression of cancer is critical for characterization of molecular-level mechanism in cancer research. In recent years, the growing availability of high-throughput molecular data and advancements in deep learning technologies has enabled the modelling of complex interactions and topological information within genomic data. Nevertheless, because of the limited labelled data, pinpointing CGs from a multitude of potential mutations remains an exceptionally challenging task. To address this, we propose a novel deep learning framework, termed self-supervised masked graph learning (SMG), which comprises SMG reconstruction (pretext task) and task-specific fine-tuning (downstream task). In the pretext task, the nodes of multi-omic featured protein-protein interaction (PPI) networks are randomly substituted with a defined mask token. The PPI networks are then reconstructed using the graph neural network (GNN)-based autoencoder, which explores the node correlations in a self-prediction manner. In the downstream tasks, the pre-trained GNN encoder embeds the input networks into feature graphs, whereas a task-specific layer proceeds with the final prediction. To assess the performance of the proposed SMG method, benchmarking experiments are performed on three node-level tasks (identification of CGs, essential genes and healthy driver genes) and one graph-level task (identification of disease subnetwork) across eight PPI networks. Benchmarking experiments and performance comparison with existing state-of-the-art methods demonstrate the superiority of SMG on multi-omic feature engineering.


Asunto(s)
Neoplasias , Oncogenes , Mutación , Benchmarking , Genes Esenciales , Genómica , Neoplasias/genética
10.
Brief Bioinform ; 25(1)2023 11 22.
Artículo en Inglés | MEDLINE | ID: mdl-38084923

RESUMEN

The stability of the gut microenvironment is inextricably linked to human health, with the onset of many diseases accompanied by dysbiosis of the gut microbiota. It has been reported that there are differences in the microbial community composition between patients and healthy individuals, and many microbes are considered potential biomarkers. Accurately identifying these biomarkers can lead to more precise and reliable clinical decision-making. To improve the accuracy of microbial biomarker identification, this study introduces WSGMB, a computational framework that uses the relative abundance of microbial taxa and health status as inputs. This method has two main contributions: (1) viewing the microbial co-occurrence network as a weighted signed graph and applying graph convolutional neural network techniques for graph classification; (2) designing a new architecture to compute the role transitions of each microbial taxon between health and disease networks, thereby identifying disease-related microbial biomarkers. The weighted signed graph neural network enhances the quality of graph embeddings; quantifying the importance of microbes in different co-occurrence networks better identifies those microbes critical to health. Microbes are ranked according to their importance change scores, and when this score exceeds a set threshold, the microbe is considered a biomarker. This framework's identification performance is validated by comparing the biomarkers identified by WSGMB with actual microbial biomarkers associated with specific diseases from public literature databases. The study tests the proposed computational framework using actual microbial community data from colorectal cancer and Crohn's disease samples. It compares it with the most advanced microbial biomarker identification methods. The results show that the WSGMB method outperforms similar approaches in the accuracy of microbial biomarker identification.


Asunto(s)
Enfermedad de Crohn , Microbioma Gastrointestinal , Microbiota , Humanos , Redes Neurales de la Computación , Biomarcadores
11.
Brief Bioinform ; 24(4)2023 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-37253698

RESUMEN

Spatially resolved transcriptomics (SRT) enable the comprehensive characterization of transcriptomic profiles in the context of tissue microenvironments. Unveiling spatial transcriptional heterogeneity needs to effectively incorporate spatial information accounting for the substantial spatial correlation of expression measurements. Here, we develop a computational method, SpaSRL (spatially aware self-representation learning), which flexibly enhances and decodes spatial transcriptional signals to simultaneously achieve spatial domain detection and spatial functional genes identification. This novel tunable spatially aware strategy of SpaSRL not only balances spatial and transcriptional coherence for the two tasks, but also can transfer spatial correlation constraint between them based on a unified model. In addition, this joint analysis by SpaSRL deciphers accurate and fine-grained tissue structures and ensures the effective extraction of biologically informative genes underlying spatial architecture. We verified the superiority of SpaSRL on spatial domain detection, spatial functional genes identification and data denoising using multiple SRT datasets obtained by different platforms and tissue sections. Our results illustrate SpaSRL's utility in flexible integration of spatial information and novel discovery of biological insights from spatial transcriptomic datasets.


Asunto(s)
Perfilación de la Expresión Génica , Aprendizaje , Transcriptoma
12.
BMC Biol ; 22(1): 156, 2024 Jul 18.
Artículo en Inglés | MEDLINE | ID: mdl-39020316

RESUMEN

BACKGROUND: Identification of potential drug-target interactions (DTIs) with high accuracy is a key step in drug discovery and repositioning, especially concerning specific drug targets. Traditional experimental methods for identifying the DTIs are arduous, time-intensive, and financially burdensome. In addition, robust computational methods have been developed for predicting the DTIs and are widely applied in drug discovery research. However, advancing more precise algorithms for predicting DTIs is essential to meet the stringent standards demanded by drug discovery. RESULTS: We proposed a novel method called GSRF-DTI, which integrates networks with a deep learning algorithm to identify DTIs. Firstly, GSRF-DTI learned the embedding representation of drugs and targets by integrating multiple drug association information and target association information, respectively. Then, GSRF-DTI considered the influence of drug-target pair (DTP) association on DTI prediction to construct a drug-target pair network (DTP-NET). Next, we utilized GraphSAGE on DTP-NET to learn the potential features of the network and applied random forest (RF) to predict the DTIs. Furthermore, we conducted ablation experiments to validate the necessity of integrating different types of network features for identifying DTIs. It is worth noting that GSRF-DTI proposed three novel DTIs. CONCLUSIONS: GSRF-DTI not only considered the influence of the interaction relationship between drug and target but also considered the impact of DTP association relationship on DTI prediction. We initially use GraphSAGE to aggregate the neighbor information of nodes for better identification. Experimental analysis on Luo's dataset and the newly constructed dataset revealed that the GSRF-DTI framework outperformed several state-of-the-art methods significantly.


Asunto(s)
Descubrimiento de Drogas , Descubrimiento de Drogas/métodos , Aprendizaje Profundo , Biología Computacional/métodos , Algoritmos , Preparaciones Farmacéuticas
13.
BMC Bioinformatics ; 25(1): 251, 2024 Jul 31.
Artículo en Inglés | MEDLINE | ID: mdl-39085787

RESUMEN

BACKGROUND: Detecting event triggers in biomedical texts, which contain domain knowledge and context-dependent terms, is more challenging than in general-domain texts. Most state-of-the-art models rely mainly on external resources such as linguistic tools and knowledge bases to improve system performance. However, they lack effective mechanisms to obtain semantic clues from label specification and sentence context. Given its success in image classification, label representation learning is a promising approach to enhancing biomedical event trigger detection models by leveraging the rich semantics of pre-defined event type labels. RESULTS: In this paper, we propose the Biomedical Label-based Synergistic representation Learning (BioLSL) model, which effectively utilizes event type labels by learning their correlation with trigger words and enriches the representation contextually. The BioLSL model consists of three modules. Firstly, the Domain-specific Joint Encoding module employs a transformer-based, domain-specific pre-trained architecture to jointly encode input sentences and pre-defined event type labels. Secondly, the Label-based Synergistic Representation Learning module learns the semantic relationships between input texts and event type labels, and generates a Label-Trigger Aware Representation (LTAR) and a Label-Context Aware Representation (LCAR) for enhanced semantic representations. Finally, the Trigger Classification module makes structured predictions, where each label is predicted with respect to its neighbours. We conduct experiments on three benchmark BioNLP datasets, namely MLEE, GE09, and GE11, to evaluate our proposed BioLSL model. Results show that BioLSL has achieved state-of-the-art performance, outperforming the baseline models. CONCLUSIONS: The proposed BioLSL model demonstrates good performance for biomedical event trigger detection without using any external resources. This suggests that label representation learning and context-aware enhancement are promising directions for improving the task. The key enhancement is that BioLSL effectively learns to construct semantic linkages between the event mentions and type labels, which provide the latent information of label-trigger and label-context relationships in biomedical texts. Moreover, additional experiments on BioLSL show that it performs exceptionally well with limited training data under the data-scarce scenarios.


Asunto(s)
Semántica , Procesamiento de Lenguaje Natural , Aprendizaje Automático , Minería de Datos/métodos , Algoritmos
14.
BMC Bioinformatics ; 25(1): 116, 2024 Mar 16.
Artículo en Inglés | MEDLINE | ID: mdl-38493095

RESUMEN

BACKGROUND: The integration of single-cell RNA sequencing data from multiple experimental batches and diverse biological conditions holds significant importance in the study of cellular heterogeneity. RESULTS: To expedite the exploration of systematic disparities under various biological contexts, we propose a scRNA-seq integration method called scDisco, which involves a domain-adaptive decoupling representation learning strategy for the integration of dissimilar single-cell RNA data. It constructs a condition-specific domain-adaptive network founded on variational autoencoders. scDisco not only effectively reduces batch effects but also successfully disentangles biological effects and condition-specific effects, and further augmenting condition-specific representations through the utilization of condition-specific Domain-Specific Batch Normalization layers. This enhancement enables the identification of genes specific to particular conditions. The effectiveness and robustness of scDisco as an integration method were analyzed using both simulated and real datasets, and the results demonstrate that scDisco can yield high-quality visualizations and quantitative outcomes. Furthermore, scDisco has been validated using real datasets, affirming its proficiency in cell clustering quality, retaining batch-specific cell types and identifying condition-specific genes. CONCLUSION: scDisco is an effective integration method based on variational autoencoders, which improves analytical tasks of reducing batch effects, cell clustering, retaining batch-specific cell types and identifying condition-specific genes.


Asunto(s)
Aprendizaje , Análisis de Expresión Génica de una Sola Célula , Análisis por Conglomerados , ARN , Análisis de la Célula Individual , Análisis de Secuencia de ARN , Perfilación de la Expresión Génica , Algoritmos
15.
BMC Bioinformatics ; 25(1): 250, 2024 Jul 30.
Artículo en Inglés | MEDLINE | ID: mdl-39080535

RESUMEN

BACKGROUND: The potential benefits of drug combination synergy in cancer medicine are significant, yet the risks must be carefully managed due to the possibility of increased toxicity. Although artificial intelligence applications have demonstrated notable success in predicting drug combination synergy, several key challenges persist: (1) Existing models often predict average synergy values across a restricted range of testing dosages, neglecting crucial dose amounts and the mechanisms of action of the drugs involved. (2) Many graph-based models rely on static protein-protein interactions, failing to adapt to dynamic and higher-order relationships. These limitations constrain the applicability of current methods. RESULTS: We introduce SAFER, a Sub-hypergraph Attention-based graph model, addressing these issues by incorporating complex relationships among biological knowledge networks and considering dosing effects on subject-specific networks. SAFER outperformed previous models on the benchmark and the independent test set. The analysis of subgraph attention weight for the lung cancer cell line highlighted JAK-STAT signaling pathway, PRDM12, ZNF781, and CDC5L that have been implicated in lung fibrosis. CONCLUSIONS: SAFER presents an interpretable framework designed to identify drug-responsive signals. Tailored for comprehending dose effects on subject-specific molecular contexts, our model uniquely captures dose-level drug combination responses. This capability unlocks previously inaccessible avenues of investigation compared to earlier models. Furthermore, the SAFER framework can be leveraged by future inquiries to investigate molecular networks that uniquely characterize individual patients and can be applied to prioritize personalized effective treatment based on safe dose combinations.


Asunto(s)
Redes Neurales de la Computación , Humanos , Línea Celular Tumoral , Sinergismo Farmacológico , Neoplasias Pulmonares/tratamiento farmacológico , Neoplasias Pulmonares/metabolismo , Relación Dosis-Respuesta a Droga , Transducción de Señal/efectos de los fármacos , Antineoplásicos/farmacología
16.
BMC Bioinformatics ; 25(1): 47, 2024 Jan 30.
Artículo en Inglés | MEDLINE | ID: mdl-38291362

RESUMEN

Drug-drug interactions (DDI) are a critical concern in healthcare due to their potential to cause adverse effects and compromise patient safety. Supervised machine learning models for DDI prediction need to be optimized to learn abstract, transferable features, and generalize to larger chemical spaces, primarily due to the scarcity of high-quality labeled DDI data. Inspired by recent advances in computer vision, we present SMR-DDI, a self-supervised framework that leverages contrastive learning to embed drugs into a scaffold-based feature space. Molecular scaffolds represent the core structural motifs that drive pharmacological activities, making them valuable for learning informative representations. Specifically, we pre-trained SMR-DDI on a large-scale unlabeled molecular dataset. We generated augmented views for each molecule via SMILES enumeration and optimized the embedding process through contrastive loss minimization between views. This enables the model to capture relevant and robust molecular features while reducing noise. We then transfer the learned representations for the downstream prediction of DDI. Experiments show that the new feature space has comparable expressivity to state-of-the-art molecular representations and achieved competitive DDI prediction results while training on less data. Additional investigations also revealed that pre-training on more extensive and diverse unlabeled molecular datasets improved the model's capability to embed molecules more effectively. Our results highlight contrastive learning as a promising approach for DDI prediction that can identify potentially hazardous drug combinations using only structural information.


Asunto(s)
Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Humanos , Interacciones Farmacológicas , Aprendizaje Automático Supervisado
17.
Hum Brain Mapp ; 45(1): e26581, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38224537

RESUMEN

Eating behavior is highly heterogeneous across individuals and cannot be fully explained using only the degree of obesity. We utilized unsupervised machine learning and functional connectivity measures to explore the heterogeneity of eating behaviors measured by a self-assessment instrument using 424 healthy adults (mean ± standard deviation [SD] age = 47.07 ± 18.89 years; 67% female). We generated low-dimensional representations of functional connectivity using resting-state functional magnetic resonance imaging and estimated latent features using the feature representation capabilities of an autoencoder by nonlinearly compressing the functional connectivity information. The clustering approaches applied to latent features identified three distinct subgroups. The subgroups exhibited different levels of hunger traits, while their body mass indices were comparable. The results were replicated in an independent dataset consisting of 212 participants (mean ± SD age = 38.97 ± 19.80 years; 35% female). The model interpretation technique of integrated gradients revealed that the between-group differences in the integrated gradient maps were associated with functional reorganization in heteromodal association and limbic cortices and reward-related subcortical structures such as the accumbens, amygdala, and caudate. The cognitive decoding analysis revealed that these systems are associated with reward- and emotion-related systems. Our findings provide insights into the macroscopic brain organization of eating behavior-related subgroups independent of obesity.


Asunto(s)
Imagen por Resonancia Magnética , Obesidad , Adulto , Humanos , Femenino , Persona de Mediana Edad , Anciano , Adulto Joven , Masculino , Imagen por Resonancia Magnética/métodos , Encéfalo/diagnóstico por imagen , Mapeo Encefálico/métodos , Conducta Alimentaria
18.
Hum Brain Mapp ; 45(11): e26795, 2024 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-39045881

RESUMEN

The architecture of the brain is too complex to be intuitively surveyable without the use of compressed representations that project its variation into a compact, navigable space. The task is especially challenging with high-dimensional data, such as gene expression, where the joint complexity of anatomical and transcriptional patterns demands maximum compression. The established practice is to use standard principal component analysis (PCA), whose computational felicity is offset by limited expressivity, especially at great compression ratios. Employing whole-brain, voxel-wise Allen Brain Atlas transcription data, here we systematically compare compressed representations based on the most widely supported linear and non-linear methods-PCA, kernel PCA, non-negative matrix factorisation (NMF), t-stochastic neighbour embedding (t-SNE), uniform manifold approximation and projection (UMAP), and deep auto-encoding-quantifying reconstruction fidelity, anatomical coherence, and predictive utility across signalling, microstructural, and metabolic targets, drawn from large-scale open-source MRI and PET data. We show that deep auto-encoders yield superior representations across all metrics of performance and target domains, supporting their use as the reference standard for representing transcription patterns in the human brain.


Asunto(s)
Encéfalo , Imagen por Resonancia Magnética , Transcripción Genética , Humanos , Encéfalo/diagnóstico por imagen , Encéfalo/metabolismo , Transcripción Genética/fisiología , Tomografía de Emisión de Positrones , Procesamiento de Imagen Asistido por Computador/métodos , Análisis de Componente Principal , Compresión de Datos/métodos , Atlas como Asunto
19.
Brief Bioinform ; 23(5)2022 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-35901452

RESUMEN

Measuring the semantic similarity between Gene Ontology (GO) terms is a fundamental step in numerous functional bioinformatics applications. To fully exploit the metadata of GO terms, word embedding-based methods have been proposed recently to map GO terms to low-dimensional feature vectors. However, these representation methods commonly overlook the key information hidden in the whole GO structure and the relationship between GO terms. In this paper, we propose a novel representation model for GO terms, named GT2Vec, which jointly considers the GO graph structure obtained by graph contrastive learning and the semantic description of GO terms based on BERT encoders. Our method is evaluated on a protein similarity task on a collection of benchmark datasets. The experimental results demonstrate the effectiveness of using a joint encoding graph structure and textual node descriptors to learn vector representations for GO terms.


Asunto(s)
Biología Computacional , Semántica , Biología Computacional/métodos , Ontología de Genes , Metadatos
20.
Brief Bioinform ; 23(2)2022 03 10.
Artículo en Inglés | MEDLINE | ID: mdl-35018418

RESUMEN

Spatial structures of proteins are closely related to protein functions. Integrating protein structures improves the performance of protein-protein interaction (PPI) prediction. However, the limited quantity of known protein structures restricts the application of structure-based prediction methods. Utilizing the predicted protein structure information is a promising method to improve the performance of sequence-based prediction methods. We propose a novel end-to-end framework, TAGPPI, to predict PPIs using protein sequence alone. TAGPPI extracts multi-dimensional features by employing 1D convolution operation on protein sequences and graph learning method on contact maps constructed from AlphaFold. A contact map contains abundant spatial structure information, which is difficult to obtain from 1D sequence data directly. We further demonstrate that the spatial information learned from contact maps improves the ability of TAGPPI in PPI prediction tasks. We compare the performance of TAGPPI with those of nine state-of-the-art sequence-based methods, and TAGPPI outperforms such methods in all metrics. To the best of our knowledge, this is the first method to use the predicted protein topology structure graph for sequence-based PPI prediction. More importantly, our proposed architecture could be extended to other prediction tasks related to proteins.


Asunto(s)
Aprendizaje Automático , Proteínas , Secuencia de Aminoácidos , Proteínas/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA