Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 4.827
Filtrar
Más filtros

Intervalo de año de publicación
1.
Annu Rev Neurosci ; 42: 407-432, 2019 07 08.
Artículo en Inglés | MEDLINE | ID: mdl-31283895

RESUMEN

The brain's function is to enable adaptive behavior in the world. To this end, the brain processes information about the world. The concept of representation links the information processed by the brain back to the world and enables us to understand what the brain does at a functional level. The appeal of making the connection between brain activity and what it represents has been irresistible to neuroscience, despite the fact that representational interpretations pose several challenges: We must define which aspects of brain activity matter, how the code works, and how it supports computations that contribute to adaptive behavior. It has been suggested that we might drop representational language altogether and seek to understand the brain, more simply, as a dynamical system. In this review, we argue that the concept of representation provides a useful link between dynamics and computational function and ask which aspects of brain activity should be analyzed to achieve a representational understanding. We peel the onion of brain representations in search of the layers (the aspects of brain activity) that matter to computation. The article provides an introduction to the motivation and mathematics of representational models, a critical discussion of their assumptions and limitations, and a preview of future directions in this area.


Asunto(s)
Mapeo Encefálico , Encéfalo/patología , Cognición/fisiología , Modelos Neurológicos , Humanos , Imagen por Resonancia Magnética/métodos
2.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38701413

RESUMEN

With the emergence of large amount of single-cell RNA sequencing (scRNA-seq) data, the exploration of computational methods has become critical in revealing biological mechanisms. Clustering is a representative for deciphering cellular heterogeneity embedded in scRNA-seq data. However, due to the diversity of datasets, none of the existing single-cell clustering methods shows overwhelming performance on all datasets. Weighted ensemble methods are proposed to integrate multiple results to improve heterogeneity analysis performance. These methods are usually weighted by considering the reliability of the base clustering results, ignoring the performance difference of the same base clustering on different cells. In this paper, we propose a high-order element-wise weighting strategy based self-representative ensemble learning framework: scEWE. By assigning different base clustering weights to individual cells, we construct and optimize the consensus matrix in a careful and exquisite way. In addition, we extracted the high-order information between cells, which enhanced the ability to represent the similarity relationship between cells. scEWE is experimentally shown to significantly outperform the state-of-the-art methods, which strongly demonstrates the effectiveness of the method and supports the potential applications in complex single-cell data analytical problems.


Asunto(s)
Análisis de Secuencia de ARN , Análisis de la Célula Individual , Análisis de la Célula Individual/métodos , Análisis por Conglomerados , Análisis de Secuencia de ARN/métodos , Algoritmos , Biología Computacional/métodos , Humanos , RNA-Seq/métodos
3.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38695119

RESUMEN

Sequence similarity is of paramount importance in biology, as similar sequences tend to have similar function and share common ancestry. Scoring matrices, such as PAM or BLOSUM, play a crucial role in all bioinformatics algorithms for identifying similarities, but have the drawback that they are fixed, independent of context. We propose a new scoring method for amino acid similarity that remedies this weakness, being contextually dependent. It relies on recent advances in deep learning architectures that employ self-supervised learning in order to leverage the power of enormous amounts of unlabelled data to generate contextual embeddings, which are vector representations for words. These ideas have been applied to protein sequences, producing embedding vectors for protein residues. We propose the E-score between two residues as the cosine similarity between their embedding vector representations. Thorough testing on a wide variety of reference multiple sequence alignments indicate that the alignments produced using the new $E$-score method, especially ProtT5-score, are significantly better than those obtained using BLOSUM matrices. The new method proposes to change the way alignments are computed, with far-reaching implications in all areas of textual data that use sequence similarity. The program to compute alignments based on various $E$-scores is available as a web server at e-score.csd.uwo.ca. The source code is freely available for download from github.com/lucian-ilie/E-score.


Asunto(s)
Algoritmos , Biología Computacional , Alineación de Secuencia , Alineación de Secuencia/métodos , Biología Computacional/métodos , Programas Informáticos , Análisis de Secuencia de Proteína/métodos , Secuencia de Aminoácidos , Proteínas/química , Proteínas/genética , Aprendizaje Profundo , Bases de Datos de Proteínas
4.
Proc Natl Acad Sci U S A ; 120(49): e2311250120, 2023 Dec 05.
Artículo en Inglés | MEDLINE | ID: mdl-38015838

RESUMEN

When two people coincidentally have something in common (such as a name or birthday), they tend to like each other more and are thus more likely to offer help and comply with requests. This dynamic can have important legal and ethical consequences whenever these incidental similarities give rise to unfair favoritism. Using a large-scale, longitudinal natural experiment, covering nearly 200,000 annual earnings forecasts over more than 25 y, we show that when a CEO and a securities analyst share a first name, the analyst's financial forecast is more accurate. We offer evidence that name matching improves forecast accuracy due to CEOs privately sharing pertinent information with name-matched analysts. Additionally, we show that this effect is especially pronounced among CEO-analyst pairs who share an uncommon first name. Our research thus demonstrates how incidental similarities can give way to special treatment. Whereas most investigations of the effects of similarity consider only one-shot interactions, we use a longitudinal dataset to show that the effect of name matching diminishes over time with more interactions between CEOs and analysts. We also point to the findings of an experiment suggesting that favoritism born of sharing a name may evade straightforward regulation in part due to people's perception that name similarity would exert little influence on them. Taken together, our work offers insight into when private disclosures are likely to be made. Our results suggest that the effectiveness of regulatory policies can be significantly impacted by psychological factors shaping the context in which they are implemented.


Asunto(s)
Revelación , Nombres , Humanos
5.
Proc Natl Acad Sci U S A ; 120(51): e2302401120, 2023 Dec 19.
Artículo en Inglés | MEDLINE | ID: mdl-38096414

RESUMEN

Complex topographies exhibit universal properties when fluvial erosion dominates landscape evolution over other geomorphological processes. Similarly, we show that the solutions of a minimalist landscape evolution model display invariant behavior as the impact of soil diffusion diminishes compared to fluvial erosion at the landscape scale, yielding complete self-similarity with respect to a dimensionless channelization index. Approaching its zero limit, soil diffusion becomes confined to a region of vanishing area and large concavity or convexity, corresponding to the locus of the ridge and valley network. We demonstrate these results using one dimensional analytical solutions and two dimensional numerical simulations, supported by real-world topographic observations. Our findings on the landscape self-similarity and the localized diffusion resemble the self-similarity of turbulent flows and the role of viscous dissipation. Topographic singularities in the vanishing diffusion limit are suggestive of shock waves and singularities observed in nonlinear complex systems.

6.
J Neurosci ; 44(26)2024 Jun 26.
Artículo en Inglés | MEDLINE | ID: mdl-38740441

RESUMEN

Humans make decisions about food every day. The visual system provides important information that forms a basis for these food decisions. Although previous research has focused on visual object and category representations in the brain, it is still unclear how visually presented food is encoded by the brain. Here, we investigate the time-course of food representations in the brain. We used time-resolved multivariate analyses of electroencephalography (EEG) data, obtained from human participants (both sexes), to determine which food features are represented in the brain and whether focused attention is needed for this. We recorded EEG while participants engaged in two different tasks. In one task, the stimuli were task relevant, whereas in the other task, the stimuli were not task relevant. Our findings indicate that the brain can differentiate between food and nonfood items from ∼112 ms after the stimulus onset. The neural signal at later latencies contained information about food naturalness, how much the food was transformed, as well as the perceived caloric content. This information was present regardless of the task. Information about whether food is immediately ready to eat, however, was only present when the food was task relevant and presented at a slow presentation rate. Furthermore, the recorded brain activity correlated with the behavioral responses in an odd-item-out task. The fast representation of these food features, along with the finding that this information is used to guide food categorization decision-making, suggests that these features are important dimensions along which the representation of foods is organized.


Asunto(s)
Encéfalo , Electroencefalografía , Alimentos , Estimulación Luminosa , Humanos , Masculino , Femenino , Encéfalo/fisiología , Adulto , Electroencefalografía/métodos , Adulto Joven , Estimulación Luminosa/métodos , Tiempo de Reacción/fisiología , Factores de Tiempo , Atención/fisiología , Toma de Decisiones/fisiología
7.
J Neurosci ; 44(4)2024 01 24.
Artículo en Inglés | MEDLINE | ID: mdl-38267235

RESUMEN

Low-level features are typically continuous (e.g., the gamut between two colors), but semantic information is often categorical (there is no corresponding gradient between dog and turtle) and hierarchical (animals live in land, water, or air). To determine the impact of these differences on cognitive representations, we characterized the geometry of perceptual spaces of five domains: a domain dominated by semantic information (animal names presented as words), a domain dominated by low-level features (colored textures), and three intermediate domains (animal images, lightly texturized animal images that were easy to recognize, and heavily texturized animal images that were difficult to recognize). Each domain had 37 stimuli derived from the same animal names. From 13 participants (9F), we gathered similarity judgments in each domain via an efficient psychophysical ranking paradigm. We then built geometric models of each domain for each participant, in which distances between stimuli accounted for participants' similarity judgments and intrinsic uncertainty. Remarkably, the five domains had similar global properties: each required 5-7 dimensions, and a modest amount of spherical curvature provided the best fit. However, the arrangement of the stimuli within these embeddings depended on the level of semantic information: dendrograms derived from semantic domains (word, image, and lightly texturized images) were more "tree-like" than those from feature-dominated domains (heavily texturized images and textures). Thus, the perceptual spaces of domains along this feature-dominated to semantic-dominated gradient shift to a tree-like organization when semantic information dominates, while retaining a similar global geometry.


Asunto(s)
Juicio , Tortugas , Humanos , Animales , Perros , Semántica , Incertidumbre , Agua
8.
J Neurosci ; 44(21)2024 May 22.
Artículo en Inglés | MEDLINE | ID: mdl-38569925

RESUMEN

When we perceive a scene, our brain processes various types of visual information simultaneously, ranging from sensory features, such as line orientations and colors, to categorical features, such as objects and their arrangements. Whereas the role of sensory and categorical visual representations in predicting subsequent memory has been studied using isolated objects, their impact on memory for complex scenes remains largely unknown. To address this gap, we conducted an fMRI study in which female and male participants encoded pictures of familiar scenes (e.g., an airport picture) and later recalled them, while rating the vividness of their visual recall. Outside the scanner, participants had to distinguish each seen scene from three similar lures (e.g., three airport pictures). We modeled the sensory and categorical visual features of multiple scenes using both early and late layers of a deep convolutional neural network. Then, we applied representational similarity analysis to determine which brain regions represented stimuli in accordance with the sensory and categorical models. We found that categorical, but not sensory, representations predicted subsequent memory. In line with the previous result, only for the categorical model, the average recognition performance of each scene exhibited a positive correlation with the average visual dissimilarity between the item in question and its respective lures. These results strongly suggest that even in memory tests that ostensibly rely solely on visual cues (such as forced-choice visual recognition with similar distractors), memory decisions for scenes may be primarily influenced by categorical rather than sensory representations.


Asunto(s)
Imagen por Resonancia Magnética , Reconocimiento Visual de Modelos , Reconocimiento en Psicología , Humanos , Masculino , Femenino , Adulto , Adulto Joven , Reconocimiento en Psicología/fisiología , Reconocimiento Visual de Modelos/fisiología , Estimulación Luminosa/métodos , Percepción Visual/fisiología , Encéfalo/fisiología , Encéfalo/diagnóstico por imagen , Recuerdo Mental/fisiología , Mapeo Encefálico
9.
J Neurosci ; 44(12)2024 Mar 20.
Artículo en Inglés | MEDLINE | ID: mdl-38331583

RESUMEN

Capacity limitations in visual tasks can be observed when the number of task-related objects increases. An influential idea is that such capacity limitations are determined by competition at the neural level: two objects that are encoded by shared neural populations interfere more in behavior (e.g., visual search) than two objects encoded by separate neural populations. However, the neural representational similarity of objects varies across brain regions and across time, raising the questions of where and when competition determines task performance. Furthermore, it is unclear whether the association between neural representational similarity and task performance is common or unique across tasks. Here, we used neural representational similarity derived from fMRI, MEG, and a deep neural network (DNN) to predict performance on two visual search tasks involving the same objects and requiring the same responses but differing in instructions: cued visual search and oddball visual search. Separate groups of human participants (both sexes) viewed the individual objects in neuroimaging experiments to establish the neural representational similarity between those objects. Results showed that performance on both search tasks could be predicted by neural representational similarity throughout the visual system (fMRI), from 80 ms after onset (MEG), and in all DNN layers. Stepwise regression analysis, however, revealed task-specific associations, with unique variability in oddball search performance predicted by early/posterior neural similarity and unique variability in cued search task performance predicted by late/anterior neural similarity. These results reveal that capacity limitations in superficially similar visual search tasks may reflect competition at different stages of visual processing.


Asunto(s)
Encéfalo , Imagen por Resonancia Magnética , Masculino , Femenino , Humanos , Imagen por Resonancia Magnética/métodos , Encéfalo/diagnóstico por imagen , Encéfalo/fisiología , Percepción Visual/fisiología , Señales (Psicología) , Mapeo Encefálico , Redes Neurales de la Computación , Reconocimiento Visual de Modelos/fisiología
10.
J Neurosci ; 44(1)2024 Jan 03.
Artículo en Inglés | MEDLINE | ID: mdl-38050089

RESUMEN

The hippocampus plays a central role as a coordinate system or index of information stored in neocortical loci. Nonetheless, it remains unclear how hippocampal processes integrate with cortical information to facilitate successful memory encoding. Thus, the goal of the current study was to identify specific hippocampal-cortical interactions that support object encoding. We collected fMRI data while 19 human participants (7 female and 12 male) encoded images of real-world objects and tested their memory for object concepts and image exemplars (i.e., conceptual and perceptual memory). Representational similarity analysis revealed robust representations of visual and semantic information in canonical visual (e.g., occipital cortex) and semantic (e.g., angular gyrus) regions in the cortex, but not in the hippocampus. Critically, hippocampal functions modulated the mnemonic impact of cortical representations that are most pertinent to future memory demands, or transfer-appropriate representations Subsequent perceptual memory was best predicted by the strength of visual representations in ventromedial occipital cortex in coordination with hippocampal activity and pattern information during encoding. In parallel, subsequent conceptual memory was best predicted by the strength of semantic representations in left inferior frontal gyrus and angular gyrus in coordination with either hippocampal activity or semantic representational strength during encoding. We found no evidence for transfer-incongruent hippocampal-cortical interactions supporting subsequent memory (i.e., no hippocampal interactions with cortical visual/semantic representations supported conceptual/perceptual memory). Collectively, these results suggest that diverse hippocampal functions flexibly modulate cortical representations of object properties to satisfy distinct future memory demands.Significance Statement The hippocampus is theorized to index pieces of information stored throughout the cortex to support episodic memory. Yet how hippocampal processes integrate with cortical representation of stimulus information remains unclear. Using fMRI, we examined various forms of hippocampal-cortical interactions during object encoding in relation to subsequent performance on conceptual and perceptual memory tests. Our results revealed novel hippocampal-cortical interactions that utilize semantic and visual representations in transfer-appropriate manners: conceptual memory supported by hippocampal modulation of frontoparietal semantic representations, and perceptual memory supported by hippocampal modulation of occipital visual representations. These findings provide important insights into the neural mechanisms underlying the formation of information-rich episodic memory and underscore the value of studying the flexible interplay between brain regions for complex cognition.


Asunto(s)
Mapeo Encefálico , Memoria Episódica , Humanos , Masculino , Femenino , Hipocampo , Lóbulo Parietal , Corteza Prefrontal , Imagen por Resonancia Magnética
11.
J Neurosci ; 44(2)2024 Jan 10.
Artículo en Inglés | MEDLINE | ID: mdl-37963765

RESUMEN

Recently, multi-voxel pattern analysis has verified that information can be removed from working memory (WM) via three distinct operations replacement, suppression, or clearing compared to information being maintained ( Kim et al., 2020). While univariate analyses and classifier importance maps in Kim et al. (2020) identified brain regions that contribute to these operations, they did not elucidate whether these regions represent the operations similarly or uniquely. Using Leiden-community-detection on a sample of 55 humans (17 male), we identified four brain networks, each of which has a unique configuration of multi-voxel activity patterns by which it represents these WM operations. The visual network (VN) shows similar multi-voxel patterns for maintain and replace, which are highly dissimilar from suppress and clear, suggesting this network differentiates whether an item is held in WM or not. The somatomotor network (SMN) shows a distinct multi-voxel pattern for clear relative to the other operations, indicating the uniqueness of this operation. The default mode network (DMN) has distinct patterns for suppress and clear, but these two operations are more similar to each other than to maintain and replace, a pattern intermediate to that of the VN and SMN. The frontoparietal control network (FPCN) displays distinct multi-voxel patterns for each of the four operations, suggesting that this network likely plays an important role in implementing these WM operations. These results indicate that the operations involved in removing information from WM can be performed in parallel by distinct brain networks, each of which has a particular configuration by which they represent these operations.


Asunto(s)
Encéfalo , Memoria a Corto Plazo , Masculino , Humanos , Encéfalo/diagnóstico por imagen , Encéfalo/cirugía , Mapeo Encefálico , Estimulación Luminosa , Imagen por Resonancia Magnética/métodos
12.
Brief Bioinform ; 24(2)2023 03 19.
Artículo en Inglés | MEDLINE | ID: mdl-36907663

RESUMEN

The discovery of drug-target interactions (DTIs) is a pivotal process in pharmaceutical development. Computational approaches are a promising and efficient alternative to tedious and costly wet-lab experiments for predicting novel DTIs from numerous candidates. Recently, with the availability of abundant heterogeneous biological information from diverse data sources, computational methods have been able to leverage multiple drug and target similarities to boost the performance of DTI prediction. Similarity integration is an effective and flexible strategy to extract crucial information across complementary similarity views, providing a compressed input for any similarity-based DTI prediction model. However, existing similarity integration methods filter and fuse similarities from a global perspective, neglecting the utility of similarity views for each drug and target. In this study, we propose a Fine-Grained Selective similarity integration approach, called FGS, which employs a local interaction consistency-based weight matrix to capture and exploit the importance of similarities at a finer granularity in both similarity selection and combination steps. We evaluate FGS on five DTI prediction datasets under various prediction settings. Experimental results show that our method not only outperforms similarity integration competitors with comparable computational costs, but also achieves better prediction performance than state-of-the-art DTI prediction approaches by collaborating with conventional base models. Furthermore, case studies on the analysis of similarity weights and on the verification of novel predictions confirm the practical ability of FGS.


Asunto(s)
Desarrollo de Medicamentos , Descubrimiento de Drogas , Descubrimiento de Drogas/métodos , Interacciones Farmacológicas
13.
Brief Bioinform ; 24(3)2023 05 19.
Artículo en Inglés | MEDLINE | ID: mdl-36971393

RESUMEN

MOTIVATION: A large number of studies have shown that circular RNA (circRNA) affects biological processes by competitively binding miRNA, providing a new perspective for the diagnosis, and treatment of human diseases. Therefore, exploring the potential circRNA-miRNA interactions (CMIs) is an important and urgent task at present. Although some computational methods have been tried, their performance is limited by the incompleteness of feature extraction in sparse networks and the low computational efficiency of lengthy data. RESULTS: In this paper, we proposed JSNDCMI, which combines the multi-structure feature extraction framework and Denoising Autoencoder (DAE) to meet the challenge of CMI prediction in sparse networks. In detail, JSNDCMI integrates functional similarity and local topological structure similarity in the CMI network through the multi-structure feature extraction framework, then forces the neural network to learn the robust representation of features through DAE and finally uses the Gradient Boosting Decision Tree classifier to predict the potential CMIs. JSNDCMI produces the best performance in the 5-fold cross-validation of all data sets. In the case study, seven of the top 10 CMIs with the highest score were verified in PubMed. AVAILABILITY: The data and source code can be found at https://github.com/1axin/JSNDCMI.


Asunto(s)
MicroARNs , Humanos , MicroARNs/genética , ARN Circular , Redes Neurales de la Computación , Programas Informáticos , Biología Computacional/métodos
14.
Brief Bioinform ; 24(1)2023 01 19.
Artículo en Inglés | MEDLINE | ID: mdl-36592062

RESUMEN

Recent studies have revealed that long noncoding RNAs (lncRNAs) are closely linked to several human diseases, providing new opportunities for their use in detection and therapy. Many graph propagation and similarity fusion approaches can be used for predicting potential lncRNA-disease associations. However, existing similarity fusion approaches suffer from noise and self-similarity loss in the fusion process. To address these problems, a new prediction approach, termed SSMF-BLNP, based on organically combining selective similarity matrix fusion (SSMF) and bidirectional linear neighborhood label propagation (BLNP), is proposed in this paper to predict lncRNA-disease associations. In SSMF, self-similarity networks of lncRNAs and diseases are obtained by selective preprocessing and nonlinear iterative fusion. The fusion process assigns weights to each initial similarity network and introduces a unit matrix that can reduce noise and compensate for the loss of self-similarity. In BLNP, the initial lncRNA-disease associations are employed in both lncRNA and disease directions as label information for linear neighborhood label propagation. The propagation was then performed on the self-similarity network obtained from SSMF to derive the scoring matrix for predicting the relationships between lncRNAs and diseases. Experimental results showed that SSMF-BLNP performed better than seven other state of-the-art approaches. Furthermore, a case study demonstrated up to 100% and 80% accuracy in 10 lncRNAs associated with hepatocellular carcinoma and 10 lncRNAs associated with renal cell carcinoma, respectively. The source code and datasets used in this paper are available at: https://github.com/RuiBingo/SSMF-BLNP.


Asunto(s)
ARN Largo no Codificante , Humanos , Algoritmos , Biología Computacional/métodos , ARN Largo no Codificante/genética , Programas Informáticos , Carcinoma Hepatocelular/genética , Carcinoma de Células Renales/genética , Neoplasias Hepáticas/genética , Neoplasias Renales/genética
15.
Brief Bioinform ; 24(2)2023 03 19.
Artículo en Inglés | MEDLINE | ID: mdl-36810579

RESUMEN

Phosphorylation is an essential mechanism for regulating protein activities. Determining kinase-specific phosphorylation sites by experiments involves time-consuming and expensive analyzes. Although several studies proposed computational methods to model kinase-specific phosphorylation sites, they typically required abundant experimentally verified phosphorylation sites to yield reliable predictions. Nevertheless, the number of experimentally verified phosphorylation sites for most kinases is relatively small, and the targeting phosphorylation sites are still unidentified for some kinases. In fact, there is little research related to these understudied kinases in the literature. Thus, this study aims to create predictive models for these understudied kinases. A kinase-kinase similarity network was generated by merging the sequence-, functional-, protein-domain- and 'STRING'-related similarities. Thus, besides sequence data, protein-protein interactions and functional pathways were also considered to aid predictive modelling. This similarity network was then integrated with a classification of kinase groups to yield highly similar kinases to a specific understudied type of kinase. Their experimentally verified phosphorylation sites were leveraged as positive sites to train predictive models. The experimentally verified phosphorylation sites of the understudied kinase were used for validation. Results demonstrate that 82 out of 116 understudied kinases were predicted with adequate performance via the proposed modelling strategy, achieving a balanced accuracy of 0.81, 0.78, 0.84, 0.84, 0.85, 0.82, 0.90, 0.82 and 0.85, for the 'TK', 'Other', 'STE', 'CAMK', 'TKL', 'CMGC', 'AGC', 'CK1' and 'Atypical' groups, respectively. Therefore, this study demonstrates that web-like predictive networks can reliably capture the underlying patterns in such understudied kinases by harnessing relevant sources of similarities to predict their specific phosphorylation sites.


Asunto(s)
Proteínas Quinasas , Fosforilación , Proteínas Quinasas/genética , Proteínas Quinasas/metabolismo
16.
Brief Bioinform ; 24(5)2023 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-37482409

RESUMEN

Numerous biological studies have shown that considering disease-associated micro RNAs (miRNAs) as potential biomarkers or therapeutic targets offers new avenues for the diagnosis of complex diseases. Computational methods have gradually been introduced to reveal disease-related miRNAs. Considering that previous models have not fused sufficiently diverse similarities, that their inappropriate fusion methods may lead to poor quality of the comprehensive similarity network and that their results are often limited by insufficiently known associations, we propose a computational model called Generative Adversarial Matrix Completion Network based on Multi-source Data Fusion (GAMCNMDF) for miRNA-disease association prediction. We create a diverse network connecting miRNAs and diseases, which is then represented using a matrix. The main task of GAMCNMDF is to complete the matrix and obtain the predicted results. The main innovations of GAMCNMDF are reflected in two aspects: GAMCNMDF integrates diverse data sources and employs a nonlinear fusion approach to update the similarity networks of miRNAs and diseases. Also, some additional information is provided to GAMCNMDF in the form of a 'hint' so that GAMCNMDF can work successfully even when complete data are not available. Compared with other methods, the outcomes of 10-fold cross-validation on two distinct databases validate the superior performance of GAMCNMDF with statistically significant results. It is worth mentioning that we apply GAMCNMDF in the identification of underlying small molecule-related miRNAs, yielding outstanding performance results in this specific domain. In addition, two case studies about two important neoplasms show that GAMCNMDF is a promising prediction method.


Asunto(s)
MicroARNs , Neoplasias , Humanos , MicroARNs/genética , Algoritmos , Biología Computacional/métodos , Neoplasias/genética , Bases de Datos Genéticas , Predisposición Genética a la Enfermedad
17.
Brief Bioinform ; 24(5)2023 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-37544658

RESUMEN

MOTIVATION: Recent advances in spatially resolved transcriptomics (ST) technologies enable the measurement of gene expression profiles while preserving cellular spatial context. Linking gene expression of cells with their spatial distribution is essential for better understanding of tissue microenvironment and biological progress. However, effectively combining gene expression data with spatial information to identify spatial domains remains challenging. RESULTS: To deal with the above issue, in this paper, we propose a novel unsupervised learning framework named STMGCN for identifying spatial domains using multi-view graph convolution networks (MGCNs). Specifically, to fully exploit spatial information, we first construct multiple neighbor graphs (views) with different similarity measures based on the spatial coordinates. Then, STMGCN learns multiple view-specific embeddings by combining gene expressions with each neighbor graph through graph convolution networks. Finally, to capture the importance of different graphs, we further introduce an attention mechanism to adaptively fuse view-specific embeddings and thus derive the final spot embedding. STMGCN allows for the effective utilization of spatial context to enhance the expressive power of the latent embeddings with multiple graph convolutions. We apply STMGCN on two simulation datasets and five real spatial transcriptomics datasets with different resolutions across distinct platforms. The experimental results demonstrate that STMGCN obtains competitive results in spatial domain identification compared with five state-of-the-art methods, including spatial and non-spatial alternatives. Besides, STMGCN can detect spatially variable genes with enriched expression patterns in the identified domains. Overall, STMGCN is a powerful and efficient computational framework for identifying spatial domains in spatial transcriptomics data.


Asunto(s)
Perfilación de la Expresión Génica , Transcriptoma , Simulación por Computador
18.
Brief Bioinform ; 24(1)2023 01 19.
Artículo en Inglés | MEDLINE | ID: mdl-36445207

RESUMEN

Driven by multi-omics data, some multi-view clustering algorithms have been successfully applied to cancer subtypes prediction, aiming to identify subtypes with biometric differences in the same cancer, thereby improving the clinical prognosis of patients and designing personalized treatment plan. Due to the fact that the number of patients in omics data is much smaller than the number of genes, multi-view spectral clustering based on similarity learning has been widely developed. However, these algorithms still suffer some problems, such as over-reliance on the quality of pre-defined similarity matrices for clustering results, inability to reasonably handle noise and redundant information in high-dimensional omics data, ignoring complementary information between omics data, etc. This paper proposes multi-view spectral clustering with latent representation learning (MSCLRL) method to alleviate the above problems. First, MSCLRL generates a corresponding low-dimensional latent representation for each omics data, which can effectively retain the unique information of each omics and improve the robustness and accuracy of the similarity matrix. Second, the obtained latent representations are assigned appropriate weights by MSCLRL, and global similarity learning is performed to generate an integrated similarity matrix. Third, the integrated similarity matrix is used to feed back and update the low-dimensional representation of each omics. Finally, the final integrated similarity matrix is used for clustering. In 10 benchmark multi-omics datasets and 2 separate cancer case studies, the experiments confirmed that the proposed method obtained statistically and biologically meaningful cancer subtypes.


Asunto(s)
Multiómica , Neoplasias , Humanos , Algoritmos , Neoplasias/genética , Análisis por Conglomerados
19.
Brief Bioinform ; 24(1)2023 01 19.
Artículo en Inglés | MEDLINE | ID: mdl-36585781

RESUMEN

Genetic similarity matrices are commonly used to assess population substructure (PS) in genetic studies. Through simulation studies and by the application to whole-genome sequencing (WGS) data, we evaluate the performance of three genetic similarity matrices: the unweighted and weighted Jaccard similarity matrices and the genetic relationship matrix. We describe different scenarios that can create numerical pitfalls and lead to incorrect conclusions in some instances. We consider scenarios in which PS is assessed based on loci that are located across the genome ('globally') and based on loci from a specific genomic region ('locally'). We also compare scenarios in which PS is evaluated based on loci from different minor allele frequency bins: common (>5%), low-frequency (5-0.5%) and rare (<0.5%) single-nucleotide variations (SNVs). Overall, we observe that all approaches provide the best clustering performance when computed based on rare SNVs. The performance of the similarity matrices is very similar for common and low-frequency variants, but for rare variants, the unweighted Jaccard matrix provides preferable clustering features. Based on visual inspection and in terms of standard clustering metrics, its clusters are the densest and the best separated in the principal component analysis of variants with rare SNVs compared with the other methods and different allele frequency cutoffs. In an application, we assessed the role of rare variants on local and global PS, using WGS data from multiethnic Alzheimer's disease data sets and European or East Asian populations from the 1000 Genome Project.


Asunto(s)
Genoma , Genómica , Análisis de Componente Principal , Frecuencia de los Genes , Simulación por Computador , Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple
20.
Brief Bioinform ; 24(4)2023 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-37248747

RESUMEN

Human Phenotype Ontology (HPO)-based approaches have gained popularity in recent times as a tool for genomic diagnostics of rare diseases. However, these approaches do not make full use of the available information on disease and patient phenotypes. We present a new method called Phen2Disease, which utilizes the bidirectional maximum matching semantic similarity between two phenotype sets of patients and diseases to prioritize diseases and genes. Our comprehensive experiments have been conducted on six real data cohorts with 2051 cases (Cohort 1, n = 384; Cohort 2, n = 281; Cohort 3, n = 185; Cohort 4, n = 784; Cohort 5, n = 208; and Cohort 6, n = 209) and two simulated data cohorts with 1000 cases. The results of the experiments showed that Phen2Disease outperforms the three state-of-the-art methods when only phenotype information and HPO knowledge base are used, particularly in cohorts with fewer average numbers of HPO terms. We also observed that patients with higher information content scores have more specific information, leading to more accurate predictions. Moreover, Phen2Disease provides high interpretability with ranked diseases and patient HPO terms presented. Our method provides a novel approach to utilizing phenotype data for genomic diagnostics of rare diseases, with potential for clinical impact. Phen2Disease is freely available on GitHub at https://github.com/ZhuLab-Fudan/Phen2Disease.


Asunto(s)
Ontologías Biológicas , Enfermedades Raras , Humanos , Semántica , Genómica , Fenotipo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA