Búsqueda | Biblioteca Virtual en Salud

1.

scMLC: an accurate and robust multiplex community detection method for single-cell multi-omics data.

Chen, Yuxuan; Zheng, Ruiqing; Liu, Jin; Li, Min.

Brief Bioinform ; 25(2)2024 Jan 22.

Artículo en Inglés | MEDLINE | ID: mdl-38493339

RESUMEN

Clustering cells based on single-cell multi-modal sequencing technologies provides an unprecedented opportunity to create high-resolution cell atlas, reveal cellular critical states and study health and diseases. However, effectively integrating different sequencing data for cell clustering remains a challenging task. Motivated by the successful application of Louvain in scRNA-seq data, we propose a single-cell multi-modal Louvain clustering framework, called scMLC, to tackle this problem. scMLC builds multiplex single- and cross-modal cell-to-cell networks to capture modal-specific and consistent information between modalities and then adopts a robust multiplex community detection method to obtain the reliable cell clusters. In comparison with 15 state-of-the-art clustering methods on seven real datasets simultaneously measuring gene expression and chromatin accessibility, scMLC achieves better accuracy and stability in most datasets. Synthetic results also indicate that the cell-network-based integration strategy of multi-omics data is superior to other strategies in terms of generalization. Moreover, scMLC is flexible and can be extended to single-cell sequencing data with more than two modalities.

Asunto(s)

Cromatina , Multiómica , Análisis por Conglomerados , Algoritmos , Análisis de Secuencia de ARN

2.

The multiscale topological organization of the functional brain network in adolescent PTSD.

Corredor, David; Segobin, Shailendra; Hinault, Thomas; Eustache, Francis; Dayan, Jacques; Guillery-Girard, Bérengère; Naveau, Mikaël.

Cereb Cortex ; 34(6)2024 Jun 04.

Artículo en Inglés | MEDLINE | ID: mdl-38864573

RESUMEN

The experience of an extremely aversive event can produce enduring deleterious behavioral, and neural consequences, among which posttraumatic stress disorder (PTSD) is a representative example. Although adolescence is a period of great exposure to potentially traumatic events, the effects of trauma during adolescence remain understudied in clinical neuroscience. In this exploratory work, we aim to study the whole-cortex functional organization of 14 adolescents with PTSD using a data-driven method tailored to our population of interest. To do so, we built on the network neuroscience framework and specifically on multilayer (multisubject) community analysis to study the functional connectivity of the brain. We show, across different topological scales (the number of communities composing the cortex), a hyper-colocalization between regions belonging to occipital and pericentral regions and hypo-colocalization in middle temporal, posterior-anterior medial, and frontal cortices in the adolescent PTSD group compared to a nontrauma exposed group of adolescents. These preliminary results raise the question of an altered large-scale cortical organization in adolescent PTSD, opening an interesting line of research for future investigations.

Asunto(s)

Encéfalo , Imagen por Resonancia Magnética , Trastornos por Estrés Postraumático , Humanos , Trastornos por Estrés Postraumático/fisiopatología , Trastornos por Estrés Postraumático/diagnóstico por imagen , Trastornos por Estrés Postraumático/psicología , Adolescente , Femenino , Masculino , Encéfalo/fisiopatología , Encéfalo/diagnóstico por imagen , Vías Nerviosas/fisiopatología , Mapeo Encefálico/métodos , Red Nerviosa/fisiopatología , Red Nerviosa/diagnóstico por imagen , Corteza Cerebral/fisiopatología , Corteza Cerebral/diagnóstico por imagen

3.

COMSE: analysis of single-cell RNA-seq data using community detection-based feature selection.

Luo, Qinhuan; Chen, Yaozhu; Lan, Xun.

BMC Biol ; 22(1): 167, 2024 Aug 07.

Artículo en Inglés | MEDLINE | ID: mdl-39113021

RESUMEN

BACKGROUND: Single-cell RNA sequencing enables studying cells individually, yet high gene dimensions and low cell numbers challenge analysis. And only a subset of the genes detected are involved in the biological processes underlying cell-type specific functions. RESULT: In this study, we present COMSE, an unsupervised feature selection framework using community detection to capture informative genes from scRNA-seq data. COMSE identified homogenous cell substates with high resolution, as demonstrated by distinguishing different cell cycle stages. Evaluations based on real and simulated scRNA-seq datasets showed COMSE outperformed methods even with high dropout rates in cell clustering assignment. We also demonstrate that by identifying communities of genes associated with batch effects, COMSE parses signals reflecting biological difference from noise arising due to differences in sequencing protocols, thereby enabling integrated analysis of scRNA-seq datasets of different sources. CONCLUSIONS: COMSE provides an efficient unsupervised framework that selects highly informative genes in scRNA-seq data improving cell sub-states identification and cell clustering. It identifies gene subsets that reveal biological and technical heterogeneity, supporting applications like batch effect correction and pathway analysis. It also provides robust results for bulk RNA-seq data analysis.

Asunto(s)

RNA-Seq , Análisis de Expresión Génica de una Sola Célula , Animales , Humanos , Ratones , RNA-Seq/métodos

4.

Hierarchical Modular Structure of the Drosophila Connectome.

Kunin, Alexander B; Guo, Jiahao; Bassler, Kevin E; Pitkow, Xaq; Josic, Kresimir.

J Neurosci ; 43(37): 6384-6400, 2023 09 13.

Artículo en Inglés | MEDLINE | ID: mdl-37591738

RESUMEN

The structure of neural circuitry plays a crucial role in brain function. Previous studies of brain organization generally had to trade off between coarse descriptions at a large scale and fine descriptions on a small scale. Researchers have now reconstructed tens to hundreds of thousands of neurons at synaptic resolution, enabling investigations into the interplay between global, modular organization, and cell type-specific wiring. Analyzing data of this scale, however, presents unique challenges. To address this problem, we applied novel community detection methods to analyze the synapse-level reconstruction of an adult female Drosophila melanogaster brain containing >20,000 neurons and 10 million synapses. Using a machine-learning algorithm, we find the most densely connected communities of neurons by maximizing a generalized modularity density measure. We resolve the community structure at a range of scales, from large (on the order of thousands of neurons) to small (on the order of tens of neurons). We find that the network is organized hierarchically, and larger-scale communities are composed of smaller-scale structures. Our methods identify well-known features of the fly brain, including its sensory pathways. Moreover, focusing on specific brain regions, we are able to identify subnetworks with distinct connectivity types. For example, manual efforts have identified layered structures in the fan-shaped body. Our methods not only automatically recover this layered structure, but also resolve finer connectivity patterns to downstream and upstream areas. We also find a novel modular organization of the superior neuropil, with distinct clusters of upstream and downstream brain regions dividing the neuropil into several pathways. These methods show that the fine-scale, local network reconstruction made possible by modern experimental methods are sufficiently detailed to identify the organization of the brain across scales, and enable novel predictions about the structure and function of its parts.Significance Statement The Hemibrain is a partial connectome of an adult female Drosophila melanogaster brain containing >20,000 neurons and 10 million synapses. Analyzing the structure of a network of this size requires novel and efficient computational tools. We applied a new community detection method to automatically uncover the modular structure in the Hemibrain dataset by maximizing a generalized modularity measure. This allowed us to resolve the community structure of the fly hemibrain at a range of spatial scales revealing a hierarchical organization of the network, where larger-scale modules are composed of smaller-scale structures. The method also allowed us to identify subnetworks with distinct cell and connectivity structures, such as the layered structures in the fan-shaped body, and the modular organization of the superior neuropil. Thus, network analysis methods can be adopted to the connectomes being reconstructed using modern experimental methods to reveal the organization of the brain across scales. This supports the view that such connectomes will allow us to uncover the organizational structure of the brain, which can ultimately lead to a better understanding of its function.

Asunto(s)

Conectoma , Tetranitrato de Pentaeritritol , Femenino , Animales , Drosophila , Drosophila melanogaster , Encéfalo , Neuronas

5.

Multi-omics analysis reveals epigenetically regulated processes and patient classification in lung adenocarcinoma.

Brativnyk, Anastasia; Ankill, Jørgen; Helland, Åslaug; Fleischer, Thomas.

Int J Cancer ; 155(2): 282-297, 2024 Jul 15.

Artículo en Inglés | MEDLINE | ID: mdl-38489486

RESUMEN

Aberrant DNA methylation is a hallmark of many cancer types. Despite our knowledge of epigenetic and transcriptomic alterations in lung adenocarcinoma (LUAD), we lack robust multi-modal molecular classifications for patient stratification. This is partly because the impact of epigenetic alterations on lung cancer development and progression is still not fully understood. To that end, we identified disease-associated processes under epigenetic regulation in LUAD. We performed a genome-wide expression-methylation Quantitative Trait Loci (emQTL) analysis by integrating DNA methylation and gene expression data from 453 patients in the TCGA cohort. Using a community detection algorithm, we identified distinct communities of CpG-gene associations with diverse biological processes. Interestingly, we identified a community linked to hormone response and lipid metabolism; the identified CpGs in this community were enriched in enhancer regions and binding regions of transcription factors such as FOXA1/2, GRHL2, HNF1B, AR, and ESR1. Furthermore, the CpGs were connected to their associated genes through chromatin interaction loops. These findings suggest that the expression of genes involved in hormone response and lipid metabolism in LUAD is epigenetically regulated through DNA methylation and enhancer-promoter interactions. By applying consensus clustering on the integrated expression-methylation pattern of the emQTL-genes and CpGs linked to hormone response and lipid metabolism, we further identified subclasses of patients with distinct prognoses. This novel patient stratification was validated in an independent patient cohort of 135 patients and showed increased prognostic significance compared to previously defined molecular subtypes.

Asunto(s)

Adenocarcinoma del Pulmón , Islas de CpG , Metilación de ADN , Epigénesis Genética , Regulación Neoplásica de la Expresión Génica , Neoplasias Pulmonares , Sitios de Carácter Cuantitativo , Humanos , Adenocarcinoma del Pulmón/genética , Adenocarcinoma del Pulmón/patología , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patología , Islas de CpG/genética , Femenino , Masculino , Adenocarcinoma/genética , Adenocarcinoma/patología , Perfilación de la Expresión Génica/métodos , Multiómica

6.

Community detection in the human connectome: Method types, differences and their impact on inference.

Brooks, Skylar J; Jones, Victoria O; Wang, Haotian; Deng, Chengyuan; Golding, Staunton G H; Lim, Jethro; Gao, Jie; Daoutidis, Prodromos; Stamoulis, Catherine.

Hum Brain Mapp ; 45(5): e26669, 2024 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-38553865

RESUMEN

Community structure is a fundamental topological characteristic of optimally organized brain networks. Currently, there is no clear standard or systematic approach for selecting the most appropriate community detection method. Furthermore, the impact of method choice on the accuracy and robustness of estimated communities (and network modularity), as well as method-dependent relationships between network communities and cognitive and other individual measures, are not well understood. This study analyzed large datasets of real brain networks (estimated from resting-state fMRI from n $$ n $$ = 5251 pre/early adolescents in the adolescent brain cognitive development [ABCD] study), and n $$ n $$ = 5338 synthetic networks with heterogeneous, data-inspired topologies, with the goal to investigate and compare three classes of community detection methods: (i) modularity maximization-based (Newman and Louvain), (ii) probabilistic (Bayesian inference within the framework of stochastic block modeling (SBM)), and (iii) geometric (based on graph Ricci flow). Extensive comparisons between methods and their individual accuracy (relative to the ground truth in synthetic networks), and reliability (when applied to multiple fMRI runs from the same brains) suggest that the underlying brain network topology plays a critical role in the accuracy, reliability and agreement of community detection methods. Consistent method (dis)similarities, and their correlations with topological properties, were estimated across fMRI runs. Based on synthetic graphs, most methods performed similarly and had comparable high accuracy only in some topological regimes, specifically those corresponding to developed connectomes with at least quasi-optimal community organization. In contrast, in densely and/or weakly connected networks with difficult to detect communities, the methods yielded highly dissimilar results, with Bayesian inference within SBM having significantly higher accuracy compared to all others. Associations between method-specific modularity and demographic, anthropometric, physiological and cognitive parameters showed mostly method invariance but some method dependence as well. Although method sensitivity to different levels of community structure may in part explain method-dependent associations between modularity estimates and parameters of interest, method dependence also highlights potential issues of reliability and reproducibility. These findings suggest that a probabilistic approach, such as Bayesian inference in the framework of SBM, may provide consistently reliable estimates of community structure across network topologies. In addition, to maximize robustness of biological inferences, identified network communities and their cognitive, behavioral and other correlates should be confirmed with multiple reliable detection methods.

Asunto(s)

Conectoma , Adolescente , Humanos , Conectoma/métodos , Reproducibilidad de los Resultados , Teorema de Bayes , Encéfalo/diagnóstico por imagen , Encéfalo/fisiología , Imagen por Resonancia Magnética/métodos

7.

Subgrouping with Chain Graphical VAR Models.

Park, Jonathan J; Chow, Sy-Miin; Epskamp, Sacha; Molenaar, Peter C M.

Multivariate Behav Res ; 59(3): 543-565, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-38351547

RESUMEN

Recent years have seen the emergence of an "idio-thetic" class of methods to bridge the gap between nomothetic and idiographic inference. These methods describe nomothetic trends in idiographic processes by pooling intraindividual information across individuals to inform group-level inference or vice versa. The current work introduces a novel "idio-thetic" model: the subgrouped chain graphical vector autoregression (scGVAR). The scGVAR is unique in its ability to identify subgroups of individuals who share common dynamic network structures in both lag(1) and contemporaneous effects. Results from Monte Carlo simulations indicate that the scGVAR shows promise over similar approaches when clusters of individuals differ in their contemporaneous dynamics and in showing increased sensitivity in detecting nuanced group differences while keeping Type-I error rates low. In contrast, a competing approach-the Alternating Least Squares VAR (ALS VAR) performs well when groups were separated by larger distances. Further considerations are provided regarding applications of the ALS VAR and scGVAR on real data and the strengths and limitations of both methods.

Asunto(s)

Simulación por Computador , Modelos Estadísticos , Método de Montecarlo , Humanos , Simulación por Computador/estadística & datos numéricos , Interpretación Estadística de Datos , Análisis de los Mínimos Cuadrados

8.

Improving the Walktrap Algorithm Using K-Means Clustering.

Brusco, Michael; Steinley, Douglas; Watts, Ashley L.

Multivariate Behav Res ; 59(2): 266-288, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-38361218

RESUMEN

The walktrap algorithm is one of the most popular community-detection methods in psychological research. Several simulation studies have shown that it is often effective at determining the correct number of communities and assigning items to their proper community. Nevertheless, it is important to recognize that the walktrap algorithm relies on hierarchical clustering because it was originally developed for networks much larger than those encountered in psychological research. In this paper, we present and demonstrate a computational alternative to the hierarchical algorithm that is conceptually easier to understand. More importantly, we show that better solutions to the sum-of-squares optimization problem that is heuristically tackled by hierarchical clustering in the walktrap algorithm can often be obtained using exact or approximate methods for K-means clustering. Three simulation studies and analyses of empirical networks were completed to assess the impact of better sum-of-squares solutions.

Asunto(s)

Algoritmos , Simulación por Computador , Análisis por Conglomerados

9.

Exploring Estimation Procedures for Reducing Dimensionality in Psychological Network Modeling.

Shi, Dingjing; Christensen, Alexander P; Day, Eric Anthony; Golino, Hudson F; Garrido, Luis Eduardo.

Multivariate Behav Res ; : 1-27, 2024 Sep 16.

Artículo en Inglés | MEDLINE | ID: mdl-39279587

RESUMEN

To understand psychological data, it is crucial to examine the structure and dimensions of variables. In this study, we examined alternative estimation algorithms to the conventional GLASSO-based exploratory graph analysis (EGA) in network psychometric models to assess the dimensionality structure of the data. The study applied Bayesian conjugate or Jeffreys' priors to estimate the graphical structure and then used the Louvain community detection algorithm to partition and identify groups of nodes, which allowed the detection of the multi- and unidimensional factor structures. Monte Carlo simulations suggested that the two alternative Bayesian estimation algorithms had comparable or better performance when compared with the GLASSO-based EGA and conventional parallel analysis (PA). When estimating the multidimensional factor structure, the analytically based method (i.e., EGA.analytical) showed the best balance between accuracy and mean biased/absolute errors, with the highest accuracy tied with EGA but with the smallest errors. The sampling-based approach (EGA.sampling) yielded higher accuracy and smaller errors than PA; lower accuracy but also lower errors than EGA. Techniques from the two algorithms had more stable performance than EGA and PA across different data conditions. When estimating the unidimensional structure, the PA technique performed the best, followed closely by EGA, and then EGA.analytical and EGA.sampling. Furthermore, the study explored four full Bayesian techniques to assess dimensionality in network psychometrics. The results demonstrated superior performance when using Bayesian hypothesis testing or deriving posterior samples of graph structures under small sample sizes. The study recommends using the EGA.analytical technique as an alternative tool for assessing dimensionality and advocates for the usefulness of the EGA.sampling method as a valuable alternate technique. The findings also indicated encouraging results for extending the regularization-based network modeling EGA method to the Bayesian framework and discussed future directions in this line of work. The study illustrated the practical application of the techniques to two empirical examples in R.

10.

Urban synergistic carbon emissions reduction research: A perspective on spatial complexity and link prediction.

Zhang, Bin; Yin, Jian; Ding, Rui; Chen, Shihui; Luo, Xinyuan; Wei, Danqi.

J Environ Manage ; 370: 122505, 2024 Sep 17.

Artículo en Inglés | MEDLINE | ID: mdl-39293117

RESUMEN

Reducing urban carbon emissions (UCEs) holds paramount importance for global sustainable development. However, the complexity of interactions among urban spatial units has impeded further research on UCEs. This study investigates synergistic emission reduction between cities by analyzing the spatial complexity within the UCEs network. The future potential for synergistic carbon emissions reduction is predicted by the link prediction algorithm. A case study conducted in the Pearl River Basin of China demonstrates that the UCEs network has a complex spatial structure, and the synergistic capacity of emission reduction among cities is enhanced. The core cities in the UCEs network, including Dongguan, Shenzhen, and Guangzhou, have spillover effects that contribute to synergistic emission reduction. Community detection reveals that the common characteristics associated with UCEs become concentrated, thereby enhancing the synergy of joint efforts between cities. The link prediction algorithm indicates a high probability of strengthened carbon emission connections in the Pearl River Delta, alongside those between upstream cities, which shows potential in forecasting synergistic emission reductions. Our research framework offers a comprehensive analysis for synergistic emission reduction from the spatial complexity of UCEs network and link prediction. It acts as a worthwhile reference for developing differentiated policies on synergistic emission reduction.

11.

Comparing community detection algorithms in psychometric networks: A Monte Carlo simulation.

Christensen, Alexander P; Garrido, Luis Eduardo; Guerra-Peña, Kiero; Golino, Hudson.

Behav Res Methods ; 56(3): 1485-1505, 2024 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-37326769

RESUMEN

Identifying the correct number of factors in multivariate data is fundamental to psychological measurement. Factor analysis has a long tradition in the field, but it has been challenged recently by exploratory graph analysis (EGA), an approach based on network psychometrics. EGA first estimates a network and then applies the Walktrap community detection algorithm. Simulation studies have demonstrated that EGA has comparable or better accuracy for recovering the same number of communities as there are factors in the simulated data than factor analytic methods. Despite EGA's effectiveness, there has yet to be an investigation into whether other sparsity induction methods or community detection algorithms could achieve equivalent or better performance. Furthermore, unidimensional structures are fundamental to psychological measurement yet they have been sparsely studied in simulations using community detection algorithms. In the present study, we performed a Monte Carlo simulation using the zero-order correlation matrix, GLASSO, and two variants of a non-regularized partial correlation sparsity induction methods with several community detection algorithms. We examined the performance of these method-algorithm combinations in both continuous and polytomous data across a variety of conditions. The results indicate that the Fast-greedy, Louvain, and Walktrap algorithms paired with the GLASSO method were consistently among the most accurate and least-biased overall.

Asunto(s)

Algoritmos , Humanos , Método de Montecarlo , Psicometría , Simulación por Computador

12.

Comparing the Clique Percolation algorithm to other overlapping community detection algorithms in psychological networks: A Monte Carlo simulation study.

Ribeiro Santiago, Pedro Henrique; Soares, Gustavo Hermes; Quintero, Adrian; Jamieson, Lisa.

Behav Res Methods ; 56(7): 7219-7240, 2024 10.

Artículo en Inglés | MEDLINE | ID: mdl-38693441

RESUMEN

In psychological networks, one limitation of the most used community detection algorithms is that they can only assign each node (symptom) to a unique community, without being able to identify overlapping symptoms. The clique percolation (CP) is an algorithm that identifies overlapping symptoms but its performance has not been evaluated in psychological networks. In this study, we compare the CP with model parameters chosen based on fuzzy modularity (CPMod) with two other alternatives, the ratio of the two largest communities (CPRat), and entropy (CPEnt). We evaluate their performance to: (1) identify the correct number of latent factors (i.e., communities); and (2) identify the observed variables with substantive (and equally sized) cross-loadings (i.e., overlapping symptoms). We carried out simulations under 972 conditions (3x2x2x3x3x3x3): (1) data categories (continuous, polytomous and dichotomous); (2) number of factors (two and four); (3) number of observed variables per factor (four and eight); (4) factor correlations (0.0, 0.5, and 0.7); (5) size of primary factor loadings (0.40, 0.55, and 0.70); (6) proportion of observed variables with substantive cross-loadings (0.0%, 12.5%, and 25.0%); and (7) sample size (300, 500, and 1000). Performance was evaluated through the Omega index, Mean Bias Error (MBE), Mean Absolute Error (MAE), sensitivity, specificity, and mean number of isolated nodes. We also evaluated two other methods, Exploratory Factor Analysis and the Walktrap algorithm modified to consider overlap (EFA-Ov and Walk-Ov, respectively). The Walk-Ov displayed the best performance across most conditions and is the recommended option to identify communities with overlapping symptoms in psychological networks.

Asunto(s)

Algoritmos , Método de Montecarlo , Humanos , Simulación por Computador , Lógica Difusa

13.

Game Theoretic Clustering for Finding Strong Communities.

Zhao, Chao; Al-Bashabsheh, Ali; Chan, Chung.

Entropy (Basel) ; 26(3)2024 Mar 18.

Artículo en Inglés | MEDLINE | ID: mdl-38539779

RESUMEN

We address the challenge of identifying meaningful communities by proposing a model based on convex game theory and a measure of community strength. Many existing community detection methods fail to provide unique solutions, and it remains unclear how the solutions depend on initial conditions. Our approach identifies strong communities with a hierarchical structure, visualizable as a dendrogram, and computable in polynomial time using submodular function minimization. This framework extends beyond graphs to hypergraphs or even polymatroids. In the case when the model is graphical, a more efficient algorithm based on the max-flow min-cut algorithm can be devised. Though not achieving near-linear time complexity, the pursuit of practical algorithms is an intriguing avenue for future research. Our work serves as the foundation, offering an analytical framework that yields unique solutions with clear operational meaning for the communities identified.

14.

F-Deepwalk: A Community Detection Model for Transport Networks.

Guo, Jiaao; Liang, Qinghuai; Zhao, Jiaqi.

Entropy (Basel) ; 26(8)2024 Aug 22.

Artículo en Inglés | MEDLINE | ID: mdl-39202185

RESUMEN

The design of transportation networks is generally performed on the basis of the division of a metropolitan region into communities. With the combination of the scale, population density, and travel characteristics of each community, the transportation routes and stations can be more precisely determined to meet the travel demand of residents within each of the communities as well as the transportation links among communities. To accurately divide urban communities, the original word vector sampling method is improved on the classic Deepwalk model, proposing a Random Walk (RW) algorithm in which the sampling is modified with the generalized travel cost and improved logit model. Urban spatial community detection is realized with the K-means algorithm, building the F-Deepwalk model. Using the basic road network as an example, the experimental results show that the Deepwalk model, which considers the generalized travel cost of residents, has a higher profile coefficient, and the performance of the model improves with the reduction of random walk length. At the same time, taking the Shijiazhuang urban rail transit network as an example, the accuracy of the model is further verified.

15.

Uncertainty in GNN Learning Evaluations: A Comparison between Measures for Quantifying Randomness in GNN Community Detection.

Leeney, William; McConville, Ryan.

Entropy (Basel) ; 26(1)2024 Jan 17.

Artículo en Inglés | MEDLINE | ID: mdl-38248203

RESUMEN

(1) The enhanced capability of graph neural networks (GNNs) in unsupervised community detection of clustered nodes is attributed to their capacity to encode both the connectivity and feature information spaces of graphs. The identification of latent communities holds practical significance in various domains, from social networks to genomics. Current real-world performance benchmarks are perplexing due to the multitude of decisions influencing GNN evaluations for this task. (2) Three metrics are compared to assess the consistency of algorithm rankings in the presence of randomness. The consistency and quality of performance between the results under a hyperparameter optimisation with the default hyperparameters is evaluated. (3) The results compare hyperparameter optimisation with default hyperparameters, revealing a significant performance loss when neglecting hyperparameter investigation. A comparison of metrics indicates that ties in ranks can substantially alter the quantification of randomness. (4) Ensuring adherence to the same evaluation criteria may result in notable differences in the reported performance of methods for this task. The W randomness coefficient, based on the Wasserstein distance, is identified as providing the most robust assessment of randomness.

16.

clusterMaker2: a major update to clusterMaker, a multi-algorithm clustering app for Cytoscape.

Utriainen, Maija; Morris, John H.

BMC Bioinformatics ; 24(1): 134, 2023 Apr 05.

Artículo en Inglés | MEDLINE | ID: mdl-37020209

RESUMEN

BACKGROUND: Since the initial publication of clusterMaker, the need for tools to analyze large biological datasets has only increased. New datasets are significantly larger than a decade ago, and new experimental techniques such as single-cell transcriptomics continue to drive the need for clustering or classification techniques to focus on portions of datasets of interest. While many libraries and packages exist that implement various algorithms, there remains the need for clustering packages that are easy to use, integrated with visualization of the results, and integrated with other commonly used tools for biological data analysis. clusterMaker2 has added several new algorithms, including two entirely new categories of analyses: node ranking and dimensionality reduction. Furthermore, many of the new algorithms have been implemented using the Cytoscape jobs API, which provides a mechanism for executing remote jobs from within Cytoscape. Together, these advances facilitate meaningful analyses of modern biological datasets despite their ever-increasing size and complexity. RESULTS: The use of clusterMaker2 is exemplified by reanalyzing the yeast heat shock expression experiment that was included in our original paper; however, here we explored this dataset in significantly more detail. Combining this dataset with the yeast protein-protein interaction network from STRING, we were able to perform a variety of analyses and visualizations from within clusterMaker2, including Leiden clustering to break the entire network into smaller clusters, hierarchical clustering to look at the overall expression dataset, dimensionality reduction using UMAP to find correlations between our hierarchical visualization and the UMAP plot, fuzzy clustering, and cluster ranking. Using these techniques, we were able to explore the highest-ranking cluster and determine that it represents a strong contender for proteins working together in response to heat shock. We found a series of clusters that, when re-explored as fuzzy clusters, provide a better presentation of mitochondrial processes. CONCLUSIONS: clusterMaker2 represents a significant advance over the previously published version, and most importantly, provides an easy-to-use tool to perform clustering and to visualize clusters within the Cytoscape network context. The new algorithms should be welcome to the large population of Cytoscape users, particularly the new dimensionality reduction and fuzzy clustering techniques.

Asunto(s)

Aplicaciones Móviles , Saccharomyces cerevisiae , Algoritmos , Mapas de Interacción de Proteínas , Análisis por Conglomerados

17.

Molecular complex detection in protein interaction networks through reinforcement learning.

Palukuri, Meghana V; Patil, Ridhi S; Marcotte, Edward M.

BMC Bioinformatics ; 24(1): 306, 2023 Aug 02.

Artículo en Inglés | MEDLINE | ID: mdl-37532987

RESUMEN

BACKGROUND: Proteins often assemble into higher-order complexes to perform their biological functions. Such protein-protein interactions (PPI) are often experimentally measured for pairs of proteins and summarized in a weighted PPI network, to which community detection algorithms can be applied to define the various higher-order protein complexes. Current methods include unsupervised and supervised approaches, often assuming that protein complexes manifest only as dense subgraphs. Utilizing supervised approaches, the focus is not on how to find them in a network, but only on learning which subgraphs correspond to complexes, currently solved using heuristics. However, learning to walk trajectories on a network to identify protein complexes leads naturally to a reinforcement learning (RL) approach, a strategy not extensively explored for community detection. Here, we develop and evaluate a reinforcement learning pipeline for community detection on weighted protein-protein interaction networks to detect new protein complexes. The algorithm is trained to calculate the value of different subgraphs encountered while walking on the network to reconstruct known complexes. A distributed prediction algorithm then scales the RL pipeline to search for novel protein complexes on large PPI networks. RESULTS: The reinforcement learning pipeline is applied to a human PPI network consisting of 8k proteins and 60k PPI, which results in 1,157 protein complexes. The method demonstrated competitive accuracy with improved speed compared to previous algorithms. We highlight protein complexes such as C4orf19, C18orf21, and KIAA1522 which are currently minimally characterized. Additionally, the results suggest TMC04 be a putative additional subunit of the KICSTOR complex and confirm the involvement of C15orf41 in a higher-order complex with HIRA, CDAN1, ASF1A, and by 3D structural modeling. CONCLUSIONS: Reinforcement learning offers several distinct advantages for community detection, including scalability and knowledge of the walk trajectories defining those communities. Applied to currently available human protein interaction networks, this method had comparable accuracy with other algorithms and notable savings in computational time, and in turn, led to clear predictions of protein function and interactions for several uncharacterized human proteins.

Asunto(s)

Algoritmos , Mapas de Interacción de Proteínas , Humanos , Factores de Transcripción , Mapeo de Interacción de Proteínas/métodos , Biología Computacional/métodos , Glicoproteínas , Proteínas Nucleares , Proteínas de Ciclo Celular , Chaperonas Moleculares

18.

Connectivity-based Meta-Bands: A new approach for automatic frequency band identification in connectivity analyses.

Rodríguez-González, Víctor; Núñez, Pablo; Gómez, Carlos; Shigihara, Yoshihito; Hoshi, Hideyuki; Tola-Arribas, Miguel Ángel; Cano, Mónica; Guerrero, Ángel; García-Azorín, David; Hornero, Roberto; Poza, Jesús.

Neuroimage ; 280: 120332, 2023 10 15.

Artículo en Inglés | MEDLINE | ID: mdl-37619796

RESUMEN

The majority of electroencephalographic (EEG) and magnetoencephalographic (MEG) studies filter and analyse neural signals in specific frequency ranges, known as "canonical" frequency bands. However, this segmentation, is not exempt from limitations, mainly due to the lack of adaptation to the neural idiosyncrasies of each individual. In this study, we introduce a new data-driven method to automatically identify frequency ranges based on the topological similarity of the frequency-dependent functional neural network. The resting-state neural activity of 195 cognitively healthy subjects from three different databases (MEG: 123 subjects; EEG1: 27 subjects; EEG2: 45 subjects) was analysed. In a first step, MEG and EEG signals were filtered with a narrow-band filter bank (1 Hz bandwidth) from 1 to 70 Hz with a 0.5 Hz step. Next, the connectivity in each of these filtered signals was estimated using the orthogonalized version of the amplitude envelope correlation to obtain the frequency-dependent functional neural network. Finally, a community detection algorithm was used to identify communities in the frequency domain showing a similar network topology. We have called this approach the "Connectivity-based Meta-Bands" (CMB) algorithm. Additionally, two types of synthetic signals were used to configure the hyper-parameters of the CMB algorithm. We observed that the classical approaches to band segmentation are partially aligned with the underlying network topologies at group level for the MEG signals, but they are missing individual idiosyncrasies that may be biasing previous studies, as revealed by our methodology. On the other hand, the sensitivity of EEG signals to reflect this underlying frequency-dependent network structure is limited, revealing a simpler frequency parcellation, not aligned with that defined by the "canonical" frequency bands. To the best of our knowledge, this is the first study that proposes an unsupervised band segmentation method based on the topological similarity of functional neural network across frequencies. This methodology fully accounts for subject-specific patterns, providing more robust and personalized analyses, and paving the way for new studies focused on exploring the frequency-dependent structure of brain connectivity.

Asunto(s)

Electroencefalografía , Magnetoencefalografía , Humanos , Algoritmos , Encéfalo , Bases de Datos Factuales

19.

DTI-MLCD: predicting drug-target interactions using multi-label learning with community detection method.

Chu, Yanyi; Shan, Xiaoqi; Chen, Tianhang; Jiang, Mingming; Wang, Yanjing; Wang, Qiankun; Salahub, Dennis Russell; Xiong, Yi; Wei, Dong-Qing.

Brief Bioinform ; 22(3)2021 05 20.

Artículo en Inglés | MEDLINE | ID: mdl-32964234

RESUMEN

Identifying drug-target interactions (DTIs) is an important step for drug discovery and drug repositioning. To reduce the experimental cost, a large number of computational approaches have been proposed for this task. The machine learning-based models, especially binary classification models, have been developed to predict whether a drug-target pair interacts or not. However, there is still much room for improvement in the performance of current methods. Multi-label learning can overcome some difficulties caused by single-label learning in order to improve the predictive performance. The key challenge faced by multi-label learning is the exponential-sized output space, and considering label correlations can help to overcome this challenge. In this paper, we facilitate multi-label classification by introducing community detection methods for DTI prediction, named DTI-MLCD. Moreover, we updated the gold standard data set by adding 15,000 more positive DTI samples in comparison to the data set, which has widely been used by most of previously published DTI prediction methods since 2008. The proposed DTI-MLCD is applied to both data sets, demonstrating its superiority over other machine learning methods and several existing methods. The data sets and source code of this study are freely available at https://github.com/a96123155/DTI-MLCD.

Asunto(s)

Algoritmos , Biología Computacional/métodos , Aprendizaje Automático , Preparaciones Farmacéuticas/metabolismo , Proteínas/metabolismo , Simulación por Computador , Descubrimiento de Drogas/métodos , Reposicionamiento de Medicamentos/métodos , Internet , Terapia Molecular Dirigida/métodos , Preparaciones Farmacéuticas/administración & dosificación , Preparaciones Farmacéuticas/química , Unión Proteica , Proteínas/antagonistas & inhibidores , Proteínas/química , Reproducibilidad de los Resultados

20.

Over-representation analysis of angiogenic factors in immunosuppressive mechanisms in neoplasms and neurological conditions during COVID-19.

Chatterjee, S; Sanjeev, B S.

Microb Pathog ; 185: 106386, 2023 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-37865274

RESUMEN

BACKGROUND: Recent studies emphasized the necessity to identify key (human) biological processes and pathways targeted by the Coronaviridae family of viruses, especially Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Coronavirus Disease (COVID-19) caused up to 33-55 % death rates in COVID-19 patients with malignant neoplasms and Alzheimer's disease. Given this scenario, we identified biological processes and pathways involved in various diseases which are most likely affected by COVID-19. METHODS: The COVID-19 DisGeNET data set (v4.0) contains the associations between various diseases and human genes known to interact with viruses from Coronaviridae family and were obtained from the IntAct Coronavirus data set annotated with DisGeNET data. We constructed the disease-gene network to identify genes that are involved in various comorbid diseased states. Communities from the disease-gene network were identified using Louvain method and functional enrichment through over-representation analysis methodology was used to discover significant biological processes and pathways shared between COVID-19 and other diseases. RESULT: The COVID-19 DisGeNET data set (v4.0) comprised of 828 human genes and 10,473 diseases (including various phenotypes) that together constituted nodes in the disease-gene network. Each of the 70,210 edges connects a human gene with an associated disease. The top 10 genes linked to most number of diseases were VEGFA, BCL2, CTNNB1, ALB, COX2, AGT, HLA-A, HMOX1, FGF2 and COMT. The most vulnerable group of patients thus discovered had comorbid conditions such as carcinomas, malignant neoplasms and Alzheimer's disease. Finally, we identified 15 potentially useful biological processes and pathways for improved therapies. Vascular endothelial growth factor (VEGF) is the key mediator of angiogenesis in cancer. It is widely distributed in the brain and plays a crucial role in brain inflammation regulating the level of angiopoietins. With a degree of 1899, VEGFA was associated with maximum number of diseases in the disease-gene network. Previous studies have indicated that increased levels of VEGFA in the blood results in dyspnea, Pulmonary Edema (PE), Acute Lung Injury (ALI) and Acute Respiratory Distress Syndrome (ARDS). In case of COVID-19 patients with neoplasms and other neurological symptoms, our results indicate VEGFA as a therapeutic target for inflammation suppression. As VEGFs are known to disproportionately affect cancer patients, improving endothelial permeability and vasodilation with anti-VEGF therapy could lead to suppression of inflammation and also improve oxygenation. As an outcome of our study, we make case for clinical investigations towards anti-VEGF therapies for such comorbid conditions affected by COVID-19 for better therapeutic outcomes.

Asunto(s)

Enfermedad de Alzheimer , COVID-19 , Neoplasias , Humanos , SARS-CoV-2/genética , SARS-CoV-2/metabolismo , Factor A de Crecimiento Endotelial Vascular/genética , Factor A de Crecimiento Endotelial Vascular/metabolismo , Enfermedad de Alzheimer/genética , Inflamación , Neoplasias/genética , Inmunosupresores

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

Detalles de la búsqueda