Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Nucleic Acids Res ; 52(W1): W481-W488, 2024 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-38783119

RESUMEN

In recent decades, the development of new drugs has become increasingly expensive and inefficient, and the molecular mechanisms of most pharmaceuticals remain poorly understood. In response, computational systems and network medicine tools have emerged to identify potential drug repurposing candidates. However, these tools often require complex installation and lack intuitive visual network mining capabilities. To tackle these challenges, we introduce Drugst.One, a platform that assists specialized computational medicine tools in becoming user-friendly, web-based utilities for drug repurposing. With just three lines of code, Drugst.One turns any systems biology software into an interactive web tool for modeling and analyzing complex protein-drug-disease networks. Demonstrating its broad adaptability, Drugst.One has been successfully integrated with 21 computational systems medicine tools. Available at https://drugst.one, Drugst.One has significant potential for streamlining the drug discovery process, allowing researchers to focus on essential aspects of pharmaceutical treatment research.


Asunto(s)
Reposicionamiento de Medicamentos , Programas Informáticos , Reposicionamiento de Medicamentos/métodos , Humanos , Internet , Descubrimiento de Drogas/métodos , Biología de Sistemas/métodos , Biología Computacional/métodos
2.
Bioinformatics ; 39(5)2023 05 04.
Artículo en Inglés | MEDLINE | ID: mdl-37084262

RESUMEN

MOTIVATION: Advances in omics technologies have revolutionized cancer research by producing massive datasets. Common approaches to deciphering these complex data are by embedding algorithms of molecular interaction networks. These algorithms find a low-dimensional space in which similarities between the network nodes are best preserved. Currently available embedding approaches mine the gene embeddings directly to uncover new cancer-related knowledge. However, these gene-centric approaches produce incomplete knowledge, since they do not account for the functional implications of genomic alterations. We propose a new, function-centric perspective and approach, to complement the knowledge obtained from omic data. RESULTS: We introduce our Functional Mapping Matrix (FMM) to explore the functional organization of different tissue-specific and species-specific embedding spaces generated by a Non-negative Matrix Tri-Factorization algorithm. Also, we use our FMM to define the optimal dimensionality of these molecular interaction network embedding spaces. For this optimal dimensionality, we compare the FMMs of the most prevalent cancers in human to FMMs of their corresponding control tissues. We find that cancer alters the positions in the embedding space of cancer-related functions, while it keeps the positions of the noncancer-related ones. We exploit this spacial 'movement' to predict novel cancer-related functions. Finally, we predict novel cancer-related genes that the currently available methods for gene-centric analyses cannot identify; we validate these predictions by literature curation and retrospective analyses of patient survival data. AVAILABILITY AND IMPLEMENTATION: Data and source code can be accessed at https://github.com/gaiac/FMM.


Asunto(s)
Neoplasias , Humanos , Estudios Retrospectivos , Neoplasias/genética , Programas Informáticos , Algoritmos , Genómica/métodos
3.
Bioinformatics ; 38(18): 4344-4351, 2022 09 15.
Artículo en Inglés | MEDLINE | ID: mdl-35916710

RESUMEN

MOTIVATION: Cancer is a genetic disease in which accumulated mutations of driver genes induce a functional reorganization of the cell by reprogramming cellular pathways. Current approaches identify cancer pathways as those most internally perturbed by gene expression changes. However, driver genes characteristically perform hub roles between pathways. Therefore, we hypothesize that cancer pathways should be identified by changes in their pathway-pathway relationships. RESULTS: To learn an embedding space that captures the relationships between pathways in a healthy cell, we propose pathway-driven non-negative matrix tri-factorization. In this space, we determine condition-specific (i.e. diseased and healthy) embeddings of pathways and genes. Based on these embeddings, we define our 'NMTF centrality' to measure a pathway's or gene's functional importance, and our 'moving distance', to measure the change in its functional relationships. We combine both measures to predict 15 genes and pathways involved in four major cancers, predicting 60 gene-cancer associations in total, covering 28 unique genes. To further exploit driver genes' tendency to perform hub roles, we model our network data using graphlet adjacency, which considers nodes adjacent if their interaction patterns form specific shapes (e.g. paths or triangles). We find that the predicted genes rewire pathway-pathway interactions in the immune system and provide literary evidence that many are druggable (15/28) and implicated in the associated cancers (47/60). We predict six druggable cancer-specific drug targets. AVAILABILITY AND IMPLEMENTATION: The code and data are available at: https://gitlab.bsc.es/swindels/pathway_driven_nmtf. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Neoplasias , Humanos , Neoplasias/genética , Algoritmos , Mutación , Sistemas de Liberación de Medicamentos
4.
Int J Mol Sci ; 24(2)2023 Jan 11.
Artículo en Inglés | MEDLINE | ID: mdl-36674947

RESUMEN

The COVID-19 pandemic is an acute and rapidly evolving global health crisis. To better understand this disease's molecular basis and design therapeutic strategies, we built upon the recently proposed concept of an integrated cell, iCell, fusing three omics, tissue-specific human molecular interaction networks. We applied this methodology to construct infected and control iCells using gene expression data from patient samples and three cell lines. We found large differences between patient-based and cell line-based iCells (both infected and control), suggesting that cell lines are ill-suited to studying this disease. We compared patient-based infected and control iCells and uncovered genes whose functioning (wiring patterns in iCells) is altered by the disease. We validated in the literature that 18 out of the top 20 of the most rewired genes are indeed COVID-19-related. Since only three of these genes are targets of approved drugs, we applied another data fusion step to predict drugs for re-purposing. We confirmed with molecular docking that the predicted drugs can bind to their predicted targets. Our most interesting prediction is artenimol, an antimalarial agent targeting ZFP62, one of our newly identified COVID-19-related genes. This drug is a derivative of artemisinin drugs that are already under clinical investigation for their potential role in the treatment of COVID-19. Our results demonstrate further applicability of the iCell framework for integrative comparative studies of human diseases.


Asunto(s)
COVID-19 , Humanos , COVID-19/genética , Simulación del Acoplamiento Molecular , Pandemias , Reposicionamiento de Medicamentos
5.
Bioinformatics ; 37(7): 1000-1007, 2021 05 17.
Artículo en Inglés | MEDLINE | ID: mdl-32886115

RESUMEN

MOTIVATION: Biological and cellular systems are often modeled as graphs in which vertices represent objects of interest (genes, proteins and drugs) and edges represent relational ties between these objects (binds-to, interacts-with and regulates). This approach has been highly successful owing to the theory, methodology and software that support analysis and learning on graphs. Graphs, however, suffer from information loss when modeling physical systems due to their inability to accurately represent multiobject relationships. Hypergraphs, a generalization of graphs, provide a framework to mitigate information loss and unify disparate graph-based methodologies. RESULTS: We present a hypergraph-based approach for modeling biological systems and formulate vertex classification, edge classification and link prediction problems on (hyper)graphs as instances of vertex classification on (extended, dual) hypergraphs. We then introduce a novel kernel method on vertex- and edge-labeled (colored) hypergraphs for analysis and learning. The method is based on exact and inexact (via hypergraph edit distances) enumeration of hypergraphlets; i.e. small hypergraphs rooted at a vertex of interest. We empirically evaluate this method on fifteen biological networks and show its potential use in a positive-unlabeled setting to estimate the interactome sizes in various species. AVAILABILITY AND IMPLEMENTATION: https://github.com/jlugomar/hypergraphlet-kernels. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Proteínas , Programas Informáticos
6.
Bioinformatics ; 36(Suppl_2): i804-i812, 2020 12 30.
Artículo en Inglés | MEDLINE | ID: mdl-33381834

RESUMEN

MOTIVATION: Molecular interactions have been successfully modeled and analyzed as networks, where nodes represent molecules and edges represent the interactions between them. These networks revealed that molecules with similar local network structure also have similar biological functions. The most sensitive measures of network structure are based on graphlets. However, graphlet-based methods thus far are only applicable to unweighted networks, whereas real-world molecular networks may have weighted edges that can represent the probability of an interaction occurring in the cell. This information is commonly discarded when applying thresholds to generate unweighted networks, which may lead to information loss. RESULTS: We introduce probabilistic graphlets as a tool for analyzing the local wiring patterns of probabilistic networks. To assess their performance compared to unweighted graphlets, we generate synthetic networks based on different well-known random network models and edge probability distributions and demonstrate that probabilistic graphlets outperform their unweighted counterparts in distinguishing network structures. Then we model different real-world molecular interaction networks as weighted graphs with probabilities as weights on edges and we analyze them with our new weighted graphlets-based methods. We show that due to their probabilistic nature, probabilistic graphlet-based methods more robustly capture biological information in these data, while simultaneously showing a higher sensitivity to identify condition-specific functions compared to their unweighted graphlet-based method counterparts. AVAILABILITYAND IMPLEMENTATION: Our implementation of probabilistic graphlets is available at https://github.com/Serdobe/Probabilistic_Graphlets. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Probabilidad
7.
Bioinformatics ; 35(19): 3727-3734, 2019 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-30821317

RESUMEN

MOTIVATION: Protein-protein interactions (PPIs) are usually modeled as networks. These networks have extensively been studied using graphlets, small induced subgraphs capturing the local wiring patterns around nodes in networks. They revealed that proteins involved in similar functions tend to be similarly wired. However, such simple models can only represent pairwise relationships and cannot fully capture the higher-order organization of protein interactomes, including protein complexes. RESULTS: To model the multi-scale organization of these complex biological systems, we utilize simplicial complexes from computational geometry. The question is how to mine these new representations of protein interactomes to reveal additional biological information. To address this, we define simplets, a generalization of graphlets to simplicial complexes. By using simplets, we define a sensitive measure of similarity between simplicial complex representations that allows for clustering them according to their data types better than clustering them by using other state-of-the-art measures, e.g. spectral distance, or facet distribution distance. We model human and baker's yeast protein interactomes as simplicial complexes that capture PPIs and protein complexes as simplices. On these models, we show that our newly introduced simplet-based methods cluster proteins by function better than the clustering methods that use the standard PPI networks, uncovering the new underlying functional organization of the cell. We demonstrate the existence of the functional geometry in the protein interactome data and the superiority of our simplet-based methods to effectively mine for new biological information hidden in the complexity of the higher-order organization of protein interactomes. AVAILABILITY AND IMPLEMENTATION: Codes and datasets are freely available at http://www0.cs.ucl.ac.uk/staff/natasa/Simplets/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Mapeo de Interacción de Proteínas , Análisis por Conglomerados , Humanos , Proteínas , Saccharomyces cerevisiae
8.
Bioinformatics ; 35(24): 5226-5234, 2019 12 15.
Artículo en Inglés | MEDLINE | ID: mdl-31192358

RESUMEN

MOTIVATION: Laplacian matrices capture the global structure of networks and are widely used to study biological networks. However, the local structure of the network around a node can also capture biological information. Local wiring patterns are typically quantified by counting how often a node touches different graphlets (small, connected, induced sub-graphs). Currently available graphlet-based methods do not consider whether nodes are in the same network neighbourhood. To combine graphlet-based topological information and membership of nodes to the same network neighbourhood, we generalize the Laplacian to the Graphlet Laplacian, by considering a pair of nodes to be 'adjacent' if they simultaneously touch a given graphlet. RESULTS: We utilize Graphlet Laplacians to generalize spectral embedding, spectral clustering and network diffusion. Applying Graphlet Laplacian-based spectral embedding, we visually demonstrate that Graphlet Laplacians capture biological functions. This result is quantified by applying Graphlet Laplacian-based spectral clustering, which uncovers clusters enriched in biological functions dependent on the underlying graphlet. We explain the complementarity of biological functions captured by different Graphlet Laplacians by showing that they capture different local topologies. Finally, diffusing pan-cancer gene mutation scores based on different Graphlet Laplacians, we find complementary sets of cancer-related genes. Hence, we demonstrate that Graphlet Laplacians capture topology-function and topology-disease relationships in biological networks. AVAILABILITY AND IMPLEMENTATION: http://www0.cs.ucl.ac.uk/staff/natasa/graphlet-laplacian/index.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Análisis por Conglomerados , Mapeo de Interacción de Proteínas
9.
Bioinformatics ; 34(17): i944-i953, 2018 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-30423061

RESUMEN

Motivation: Molecular interactions have widely been modelled as networks. The local wiring patterns around molecules in molecular networks are linked with their biological functions. However, networks model only pairwise interactions between molecules and cannot explicitly and directly capture the higher-order molecular organization, such as protein complexes and pathways. Hence, we ask if hypergraphs (hypernetworks), that directly capture entire complexes and pathways along with protein-protein interactions (PPIs), carry additional functional information beyond what can be uncovered from networks of pairwise molecular interactions. The mathematical formalism of a hypergraph has long been known, but not often used in studying molecular networks due to the lack of sophisticated algorithms for mining the underlying biological information hidden in the wiring patterns of molecular systems modelled as hypernetworks. Results: We propose a new, multi-scale, protein interaction hypernetwork model that utilizes hypergraphs to capture different scales of protein organization, including PPIs, protein complexes and pathways. In analogy to graphlets, we introduce hypergraphlets, small, connected, non-isomorphic, induced sub-hypergraphs of a hypergraph, to quantify the local wiring patterns of these multi-scale molecular hypergraphs and to mine them for new biological information. We apply them to model the multi-scale protein networks of bakers yeast and human and show that the higher-order molecular organization captured by these hypergraphs is strongly related to the underlying biology. Importantly, we demonstrate that our new models and data mining tools reveal different, but complementary biological information compared with classical PPI networks. We apply our hypergraphlets to successfully predict biological functions of uncharacterized proteins. Availability and implementation: Code and data are available online at http://www0.cs.ucl.ac.uk/staff/natasa/hypergraphlets.


Asunto(s)
Mapeo de Interacción de Proteínas/métodos , Algoritmos , Humanos , Proteínas/metabolismo , Saccharomyces cerevisiae/metabolismo
10.
Bioinformatics ; 32(8): 1195-203, 2016 04 15.
Artículo en Inglés | MEDLINE | ID: mdl-26668003

RESUMEN

MOTIVATION: Discovering patterns in networks of protein-protein interactions (PPIs) is a central problem in systems biology. Alignments between these networks aid functional understanding as they uncover important information, such as evolutionary conserved pathways, protein complexes and functional orthologs. However, the complexity of the multiple network alignment problem grows exponentially with the number of networks being aligned and designing a multiple network aligner that is both scalable and that produces biologically relevant alignments is a challenging task that has not been fully addressed. The objective of multiple network alignment is to create clusters of nodes that are evolutionarily and functionally conserved across all networks. Unfortunately, the alignment methods proposed thus far do not meet this objective as they are guided by pairwise scores that do not utilize the entire functional and evolutionary information across all networks. RESULTS: To overcome this weakness, we propose Fuse, a new multiple network alignment algorithm that works in two steps. First, it computes our novel protein functional similarity scores by fusing information from wiring patterns of all aligned PPI networks and sequence similarities between their proteins. This is in contrast with the previous tools that are all based on protein similarities in pairs of networks being aligned. Our comprehensive new protein similarity scores are computed by Non-negative Matrix Tri-Factorization (NMTF) method that predicts associations between proteins whose homology (from sequences) and functioning similarity (from wiring patterns) are supported by all networks. Using the five largest and most complete PPI networks from BioGRID, we show that NMTF predicts a large number protein pairs that are biologically consistent. Second, to identify clusters of aligned proteins over all networks, Fuse uses our novel maximum weight k-partite matching approximation algorithm. We compare Fuse with the state of the art multiple network aligners and show that (i) by using only sequence alignment scores, Fuse already outperforms other aligners and produces a larger number of biologically consistent clusters that cover all aligned PPI networks and (ii) using both sequence alignments and topological NMTF-predicted scores leads to the best multiple network alignments thus far. AVAILABILITY AND IMPLEMENTATION: Our dataset and software are freely available from the web site: http://bio-nets.doc.ic.ac.uk/Fuse/ CONTACT: natasha@imperial.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Biología Computacional/métodos , Mapeo de Interacción de Proteínas , Algoritmos , Proteínas , Alineación de Secuencia , Programas Informáticos
11.
Proteomics ; 16(5): 741-58, 2016 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-26677817

RESUMEN

We provide an overview of recent developments in big data analyses in the context of precision medicine and health informatics. With the advance in technologies capturing molecular and medical data, we entered the area of "Big Data" in biology and medicine. These data offer many opportunities to advance precision medicine. We outline key challenges in precision medicine and present recent advances in data integration-based methods to uncover personalized information from big data produced by various omics studies. We survey recent integrative methods for disease subtyping, biomarkers discovery, and drug repurposing, and list the tools that are available to domain scientists. Given the ever-growing nature of these big data, we highlight key issues that big data integration methods will face.


Asunto(s)
Biomarcadores/análisis , Biología Computacional/métodos , Reposicionamiento de Medicamentos/métodos , Informática Médica/métodos , Medicina de Precisión/métodos , Investigación Biomédica , Epigenómica/métodos , Humanos , Metabolómica/métodos , Proteómica/métodos , Transcriptoma/genética
12.
Bioinformatics ; 31(13): 2182-9, 2015 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-25725498

RESUMEN

MOTIVATION: Discovering and understanding patterns in networks of protein-protein interactions (PPIs) is a central problem in systems biology. Alignments between these networks aid functional understanding as they uncover important information, such as evolutionary conserved pathways, protein complexes and functional orthologs. A few methods have been proposed for global PPI network alignments, but because of NP-completeness of underlying sub-graph isomorphism problem, producing topologically and biologically accurate alignments remains a challenge. RESULTS: We introduce a novel global network alignment tool, Lagrangian GRAphlet-based ALigner (L-GRAAL), which directly optimizes both the protein and the interaction functional conservations, using a novel alignment search heuristic based on integer programming and Lagrangian relaxation. We compare L-GRAAL with the state-of-the-art network aligners on the largest available PPI networks from BioGRID and observe that L-GRAAL uncovers the largest common sub-graphs between the networks, as measured by edge-correctness and symmetric sub-structures scores, which allow transferring more functional information across networks. We assess the biological quality of the protein mappings using the semantic similarity of their Gene Ontology annotations and observe that L-GRAAL best uncovers functionally conserved proteins. Furthermore, we introduce for the first time a measure of the semantic similarity of the mapped interactions and show that L-GRAAL also uncovers best functionally conserved interactions. In addition, we illustrate on the PPI networks of baker's yeast and human the ability of L-GRAAL to predict new PPIs. Finally, L-GRAAL's results are the first to show that topological information is more important than sequence information for uncovering functionally conserved interactions. AVAILABILITY AND IMPLEMENTATION: L-GRAAL is coded in C++. Software is available at: http://bio-nets.doc.ic.ac.uk/L-GRAAL/. CONTACT: n.malod-dognin@imperial.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Gráficos por Computador , Mapeo de Interacción de Proteínas/métodos , Proteínas/metabolismo , Programas Informáticos , Bases de Datos de Proteínas , Humanos , Anotación de Secuencia Molecular , Saccharomyces cerevisiae/metabolismo , Biología de Sistemas
13.
Bioinformatics ; 31(10): 1632-9, 2015 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-25609797

RESUMEN

MOTIVATION: Proteins underlay the functioning of a cell and the wiring of proteins in protein-protein interaction network (PIN) relates to their biological functions. Proteins with similar wiring in the PIN (topology around them) have been shown to have similar functions. This property has been successfully exploited for predicting protein functions. Topological similarity is also used to guide network alignment algorithms that find similarly wired proteins between PINs of different species; these similarities are used to transfer annotation across PINs, e.g. from model organisms to human. To refine these functional predictions and annotation transfers, we need to gain insight into the variability of the topology-function relationships. For example, a function may be significantly associated with specific topologies, while another function may be weakly associated with several different topologies. Also, the topology-function relationships may differ between different species. RESULTS: To improve our understanding of topology-function relationships and of their conservation among species, we develop a statistical framework that is built upon canonical correlation analysis. Using the graphlet degrees to represent the wiring around proteins in PINs and gene ontology (GO) annotations to describe their functions, our framework: (i) characterizes statistically significant topology-function relationships in a given species, and (ii) uncovers the functions that have conserved topology in PINs of different species, which we term topologically orthologous functions. We apply our framework to PINs of yeast and human, identifying seven biological process and two cellular component GO terms to be topologically orthologous for the two organisms.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Redes Reguladoras de Genes , Mapeo de Interacción de Proteínas/métodos , Proteínas/metabolismo , Saccharomyces cerevisiae/metabolismo , Secuencia Conservada , Ontología de Genes , Humanos , Anotación de Secuencia Molecular , Saccharomyces cerevisiae/genética
14.
Bioinformatics ; 30(9): 1259-65, 2014 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-24443377

RESUMEN

MOTIVATION: Protein structure alignment is key for transferring information from well-studied proteins to less studied ones. Structural alignment identifies the most precise mapping of equivalent residues, as structures are more conserved during evolution than sequences. Among the methods for aligning protein structures, maximum Contact Map Overlap (CMO) has received sustained attention during the past decade. Yet, known algorithms exhibit modest performance and are not applicable for large-scale comparison. RESULTS: Graphlets are small induced subgraphs that are used to design sensitive topological similarity measures between nodes and networks. By generalizing graphlets to ordered graphs, we introduce GR-Align, a CMO heuristic that is suited for database searches. On the Proteus_300 set (44 850 protein domain pairs), GR-Align is several orders of magnitude faster than the state-of-the-art CMO solvers Apurva, MSVNS and AlEigen7, and its similarity score is in better agreement with the structural classification of proteins. On a large-scale experiment on the Gold-standard benchmark dataset (3 207 270 protein domain pairs), GR-Align is several orders of magnitude faster than the state-of-the-art protein structure comparison tools TM-Align, DaliLite, MATT and Yakusa, while achieving similar classification performances. Finally, we illustrate the difference between GR-Align's flexible alignments and the traditional ones by querying a flexible protein in the Astral-40 database (11 154 protein domains). In this experiment, GR-Align's top scoring alignments are not only in better agreement with structural classification of proteins, but also that they allow transferring more information across proteins.


Asunto(s)
Proteínas/química , Algoritmos , Carbono/química , Humanos , Conformación Proteica , Factores de Tiempo
15.
Nucleic Acids Res ; 40(Web Server issue): W303-9, 2012 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-22553365

RESUMEN

CSA is a web server for the computation, evaluation and comprehensive comparison of pairwise protein structure alignments. Its exact alignment engine computes either optimal, top-scoring alignments or heuristic alignments with quality guarantee for the inter-residue distance-based scorings of contact map overlap, PAUL, DALI and MATRAS. These and additional, uploaded alignments are compared using a number of quality measures and intuitive visualizations. CSA brings new insight into the structural relationship of the protein pairs under investigation and is a valuable tool for studying structural similarities. It is available at http://csa.project.cwi.nl.


Asunto(s)
Programas Informáticos , Homología Estructural de Proteína , Algoritmos , Calmodulina/química , Internet
16.
Bioinform Adv ; 4(1): vbae075, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38827411

RESUMEN

Summary: Common approaches for deciphering biological networks involve network embedding algorithms. These approaches strictly focus on clustering the genes' embedding vectors and interpreting such clusters to reveal the hidden information of the networks. However, the difficulty in interpreting the genes' clusters and the limitations of the functional annotations' resources hinder the identification of the currently unknown cell's functioning mechanisms. We propose a new approach that shifts this functional exploration from the embedding vectors of genes in space to the axes of the space itself. Our methodology better disentangles biological information from the embedding space than the classic gene-centric approach. Moreover, it uncovers new data-driven functional interactions that are unregistered in the functional ontologies, but biologically coherent. Furthermore, we exploit these interactions to define new higher-level annotations that we term Axes-Specific Functional Annotations and validate them through literature curation. Finally, we leverage our methodology to discover evolutionary connections between cellular functions and the evolution of species. Availability and implementation: Data and source code can be accessed at https://gitlab.bsc.es/sdoria/axes-of-biology.git.

17.
Sci Rep ; 14(1): 10983, 2024 05 14.
Artículo en Inglés | MEDLINE | ID: mdl-38744869

RESUMEN

Parkinson's disease (PD) is a complex neurodegenerative disorder without a cure. The onset of PD symptoms corresponds to 50% loss of midbrain dopaminergic (mDA) neurons, limiting early-stage understanding of PD. To shed light on early PD development, we study time series scRNA-seq datasets of mDA neurons obtained from patient-derived induced pluripotent stem cell differentiation. We develop a new data integration method based on Non-negative Matrix Tri-Factorization that integrates these datasets with molecular interaction networks, producing condition-specific "gene embeddings". By mining these embeddings, we predict 193 PD-related genes that are largely supported (49.7%) in the literature and are specific to the investigated PINK1 mutation. Enrichment analysis in Kyoto Encyclopedia of Genes and Genomes pathways highlights 10 PD-related molecular mechanisms perturbed during early PD development. Finally, investigating the top 20 prioritized genes reveals 12 previously unrecognized genes associated with PD that represent interesting drug targets.


Asunto(s)
Neuronas Dopaminérgicas , Enfermedad de Parkinson , Enfermedad de Parkinson/genética , Enfermedad de Parkinson/patología , Humanos , Neuronas Dopaminérgicas/metabolismo , Neuronas Dopaminérgicas/patología , RNA-Seq/métodos , Células Madre Pluripotentes Inducidas/metabolismo , Mesencéfalo/metabolismo , Mesencéfalo/patología , Redes Reguladoras de Genes , Mutación , Diferenciación Celular/genética , Multiómica , Análisis de Expresión Génica de una Sola Célula
19.
PLoS One ; 18(4): e0284084, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37098010

RESUMEN

Antithrombin resistance is a rare subtype of hereditary thrombophilia caused by prothrombin gene variants, leading to thrombotic disorders. Recently, the Prothrombin Belgrade variant has been reported as a specific variant that leads to antithrombin resistance in two Serbian families with thrombosis. However, due to clinical data scarcity and the inapplicability of traditional genome-wide association studies (GWAS), a broader perspective on molecular and phenotypic mechanisms associated with the Prothrombin Belgrade variant is yet to be uncovered. Here, we propose an integrative framework to address the lack of genomic samples and support the genomic signal from the full genome sequences of five heterozygous subjects by integrating it with subjects' phenotypes and the genes' molecular interactions. Our goal is to identify candidate thrombophilia-related genes for which our subjects possess germline variants by focusing on the resulting gene clusters of our integrative framework. We applied a Non-negative Matrix Tri-Factorization-based method to simultaneously integrate different data sources, taking into account the observed phenotypes. In other words, our data-integration framework reveals gene clusters involved with this rare disease by fusing different datasets. Our results are in concordance with the current literature about antithrombin resistance. We also found candidate disease-related genes that need to be further investigated. CD320, RTEL1, UCP2, APOA5 and PROZ participate in healthy-specific or disease-specific subnetworks involving thrombophilia-annotated genes and are related to general thrombophilia mechanisms according to the literature. Moreover, the ADRA2A and TBXA2R subnetworks analysis suggested that their variants may have a protective effect due to their connection with decreased platelet activation. The results show that our method can give insights into antithrombin resistance even if a small amount of genetic data is available. Our framework is also customizable, meaning that it applies to any other rare disease.


Asunto(s)
Trombofilia , Trombosis , Humanos , Protrombina , Estudio de Asociación del Genoma Completo , Enfermedades Raras , Mutación , Trombofilia/genética , Antitrombinas , Anticoagulantes , Antitrombina III , Fenotipo
20.
Cancer Gene Ther ; 30(10): 1330-1345, 2023 10.
Artículo en Inglés | MEDLINE | ID: mdl-37420093

RESUMEN

Therapy Induced Senescence (TIS) leads to sustained growth arrest of cancer cells. The associated cytostasis has been shown to be reversible and cells escaping senescence further enhance the aggressiveness of cancers. Chemicals specifically targeting senescent cells, so-called senolytics, constitute a promising avenue for improved cancer treatment in combination with targeted therapies. Understanding how cancer cells evade senescence is needed to optimise the clinical benefits of this therapeutic approach. Here we characterised the response of three different NRAS mutant melanoma cell lines to a combination of CDK4/6 and MEK inhibitors over 33 days. Transcriptomic data show that all cell lines trigger a senescence programme coupled with strong induction of interferons. Kinome profiling revealed the activation of Receptor Tyrosine Kinases (RTKs) and enriched downstream signaling of neurotrophin, ErbB and insulin pathways. Characterisation of the miRNA interactome associates miR-211-5p with resistant phenotypes. Finally, iCell-based integration of bulk and single-cell RNA-seq data identifies biological processes perturbed during senescence and predicts 90 new genes involved in its escape. Overall, our data associate insulin signaling with persistence of a senescent phenotype and suggest a new role for interferon gamma in senescence escape through the induction of EMT and the activation of ERK5 signaling.


Asunto(s)
Insulinas , Melanoma , Humanos , Multiómica , Línea Celular Tumoral , Melanoma/tratamiento farmacológico , Melanoma/genética , Inhibidores de Proteínas Quinasas/farmacología , Inhibidores de Proteínas Quinasas/uso terapéutico , Insulinas/uso terapéutico , Senescencia Celular/genética , Proteínas de la Membrana/genética , GTP Fosfohidrolasas/genética , GTP Fosfohidrolasas/uso terapéutico
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA