Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 79
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Nucleic Acids Res ; 52(W1): W481-W488, 2024 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-38783119

RESUMEN

In recent decades, the development of new drugs has become increasingly expensive and inefficient, and the molecular mechanisms of most pharmaceuticals remain poorly understood. In response, computational systems and network medicine tools have emerged to identify potential drug repurposing candidates. However, these tools often require complex installation and lack intuitive visual network mining capabilities. To tackle these challenges, we introduce Drugst.One, a platform that assists specialized computational medicine tools in becoming user-friendly, web-based utilities for drug repurposing. With just three lines of code, Drugst.One turns any systems biology software into an interactive web tool for modeling and analyzing complex protein-drug-disease networks. Demonstrating its broad adaptability, Drugst.One has been successfully integrated with 21 computational systems medicine tools. Available at https://drugst.one, Drugst.One has significant potential for streamlining the drug discovery process, allowing researchers to focus on essential aspects of pharmaceutical treatment research.


Asunto(s)
Reposicionamiento de Medicamentos , Programas Informáticos , Reposicionamiento de Medicamentos/métodos , Humanos , Internet , Descubrimiento de Drogas/métodos , Biología de Sistemas/métodos , Biología Computacional/métodos
2.
Bioinformatics ; 39(5)2023 05 04.
Artículo en Inglés | MEDLINE | ID: mdl-37084262

RESUMEN

MOTIVATION: Advances in omics technologies have revolutionized cancer research by producing massive datasets. Common approaches to deciphering these complex data are by embedding algorithms of molecular interaction networks. These algorithms find a low-dimensional space in which similarities between the network nodes are best preserved. Currently available embedding approaches mine the gene embeddings directly to uncover new cancer-related knowledge. However, these gene-centric approaches produce incomplete knowledge, since they do not account for the functional implications of genomic alterations. We propose a new, function-centric perspective and approach, to complement the knowledge obtained from omic data. RESULTS: We introduce our Functional Mapping Matrix (FMM) to explore the functional organization of different tissue-specific and species-specific embedding spaces generated by a Non-negative Matrix Tri-Factorization algorithm. Also, we use our FMM to define the optimal dimensionality of these molecular interaction network embedding spaces. For this optimal dimensionality, we compare the FMMs of the most prevalent cancers in human to FMMs of their corresponding control tissues. We find that cancer alters the positions in the embedding space of cancer-related functions, while it keeps the positions of the noncancer-related ones. We exploit this spacial 'movement' to predict novel cancer-related functions. Finally, we predict novel cancer-related genes that the currently available methods for gene-centric analyses cannot identify; we validate these predictions by literature curation and retrospective analyses of patient survival data. AVAILABILITY AND IMPLEMENTATION: Data and source code can be accessed at https://github.com/gaiac/FMM.


Asunto(s)
Neoplasias , Humanos , Estudios Retrospectivos , Neoplasias/genética , Programas Informáticos , Algoritmos , Genómica/métodos
3.
Bioinformatics ; 38(18): 4344-4351, 2022 09 15.
Artículo en Inglés | MEDLINE | ID: mdl-35916710

RESUMEN

MOTIVATION: Cancer is a genetic disease in which accumulated mutations of driver genes induce a functional reorganization of the cell by reprogramming cellular pathways. Current approaches identify cancer pathways as those most internally perturbed by gene expression changes. However, driver genes characteristically perform hub roles between pathways. Therefore, we hypothesize that cancer pathways should be identified by changes in their pathway-pathway relationships. RESULTS: To learn an embedding space that captures the relationships between pathways in a healthy cell, we propose pathway-driven non-negative matrix tri-factorization. In this space, we determine condition-specific (i.e. diseased and healthy) embeddings of pathways and genes. Based on these embeddings, we define our 'NMTF centrality' to measure a pathway's or gene's functional importance, and our 'moving distance', to measure the change in its functional relationships. We combine both measures to predict 15 genes and pathways involved in four major cancers, predicting 60 gene-cancer associations in total, covering 28 unique genes. To further exploit driver genes' tendency to perform hub roles, we model our network data using graphlet adjacency, which considers nodes adjacent if their interaction patterns form specific shapes (e.g. paths or triangles). We find that the predicted genes rewire pathway-pathway interactions in the immune system and provide literary evidence that many are druggable (15/28) and implicated in the associated cancers (47/60). We predict six druggable cancer-specific drug targets. AVAILABILITY AND IMPLEMENTATION: The code and data are available at: https://gitlab.bsc.es/swindels/pathway_driven_nmtf. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Neoplasias , Humanos , Neoplasias/genética , Algoritmos , Mutación , Sistemas de Liberación de Medicamentos
4.
Int J Mol Sci ; 24(2)2023 Jan 11.
Artículo en Inglés | MEDLINE | ID: mdl-36674947

RESUMEN

The COVID-19 pandemic is an acute and rapidly evolving global health crisis. To better understand this disease's molecular basis and design therapeutic strategies, we built upon the recently proposed concept of an integrated cell, iCell, fusing three omics, tissue-specific human molecular interaction networks. We applied this methodology to construct infected and control iCells using gene expression data from patient samples and three cell lines. We found large differences between patient-based and cell line-based iCells (both infected and control), suggesting that cell lines are ill-suited to studying this disease. We compared patient-based infected and control iCells and uncovered genes whose functioning (wiring patterns in iCells) is altered by the disease. We validated in the literature that 18 out of the top 20 of the most rewired genes are indeed COVID-19-related. Since only three of these genes are targets of approved drugs, we applied another data fusion step to predict drugs for re-purposing. We confirmed with molecular docking that the predicted drugs can bind to their predicted targets. Our most interesting prediction is artenimol, an antimalarial agent targeting ZFP62, one of our newly identified COVID-19-related genes. This drug is a derivative of artemisinin drugs that are already under clinical investigation for their potential role in the treatment of COVID-19. Our results demonstrate further applicability of the iCell framework for integrative comparative studies of human diseases.


Asunto(s)
COVID-19 , Humanos , COVID-19/genética , Simulación del Acoplamiento Molecular , Pandemias , Reposicionamiento de Medicamentos
5.
Bioinformatics ; 37(7): 1000-1007, 2021 05 17.
Artículo en Inglés | MEDLINE | ID: mdl-32886115

RESUMEN

MOTIVATION: Biological and cellular systems are often modeled as graphs in which vertices represent objects of interest (genes, proteins and drugs) and edges represent relational ties between these objects (binds-to, interacts-with and regulates). This approach has been highly successful owing to the theory, methodology and software that support analysis and learning on graphs. Graphs, however, suffer from information loss when modeling physical systems due to their inability to accurately represent multiobject relationships. Hypergraphs, a generalization of graphs, provide a framework to mitigate information loss and unify disparate graph-based methodologies. RESULTS: We present a hypergraph-based approach for modeling biological systems and formulate vertex classification, edge classification and link prediction problems on (hyper)graphs as instances of vertex classification on (extended, dual) hypergraphs. We then introduce a novel kernel method on vertex- and edge-labeled (colored) hypergraphs for analysis and learning. The method is based on exact and inexact (via hypergraph edit distances) enumeration of hypergraphlets; i.e. small hypergraphs rooted at a vertex of interest. We empirically evaluate this method on fifteen biological networks and show its potential use in a positive-unlabeled setting to estimate the interactome sizes in various species. AVAILABILITY AND IMPLEMENTATION: https://github.com/jlugomar/hypergraphlet-kernels. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Proteínas , Programas Informáticos
6.
Bioinformatics ; 36(Suppl_2): i804-i812, 2020 12 30.
Artículo en Inglés | MEDLINE | ID: mdl-33381834

RESUMEN

MOTIVATION: Molecular interactions have been successfully modeled and analyzed as networks, where nodes represent molecules and edges represent the interactions between them. These networks revealed that molecules with similar local network structure also have similar biological functions. The most sensitive measures of network structure are based on graphlets. However, graphlet-based methods thus far are only applicable to unweighted networks, whereas real-world molecular networks may have weighted edges that can represent the probability of an interaction occurring in the cell. This information is commonly discarded when applying thresholds to generate unweighted networks, which may lead to information loss. RESULTS: We introduce probabilistic graphlets as a tool for analyzing the local wiring patterns of probabilistic networks. To assess their performance compared to unweighted graphlets, we generate synthetic networks based on different well-known random network models and edge probability distributions and demonstrate that probabilistic graphlets outperform their unweighted counterparts in distinguishing network structures. Then we model different real-world molecular interaction networks as weighted graphs with probabilities as weights on edges and we analyze them with our new weighted graphlets-based methods. We show that due to their probabilistic nature, probabilistic graphlet-based methods more robustly capture biological information in these data, while simultaneously showing a higher sensitivity to identify condition-specific functions compared to their unweighted graphlet-based method counterparts. AVAILABILITYAND IMPLEMENTATION: Our implementation of probabilistic graphlets is available at https://github.com/Serdobe/Probabilistic_Graphlets. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Probabilidad
7.
Bioinformatics ; 35(19): 3727-3734, 2019 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-30821317

RESUMEN

MOTIVATION: Protein-protein interactions (PPIs) are usually modeled as networks. These networks have extensively been studied using graphlets, small induced subgraphs capturing the local wiring patterns around nodes in networks. They revealed that proteins involved in similar functions tend to be similarly wired. However, such simple models can only represent pairwise relationships and cannot fully capture the higher-order organization of protein interactomes, including protein complexes. RESULTS: To model the multi-scale organization of these complex biological systems, we utilize simplicial complexes from computational geometry. The question is how to mine these new representations of protein interactomes to reveal additional biological information. To address this, we define simplets, a generalization of graphlets to simplicial complexes. By using simplets, we define a sensitive measure of similarity between simplicial complex representations that allows for clustering them according to their data types better than clustering them by using other state-of-the-art measures, e.g. spectral distance, or facet distribution distance. We model human and baker's yeast protein interactomes as simplicial complexes that capture PPIs and protein complexes as simplices. On these models, we show that our newly introduced simplet-based methods cluster proteins by function better than the clustering methods that use the standard PPI networks, uncovering the new underlying functional organization of the cell. We demonstrate the existence of the functional geometry in the protein interactome data and the superiority of our simplet-based methods to effectively mine for new biological information hidden in the complexity of the higher-order organization of protein interactomes. AVAILABILITY AND IMPLEMENTATION: Codes and datasets are freely available at http://www0.cs.ucl.ac.uk/staff/natasa/Simplets/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Mapeo de Interacción de Proteínas , Análisis por Conglomerados , Humanos , Proteínas , Saccharomyces cerevisiae
8.
Bioinformatics ; 35(24): 5226-5234, 2019 12 15.
Artículo en Inglés | MEDLINE | ID: mdl-31192358

RESUMEN

MOTIVATION: Laplacian matrices capture the global structure of networks and are widely used to study biological networks. However, the local structure of the network around a node can also capture biological information. Local wiring patterns are typically quantified by counting how often a node touches different graphlets (small, connected, induced sub-graphs). Currently available graphlet-based methods do not consider whether nodes are in the same network neighbourhood. To combine graphlet-based topological information and membership of nodes to the same network neighbourhood, we generalize the Laplacian to the Graphlet Laplacian, by considering a pair of nodes to be 'adjacent' if they simultaneously touch a given graphlet. RESULTS: We utilize Graphlet Laplacians to generalize spectral embedding, spectral clustering and network diffusion. Applying Graphlet Laplacian-based spectral embedding, we visually demonstrate that Graphlet Laplacians capture biological functions. This result is quantified by applying Graphlet Laplacian-based spectral clustering, which uncovers clusters enriched in biological functions dependent on the underlying graphlet. We explain the complementarity of biological functions captured by different Graphlet Laplacians by showing that they capture different local topologies. Finally, diffusing pan-cancer gene mutation scores based on different Graphlet Laplacians, we find complementary sets of cancer-related genes. Hence, we demonstrate that Graphlet Laplacians capture topology-function and topology-disease relationships in biological networks. AVAILABILITY AND IMPLEMENTATION: http://www0.cs.ucl.ac.uk/staff/natasa/graphlet-laplacian/index.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Análisis por Conglomerados , Mapeo de Interacción de Proteínas
9.
Bioinformatics ; 34(17): i944-i953, 2018 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-30423061

RESUMEN

Motivation: Molecular interactions have widely been modelled as networks. The local wiring patterns around molecules in molecular networks are linked with their biological functions. However, networks model only pairwise interactions between molecules and cannot explicitly and directly capture the higher-order molecular organization, such as protein complexes and pathways. Hence, we ask if hypergraphs (hypernetworks), that directly capture entire complexes and pathways along with protein-protein interactions (PPIs), carry additional functional information beyond what can be uncovered from networks of pairwise molecular interactions. The mathematical formalism of a hypergraph has long been known, but not often used in studying molecular networks due to the lack of sophisticated algorithms for mining the underlying biological information hidden in the wiring patterns of molecular systems modelled as hypernetworks. Results: We propose a new, multi-scale, protein interaction hypernetwork model that utilizes hypergraphs to capture different scales of protein organization, including PPIs, protein complexes and pathways. In analogy to graphlets, we introduce hypergraphlets, small, connected, non-isomorphic, induced sub-hypergraphs of a hypergraph, to quantify the local wiring patterns of these multi-scale molecular hypergraphs and to mine them for new biological information. We apply them to model the multi-scale protein networks of bakers yeast and human and show that the higher-order molecular organization captured by these hypergraphs is strongly related to the underlying biology. Importantly, we demonstrate that our new models and data mining tools reveal different, but complementary biological information compared with classical PPI networks. We apply our hypergraphlets to successfully predict biological functions of uncharacterized proteins. Availability and implementation: Code and data are available online at http://www0.cs.ucl.ac.uk/staff/natasa/hypergraphlets.


Asunto(s)
Mapeo de Interacción de Proteínas/métodos , Algoritmos , Humanos , Proteínas/metabolismo , Saccharomyces cerevisiae/metabolismo
10.
Mol Syst Biol ; 13(3): 918, 2017 03 15.
Artículo en Inglés | MEDLINE | ID: mdl-28298427

RESUMEN

G-protein-coupled receptors (GPCRs) are the largest family of integral membrane receptors with key roles in regulating signaling pathways targeted by therapeutics, but are difficult to study using existing proteomics technologies due to their complex biochemical features. To obtain a global view of GPCR-mediated signaling and to identify novel components of their pathways, we used a modified membrane yeast two-hybrid (MYTH) approach and identified interacting partners for 48 selected full-length human ligand-unoccupied GPCRs in their native membrane environment. The resulting GPCR interactome connects 686 proteins by 987 unique interactions, including 299 membrane proteins involved in a diverse range of cellular functions. To demonstrate the biological relevance of the GPCR interactome, we validated novel interactions of the GPR37, serotonin 5-HT4d, and adenosine ADORA2A receptors. Our data represent the first large-scale interactome mapping for human GPCRs and provide a valuable resource for the analysis of signaling pathways involving this druggable family of integral membrane proteins.


Asunto(s)
Mapeo de Interacción de Proteínas/métodos , Mapas de Interacción de Proteínas , Receptores Acoplados a Proteínas G/metabolismo , Membrana Celular/metabolismo , Humanos , Receptor de Adenosina A2A/metabolismo , Receptores de Serotonina 5-HT4/metabolismo , Transducción de Señal , Técnicas del Sistema de Dos Híbridos
11.
Bioinformatics ; 32(8): 1195-203, 2016 04 15.
Artículo en Inglés | MEDLINE | ID: mdl-26668003

RESUMEN

MOTIVATION: Discovering patterns in networks of protein-protein interactions (PPIs) is a central problem in systems biology. Alignments between these networks aid functional understanding as they uncover important information, such as evolutionary conserved pathways, protein complexes and functional orthologs. However, the complexity of the multiple network alignment problem grows exponentially with the number of networks being aligned and designing a multiple network aligner that is both scalable and that produces biologically relevant alignments is a challenging task that has not been fully addressed. The objective of multiple network alignment is to create clusters of nodes that are evolutionarily and functionally conserved across all networks. Unfortunately, the alignment methods proposed thus far do not meet this objective as they are guided by pairwise scores that do not utilize the entire functional and evolutionary information across all networks. RESULTS: To overcome this weakness, we propose Fuse, a new multiple network alignment algorithm that works in two steps. First, it computes our novel protein functional similarity scores by fusing information from wiring patterns of all aligned PPI networks and sequence similarities between their proteins. This is in contrast with the previous tools that are all based on protein similarities in pairs of networks being aligned. Our comprehensive new protein similarity scores are computed by Non-negative Matrix Tri-Factorization (NMTF) method that predicts associations between proteins whose homology (from sequences) and functioning similarity (from wiring patterns) are supported by all networks. Using the five largest and most complete PPI networks from BioGRID, we show that NMTF predicts a large number protein pairs that are biologically consistent. Second, to identify clusters of aligned proteins over all networks, Fuse uses our novel maximum weight k-partite matching approximation algorithm. We compare Fuse with the state of the art multiple network aligners and show that (i) by using only sequence alignment scores, Fuse already outperforms other aligners and produces a larger number of biologically consistent clusters that cover all aligned PPI networks and (ii) using both sequence alignments and topological NMTF-predicted scores leads to the best multiple network alignments thus far. AVAILABILITY AND IMPLEMENTATION: Our dataset and software are freely available from the web site: http://bio-nets.doc.ic.ac.uk/Fuse/ CONTACT: natasha@imperial.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Biología Computacional/métodos , Mapeo de Interacción de Proteínas , Algoritmos , Proteínas , Alineación de Secuencia , Programas Informáticos
12.
Proteomics ; 16(5): 741-58, 2016 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-26677817

RESUMEN

We provide an overview of recent developments in big data analyses in the context of precision medicine and health informatics. With the advance in technologies capturing molecular and medical data, we entered the area of "Big Data" in biology and medicine. These data offer many opportunities to advance precision medicine. We outline key challenges in precision medicine and present recent advances in data integration-based methods to uncover personalized information from big data produced by various omics studies. We survey recent integrative methods for disease subtyping, biomarkers discovery, and drug repurposing, and list the tools that are available to domain scientists. Given the ever-growing nature of these big data, we highlight key issues that big data integration methods will face.


Asunto(s)
Biomarcadores/análisis , Biología Computacional/métodos , Reposicionamiento de Medicamentos/métodos , Informática Médica/métodos , Medicina de Precisión/métodos , Investigación Biomédica , Epigenómica/métodos , Humanos , Metabolómica/métodos , Proteómica/métodos , Transcriptoma/genética
13.
Bioinformatics ; 31(13): 2182-9, 2015 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-25725498

RESUMEN

MOTIVATION: Discovering and understanding patterns in networks of protein-protein interactions (PPIs) is a central problem in systems biology. Alignments between these networks aid functional understanding as they uncover important information, such as evolutionary conserved pathways, protein complexes and functional orthologs. A few methods have been proposed for global PPI network alignments, but because of NP-completeness of underlying sub-graph isomorphism problem, producing topologically and biologically accurate alignments remains a challenge. RESULTS: We introduce a novel global network alignment tool, Lagrangian GRAphlet-based ALigner (L-GRAAL), which directly optimizes both the protein and the interaction functional conservations, using a novel alignment search heuristic based on integer programming and Lagrangian relaxation. We compare L-GRAAL with the state-of-the-art network aligners on the largest available PPI networks from BioGRID and observe that L-GRAAL uncovers the largest common sub-graphs between the networks, as measured by edge-correctness and symmetric sub-structures scores, which allow transferring more functional information across networks. We assess the biological quality of the protein mappings using the semantic similarity of their Gene Ontology annotations and observe that L-GRAAL best uncovers functionally conserved proteins. Furthermore, we introduce for the first time a measure of the semantic similarity of the mapped interactions and show that L-GRAAL also uncovers best functionally conserved interactions. In addition, we illustrate on the PPI networks of baker's yeast and human the ability of L-GRAAL to predict new PPIs. Finally, L-GRAAL's results are the first to show that topological information is more important than sequence information for uncovering functionally conserved interactions. AVAILABILITY AND IMPLEMENTATION: L-GRAAL is coded in C++. Software is available at: http://bio-nets.doc.ic.ac.uk/L-GRAAL/. CONTACT: n.malod-dognin@imperial.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Gráficos por Computador , Mapeo de Interacción de Proteínas/métodos , Proteínas/metabolismo , Programas Informáticos , Bases de Datos de Proteínas , Humanos , Anotación de Secuencia Molecular , Saccharomyces cerevisiae/metabolismo , Biología de Sistemas
14.
Bioinformatics ; 31(16): 2697-704, 2015 Aug 15.
Artículo en Inglés | MEDLINE | ID: mdl-25810431

RESUMEN

MOTIVATION: Network comparison is a computationally intractable problem with important applications in systems biology and other domains. A key challenge is to properly quantify similarity between wiring patterns of two networks in an alignment-free fashion. Also, alignment-based methods exist that aim to identify an actual node mapping between networks and as such serve a different purpose. Various alignment-free methods that use different global network properties (e.g. degree distribution) have been proposed. Methods based on small local subgraphs called graphlets perform the best in the alignment-free network comparison task, due to high level of topological detail that graphlets can capture. Among different graphlet-based methods, Graphlet Correlation Distance (GCD) was shown to be the most accurate for comparing networks. Recently, a new graphlet-based method called NetDis was proposed, which was claimed to be superior. We argue against this, as the performance of NetDis was not properly evaluated to position it correctly among the other alignment-free methods. RESULTS: We evaluate the performance of available alignment-free network comparison methods, including GCD and NetDis. We do this by measuring accuracy of each method (in a systematic precision-recall framework) in terms of how well the method can group (cluster) topologically similar networks. By testing this on both synthetic and real-world networks from different domains, we show that GCD remains the most accurate, noise-tolerant and computationally efficient alignment-free method. That is, we show that NetDis does not outperform the other methods, as originally claimed, while it is also computationally more expensive. Furthermore, since NetDis is dependent on the choice of a network null model (unlike the other graphlet-based methods), we show that its performance is highly sensitive to the choice of this parameter. Finally, we find that its performance is not independent on network sizes and densities, as originally claimed. CONTACT: natasha@imperial.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Redes Reguladoras de Genes , Mapeo de Interacción de Proteínas/métodos , Alineación de Secuencia/métodos , Algoritmos , Análisis por Conglomerados , Humanos , Filogenia , Estándares de Referencia
15.
Bioinformatics ; 31(10): 1632-9, 2015 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-25609797

RESUMEN

MOTIVATION: Proteins underlay the functioning of a cell and the wiring of proteins in protein-protein interaction network (PIN) relates to their biological functions. Proteins with similar wiring in the PIN (topology around them) have been shown to have similar functions. This property has been successfully exploited for predicting protein functions. Topological similarity is also used to guide network alignment algorithms that find similarly wired proteins between PINs of different species; these similarities are used to transfer annotation across PINs, e.g. from model organisms to human. To refine these functional predictions and annotation transfers, we need to gain insight into the variability of the topology-function relationships. For example, a function may be significantly associated with specific topologies, while another function may be weakly associated with several different topologies. Also, the topology-function relationships may differ between different species. RESULTS: To improve our understanding of topology-function relationships and of their conservation among species, we develop a statistical framework that is built upon canonical correlation analysis. Using the graphlet degrees to represent the wiring around proteins in PINs and gene ontology (GO) annotations to describe their functions, our framework: (i) characterizes statistically significant topology-function relationships in a given species, and (ii) uncovers the functions that have conserved topology in PINs of different species, which we term topologically orthologous functions. We apply our framework to PINs of yeast and human, identifying seven biological process and two cellular component GO terms to be topologically orthologous for the two organisms.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Redes Reguladoras de Genes , Mapeo de Interacción de Proteínas/métodos , Proteínas/metabolismo , Saccharomyces cerevisiae/metabolismo , Secuencia Conservada , Ontología de Genes , Humanos , Anotación de Secuencia Molecular , Saccharomyces cerevisiae/genética
16.
Bioinformatics ; 30(17): i594-600, 2014 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-25161252

RESUMEN

MOTIVATION: Recently, a shift was made from using Gene Ontology (GO) to evaluate molecular network data to using these data to construct and evaluate GO. Dutkowski et al. provide the first evidence that a large part of GO can be reconstructed solely from topologies of molecular networks. Motivated by this work, we develop a novel data integration framework that integrates multiple types of molecular network data to reconstruct and update GO. We ask how much of GO can be recovered by integrating various molecular interaction data. RESULTS: We introduce a computational framework for integration of various biological networks using penalized non-negative matrix tri-factorization (PNMTF). It takes all network data in a matrix form and performs simultaneous clustering of genes and GO terms, inducing new relations between genes and GO terms (annotations) and between GO terms themselves. To improve the accuracy of our predicted relations, we extend the integration methodology to include additional topological information represented as the similarity in wiring around non-interacting genes. Surprisingly, by integrating topologies of bakers' yeasts protein-protein interaction, genetic interaction (GI) and co-expression networks, our method reports as related 96% of GO terms that are directly related in GO. The inclusion of the wiring similarity of non-interacting genes contributes 6% to this large GO term association capture. Furthermore, we use our method to infer new relationships between GO terms solely from the topologies of these networks and validate 44% of our predictions in the literature. In addition, our integration method reproduces 48% of cellular component, 41% of molecular function and 41% of biological process GO terms, outperforming the previous method in the former two domains of GO. Finally, we predict new GO annotations of yeast genes and validate our predictions through GIs profiling. AVAILABILITY AND IMPLEMENTATION: Supplementary Tables of new GO term associations and predicted gene annotations are available at http://bio-nets.doc.ic.ac.uk/GO-Reconstruction/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Ontología de Genes , Redes Reguladoras de Genes , Mapeo de Interacción de Proteínas , Algoritmos , Análisis por Conglomerados , Biología Computacional/métodos , Expresión Génica , Anotación de Secuencia Molecular , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo
17.
Bioinformatics ; 30(9): 1259-65, 2014 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-24443377

RESUMEN

MOTIVATION: Protein structure alignment is key for transferring information from well-studied proteins to less studied ones. Structural alignment identifies the most precise mapping of equivalent residues, as structures are more conserved during evolution than sequences. Among the methods for aligning protein structures, maximum Contact Map Overlap (CMO) has received sustained attention during the past decade. Yet, known algorithms exhibit modest performance and are not applicable for large-scale comparison. RESULTS: Graphlets are small induced subgraphs that are used to design sensitive topological similarity measures between nodes and networks. By generalizing graphlets to ordered graphs, we introduce GR-Align, a CMO heuristic that is suited for database searches. On the Proteus_300 set (44 850 protein domain pairs), GR-Align is several orders of magnitude faster than the state-of-the-art CMO solvers Apurva, MSVNS and AlEigen7, and its similarity score is in better agreement with the structural classification of proteins. On a large-scale experiment on the Gold-standard benchmark dataset (3 207 270 protein domain pairs), GR-Align is several orders of magnitude faster than the state-of-the-art protein structure comparison tools TM-Align, DaliLite, MATT and Yakusa, while achieving similar classification performances. Finally, we illustrate the difference between GR-Align's flexible alignments and the traditional ones by querying a flexible protein in the Astral-40 database (11 154 protein domains). In this experiment, GR-Align's top scoring alignments are not only in better agreement with structural classification of proteins, but also that they allow transferring more information across proteins.


Asunto(s)
Proteínas/química , Algoritmos , Carbono/química , Humanos , Conformación Proteica , Factores de Tiempo
18.
BMC Bioinformatics ; 15: 304, 2014 Sep 17.
Artículo en Inglés | MEDLINE | ID: mdl-25228247

RESUMEN

BACKGROUND: Understanding the relationship between diseases based on the underlying biological mechanisms is one of the greatest challenges in modern biology and medicine. Exploring disease-disease associations by using system-level biological data is expected to improve our current knowledge of disease relationships, which may lead to further improvements in disease diagnosis, prognosis and treatment. RESULTS: We took advantage of diverse biological data including disease-gene associations and a large-scale molecular network to gain novel insights into disease relationships. We analysed and compared four publicly available disease-gene association datasets, then applied three disease similarity measures, namely annotation-based measure, function-based measure and topology-based measure, to estimate the similarity scores between diseases. We systematically evaluated disease associations obtained by these measures against a statistical measure of comorbidity which was derived from a large number of medical patient records. Our results show that the correlation between our similarity measures and comorbidity scores is substantially higher than expected at random, confirming that our similarity measures are able to recover comorbidity associations. We also demonstrated that our predicted disease associations correlated with disease associations generated from genome-wide association studies significantly higher than expected at random. Furthermore, we evaluated our predicted disease associations via mining the literature on PubMed, and presented case studies to demonstrate how these novel disease associations can be used to enhance our current knowledge of disease relationships. CONCLUSIONS: We present three similarity measures for predicting disease associations. The strong correlation between our predictions and known disease associations demonstrates the ability of our measures to provide novel insights into disease relationships.


Asunto(s)
Enfermedad/genética , Genómica/métodos , Ontología de Genes , Estudio de Asociación del Genoma Completo , Humanos , Anotación de Secuencia Molecular , PubMed
19.
Breast Cancer Res Treat ; 148(2): 455-62, 2014 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-25248409

RESUMEN

The goal of targeted cancer therapies is to specifically block oncogenic signalling, thus maximising efficacy, while reducing side-effects to patients. The gamma-secretase (GS) complex is an attractive therapeutic target in haematological malignancies and solid tumours with major pharmaceutical activity to identify optimal inhibitors. Within GS, nicastrin (NCSTN) offers an opportunity for therapeutic intervention using blocking monoclonal antibodies (mAbs). Here we explore the role of anti-nicastrin monoclonal antibodies, which we have developed as specific, multi-faceted inhibitors of proliferation and invasive traits of triple-negative breast cancer cells. We use 3D in vitro proliferation and invasion assays as well as an orthotopic and tail vail injection triple-negative breast cancer in vivo xenograft model systems. RNAScope assessed nicastrin in patient samples. Anti-NCSTN mAb clone-2H6 demonstrated a superior anti-tumour efficacy than clone-10C11 and the RO4929097 small molecule GS inhibitor, acting by inhibiting GS enzymatic activity and Notch signalling in vitro and in vivo. Confirming clinical relevance of nicastrin as a target, we report evidence of increased NCSTN mRNA levels by RNA in situ hybridization (RNAScope) in a large cohort of oestrogen receptor negative breast cancers, conferring independent prognostic significance for disease-free survival, in multivariate analysis. We demonstrate here that targeting NCSTN using specific mAbs may represent a novel mode of treatment for invasive triple-negative breast cancer, for which there are few targeted therapeutic options. Furthermore, we propose that measuring NCSTN in patient samples using RNAScope technology may serve as companion diagnostic for anti-NCSTN therapy in the clinic.


Asunto(s)
Secretasas de la Proteína Precursora del Amiloide/antagonistas & inhibidores , Anticuerpos Monoclonales/farmacología , Regulación Neoplásica de la Expresión Génica/efectos de los fármacos , Glicoproteínas de Membrana/antagonistas & inhibidores , Neoplasias de la Mama Triple Negativas/tratamiento farmacológico , Secretasas de la Proteína Precursora del Amiloide/metabolismo , Animales , Apoptosis/efectos de los fármacos , Western Blotting , Movimiento Celular/efectos de los fármacos , Proliferación Celular/efectos de los fármacos , Femenino , Citometría de Flujo , Humanos , Glicoproteínas de Membrana/metabolismo , Ratones , Ratones Endogámicos BALB C , Ratones Desnudos , Invasividad Neoplásica , Neoplasias de la Mama Triple Negativas/inmunología , Neoplasias de la Mama Triple Negativas/metabolismo , Neoplasias de la Mama Triple Negativas/patología , Células Tumorales Cultivadas , Ensayos Antitumor por Modelo de Xenoinjerto
20.
Mol Genet Genomics ; 289(5): 727-34, 2014 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-24728588

RESUMEN

Systems biology aims at creating mathematical models, i.e., computational reconstructions of biological systems and processes that will result in a new level of understanding-the elucidation of the basic and presumably conserved "design" and "engineering" principles of biomolecular systems. Thus, systems biology will move biology from a phenomenological to a predictive science. Mathematical modeling of biological networks and processes has already greatly improved our understanding of many cellular processes. However, given the massive amount of qualitative and quantitative data currently produced and number of burning questions in health care and biotechnology needed to be solved is still in its early phases. The field requires novel approaches for abstraction, for modeling bioprocesses that follow different biochemical and biophysical rules, and for combining different modules into larger models that still allow realistic simulation with the computational power available today. We have identified and discussed currently most prominent problems in systems biology: (1) how to bridge different scales of modeling abstraction, (2) how to bridge the gap between topological and mechanistic modeling, and (3) how to bridge the wet and dry laboratory gap. The future success of systems biology largely depends on bridging the recognized gaps.


Asunto(s)
Investigación Biomédica/normas , Biología de Sistemas , Humanos , Modelos Biológicos , Estándares de Referencia
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA