RESUMEN
Prior knowledge about DNA-binding transcription factors (dbTFs), transcription co-regulators (coTFs) and general transcriptional factors (GTFs) is crucial for the study and understanding of the regulation of transcription. This is reflected by the many publications and database resources describing knowledge about TFs. We previously launched the TFCheckpoint database, an integrated resource focused on human, mouse and rat dbTFs, providing users access to a comprehensive overview of these proteins. Here, we describe TFCheckpoint 2.0 (https://www.tfcheckpoint.org/index.php), comprising 13 collections of dbTFs, coTFs and GTFs. TFCheckpoint 2.0 provides an easy and versatile cross-referencing system for users to view and download collections that may otherwise be cumbersome to find, compare and retrieve.
Asunto(s)
Bases de Datos Genéticas , Regulación de la Expresión Génica , Factores de Transcripción , Animales , Humanos , Ratones , Ratas , Internet , Factores de Transcripción/genética , Factores de Transcripción/metabolismoRESUMEN
Knowledge about transcription factor binding and regulation, target genes, cis-regulatory modules and topologically associating domains is not only defined by functional associations like biological processes or diseases but also has a determinative genome location aspect. Here, we exploit these location and functional aspects together to develop new strategies to enable advanced data querying. Many databases have been developed to provide information about enhancers, but a schema that allows the standardized representation of data, securing interoperability between resources, has been lacking. In this work, we use knowledge graphs for the standardized representation of enhancers and topologically associating domains, together with data about their target genes, transcription factors, location on the human genome, and functional data about diseases and gene ontology annotations. We used this schema to integrate twenty-five enhancer datasets and two domain datasets, creating the most powerful integrative resource in this field to date. The knowledge graphs have been implemented using the Resource Description Framework and integrated within the open-access BioGateway knowledge network, generating a resource that contains an interoperable set of knowledge graphs (enhancers, TADs, genes, proteins, diseases, GO terms, and interactions between domains). We show how advanced queries, which combine functional and location restrictions, can be used to develop new hypotheses about functional aspects of gene expression regulation.
Asunto(s)
Bases de Datos Genéticas , Elementos de Facilitación Genéticos , Regulación de la Expresión Génica , Humanos , Factores de Transcripción/metabolismo , Factores de Transcripción/genética , Genoma Humano , Ontología de GenesRESUMEN
Causal molecular interactions represent key building blocks used in computational modeling, where they facilitate the assembly of regulatory networks. Logical regulatory networks can be used to predict biological and cellular behaviors by system perturbations and in silico simulations. Today, broad sets of causal interactions are available in a variety of biological knowledge resources. However, different visions, based on distinct biological interests, have led to the development of multiple ways to describe and annotate causal molecular interactions. It can therefore be challenging to efficiently explore various resources of causal interaction and maintain an overview of recorded contextual information that ensures valid use of the data. This review lists the different types of public resources with causal interactions, the different views on biological processes that they represent, the various data formats they use for data representation and storage, and the data exchange and conversion procedures that are available to extract and download these interactions. This may further raise awareness among the targeted audience, i.e. logical modelers and other scientists interested in molecular causal interactions, but also database managers and curators, about the abundance and variety of causal molecular interaction data, and the variety of tools and approaches to convert them into one interoperable resource.
Asunto(s)
Simulación por Computador , Bases de Datos Factuales , Modelos Biológicos , Programas InformáticosRESUMEN
The fast accumulation of biological data calls for their integration, analysis and exploitation through more systematic approaches. The generation of novel, relevant hypotheses from this enormous quantity of data remains challenging. Logical models have long been used to answer a variety of questions regarding the dynamical behaviours of regulatory networks. As the number of published logical models increases, there is a pressing need for systematic model annotation, referencing and curation in community-supported and standardised formats. This article summarises the key topics and future directions of a meeting entitled 'Annotation and curation of computational models in biology', organised as part of the 2019 [BC]2 conference. The purpose of the meeting was to develop and drive forward a plan towards the standardised annotation of logical models, review and connect various ongoing projects of experts from different communities involved in the modelling and annotation of molecular biological entities, interactions, pathways and models. This article defines a roadmap towards the annotation and curation of logical models, including milestones for best practices and minimum standard requirements.
Asunto(s)
Biología Computacional/métodos , Modelos Biológicos , Guías de Práctica Clínica como Asunto , Reproducibilidad de los ResultadosRESUMEN
SUMMARY: We present a set of software packages that provide uniform access to diverse biological vocabulary resources that are instrumental for current biocuration efforts and tools. The Unified Biological Dictionaries (UniBioDicts or UBDs) provide a single query-interface for accessing the online API services of leading biological data providers. Given a search string, UBDs return a list of matching term, identifier and metadata units from databases (e.g. UniProt), controlled vocabularies (e.g. PSI-MI) and ontologies (e.g. GO, via BioPortal). This functionality can be connected to input fields (user-interface components) that offer autocomplete lookup for these dictionaries. UBDs create a unified gateway for accessing life science concepts, helping curators find annotation terms across resources (based on descriptive metadata and unambiguous identifiers), and helping data users search and retrieve the right query terms. AVAILABILITY AND IMPLEMENTATION: The UBDs are available through npm and the code is available in the GitHub organisation UniBioDicts (https://github.com/UniBioDicts) under the Affero GPL license. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
RESUMEN
MOTIVATION: A large variety of molecular interactions occurs between biomolecular components in cells. When a molecular interaction results in a regulatory effect, exerted by one component onto a downstream component, a so-called 'causal interaction' takes place. Causal interactions constitute the building blocks in our understanding of larger regulatory networks in cells. These causal interactions and the biological processes they enable (e.g. gene regulation) need to be described with a careful appreciation of the underlying molecular reactions. A proper description of this information enables archiving, sharing and reuse by humans and for automated computational processing. Various representations of causal relationships between biological components are currently used in a variety of resources. RESULTS: Here, we propose a checklist that accommodates current representations, called the Minimum Information about a Molecular Interaction CAusal STatement (MI2CAST). This checklist defines both the required core information, as well as a comprehensive set of other contextual details valuable to the end user and relevant for reusing and reproducing causal molecular interaction information. The MI2CAST checklist can be used as reporting guidelines when annotating and curating causal statements, while fostering uniformity and interoperability of the data across resources. AVAILABILITY AND IMPLEMENTATION: The checklist together with examples is accessible at https://github.com/MI2CAST/MI2CAST. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Programas Informáticos , Causalidad , HumanosRESUMEN
Computational models of biological processes provide one of the most powerful methods for a detailed analysis of the mechanisms that drive the behavior of complex systems. Logic-based modeling has enhanced our understanding and interpretation of those systems. Defining rules that determine how the output activity of biological entities is regulated by their respective inputs has proven to be challenging. Partly this is because of the inherent noise in data that allows multiple model parameterizations to fit the experimental observations, but some of it is also due to the fact that models become increasingly larger, making the use of automated tools to assemble the underlying rules indispensable. We present several Boolean function metrics that provide modelers with the appropriate framework to analyze the impact of a particular model parameterization. We demonstrate the link between a semantic characterization of a Boolean function and its consistency with the model's underlying regulatory structure. We further define the properties that outline such consistency and show that several of the Boolean functions under study violate them, questioning their biological plausibility and subsequent use. We also illustrate that regulatory functions can have major differences with regard to their asymptotic output behavior, with some of them being biased towards specific Boolean outcomes when others are dependent on the ratio between activating and inhibitory regulators. Application results show that in a specific signaling cancer network, the function bias can be used to guide the choice of logical operators for a model that matches data observations. Moreover, graph analysis indicates that commonly used Boolean functions become more biased with increasing numbers of regulators, supporting the idea that rule specification can effectively determine regulatory outcome despite the complex dynamics of biological networks.
Asunto(s)
Benchmarking , Transducción de Señal , Redes Reguladoras de Genes , LógicaRESUMEN
BACKGROUND: Treating patients with combinations of drugs that have synergistic effects has become widespread practice in the clinic. Drugs work synergistically when the observed effect of a drug combination is larger than the effect predicted by the reference model. The reference model is a theoretical null model that returns the combined effect of given doses of drugs under the assumption that these drugs do not interact. There is ongoing debate on what it means for drugs to not interact. The controversy transcends mathematical punctuality, as different non-interaction principles result in different reference models. A famous reference model that has been in existence for already a long time is Loewe's reference model. Loewe's vision on non-interaction was purely intuitive: two drugs do not interact if all combinations of doses that result in a certain given effect lie on a straight line. RESULTS: We show that Loewe's reference model can be obtained from much more fundamental principles. First, we introduce the new notion of complementary dose. Secondly, we reformulate the existing concept of equivalent dose, whereby our formulation is more general than existing ones. Finally, a very general non-interaction principle is put forward. The proposed non-interaction principle represents a certain interplay between complementary and equivalent doses: drugs are non-interacting if complementarity is preserved under equivalence. It is then shown that Loewe's reference model naturally follows from these principles by an appropriate choice of complementarity. CONCLUSIONS: The presented work increases insight into Loewe's reference model for drug combinations, which is realized by the introduction of a very general non-interaction principle that does not refer to any specific dose-response curve, nor to any property of applicable dose-response curves.
Asunto(s)
Combinación de Medicamentos , Modelos Teóricos , Relación Dosis-Respuesta a Droga , Interacciones Farmacológicas , Sinergismo Farmacológico , Humanos , Preparaciones Farmacéuticas/metabolismo , Estándares de ReferenciaRESUMEN
SUMMARY: The BioGateway App is a Cytoscape (version 3) plugin designed to provide easy query access to the BioGateway RDF triple store, which contains functional and interaction information for proteins from several curated resources. For explorative network building, we have added a comprehensive dataset with regulatory relationships of mammalian DNA binding transcription factors and their target genes, compiled both from curated resources and from a text mining effort. Query results are visualised using the inherent flexibility of the Cytoscape framework, and network links can be checked against curated database records or against the original publication. AVAILABILITY: Install through the Cytoscape application manager or visit www.biogateway.eu for download and tutorial documents. SUPPLEMENTARY INFORMATION: Supplementary information is available at Bioinformatics online.
RESUMEN
In this paper, we tell the story of efforts currently underway, on diverse fronts, to build digital knowledge repositories ('knowledge-bases') to support research in the life sciences. If successful, knowledge bases will be part of a new knowledge infrastructure-capable of facilitating ever-more comprehensive, computational models of biological systems. Such an infrastructure would, however, represent a sea-change in the technological management and manipulation of complex data, inducing a generational shift in how questions are asked and answered and results published and circulated. Integrating such knowledge bases into the daily workflow of the lab thus destabilizes a number of well-established habits which biologists rely on to ensure the quality of the knowledge they produce, evaluate, communicate and exploit. As the story we tell here shows, such destabilization introduces a situation of unfamiliarity, one that carries with it epistemic risks. It should elicit-to use Niklas Luhmann's terms-the question of trust: a shared recognition that the reliability of research practices is being risked, but that such a risk is worth taking in view of what may be gained. And yet, the problem of trust is being unexpectedly silenced. How that silencing has come about, why it matters, and what might yet be done forms the heart of this paper.
Asunto(s)
Disciplinas de las Ciencias Biológicas , Bases de Datos Factuales , Conocimiento , Investigación/organización & administración , Confianza , Humanos , Análisis por Micromatrices/métodosRESUMEN
Discovery of efficient anti-cancer drug combinations is a major challenge, since experimental testing of all possible combinations is clearly impossible. Recent efforts to computationally predict drug combination responses retain this experimental search space, as model definitions typically rely on extensive drug perturbation data. We developed a dynamical model representing a cell fate decision network in the AGS gastric cancer cell line, relying on background knowledge extracted from literature and databases. We defined a set of logical equations recapitulating AGS data observed in cells in their baseline proliferative state. Using the modeling software GINsim, model reduction and simulation compression techniques were applied to cope with the vast state space of large logical models and enable simulations of pairwise applications of specific signaling inhibitory chemical substances. Our simulations predicted synergistic growth inhibitory action of five combinations from a total of 21 possible pairs. Four of the predicted synergies were confirmed in AGS cell growth real-time assays, including known effects of combined MEK-AKT or MEK-PI3K inhibitions, along with novel synergistic effects of combined TAK1-AKT or TAK1-PI3K inhibitions. Our strategy reduces the dependence on a priori drug perturbation experimentation for well-characterized signaling networks, by demonstrating that a model predictive of combinatorial drug effects can be inferred from background knowledge on unperturbed and proliferating cancer cells. Our modeling approach can thus contribute to preclinical discovery of efficient anticancer drug combinations, and thereby to development of strategies to tailor treatment to individual cancer patients.
Asunto(s)
Antineoplásicos/farmacología , Biología Computacional/métodos , Sinergismo Farmacológico , Neoplasias Gástricas/tratamiento farmacológico , Antineoplásicos/uso terapéutico , Línea Celular Tumoral , Proliferación Celular/efectos de los fármacos , Simulación por Computador , Descubrimiento de Drogas , Humanos , Modelos BiológicosRESUMEN
Abortive transcription initiation can be rate-limiting for promoter escape and therefore represents a barrier to productive gene expression. The mechanism for abortive initiation is unknown, but the amount of abortive transcript is known to vary with the composition of the initial transcribed sequence (ITS). Here, we used a thermodynamic model of translocation combined with experimental validation to investigate the relationship between ITS and promoter escape on a set of phage T5 N25 promoters. We found a strong, negative correlation between RNAP's propensity to occupy the pretranslocated state during initial transcription and the efficiency of promoter escape (r = -0.67; p < 10(-6)). This correlation was almost entirely caused by free energy changes due to variation in the RNA 3' dinucleotide sequence at each step, implying that this sequence element controls the disposition of initial transcribing complexes. We tested our model experimentally by constructing a set of novel N25-ITS promoter variants; quantitative transcription analysis again showed a strong correlation (r = -0.81; p < 10(-6)). Our results support a model in which sequence-directed bias for the pretranslocated state during scrunching results in increased backtracking, which limits the efficiency of promoter escape. This provides an answer to the long-standing issue of how sequence composition of the ITS affects promoter escape efficiency.
Asunto(s)
Bacteriófagos/genética , Escherichia coli/virología , Regulación Viral de la Expresión Génica , Regiones Promotoras Genéticas , Iniciación de la Transcripción Genética , Secuencia de Bases , ARN Polimerasas Dirigidas por ADN/metabolismo , Escherichia coli/genética , Escherichia coli/metabolismo , Proteínas de Escherichia coli/metabolismo , TermodinámicaRESUMEN
MOTIVATION: The comparison of genes and gene products across species depends on high-quality tools to determine the relationships between gene or protein sequences from various species. Although some excellent applications are available and widely used, their performance leaves room for improvement. RESULTS: We developed orthAgogue: a multithreaded C application for high-speed estimation of homology relations in massive datasets, operated via a flexible and easy command-line interface. AVAILABILITY: The orthAgogue software is distributed under the GNU license. The source code and binaries compiled for Linux are available at https://code.google.com/p/orthagogue/.
Asunto(s)
Homología de Secuencia , Programas Informáticos , Algoritmos , Alineación de SecuenciaRESUMEN
BACKGROUND: Network-based approaches for the analysis of large-scale genomics data have become well established. Biological networks provide a knowledge scaffold against which the patterns and dynamics of 'omics' data can be interpreted. The background information required for the construction of such networks is often dispersed across a multitude of knowledge bases in a variety of formats. The seamless integration of this information is one of the main challenges in bioinformatics. The Semantic Web offers powerful technologies for the assembly of integrated knowledge bases that are computationally comprehensible, thereby providing a potentially powerful resource for constructing biological networks and network-based analysis. RESULTS: We have developed the Gene eXpression Knowledge Base (GeXKB), a semantic web technology based resource that contains integrated knowledge about gene expression regulation. To affirm the utility of GeXKB we demonstrate how this resource can be exploited for the identification of candidate regulatory network proteins. We present four use cases that were designed from a biological perspective in order to find candidate members relevant for the gastrin hormone signaling network model. We show how a combination of specific query definitions and additional selection criteria derived from gene expression data and prior knowledge concerning candidate proteins can be used to retrieve a set of proteins that constitute valid candidates for regulatory network extensions. CONCLUSIONS: Semantic web technologies provide the means for processing and integrating various heterogeneous information sources. The GeXKB offers biologists such an integrated knowledge resource, allowing them to address complex biological questions pertaining to gene expression. This work illustrates how GeXKB can be used in combination with gene expression results and literature information to identify new potential candidates that may be considered for extending a gene regulatory network.
Asunto(s)
Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Regulación de la Expresión Génica , Redes Reguladoras de Genes , Genómica/métodos , Modelos Biológicos , Transducción de Señal , Humanos , Bases del Conocimiento , Mapas de Interacción de Proteínas , SemánticaRESUMEN
SUMMARY: Gene regulatory network assembly and analysis requires high-quality knowledge sources that cover functional aspects of the various components of the gene regulatory machinery. A multiplicity of resources exists with information about mammalian transcription factors (TFs); yet, only few of these provide sufficiently accurate classifications of the functional roles of individual TFs, or standardized evidence that would justify the information on which these functional classifications are based. We compiled the list of all putative TFs from nine different resources, ignored factors such as general TFs, mediator complexes and chromatin modifiers, and for the remaining factors checked the available literature for references that support their function as a true sequence-specific DNA-binding RNA polymerase II TF (DbTF). The results are available in the TFcheckpoint database, an exhaustive collection of TFs annotated according to experimental and other evidence on their function as true DbTFs. TFcheckpoint.org provides a high-quality and comprehensive knowledge source for genome-scale regulatory network studies. AVAILABILITY: The TFcheckpoint database is freely available at www.tfcheckpoint.org
Asunto(s)
Bases de Datos Genéticas , ARN Polimerasa II/análisis , Factores de Transcripción/análisis , Animales , ADN/metabolismo , Humanos , Internet , Unión Proteica , ARN Polimerasa II/química , Programas Informáticos , Factores de Transcripción/químicaRESUMEN
Psoriasis arises from complex interactions between keratinocytes and immune cells, leading to uncontrolled inflammation, immune hyperactivation, and a perturbed keratinocyte life cycle. Despite the availability of drugs for psoriasis management, the disease remains incurable. Treatment response variability calls for new tools and approaches to comprehend the mechanisms underlying disease development. We present a Boolean multiscale population model that captures the dynamics of cell-specific phenotypes in psoriasis, integrating discrete logical formalism and population dynamics simulations. Through simulations and network analysis, the model predictions suggest that targeting neutrophil activation in conjunction with inhibition of either prostaglandin E2 (PGE2) or STAT3 shows promise comparable to interleukin-17 (IL-17) inhibition, one of the most effective treatment options for moderate and severe cases. Our findings underscore the significance of considering complex intercellular interactions and intracellular signaling in psoriasis and highlight the importance of computational approaches in unraveling complex biological systems for drug target identification.
RESUMEN
Plants are continuously exposed to changing environmental conditions and must, as sessile organisms, possess sophisticated acclimative mechanisms. To gain insight into systemic responses to local virus infection or wounding, we performed comparative LC-MS/MS protein profiling of distal, virus-free leaves four and five days after local inoculation of Arabidopsis thaliana plants with either Oilseed rape mosaic virus (ORMV) or inoculation buffer alone. Our study revealed biomarkers for systemic signaling in response to wounding and compatible virus infection in Arabidopsis, which should prove useful in further addressing the trigger-specific systemic response network and the elusive systemic signals. We observed responses common to ORMV and mock treatment as well as protein profile changes that are specific to local virus infection or mechanical wounding (mock treatment) alone, which provides evidence for the existence of more than one systemic signal to induce these distinct changes. Comparison of the systemic responses between time points indicated that the responses build up over time. Our data indicate stress-specific changes in proteins involved in jasmonic and abscisic acid signaling, intracellular transport, compartmentalization of enzyme activities, protein folding and synthesis, and energy and carbohydrate metabolism. In addition, a virus-triggered systemic signal appears to suppress antiviral host defense.
Asunto(s)
Proteínas de Arabidopsis/aislamiento & purificación , Arabidopsis/genética , Regulación de la Expresión Génica de las Plantas , Enfermedades de las Plantas/genética , Hojas de la Planta/genética , Arabidopsis/inmunología , Arabidopsis/virología , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/inmunología , Cromatografía Liquida , Perfilación de la Expresión Génica , Anotación de Secuencia Molecular , Enfermedades de las Plantas/inmunología , Enfermedades de las Plantas/virología , Hojas de la Planta/inmunología , Hojas de la Planta/virología , Proteómica , Transducción de Señal , Espectrometría de Masas en Tándem , Tobamovirus/inmunologíaRESUMEN
A hallmark of the development of solid and hematological malignancies is the dysregulation of apoptosis, which leads to an imbalance between cell proliferation, cell survival and death. Halogenated boroxine [K2 (B3 O3 F4 OH)] (HB) is a derivative of cyclic anhydride of boronic acid, with reproducible anti-tumor and anti-proliferative effects in different cell models. Notably, these changes are observed to be more profound in tumor cells than in normal cells. Here, we investigated the underlying mechanisms through an extensive evaluation of (a) deregulated target genes and (b) their interactions and links with main apoptotic pathway genes upon treatment with an optimized concentration of HB. To provide deeper insights into the mechanism of action of HB, we performed identification, visualization, and pathway association of differentially expressed genes (DEGs) involved in regulation of apoptosis among tumor and non-tumor cells upon HB treatment. We report that HB at a concentration of 0.2 mg·mL-1 drives tumor cells to apoptosis, whereas non-tumor cells are not affected. Comparison of DEG profiles, gene interactions and pathway associations suggests that the HB effect and tumor-'selectivity' can be explained by Bax/Bak-independent mitochondrial depolarization by ROS generation and TRAIL-like activation, followed by permanent inhibition of NFκB signaling pathway specifically in tumor cells.
Asunto(s)
Apoptosis , Leucemia , Humanos , Leucemia/metabolismo , Transducción de Señal , FN-kappa B/metabolismo , Proliferación CelularRESUMEN
Colorectal cancer (CRC) is one of the most prevalent cancers, driven by several factors including deregulations in intracellular signalling pathways. Small extracellular vesicles (sEVs) are nanosized protein-packaged particles released from cells, which are present in liquid biopsies. Here, we characterised the proteome landscape of sEVs and their cells of origin in three CRC cell lines HCT116, HT29 and SW620 to explore molecular traits that could be exploited as cancer biomarker candidates and how intracellular signalling can be assessed by sEV analysis instead of directly obtaining the cell of origin itself. Our findings revealed that sEV cargo clearly reflects its cell of origin with proteins of the PI3K-AKT pathway highly represented in sEVs. Proteins known to be involved in CRC were detected in both cells and sEVs including KRAS, ARAF, mTOR, PDPK1 and MAPK1, while TGFB1 and TGFBR2, known to be key players in epithelial cancer carcinogenesis, were found to be enriched in sEVs. Furthermore, the phosphopeptide-enriched profiling of cell lysates demonstrated a distinct pattern between cell lines and highlighted potential phosphoproteomic targets to be investigated in sEVs. The total proteomic and phosphoproteomics profiles described in the current work can serve as a source to identify candidates for cancer biomarkers that can potentially be assessed from liquid biopsies.
RESUMEN
BACKGROUND: More than one million terms from biomedical ontologies and controlled vocabularies are available through the Ontology Lookup Service (OLS). Although OLS provides ample possibility for querying and browsing terms, the visualization of parts of the ontology graphs is rather limited and inflexible. RESULTS: We created the OLSVis web application, a visualiser for browsing all ontologies available in the OLS database. OLSVis shows customisable subgraphs of the OLS ontologies. Subgraphs are animated via a real-time force-based layout algorithm which is fully interactive: each time the user makes a change, e.g. browsing to a new term, hiding, adding, or dragging terms, the algorithm performs smooth and only essential reorganisations of the graph. This assures an optimal viewing experience, because subsequent screen layouts are not grossly altered, and users can easily navigate through the graph. URL: http://ols.wordvis.com CONCLUSIONS: The OLSVis web application provides a user-friendly tool to visualise ontologies from the OLS repository. It broadens the possibilities to investigate and select ontology subgraphs through a smooth visualisation method.