Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 24
Filter
1.
Nucleic Acids Res ; 51(D1): D539-D545, 2023 01 06.
Article in English | MEDLINE | ID: mdl-36382402

ABSTRACT

The CORUM database has been providing comprehensive reference information about experimentally characterized, mammalian protein complexes and their associated biological and biomedical properties since 2007. Given that most catalytic and regulatory functions of the cell are carried out by protein complexes, their composition and characterization is of greatest importance in basic and disease biology. The new CORUM 4.0 release encompasses 5204 protein complexes offering the largest and most comprehensive publicly available dataset of manually curated mammalian protein complexes. The CORUM dataset is built from 5299 different genes, representing 26% of the protein coding genes in humans. Complex information from 3354 scientific articles is mainly obtained from human (70%), mouse (16%) and rat (9%) cells and tissues. Recent curation work includes sets of protein complexes, Functional Complex Groups, that offer comprehensive collections of published data in specific biological processes and molecular functions. In addition, a new graphical analysis tool was implemented that displays co-expression data from the subunits of protein complexes. CORUM is freely accessible at http://mips.helmholtz-muenchen.de/corum/.


Subject(s)
Databases, Protein , Multiprotein Complexes , Animals , Humans , Mice , Rats , Databases, Factual , Mammals , Multiprotein Complexes/chemistry
2.
Viruses ; 14(7)2022 07 21.
Article in English | MEDLINE | ID: mdl-35891571

ABSTRACT

Human endogenous retrovirus (HERVs), normally silenced by methylation or mutations, can be reactivated by multiple environmental factors, including infections with exogenous viruses. In this work, we investigated the transcriptional activity of HERVs in human A549 cells infected by two wild-type (PR8M, SC35M) and one mutated (SC35MΔNS1) strains of Influenza A virus (IAVs). We found that the majority of differentially expressed HERVs (DEHERVS) and genes (DEGs) were up-regulated in the infected cells, with the most significantly enriched biological processes associated with the genes differentially expressed exclusively in SC35MΔNS1 being linked to the immune system. Most DEHERVs in PR8M and SC35M are mammalian apparent LTR retrotransposons, while in SC35MΔNS1, more HERV loci from the HERVW9 group were differentially expressed. Furthermore, up-regulated pairs of HERVs and genes in close chromosomal proximity to each other tended to be associated with immune responses, which implies that specific HERV groups might have the potential to trigger specific gene networks and influence host immunological pathways.


Subject(s)
Endogenous Retroviruses , Influenza A virus , Animals , Antiviral Agents , Endogenous Retroviruses/genetics , Humans , Immune System , Influenza A virus/genetics , Mammals , Retroelements
4.
Mol Syst Biol ; 17(10): e10387, 2021 10.
Article in English | MEDLINE | ID: mdl-34664389

ABSTRACT

We need to effectively combine the knowledge from surging literature with complex datasets to propose mechanistic models of SARS-CoV-2 infection, improving data interpretation and predicting key targets of intervention. Here, we describe a large-scale community effort to build an open access, interoperable and computable repository of COVID-19 molecular mechanisms. The COVID-19 Disease Map (C19DMap) is a graphical, interactive representation of disease-relevant molecular mechanisms linking many knowledge sources. Notably, it is a computational resource for graph-based analyses and disease modelling. To this end, we established a framework of tools, platforms and guidelines necessary for a multifaceted community of biocurators, domain experts, bioinformaticians and computational biologists. The diagrams of the C19DMap, curated from the literature, are integrated with relevant interaction and text mining databases. We demonstrate the application of network analysis and modelling approaches by concrete examples to highlight new testable hypotheses. This framework helps to find signatures of SARS-CoV-2 predisposition, treatment response or prioritisation of drug candidates. Such an approach may help deal with new waves of COVID-19 or similar pandemics in the long-term perspective.


Subject(s)
COVID-19/immunology , Computational Biology/methods , Databases, Factual , SARS-CoV-2/immunology , Software , Antiviral Agents/therapeutic use , COVID-19/genetics , COVID-19/virology , Computer Graphics , Cytokines/genetics , Cytokines/immunology , Data Mining/statistics & numerical data , Gene Expression Regulation , Host Microbial Interactions/genetics , Host Microbial Interactions/immunology , Humans , Immunity, Cellular/drug effects , Immunity, Humoral/drug effects , Immunity, Innate/drug effects , Lymphocytes/drug effects , Lymphocytes/immunology , Lymphocytes/virology , Metabolic Networks and Pathways/genetics , Metabolic Networks and Pathways/immunology , Myeloid Cells/drug effects , Myeloid Cells/immunology , Myeloid Cells/virology , Protein Interaction Mapping , SARS-CoV-2/drug effects , SARS-CoV-2/genetics , SARS-CoV-2/pathogenicity , Signal Transduction , Transcription Factors/genetics , Transcription Factors/immunology , Viral Proteins/genetics , Viral Proteins/immunology , COVID-19 Drug Treatment
6.
Cell Stem Cell ; 28(9): 1566-1581.e8, 2021 09 02.
Article in English | MEDLINE | ID: mdl-33951478

ABSTRACT

The biological function and disease association of human endogenous retroviruses (HERVs) are largely elusive. HERV-K(HML-2) has been associated with neurotoxicity, but there is no clear understanding of its role or mechanistic basis. We addressed the physiological functions of HERV-K(HML-2) in neuronal differentiation using CRISPR engineering to activate or repress its expression levels in a human-pluripotent-stem-cell-based system. We found that elevated HERV-K(HML-2) transcription is detrimental for the development and function of cortical neurons. These effects are cell-type-specific, as dopaminergic neurons are unaffected. Moreover, high HERV-K(HML-2) transcription alters cortical layer formation in forebrain organoids. HERV-K(HML-2) transcriptional activation leads to hyperactivation of NTRK3 expression and other neurodegeneration-related genes. Direct activation of NTRK3 phenotypically resembles HERV-K(HML-2) induction, and reducing NTRK3 levels in context of HERV-K(HML-2) induction restores cortical neuron differentiation. Hence, these findings unravel a cell-type-specific role for HERV-K(HML-2) in cortical neuron development.


Subject(s)
Endogenous Retroviruses , Cell Differentiation , Humans , Transcriptional Activation
8.
Sci Rep ; 10(1): 4350, 2020 03 09.
Article in English | MEDLINE | ID: mdl-32152446

ABSTRACT

Isoform switching is a recently characterized hallmark of cancer, and often translates to the loss or gain of domains mediating protein interactions and thus, the re-wiring of the interactome. Recent computational tools leverage domain-domain interaction data to resolve the condition-specific interaction networks from RNA-Seq data accounting for the domain content of the primary transcripts expressed. Here, we used The Cancer Genome Atlas RNA-Seq datasets to generate 642 patient-specific pairs of interactomes corresponding to both the tumor and the healthy tissues across 13 cancer types. The comparison of these interactomes provided a list of patient-specific edgetic perturbations of the interactomes associated with the cancerous state. We found that among the identified perturbations, select sets are robustly shared between patients at the multi-cancer, cancer-specific and cancer sub-type specific levels. Interestingly, the majority of the alterations do not directly involve significantly mutated genes, nevertheless, they strongly correlate with patient survival. The findings (available at EdgeExplorer: "http://webclu.bio.wzw.tum.de/EdgeExplorer") are a new source of potential biomarkers for classifying cancer types and the proteins we identified are potential anti-cancer therapy targets.


Subject(s)
Biomarkers, Tumor , Disease Susceptibility , Neoplasms/etiology , Neoplasms/metabolism , Computational Biology/methods , Gene Expression Profiling , Humans , Neoplasms/mortality , Neoplasms/pathology , Prognosis , Protein Interaction Mapping , Protein Isoforms , Structure-Activity Relationship
9.
Nucleic Acids Res ; 47(D1): D559-D563, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30357367

ABSTRACT

CORUM is a database that provides a manually curated repository of experimentally characterized protein complexes from mammalian organisms, mainly human (67%), mouse (15%) and rat (10%). Given the vital functions of these macromolecular machines, their identification and functional characterization is foundational to our understanding of normal and disease biology. The new CORUM 3.0 release encompasses 4274 protein complexes offering the largest and most comprehensive publicly available dataset of mammalian protein complexes. The CORUM dataset is built from 4473 different genes, representing 22% of the protein coding genes in humans. Protein complexes are described by a protein complex name, subunit composition, cellular functions as well as the literature references. Information about stoichiometry of subunits depends on availability of experimental data. Recent developments include a graphical tool displaying known interactions between subunits. This allows the prediction of structural interconnections within protein complexes of unknown structure. In addition, we present a set of 58 protein complexes with alternatively spliced subunits. Those were found to affect cellular functions such as regulation of apoptotic activity, protein complex assembly or define cellular localization. CORUM is freely accessible at http://mips.helmholtz-muenchen.de/corum/.


Subject(s)
Databases, Protein , Multiprotein Complexes/chemistry , Multiprotein Complexes/metabolism , Alternative Splicing , Animals , Humans , Mice , Multiprotein Complexes/genetics , Protein Conformation , Protein Interaction Mapping , Protein Isoforms/genetics , Protein Isoforms/metabolism , Protein Subunits/chemistry , Protein Subunits/metabolism , Rats
10.
Orphanet J Rare Dis ; 13(1): 22, 2018 01 25.
Article in English | MEDLINE | ID: mdl-29370821

ABSTRACT

BACKGROUND: Thoroughly annotated data resources are a key requirement in phenotype dependent analysis and diagnosis of diseases in the area of precision medicine. Recent work has shown that curation and systematic annotation of human phenome data can significantly improve the quality and selectivity for the interpretation of inherited diseases. We have therefore developed PhenoDis, a comprehensive, manually annotated database providing symptomatic, genetic and imprinting information about rare cardiac diseases. RESULTS: PhenoDis includes 214 rare cardiac diseases from Orphanet and 94 more from OMIM. For phenotypic characterization of the diseases, we performed manual annotation of diseases with articles from the biomedical literature. Detailed description of disease symptoms required the use of 2247 different terms from the Human Phenotype Ontology (HPO). Diseases listed in PhenoDis frequently cover a broad spectrum of symptoms with 28% from the branch of 'cardiovascular abnormality' and others from areas such as neurological (11.5%) and metabolism (6%). We collected extensive information on the frequency of symptoms in respective diseases as well as on disease-associated genes and imprinting data. The analysis of the abundance of symptoms in patient studies revealed that most of the annotated symptoms (71%) are found in less than half of the patients of a particular disease. Comprehensive and systematic characterization of symptoms including their frequency is a pivotal prerequisite for computer based prediction of diseases and disease causing genetic variants. To this end, PhenoDis provides in-depth annotation for a complete group of rare diseases, including information on pathogenic and likely pathogenic genetic variants for 206 diseases as listed in ClinVar. We integrated all results in an online database ( http://mips.helmholtz-muenchen.de/phenodis/ ) with multiple search options and provide the complete dataset for download. CONCLUSION: PhenoDis provides a comprehensive set of manually annotated rare cardiac diseases that enables computational approaches for disease prediction via decision support systems and phenotype-driven strategies for the identification of disease causing genes.


Subject(s)
Heart Diseases/genetics , Heart Diseases/pathology , Rare Diseases/genetics , Rare Diseases/pathology , Computational Biology/methods , Databases, Genetic , Genetic Variation/genetics , Genomics/methods , Heart Diseases/metabolism , Humans , Phenotype , Precision Medicine/methods , Rare Diseases/metabolism
11.
Sci Rep ; 7(1): 4555, 2017 07 04.
Article in English | MEDLINE | ID: mdl-28676676

ABSTRACT

Recognizing that insights into the modulation of sleep duration can emerge by exploring the functional relationships among genes, we used this strategy to explore the genome-wide association results for this trait. We detected two major signalling pathways (ion channels and the ERBB signalling family of tyrosine kinases) that could be replicated across independent GWA studies meta-analyses. To investigate the significance of these pathways for sleep modulation, we performed transcriptome analyses of short sleeping flies' heads (knockdown for the ABCC9 gene homolog; dSur). We found significant alterations in gene-expression in the short sleeping knockdowns versus controls flies, which correspond to pathways associated with sleep duration in our human studies. Most notably, the expression of Rho and EGFR (members of the ERBB signalling pathway) genes was down- and up-regulated, respectively, consistently with the established role of these genes for sleep consolidation in Drosophila. Using a disease multifactorial interaction network, we showed that many of the genes of the pathways indicated to be relevant for sleep duration had functional evidence of their involvement with sleep regulation, circadian rhythms, insulin secretion, gluconeogenesis and lipogenesis.


Subject(s)
Gene Expression Regulation , Signal Transduction , Sleep/physiology , Animals , Computational Biology , Drosophila/physiology , ErbB Receptors/metabolism , Gene Expression Profiling , Gene Regulatory Networks , Genome-Wide Association Study , Genomics , Humans , Meta-Analysis as Topic , Phenotype , Polymorphism, Single Nucleotide , Transcriptome
12.
PLoS One ; 11(9): e0163362, 2016.
Article in English | MEDLINE | ID: mdl-27662471

ABSTRACT

BACKGROUND: Cardiomyopathies represent a rare group of disorders often of genetic origin. While approximately 50% of genetic causes are known for other types of cardiomyopathies, the genetic spectrum of restrictive cardiomyopathy (RCM) is largely unknown. The aim of the present study was to identify the genetic background of idiopathic RCM and to compile the obtained genetic variants to the novel signalling pathways using in silico protein network analysis. PATIENTS AND METHODS: We used Illumina MiSeq setup to screen for 108 cardiomyopathy and arrhythmia-associated genes in 24 patients with idiopathic RCM. Pathogenicity of genetic variants was classified according to American College of Medical Genetics and Genomics classification. RESULTS: Pathogenic and likely-pathogenic variants were detected in 13 of 24 patients resulting in an overall genotype-positive rate of 54%. Half of the genotype-positive patients carried a combination of pathogenic, likely-pathogenic variants and variants of unknown significance. The most frequent combination included mutations in sarcomeric and cytoskeletal genes (38%). A bioinformatics approach underlined the mechanotransducing protein networks important for RCM pathogenesis. CONCLUSIONS: Multiple gene mutations were detected in half of the RCM cases, with a combination of sarcomeric and cytoskeletal gene mutations being the most common. Mutations of genes encoding sarcomeric, cytoskeletal, and Z-line-associated proteins appear to have a predominant role in the development of RCM.

13.
Neuron ; 86(5): 1189-202, 2015 Jun 03.
Article in English | MEDLINE | ID: mdl-26050039

ABSTRACT

Depression risk is exacerbated by genetic factors and stress exposure; however, the biological mechanisms through which these factors interact to confer depression risk are poorly understood. One putative biological mechanism implicates variability in the ability of cortisol, released in response to stress, to trigger a cascade of adaptive genomic and non-genomic processes through glucocorticoid receptor (GR) activation. Here, we demonstrate that common genetic variants in long-range enhancer elements modulate the immediate transcriptional response to GR activation in human blood cells. These functional genetic variants increase risk for depression and co-heritable psychiatric disorders. Moreover, these risk variants are associated with inappropriate amygdala reactivity, a transdiagnostic psychiatric endophenotype and an important stress hormone response trigger. Network modeling and animal experiments suggest that these genetic differences in GR-induced transcriptional activation may mediate the risk for depression and other psychiatric disorders by altering a network of functionally related stress-sensitive genes in blood and brain.


Subject(s)
Brain/physiology , Genetic Variation/genetics , Mental Disorders/diagnosis , Mental Disorders/genetics , Stress, Psychological/genetics , Transcriptome/genetics , Animals , Cohort Studies , Forecasting , Gene Regulatory Networks/genetics , Humans , Male , Mice , Mice, Inbred C57BL , Polymorphism, Single Nucleotide/genetics , Risk Factors , Stress, Psychological/diagnosis
14.
Nucleic Acids Res ; 42(Database issue): D396-400, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24214996

ABSTRACT

Knowledge about non-interacting proteins (NIPs) is important for training the algorithms to predict protein-protein interactions (PPIs) and for assessing the false positive rates of PPI detection efforts. We present the second version of Negatome, a database of proteins and protein domains that are unlikely to engage in physical interactions (available online at http://mips.helmholtz-muenchen.de/proj/ppi/negatome). Negatome is derived by manual curation of literature and by analyzing three-dimensional structures of protein complexes. The main methodological innovation in Negatome 2.0 is the utilization of an advanced text mining procedure to guide the manual annotation process. Potential non-interactions were identified by a modified version of Excerbt, a text mining tool based on semantic sentence analysis. Manual verification shows that nearly a half of the text mining results with the highest confidence values correspond to NIP pairs. Compared to the first version the contents of the database have grown by over 300%.


Subject(s)
Databases, Protein , Protein Interaction Domains and Motifs , Protein Interaction Mapping , Data Mining , Internet , Molecular Sequence Annotation , Protein Conformation
15.
Genome Biol ; 13(7): R62, 2012 Jul 18.
Article in English | MEDLINE | ID: mdl-22809392

ABSTRACT

The pathobiology of common diseases is influenced by heterogeneous factors interacting in complex networks. CIDeR http://mips.helmholtz-muenchen.de/cider/ is a publicly available, manually curated, integrative database of metabolic and neurological disorders. The resource provides structured information on 18,813 experimentally validated interactions between molecules, bioprocesses and environmental factors extracted from the scientific literature. Systematic annotation and interactive graphical representation of disease networks make CIDeR a versatile knowledge base for biologists, analysis of large-scale data and systems biology approaches.


Subject(s)
Databases, Factual , Metabolic Diseases/metabolism , Nervous System Diseases/metabolism , Gene Regulatory Networks , Humans , Metabolic Diseases/genetics , Metabolic Networks and Pathways , Nervous System Diseases/genetics , Software , Systems Biology
16.
Genome Biol ; 11(1): R6, 2010 Jan 20.
Article in English | MEDLINE | ID: mdl-20089154

ABSTRACT

In recent years, microRNAs have been shown to play important roles in physiological as well as malignant processes. The PhenomiR database http://mips.helmholtz-muenchen.de/phenomir provides data from 542 studies that investigate deregulation of microRNA expression in diseases and biological processes as a systematic, manually curated resource. Using the PhenomiR dataset, we could demonstrate that, depending on disease type, independent information from cell culture studies contrasts with conclusions drawn from patient studies.


Subject(s)
Computational Biology/methods , MicroRNAs/genetics , Algorithms , Biochemistry/methods , Cluster Analysis , Disease/genetics , Gene Expression Profiling , Genes , Genome , Humans , Internet , Lod Score , MicroRNAs/metabolism , Models, Biological , Models, Genetic
17.
Nucleic Acids Res ; 38(Database issue): D540-4, 2010 Jan.
Article in English | MEDLINE | ID: mdl-19920129

ABSTRACT

The Negatome is a collection of protein and domain pairs that are unlikely to be engaged in direct physical interactions. The database currently contains experimentally supported non-interacting protein pairs derived from two distinct sources: by manual curation of literature and by analyzing protein complexes with known 3D structure. More stringent lists of non-interacting pairs were derived from these two datasets by excluding interactions detected by high-throughput approaches. Additionally, non-interacting protein domains have been derived from the stringent manual and structural data, respectively. The Negatome is much less biased toward functionally dissimilar proteins than the negative data derived by randomly selecting proteins from different cellular locations. It can be used to evaluate protein and domain interactions from new experiments and improve the training of interaction prediction algorithms. The Negatome database is available at http://mips.helmholtz-muenchen.de/proj/ppi/negatome.


Subject(s)
Computational Biology/methods , Databases, Genetic , Databases, Nucleic Acid , Protein Interaction Mapping , Proteins/chemistry , Algorithms , Animals , Computational Biology/trends , Databases, Protein , Genome, Fungal , Humans , Information Storage and Retrieval/methods , Internet , Protein Structure, Tertiary , Saccharomyces cerevisiae/metabolism , Software
18.
Nucleic Acids Res ; 38(Database issue): D497-501, 2010 Jan.
Article in English | MEDLINE | ID: mdl-19884131

ABSTRACT

CORUM is a database that provides a manually curated repository of experimentally characterized protein complexes from mammalian organisms, mainly human (64%), mouse (16%) and rat (12%). Protein complexes are key molecular entities that integrate multiple gene products to perform cellular functions. The new CORUM 2.0 release encompasses 2837 protein complexes offering the largest and most comprehensive publicly available dataset of mammalian protein complexes. The CORUM dataset is built from 3198 different genes, representing approximately 16% of the protein coding genes in humans. Each protein complex is described by a protein complex name, subunit composition, function as well as the literature reference that characterizes the respective protein complex. Recent developments include mapping of functional annotation to Gene Ontology terms as well as cross-references to Entrez Gene identifiers. In addition, a 'Phylogenetic Conservation' analysis tool was implemented that analyses the potential occurrence of orthologous protein complex subunits in mammals and other selected groups of organisms. This allows one to predict the occurrence of protein complexes in different phylogenetic groups. CORUM is freely accessible at (http://mips.helmholtz-muenchen.de/genre/proj/corum/index.html).


Subject(s)
Computational Biology/methods , Databases, Genetic , Databases, Protein , Multiprotein Complexes , Animals , Computational Biology/trends , Humans , Information Storage and Retrieval/methods , Internet , Mice , Phylogeny , Protein Structure, Tertiary , Rats , Saccharomyces cerevisiae/genetics , Software
19.
Nucleic Acids Res ; 36(Database issue): D646-50, 2008 Jan.
Article in English | MEDLINE | ID: mdl-17965090

ABSTRACT

Protein complexes are key molecular entities that integrate multiple gene products to perform cellular functions. The CORUM (http://mips.gsf.de/genre/proj/corum/index.html) database is a collection of experimentally verified mammalian protein complexes. Information is manually derived by critical reading of the scientific literature from expert annotators. Information about protein complexes includes protein complex names, subunits, literature references as well as the function of the complexes. For functional annotation, we use the FunCat catalogue that enables to organize the protein complex space into biologically meaningful subsets. The database contains more than 1750 protein complexes that are built from 2400 different genes, thus representing 12% of the protein-coding genes in human. A web-based system is available to query, view and download the data. CORUM provides a comprehensive dataset of protein complexes for discoveries in systems biology, analyses of protein networks and protein complex-associated diseases. Comparable to the MIPS reference dataset of protein complexes from yeast, CORUM intends to serve as a reference for mammalian protein complexes.


Subject(s)
Databases, Protein , Multiprotein Complexes/physiology , Animals , Humans , Internet , Mice , Multiprotein Complexes/analysis , Multiprotein Complexes/chemistry , Rats , User-Computer Interface
20.
BMC Bioinformatics ; 8: 261, 2007 Jul 21.
Article in English | MEDLINE | ID: mdl-17659089

ABSTRACT

BACKGROUND: Unsupervised annotation of proteins by software pipelines suffers from very high error rates. Spurious functional assignments are usually caused by unwarranted homology-based transfer of information from existing database entries to the new target sequences. We have previously demonstrated that data mining in large sequence annotation databanks can help identify annotation items that are strongly associated with each other, and that exceptions from strong positive association rules often point to potential annotation errors. Here we investigate the applicability of negative association rule mining to revealing erroneously assigned annotation items. RESULTS: Almost all exceptions from strong negative association rules are connected to at least one wrong attribute in the feature combination making up the rule. The fraction of annotation features flagged by this approach as suspicious is strongly enriched in errors and constitutes about 0.6% of the whole body of the similarity-transferred annotation in the PEDANT genome database. Positive rule mining does not identify two thirds of these errors. The approach based on exceptions from negative rules is much more specific than positive rule mining, but its coverage is significantly lower. CONCLUSION: Mining of both negative and positive association rules is a potent tool for finding significant trends in protein annotation and flagging doubtful features for further inspection.


Subject(s)
Algorithms , Databases, Genetic/statistics & numerical data , Databases, Protein/statistics & numerical data , Genome , Information Storage and Retrieval/methods , Proteins/genetics , Amino Acid Sequence , Computational Biology/methods , Protein Structure, Secondary , Protein Structure, Tertiary , Sequence Analysis, Protein , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...