Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 80
Filter
1.
Orphanet J Rare Dis ; 13(1): 22, 2018 01 25.
Article in English | MEDLINE | ID: mdl-29370821

ABSTRACT

BACKGROUND: Thoroughly annotated data resources are a key requirement in phenotype dependent analysis and diagnosis of diseases in the area of precision medicine. Recent work has shown that curation and systematic annotation of human phenome data can significantly improve the quality and selectivity for the interpretation of inherited diseases. We have therefore developed PhenoDis, a comprehensive, manually annotated database providing symptomatic, genetic and imprinting information about rare cardiac diseases. RESULTS: PhenoDis includes 214 rare cardiac diseases from Orphanet and 94 more from OMIM. For phenotypic characterization of the diseases, we performed manual annotation of diseases with articles from the biomedical literature. Detailed description of disease symptoms required the use of 2247 different terms from the Human Phenotype Ontology (HPO). Diseases listed in PhenoDis frequently cover a broad spectrum of symptoms with 28% from the branch of 'cardiovascular abnormality' and others from areas such as neurological (11.5%) and metabolism (6%). We collected extensive information on the frequency of symptoms in respective diseases as well as on disease-associated genes and imprinting data. The analysis of the abundance of symptoms in patient studies revealed that most of the annotated symptoms (71%) are found in less than half of the patients of a particular disease. Comprehensive and systematic characterization of symptoms including their frequency is a pivotal prerequisite for computer based prediction of diseases and disease causing genetic variants. To this end, PhenoDis provides in-depth annotation for a complete group of rare diseases, including information on pathogenic and likely pathogenic genetic variants for 206 diseases as listed in ClinVar. We integrated all results in an online database ( http://mips.helmholtz-muenchen.de/phenodis/ ) with multiple search options and provide the complete dataset for download. CONCLUSION: PhenoDis provides a comprehensive set of manually annotated rare cardiac diseases that enables computational approaches for disease prediction via decision support systems and phenotype-driven strategies for the identification of disease causing genes.


Subject(s)
Heart Diseases/genetics , Heart Diseases/pathology , Rare Diseases/genetics , Rare Diseases/pathology , Computational Biology/methods , Databases, Genetic , Genetic Variation/genetics , Genomics/methods , Heart Diseases/metabolism , Humans , Phenotype , Precision Medicine/methods , Rare Diseases/metabolism
2.
Bioinformatics ; 33(10): 1565-1567, 2017 05 15.
Article in English | MEDLINE | ID: mdl-28069593

ABSTRACT

Summary: Analysis of Next Generation Sequencing (NGS) data requires the processing of large datasets by chaining various tools with complex input and output formats. In order to automate data analysis, we propose to standardize NGS tasks into modular workflows. This simplifies reliable handling and processing of NGS data, and corresponding solutions become substantially more reproducible and easier to maintain. Here, we present a documented, linux-based, toolbox of 42 processing modules that are combined to construct workflows facilitating a variety of tasks such as DNAseq and RNAseq analysis. We also describe important technical extensions. The high throughput executor (HTE) helps to increase the reliability and to reduce manual interventions when processing complex datasets. We also provide a dedicated binary manager that assists users in obtaining the modules' executables and keeping them up to date. As basis for this actively developed toolbox we use the workflow management software KNIME. Availability and Implementation: See http://ibisngs.github.io/knime4ngs for nodes and user manual (GPLv3 license). Contact: robert.kueffner@helmholtz-muenchen.de. Supplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
High-Throughput Nucleotide Sequencing/methods , Software , Reproducibility of Results , Workflow
4.
Pharmacopsychiatry ; 46 Suppl 1: S2-9, 2013 May.
Article in English | MEDLINE | ID: mdl-23599241

ABSTRACT

Psychiatric diseases provoke human tragedies. Asocial behaviour, mood imbalance, uncontrolled affect, and cognitive malfunction are the price for the biological and social complexity of neurobiology. To understand the etiology and to influence the onset and progress of mental diseases remains of upmost importance, but despite the much improved care for the patients, more then 100 years of research have not succeeded to understand the basic disease mechanisms and enabling rationale treatment. With the advent of the genome based technologies, much hope has been created to join the various dimension of -omics data to uncover the secrets of mental illness. Big Data as generated by -omics do not come with explanations. In this essay, I will discuss the inherent, not well understood methodological foundations and problems that seriously obstacle in striving for a quick success and may render lucky strikes impossible.


Subject(s)
Brain/physiopathology , Mental Disorders/pathology , Models, Neurological , Systems Biology , Humans , Nerve Net/pathology , Nerve Net/physiopathology , Neural Pathways/pathology , Neural Pathways/physiopathology
7.
Pharmacopsychiatry ; 44 Suppl 1: S2-8, 2011 May.
Article in English | MEDLINE | ID: mdl-21544742

ABSTRACT

Understanding mental disorders and their neurobiological basis encompasses the conceptual management of "complexity" and "dynamics". For example, affective disorders exhibit several fluctuating state variables on psychological and biological levels and data collected of these systems levels suggest quasi-chaotic periodicity leading to use concepts and tools of the mathematics of nonlinear dynamic systems. Regarding this, we demonstrate that the concept of "Dynamic Diseases" could be a fruitful way for theory and empirical research in neuropsychiatry. In a first step, as an example, we focus on the analysis of dynamic cortisol regulation that is important for understanding depressive disorders. In this case, our message is that extremely complex phenomena of a disease may be explained as resulting from perplexingly simple nonlinear interactions of a very small number of variables. Additionally, we propose that and how widely used complex circuit diagrams representing the macroanatomic structures and connectivities of the brain involved in major depression or other mental disorders may be "animated" by quantification, even by using expert-based estimations (dummy variables). This method of modeling allows to develop exploratory computer-based numerical models that encompass the option to explore the system by computer simulations (in-silico experiments). Also inter- and intracellular molecular networks involved in affective disorders could be modeled by this procedure. We want to stimulate future research in this theoretical context.


Subject(s)
Depression/physiopathology , Depressive Disorder/physiopathology , Disease , Mental Disorders/physiopathology , Mood Disorders/physiopathology , Neurobiology , Systems Biology , Brain/anatomy & histology , Brain/pathology , Brain/physiopathology , Computer Simulation , Depressive Disorder/pathology , Humans , Hydrocortisone/metabolism , Mental Disorders/pathology , Models, Biological , Mood Disorders/metabolism , Mood Disorders/pathology , Neuropsychiatry , Nonlinear Dynamics , Signal Transduction
8.
Nucleic Acids Res ; 39(Database issue): D220-4, 2011 Jan.
Article in English | MEDLINE | ID: mdl-21109531

ABSTRACT

The Munich Information Center for Protein Sequences (MIPS at the Helmholtz Center for Environmental Health, Neuherberg, Germany) has many years of experience in providing annotated collections of biological data. Selected data sets of high relevance, such as model genomes, are subjected to careful manual curation, while the bulk of high-throughput data is annotated by automatic means. High-quality reference resources developed in the past and still actively maintained include Saccharomyces cerevisiae, Neurospora crassa and Arabidopsis thaliana genome databases as well as several protein interaction data sets (MPACT, MPPI and CORUM). More recent projects are PhenomiR, the database on microRNA-related phenotypes, and MIPS PlantsDB for integrative and comparative plant genome research. The interlinked resources SIMAP and PEDANT provide homology relationships as well as up-to-date and consistent annotation for 38,000,000 protein sequences. PPLIPS and CCancer are versatile tools for proteomics and functional genomics interfacing to a database of compilations from gene lists extracted from literature. A novel literature-mining tool, EXCERBT, gives access to structured information on classified relations between genes, proteins, phenotypes and diseases extracted from Medline abstracts by semantic analysis. All databases described here, as well as the detailed descriptions of our projects can be accessed through the MIPS WWW server (http://mips.helmholtz-muenchen.de).


Subject(s)
Databases, Genetic , Data Mining , Databases, Protein , Genes, Neoplasm , Genome, Plant , Genomics , Metabolomics , MicroRNAs/metabolism , Phenotype , Proteomics , Sequence Analysis, Protein , Systems Integration
9.
Pharmacopsychiatry ; 43 Suppl 1: S2-8, 2010 May.
Article in English | MEDLINE | ID: mdl-20480444

ABSTRACT

Understanding the synapse and its role in the development of psychiatric disorders is not only a demanding but a highly relevant challenge for neuroscience. With the advancement of modern high-throughput technologies, the amount of data collected becomes incomprehensible and the volume of information intractable for the individual scientist. Why Systems Biology opens alternatives to organize information and to deduce knowledge that can be scrutinized by rationally designed experiments? We discuss some of the fundamental ideas why Systems Biology is indeed an alternative to reductionism and show an example how semantics may help to exploit the rich source of the scientific literature to generate qualitative models of functional modules.


Subject(s)
Models, Neurological , Synapses/physiology , Systems Biology/methods , Alzheimer Disease/physiopathology , Brain/physiology , Brain/physiopathology , Humans , Neurosciences/methods , Signal Processing, Computer-Assisted , Software , User-Computer Interface
10.
Nucleic Acids Res ; 38(Database issue): D497-501, 2010 Jan.
Article in English | MEDLINE | ID: mdl-19884131

ABSTRACT

CORUM is a database that provides a manually curated repository of experimentally characterized protein complexes from mammalian organisms, mainly human (64%), mouse (16%) and rat (12%). Protein complexes are key molecular entities that integrate multiple gene products to perform cellular functions. The new CORUM 2.0 release encompasses 2837 protein complexes offering the largest and most comprehensive publicly available dataset of mammalian protein complexes. The CORUM dataset is built from 3198 different genes, representing approximately 16% of the protein coding genes in humans. Each protein complex is described by a protein complex name, subunit composition, function as well as the literature reference that characterizes the respective protein complex. Recent developments include mapping of functional annotation to Gene Ontology terms as well as cross-references to Entrez Gene identifiers. In addition, a 'Phylogenetic Conservation' analysis tool was implemented that analyses the potential occurrence of orthologous protein complex subunits in mammals and other selected groups of organisms. This allows one to predict the occurrence of protein complexes in different phylogenetic groups. CORUM is freely accessible at (http://mips.helmholtz-muenchen.de/genre/proj/corum/index.html).


Subject(s)
Computational Biology/methods , Databases, Genetic , Databases, Protein , Multiprotein Complexes , Animals , Computational Biology/trends , Humans , Information Storage and Retrieval/methods , Internet , Mice , Phylogeny , Protein Structure, Tertiary , Rats , Saccharomyces cerevisiae/genetics , Software
11.
Bioinformatics ; 25(1): 141-3, 2009 Jan 01.
Article in English | MEDLINE | ID: mdl-19010804

ABSTRACT

UNLABELLED: Cross-mapping of gene and protein identifiers between different databases is a tedious and time-consuming task. To overcome this, we developed CRONOS, a cross-reference server that contains entries from five mammalian organisms presented by major gene and protein information resources. Sequence similarity analysis of the mapped entries shows that the cross-references are highly accurate. In total, up to 18 different identifier types can be used for identification of cross-references. The quality of the mapping could be improved substantially by exclusion of ambiguous gene and protein names which were manually validated. Organism-specific lists of ambiguous terms, which are valuable for a variety of bioinformatics applications like text mining are available for download. AVAILABILITY: CRONOS is freely available to non-commercial users at http://mips.gsf.de/genre/proj/cronos/index.html, web services are available at http://mips.gsf.de/CronosWSService/CronosWS?wsdl.


Subject(s)
Computational Biology/instrumentation , Computational Biology/methods , Internet , Software , Animals , Genes , Humans , Proteins
12.
Methods Inf Med ; 47(4): 283-95, 2008.
Article in English | MEDLINE | ID: mdl-18690362

ABSTRACT

OBJECTIVES: To clarify challenges and research topics for informatics in health and to describe new approaches for interdisciplinary collaboration and education. METHODS: Research challenges and possible solutions were elaborated by scientists of two universities using an interdisciplinary approach, in a series of meetings over several months. RESULTS AND CONCLUSION: In order to translate scientific results from bench to bedside and further into an evidence-based and efficient health system, intensive collaboration is needed between experts from medicine, biology, informatics, engineering, public health, as well as social and economic sciences. Research challenges can be attributed to four areas: bioinformatics and systems biology, biomedical engineering and informatics, health informatics and individual healthcare, and public health informatics. In order to bridge existing gaps between different disciplines and cultures, we suggest focusing on interdisciplinary education, taking an integrative approach and starting interdisciplinary practice at early stages of education.


Subject(s)
Biomedical Research , Medical Informatics , Public Health Informatics , Evidence-Based Medicine , Research/education
13.
Nucleic Acids Res ; 36(Web Server issue): W140-4, 2008 Jul 01.
Article in English | MEDLINE | ID: mdl-18463135

ABSTRACT

The generation of expressed sequence tag (EST) libraries offers an affordable approach to investigate organisms, if no genome sequence is available. OREST (http://mips.gsf.de/genre/proj/orest/index.html) is a server-based EST analysis pipeline, which allows the rapid analysis of large amounts of ESTs or cDNAs from mammalia and fungi. In order to assign the ESTs to genes or proteins OREST maps DNA sequences to reference datasets of gene products and in a second step to complete genome sequences. Mapping against genome sequences recovers additional 13% of EST data, which otherwise would escape further analysis. To enable functional analysis of the datasets, ESTs are functionally annotated using the hierarchical FunCat annotation scheme as well as GO annotation terms. OREST also allows to predict the association of gene products and diseases by Morbid Map (OMIM) classification. A statistical analysis of the results of the dataset is possible with the included PROMPT software, which provides information about enrichment and depletion of functional and disease annotation terms. OREST was successfully applied for the identification and functional characterization of more than 3000 EST sequences of the common marmoset monkey (Callithrix jacchus) as part of an international collaboration.


Subject(s)
Expressed Sequence Tags/chemistry , Software , Animals , Chromosome Mapping , Genes, Fungal , Humans , Internet , Mammals/genetics , Mice , Proteins/genetics , Rats , Saccharomyces cerevisiae/genetics , Sequence Analysis, DNA
14.
Nucleic Acids Res ; 36(Database issue): D646-50, 2008 Jan.
Article in English | MEDLINE | ID: mdl-17965090

ABSTRACT

Protein complexes are key molecular entities that integrate multiple gene products to perform cellular functions. The CORUM (http://mips.gsf.de/genre/proj/corum/index.html) database is a collection of experimentally verified mammalian protein complexes. Information is manually derived by critical reading of the scientific literature from expert annotators. Information about protein complexes includes protein complex names, subunits, literature references as well as the function of the complexes. For functional annotation, we use the FunCat catalogue that enables to organize the protein complex space into biologically meaningful subsets. The database contains more than 1750 protein complexes that are built from 2400 different genes, thus representing 12% of the protein-coding genes in human. A web-based system is available to query, view and download the data. CORUM provides a comprehensive dataset of protein complexes for discoveries in systems biology, analyses of protein networks and protein complex-associated diseases. Comparable to the MIPS reference dataset of protein complexes from yeast, CORUM intends to serve as a reference for mammalian protein complexes.


Subject(s)
Databases, Protein , Multiprotein Complexes/physiology , Animals , Humans , Internet , Mice , Multiprotein Complexes/analysis , Multiprotein Complexes/chemistry , Rats , User-Computer Interface
15.
Nucleic Acids Res ; 36(Database issue): D196-201, 2008 Jan.
Article in English | MEDLINE | ID: mdl-18158298

ABSTRACT

The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) combines automatic processing of large amounts of sequences with manual annotation of selected model genomes. Due to the massive growth of the available data, the depth of annotation varies widely between independent databases. Also, the criteria for the transfer of information from known to orthologous sequences are diverse. To cope with the task of global in-depth genome annotation has become unfeasible. Therefore, our efforts are dedicated to three levels of annotation: (i) the curation of selected genomes, in particular from fungal and plant taxa (e.g. CYGD, MNCDB, MatDB), (ii) the comprehensive, consistent, automatic annotation employing exhaustive methods for the computation of sequence similarities and sequence-related attributes as well as the classification of individual sequences (SIMAP, PEDANT and FunCat) and (iii) the compilation of manually curated databases for protein interactions based on scrutinized information from the literature to serve as an accepted set of reliable annotated interaction data (MPACT, MPPI, CORUM). All databases and tools described as well as the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de).


Subject(s)
Databases, Protein , Fungal Proteins/chemistry , Fungal Proteins/genetics , Plant Proteins/chemistry , Plant Proteins/genetics , Fungal Proteins/metabolism , Genome, Fungal , Genome, Plant , Genomics , Internet , Plant Proteins/metabolism , Protein Interaction Mapping , Sequence Analysis, Protein , Software , User-Computer Interface
16.
Drug Discov Today Technol ; 3(2): 145-51, 2006.
Article in English | MEDLINE | ID: mdl-24980401

ABSTRACT

Data from large-scale genome projects, transcriptomics and proteomics experiments have provided scientists with a wealth of information establishing the basis for the investigation of cellular processes. To understand biological function beyond the single gene by the discovery and characterization of functional protein networks, bioinformatics analysis requires information about two additional attributes associated with the gene products: (i) high-level protein function prediction of experimentally uncharacterized proteins and (ii) systematic classification of protein function. This article describes the basic properties of protein classification systems and discusses examples of their implementation.:

17.
Nucleic Acids Res ; 34(Database issue): D169-72, 2006 Jan 01.
Article in English | MEDLINE | ID: mdl-16381839

ABSTRACT

The Munich Information Center for Protein Sequences (MIPS at the GSF), Neuherberg, Germany, provides resources related to genome information. Manually curated databases for several reference organisms are maintained. Several of these databases are described elsewhere in this and other recent NAR database issues. In a complementary effort, a comprehensive set of >400 genomes automatically annotated with the PEDANT system are maintained. The main goal of our current work on creating and maintaining genome databases is to extend gene centered information to information on interactions within a generic comprehensive framework. We have concentrated our efforts along three lines (i) the development of suitable comprehensive data structures and database technology, communication and query tools to include a wide range of different types of information enabling the representation of complex information such as functional modules or networks Genome Research Environment System, (ii) the development of databases covering computable information such as the basic evolutionary relations among all genes, namely SIMAP, the sequence similarity matrix and the CABiNet network analysis framework and (iii) the compilation and manual annotation of information related to interactions such as protein-protein interactions or other types of relations (e.g. MPCDB, MPPI, CYGD). All databases described and the detailed descriptions of our projects can be accessed through the MIPS WWW server (http://mips.gsf.de).


Subject(s)
Databases, Genetic , Genomics , Proteins/genetics , Animals , Computational Biology/methods , Evolution, Molecular , Internet , Mice , Models, Genetic , Protein Interaction Mapping , User-Computer Interface
18.
Nucleic Acids Res ; 34(Database issue): D252-6, 2006 Jan 01.
Article in English | MEDLINE | ID: mdl-16381858

ABSTRACT

Similarity Matrix of Proteins (SIMAP) (http://mips.gsf.de/simap) provides a database based on a pre-computed similarity matrix covering the similarity space formed by >4 million amino acid sequences from public databases and completely sequenced genomes. The database is capable of handling very large datasets and is updated incrementally. For sequence similarity searches and pairwise alignments, we implemented a grid-enabled software system, which is based on FASTA heuristics and the Smith-Waterman algorithm. Our ProtInfo system allows querying by protein sequences covered by the SIMAP dataset as well as by fragments of these sequences, highly similar sequences and title words. Each sequence in the database is supplemented with pre-calculated features generated by detailed sequence analyses. By providing WWW interfaces as well as web-services, we offer the SIMAP resource as an efficient and comprehensive tool for sequence similarity searches.


Subject(s)
Databases, Protein , Sequence Homology, Amino Acid , Internet , Sequence Alignment , Software , User-Computer Interface
19.
Nucleic Acids Res ; 34(Database issue): D568-71, 2006 Jan 01.
Article in English | MEDLINE | ID: mdl-16381934

ABSTRACT

MfunGD (http://mips.gsf.de/genre/proj/mfungd/) provides a resource for annotated mouse proteins and their occurrence in protein networks. Manual annotation concentrates on proteins which are found to interact physically with other proteins. Accordingly, manually curated information from a protein-protein interaction database (MPPI) and a database of mammalian protein complexes is interconnected with MfunGD. Protein function annotation is performed using the Functional Catalogue (FunCat) annotation scheme which is widely used for the analysis of protein networks. The dataset is also supplemented with information about the literature that was used in the annotation process as well as links to the SIMAP Fasta database, the Pedant protein analysis system and cross-references to external resources. Proteins that so far were not manually inspected are annotated automatically by a graphical probabilistic model and/or superparamagnetic clustering. The database is continuously expanding to include the rapidly growing amount of functional information about gene products from mouse. MfunGD is implemented in GenRE, a J2EE-based component-oriented multi-tier architecture following the separation of concern principle.


Subject(s)
Databases, Genetic , Genomics , Mice/genetics , Multiprotein Complexes/genetics , Multiprotein Complexes/physiology , Animals , Internet , Multiprotein Complexes/chemistry , Proteomics , Software , User-Computer Interface
20.
Nucleic Acids Res ; 33(Database issue): D364-8, 2005 Jan 01.
Article in English | MEDLINE | ID: mdl-15608217

ABSTRACT

The Comprehensive Yeast Genome Database (CYGD) compiles a comprehensive data resource for information on the cellular functions of the yeast Saccharomyces cerevisiae and related species, chosen as the best understood model organism for eukaryotes. The database serves as a common resource generated by a European consortium, going beyond the provision of sequence information and functional annotations on individual genes and proteins. In addition, it provides information on the physical and functional interactions among proteins as well as other genetic elements. These cellular networks include metabolic and regulatory pathways, signal transduction and transport processes as well as co-regulated gene clusters. As more yeast genomes are published, their annotation becomes greatly facilitated using S.cerevisiae as a reference. CYGD provides a way of exploring related genomes with the aid of the S.cerevisiae genome as a backbone and SIMAP, the Similarity Matrix of Proteins. The comprehensive resource is available under http://mips.gsf.de/genre/proj/yeast/.


Subject(s)
Databases, Genetic , Genome, Fungal , Saccharomyces cerevisiae Proteins/genetics , Saccharomyces cerevisiae/genetics , Binding Sites , Genomics , Membrane Proteins/analysis , Membrane Transport Proteins/analysis , Membrane Transport Proteins/metabolism , Saccharomyces cerevisiae/metabolism , Saccharomyces cerevisiae Proteins/chemistry , Saccharomyces cerevisiae Proteins/metabolism , Sequence Analysis, Protein , Transcription Factors/metabolism , User-Computer Interface
SELECTION OF CITATIONS
SEARCH DETAIL