Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 58
Filter
1.
PLoS Comput Biol ; 16(2): e1007613, 2020 02.
Article in English | MEDLINE | ID: mdl-32032351

ABSTRACT

There is an increasing need to use genome and transcriptome sequencing to genetically diagnose patients suffering from suspected monogenic rare diseases. The proper detection of compound heterozygous variant combinations as disease-causing candidates is a challenge in diagnostic workflows as haplotype information is lost by currently used next-generation sequencing technologies. Consequently, computational tools are required to phase, or resolve the haplotype of, the high number of heterozygous variants in the exome or genome of each patient. Here we present SmartPhase, a phasing tool designed to efficiently reduce the set of potential compound heterozygous variant pairs in genetic diagnoses pipelines. The phasing algorithm of SmartPhase creates haplotypes using both parental genotype information and reads generated by DNA or RNA sequencing and is thus well suited to resolve the phase of rare variants. To inform the user about the reliability of a phasing prediction, it computes a confidence score which is essential to select error-free predictions. It incorporates existing haplotype information and applies logical rules to determine variants that can be excluded as causing a recessive, monogenic disease. SmartPhase can phase either all possible variant pairs in predefined genetic loci or preselected variant pairs of interest, thus keeping the focus on clinically relevant results. We compared SmartPhase to WhatsHap, one of the leading comparable phasing tools, using simulated data and a real clinical cohort of 921 patients. On both data sets, SmartPhase generated error-free predictions using our derived confidence score threshold. It outperformed WhatsHap with regard to the percentage of resolved pairs when parental genotype information is available. On the cohort data, SmartPhase enabled on average the exclusion of approximately 22% of the input variant pairs in each singleton patient and 44% in each trio patient. SmartPhase is implemented as an open-source Java tool and freely available at http://ibis.helmholtz-muenchen.de/smartphase/.


Subject(s)
Heterozygote , Rare Diseases/diagnosis , Haplotypes , High-Throughput Nucleotide Sequencing/methods , Humans , Rare Diseases/genetics , Reproducibility of Results
2.
Yeast ; 36(4): 161-165, 2019 04.
Article in English | MEDLINE | ID: mdl-30650215

ABSTRACT

From 1989 to 1997, the yeast genome was sequenced by a worldwide international consortium initiated and conducted by André Goffeau (1935-2018). The article describes the pioneering collaboration of yeast scientists from a bioinformatics perspective. Indeed, the yeast genome has turned bioinformatics from an exotic hobby of few nerds into a discipline indispensable for answering biological questions using computational methods.


Subject(s)
Computational Biology/history , Genome, Fungal , Saccharomyces cerevisiae/genetics , History, 20th Century , History, 21st Century
3.
Toxins (Basel) ; 8(1)2015 Dec 25.
Article in English | MEDLINE | ID: mdl-26712789

ABSTRACT

Increasing frequencies of 3-acetyl-deoxynivalenol (3-ADON)-producing strains of Fusarium graminearum (3-ADON chemotype) have been reported in North America and Asia. 3-ADON is nearly nontoxic at the level of the ribosomal target and has to be deacetylated to cause inhibition of protein biosynthesis. Plant cells can efficiently remove the acetyl groups of 3-ADON, but the underlying genes are yet unknown. We therefore performed a study of the family of candidate carboxylesterases (CXE) genes of the monocot model plant Brachypodium distachyon. We report the identification and characterization of the first plant enzymes responsible for deacetylation of trichothecene toxins. The product of the BdCXE29 gene efficiently deacetylates T-2 toxin to HT-2 toxin, NX-2 to NX-3, both 3-ADON and 15-acetyl-deoxynivalenol (15-ADON) into deoxynivalenol and, to a lesser degree, also fusarenon X into nivalenol. The BdCXE52 esterase showed lower activity than BdCXE29 when expressed in yeast and accepts 3-ADON, NX-2, 15-ADON and, to a limited extent, fusarenon X as substrates. Expression of these Brachypodium genes in yeast increases the toxicity of 3-ADON, suggesting that highly similar genes existing in crop plants may act as susceptibility factors in Fusarium head blight disease.


Subject(s)
Brachypodium/genetics , Carboxylic Ester Hydrolases/genetics , Carboxylic Ester Hydrolases/metabolism , Plant Proteins/genetics , Plant Proteins/metabolism , Trichothecenes/metabolism , Acetylation , Brachypodium/enzymology , Genes, Plant , Saccharomyces cerevisiae/genetics , Trichothecenes/chemistry , Trichothecenes/toxicity
4.
Stem Cell Reports ; 5(5): 702-715, 2015 Nov 10.
Article in English | MEDLINE | ID: mdl-26527384

ABSTRACT

Hematopoietic stem cells (HSCs) are preserved in co-cultures with UG26-1B6 stromal cells or their conditioned medium. We performed a genome-wide study of gene expression changes of UG26-1B6 stromal cells in contact with Lineage⁻ SCA-1⁺ KIT⁺ (LSK) cells. This analysis identified connective tissue growth factor (CTGF) to be upregulated in response to LSK cells. We found that co-culture of HSCs on CTGF knockdown stroma (shCtgf) shows impaired engraftment and long-term quality. Further experiments demonstrated that CD34⁻ CD48⁻ CD150⁺ LSK (CD34⁻ SLAM) cell numbers from shCtgf co-cultures increase in G0 and senescence and show delayed time to first cell division. To understand this observation, a CTGF signaling network model was assembled, which was experimentally validated. In co-culture experiments of CD34⁻ SLAM cells with shCtgf stromal cells, we found that SMAD2/3-dependent signaling was activated, with increasing p27(Kip1) expression and downregulating cyclin D1. Our data support the view that LSK cells modulate gene expression in the niche to maintain repopulating HSC activity.


Subject(s)
Cell Cycle , Connective Tissue Growth Factor/pharmacology , Hematopoietic Stem Cells/cytology , Stromal Cells/metabolism , Animals , Cell Line , Cells, Cultured , Connective Tissue Growth Factor/metabolism , Cyclin D1/genetics , Cyclin D1/metabolism , Cyclin-Dependent Kinase Inhibitor p27/genetics , Cyclin-Dependent Kinase Inhibitor p27/metabolism , Hematopoietic Stem Cells/drug effects , Hematopoietic Stem Cells/metabolism , Mice , Mice, Inbred C57BL , Smad2 Protein/metabolism , Smad3 Protein/metabolism , Stem Cell Niche
5.
Genome Med ; 7: 102, 2015 Sep 29.
Article in English | MEDLINE | ID: mdl-26419521

ABSTRACT

The cause of a complex disease cannot be pinpointed to a single origin; rather, a highly complex network of many factors that interact on different levels over time and space is disturbed. This complexity requires novel approaches to diagnosis, treatment, and prevention. To foster the necessary shift to a pro-active systems medicine, proof-of-concept studies are needed. Here, we highlight several systems approaches that have been shown to work within the field of respiratory medicine, and we propose the next steps for broader implementation.


Subject(s)
Systems Analysis , Delivery of Health Care , Disease Management , Humans
6.
PLoS One ; 9(10): e110311, 2014.
Article in English | MEDLINE | ID: mdl-25333987

ABSTRACT

Fungal secondary metabolite biosynthesis genes are of major interest due to the pharmacological properties of their products (like mycotoxins and antibiotics). The genome of the plant pathogenic fungus Fusarium graminearum codes for a large number of candidate enzymes involved in secondary metabolite biosynthesis. However, the chemical nature of most enzymatic products of proteins encoded by putative secondary metabolism biosynthetic genes is largely unknown. Based on our analysis we present 67 gene clusters with significant enrichment of predicted secondary metabolism related enzymatic functions. 20 gene clusters with unknown metabolites exhibit strong gene expression correlation in planta and presumably play a role in virulence. Furthermore, the identification of conserved and over-represented putative transcription factor binding sites serves as additional evidence for cluster co-regulation. Orthologous cluster search provided insight into the evolution of secondary metabolism clusters. Some clusters are characteristic for the Fusarium phylum while others show evidence of horizontal gene transfer as orthologs can be found in representatives of the Botrytis or Cochliobolus lineage. The presented candidate clusters provide valuable targets for experimental examination.


Subject(s)
Fusarium/genetics , Gene Transfer, Horizontal , Genes, Fungal , Genome, Fungal , Multigene Family , Secondary Metabolism/genetics , Cluster Analysis , Evolution, Molecular , Gene Expression Profiling , Gene Expression Regulation, Fungal , Nucleotide Motifs , Promoter Regions, Genetic
7.
Nucleic Acids Res ; 42(21)2014 Dec 01.
Article in English | MEDLINE | ID: mdl-25294834

ABSTRACT

Understanding how regulatory networks globally coordinate the response of a cell to changing conditions, such as perturbations by shifting environments, is an elementary challenge in systems biology which has yet to be met. Genome-wide gene expression measurements are high dimensional as these are reflecting the condition-specific interplay of thousands of cellular components. The integration of prior biological knowledge into the modeling process of systems-wide gene regulation enables the large-scale interpretation of gene expression signals in the context of known regulatory relations. We developed COGERE (http://mips.helmholtz-muenchen.de/cogere), a method for the inference of condition-specific gene regulatory networks in human and mouse. We integrated existing knowledge of regulatory interactions from multiple sources to a comprehensive model of prior information. COGERE infers condition-specific regulation by evaluating the mutual dependency between regulator (transcription factor or miRNA) and target gene expression using prior information. This dependency is scored by the non-parametric, nonlinear correlation coefficient η(2) (eta squared) that is derived by a two-way analysis of variance. We show that COGERE significantly outperforms alternative methods in predicting condition-specific gene regulatory networks on simulated data sets. Furthermore, by inferring the cancer-specific gene regulatory network from the NCI-60 expression study, we demonstrate the utility of COGERE to promote hypothesis-driven clinical research.


Subject(s)
Gene Regulatory Networks , Models, Genetic , Animals , Cell Line, Tumor , Gene Expression Profiling , Humans , Mice , MicroRNAs/metabolism , Neoplasms/genetics , Transcription Factors/metabolism
8.
Nucleic Acids Res ; 42(Database issue): D279-84, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24165881

ABSTRACT

The Similarity Matrix of Proteins (SIMAP, http://mips.gsf.de/simap/) database has been designed to massively accelerate computationally expensive protein sequence analysis tasks in bioinformatics. It provides pre-calculated sequence similarities interconnecting the entire known protein sequence universe, complemented by pre-calculated protein features and domains, similarity clusters and functional annotations. SIMAP covers all major public protein databases as well as many consistently re-annotated metagenomes from different repositories. As of September 2013, SIMAP contains >163 million proteins corresponding to ∼70 million non-redundant sequences. SIMAP uses the sensitive FASTA search heuristics, the Smith-Waterman alignment algorithm, the InterPro database of protein domain models and the BLAST2GO functional annotation algorithm. SIMAP assists biologists by facilitating the interactive exploration of the protein sequence universe. Web-Service and DAS interfaces allow connecting SIMAP with any other bioinformatic tool and resource. All-against-all protein sequence similarity matrices of project-specific protein collections are generated on request. Recent improvements allow SIMAP to cover the rapidly growing sequenced protein sequence universe. New Web-Service interfaces enhance the connectivity of SIMAP. Novel tools for interactive extraction of protein similarity networks have been added. Open access to SIMAP is provided through the web portal; the portal also contains instructions and links for software access and flat file downloads.


Subject(s)
Databases, Protein , Molecular Sequence Annotation , Sequence Analysis, Protein , Internet , Protein Structure, Tertiary , Sequence Alignment , User-Computer Interface
9.
Neurogenetics ; 15(1): 49-57, 2014 Mar.
Article in English | MEDLINE | ID: mdl-24241507

ABSTRACT

Approximately 20 % of individuals with Parkinson's disease (PD) report a positive family history. Yet, a large portion of causal and disease-modifying variants is still unknown. We used exome sequencing in two affected individuals from a family with late-onset PD to identify 15 potentially causal variants. Segregation analysis and frequency assessment in 862 PD cases and 1,014 ethnically matched controls highlighted variants in EEF1D and LRRK1 as the best candidates. Mutation screening of the coding regions of these genes in 862 cases and 1,014 controls revealed several novel non-synonymous variants in both genes in cases and controls. An in silico multi-model bioinformatics analysis was used to prioritize identified variants in LRRK1 for functional follow-up. However, protein expression, subcellular localization, and cell viability were not affected by the identified variants. Although it has yet to be proven conclusively that variants in LRRK1 are indeed causative of PD, our data strengthen a possible role for LRRK1 in addition to LRRK2 in the genetic underpinnings of PD but, at the same time, highlight the difficulties encountered in the study of rare variants identified by next-generation sequencing in diseases with autosomal dominant or complex patterns of inheritance.


Subject(s)
Genetic Variation , Parkinson Disease/genetics , Protein Serine-Threonine Kinases/genetics , Algorithms , Cell Survival , DNA Mutational Analysis , Exome , Family Health , Female , Gene Dosage , Gene Frequency , Genetic Predisposition to Disease , Genotype , Germany , Humans , Male , Middle Aged , Models, Genetic , Mutation , Oligonucleotide Array Sequence Analysis , Peptide Elongation Factor 1/genetics , Phenotype
10.
Mol Plant Microbe Interact ; 26(7): 781-92, 2013 Jul.
Article in English | MEDLINE | ID: mdl-23550529

ABSTRACT

Plant small-molecule UDP-glycosyltransferases (UGT) glycosylate a vast number of endogenous substances but also act in detoxification of metabolites produced by plant-pathogenic microorganisms. The ability to inactivate the Fusarium graminearum mycotoxin deoxynivalenol (DON) into DON-3-O-glucoside is crucial for resistance of cereals. We analyzed the UGT gene family of the monocot model species Brachypodium distachyon and functionally characterized two gene clusters containing putative orthologs of previously identified DON-detoxification genes from Arabidopsis thaliana and barley. Analysis of transcription showed that UGT encoded in both clusters are highly inducible by DON and expressed at much higher levels upon infection with a wild-type DON-producing F. graminearum strain compared with infection with a mutant deficient in DON production. Expression of these genes in a toxin-sensitive strain of Saccharomyces cerevisiae revealed that only two B. distachyon UGT encoded by members of a cluster of six genes homologous to the DON-inactivating barley HvUGT13248 were able to convert DON into DON-3-O-glucoside. Also, a single copy gene from Sorghum bicolor orthologous to this cluster and one of three putative orthologs of rice exhibit this ability. Seemingly, the UGT genes undergo rapid evolution and changes in copy number, making it difficult to identify orthologs with conserved substrate specificity.


Subject(s)
Brachypodium/enzymology , Fusarium/pathogenicity , Glycosyltransferases/metabolism , Plant Diseases/microbiology , Trichothecenes/metabolism , Amino Acid Sequence , Brachypodium/genetics , Fusarium/chemistry , Gene Dosage , Gene Expression Regulation, Plant , Gene Order , Glucosides/metabolism , Glycosyltransferases/genetics , Molecular Sequence Data , Multigene Family , Mutation , Mycotoxins/genetics , Mycotoxins/metabolism , Oryza/enzymology , Oryza/genetics , Phylogeny , Plant Proteins/genetics , Plant Proteins/metabolism , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae/metabolism , Sorghum/enzymology , Sorghum/genetics , Species Specificity , Synteny
11.
BMC Genomics ; 13: 490, 2012 Sep 18.
Article in English | MEDLINE | ID: mdl-22988944

ABSTRACT

BACKGROUND: Genome-wide association studies (GWAS) have provided a large set of genetic loci influencing the risk for many common diseases. Association studies typically analyze one specific trait in single populations in an isolated fashion without taking into account the potential phenotypic and genetic correlation between traits. However, GWA data can be efficiently used to identify overlapping loci with analogous or contrasting effects on different diseases. RESULTS: Here, we describe a new approach to systematically prioritize and interpret available GWA data. We focus on the analysis of joint and disjoint genetic determinants across diseases. Using network analysis, we show that variant-based approaches are superior to locus-based analyses. In addition, we provide a prioritization of disease loci based on network properties and discuss the roles of hub loci across several diseases. We demonstrate that, in general, agonistic associations appear to reflect current disease classifications, and present the potential use of effect sizes in refining and revising these agonistic signals. We further identify potential branching points in disease etiologies based on antagonistic variants and describe plausible small-scale models of the underlying molecular switches. CONCLUSIONS: The observation that a surprisingly high fraction (>15%) of the SNPs considered in our study are associated both agonistically and antagonistically with related as well as unrelated disorders indicates that the molecular mechanisms influencing causes and progress of human diseases are in part interrelated. Genetic overlaps between two diseases also suggest the importance of the affected entities in the specific pathogenic pathways and should be investigated further.


Subject(s)
Genome-Wide Association Study , Polymorphism, Single Nucleotide , Cluster Analysis , Genetic Loci , Genome, Human , Humans , Odds Ratio
12.
Nature ; 477(7362): 54-60, 2011 Aug 31.
Article in English | MEDLINE | ID: mdl-21886157

ABSTRACT

Genome-wide association studies (GWAS) have identified many risk loci for complex diseases, but effect sizes are typically small and information on the underlying biological processes is often lacking. Associations with metabolic traits as functional intermediates can overcome these problems and potentially inform individualized therapy. Here we report a comprehensive analysis of genotype-dependent metabolic phenotypes using a GWAS with non-targeted metabolomics. We identified 37 genetic loci associated with blood metabolite concentrations, of which 25 show effect sizes that are unusually high for GWAS and account for 10-60% differences in metabolite levels per allele copy. Our associations provide new functional insights for many disease-related associations that have been reported in previous studies, including those for cardiovascular and kidney disorders, type 2 diabetes, cancer, gout, venous thromboembolism and Crohn's disease. The study advances our knowledge of the genetic basis of metabolic individuality in humans and generates many new hypotheses for biomedical and pharmaceutical research.


Subject(s)
Biomedical Research , Drug Industry , Genetic Variation , Genome-Wide Association Study , Metabolism/genetics , Adolescent , Adult , Aged , Aged, 80 and over , Blood/metabolism , Child , Chronic Disease , Coronary Artery Disease/genetics , Diabetes Mellitus/genetics , Female , Genetic Loci/genetics , Genotype , Humans , Male , Metabolomics , Middle Aged , Pharmacogenetics , Renal Insufficiency/genetics , Risk Factors , Venous Thromboembolism/genetics , Young Adult
13.
Bioinformatics ; 27(10): 1346-50, 2011 May 15.
Article in English | MEDLINE | ID: mdl-21441577

ABSTRACT

MOTIVATION: Pairing between the target sequence and the 6-8 nt long seed sequence of the miRNA presents the most important feature for miRNA target site prediction. Novel high-throughput technologies such as Argonaute HITS-CLIP afford meanwhile a detailed study of miRNA:mRNA duplices. These interaction maps enable a first discrimination between functional and non-functional target sites in a bulky fashion. Prediction algorithms apply different seed paradigms to identify miRNA target sites. Therefore, a quantitative assessment of miRNA target site prediction is of major interest. RESULTS: We identified a set of canonical seed types based on a transcriptome wide analysis of experimentally verified functional target sites. We confirmed the specificity of long seeds but we found that the majority of functional target sites are formed by less specific seeds of only 6 nt indicating a crucial role of this type. A substantial fraction of genuine target sites arenon-conserved. Moreover, the majority of functional sites remain uncovered by common prediction methods.


Subject(s)
Algorithms , Gene Expression Profiling , MicroRNAs/chemistry , MicroRNAs/genetics , Animals , Base Sequence , Eukaryotic Initiation Factors/metabolism , Humans , Mice , MicroRNAs/metabolism , Oligonucleotide Array Sequence Analysis , Oligonucleotides/genetics , Oligonucleotides/metabolism , RNA, Messenger/genetics , RNA, Messenger/metabolism
14.
Nucleic Acids Res ; 39(Database issue): D637-9, 2011 Jan.
Article in English | MEDLINE | ID: mdl-21051345

ABSTRACT

The MIPS Fusarium graminearum Genome Database (FGDB) was established as a comprehensive genome database on one of the most devastating fungal plant pathogens of wheat, barley and maize. The current version of FGDB v3.1 provides information on the full manually revised gene set based on the Broad Institute assembly FG3 genome sequence. The results of gene prediction tools were integrated with the help of comparative data on related species to result in a set of 13.718 annotated protein coding genes. This rigorous approach involved adding or modifying gene models and represents a coding sequence gold standard for the genus Fusarium. The gene loci improvements results in 2461 genes which either are new or have different structures compared to the Broad Institute assembly 3 gene set. Moreover the database serves as a convenient entry point to explore expression data results and to obtain information on the Affymetrix GeneChip probe sets. The resource is accessible on http://mips.gsf.de/genre/proj/FGDB/.


Subject(s)
Databases, Genetic , Fusarium/genetics , Fungal Proteins/genetics , Fusarium/metabolism , Gene Expression Profiling , Genome, Fungal , Molecular Sequence Annotation
15.
PLoS One ; 5(11): e13953, 2010 Nov 11.
Article in English | MEDLINE | ID: mdl-21085649

ABSTRACT

BACKGROUND: Metabolomics is the rapidly evolving field of the comprehensive measurement of ideally all endogenous metabolites in a biological fluid. However, no single analytic technique covers the entire spectrum of the human metabolome. Here we present results from a multiplatform study, in which we investigate what kind of results can presently be obtained in the field of diabetes research when combining metabolomics data collected on a complementary set of analytical platforms in the framework of an epidemiological study. METHODOLOGY/PRINCIPAL FINDINGS: 40 individuals with self-reported diabetes and 60 controls (male, over 54 years) were randomly selected from the participants of the population-based KORA (Cooperative Health Research in the Region of Augsburg) study, representing an extensively phenotyped sample of the general German population. Concentrations of over 420 unique small molecules were determined in overnight-fasting blood using three different techniques, covering nuclear magnetic resonance and tandem mass spectrometry. Known biomarkers of diabetes could be replicated by this multiple metabolomic platform approach, including sugar metabolites (1,5-anhydroglucoitol), ketone bodies (3-hydroxybutyrate), and branched chain amino acids. In some cases, diabetes-related medication can be detected (pioglitazone, salicylic acid). CONCLUSIONS/SIGNIFICANCE: Our study depicts the promising potential of metabolomics in diabetes research by identification of a series of known and also novel, deregulated metabolites that associate with diabetes. Key observations include perturbations of metabolic pathways linked to kidney dysfunction (3-indoxyl sulfate), lipid metabolism (glycerophospholipids, free fatty acids), and interaction with the gut microflora (bile acids). Our study suggests that metabolic markers hold the potential to detect diabetes-related complications already under sub-clinical conditions in the general population.


Subject(s)
Diabetes Mellitus, Type 2/metabolism , Metabolomics/methods , Aged , Amino Acid Sequence , Amino Acids/metabolism , Arachidonic Acid/metabolism , Carbohydrate Metabolism , Diabetes Mellitus, Type 2/blood , Diabetes Mellitus, Type 2/epidemiology , Fatty Acids/metabolism , Germany/epidemiology , Glucose/metabolism , Humans , Ketone Bodies/metabolism , Male , Middle Aged , Molecular Sequence Data
16.
Nat Genet ; 42(12): 1131-4, 2010 Dec.
Article in English | MEDLINE | ID: mdl-21057504

ABSTRACT

An isolated defect of respiratory chain complex I activity is a frequent biochemical abnormality in mitochondrial disorders. Despite intensive investigation in recent years, in most instances, the molecular basis underpinning complex I defects remains unknown. We report whole-exome sequencing of a single individual with severe, isolated complex I deficiency. This analysis, followed by filtering with a prioritization of mitochondrial proteins, led us to identify compound heterozygous mutations in ACAD9, which encodes a poorly understood member of the mitochondrial acyl-CoA dehydrogenase protein family. We demonstrated the pathogenic role of the ACAD9 variants by the correction of the complex I defect on expression of the wildtype ACAD9 protein in fibroblasts derived from affected individuals. ACAD9 screening of 120 additional complex I-defective index cases led us to identify two additional unrelated cases and a total of five pathogenic ACAD9 alleles.


Subject(s)
Acyl-CoA Dehydrogenases/genetics , Electron Transport Complex I/deficiency , Exons/genetics , Mutation/genetics , Sequence Analysis, DNA , Acyl-CoA Dehydrogenases/chemistry , Amino Acid Sequence , Cell Line , Child , Child, Preschool , Electron Transport Complex I/metabolism , Electrophoresis, Gel, Two-Dimensional , Female , Fibroblasts/drug effects , Fibroblasts/metabolism , Genetic Complementation Test , Humans , Infant , Male , Molecular Sequence Data , Riboflavin/pharmacology , Transduction, Genetic
17.
Nucleic Acids Res ; 38(Database issue): D223-6, 2010 Jan.
Article in English | MEDLINE | ID: mdl-19906725

ABSTRACT

The prediction of protein function as well as the reconstruction of evolutionary genesis employing sequence comparison at large is still the most powerful tool in sequence analysis. Due to the exponential growth of the number of known protein sequences and the subsequent quadratic growth of the similarity matrix, the computation of the Similarity Matrix of Proteins (SIMAP) becomes a computational intensive task. The SIMAP database provides a comprehensive and up-to-date pre-calculation of the protein sequence similarity matrix, sequence-based features and sequence clusters. As of September 2009, SIMAP covers 48 million proteins and more than 23 million non-redundant sequences. Novel features of SIMAP include the expansion of the sequence space by including databases such as ENSEMBL as well as the integration of metagenomes based on their consistent processing and annotation. Furthermore, protein function predictions by Blast2GO are pre-calculated for all sequences in SIMAP and the data access and query functions have been improved. SIMAP assists biologists to query the up-to-date sequence space systematically and facilitates large-scale downstream projects in computational biology. Access to SIMAP is freely provided through the web portal for individuals (http://mips.gsf.de/simap/) and for programmatic access through DAS (http://webclu.bio.wzw.tum.de/das/) and Web-Service (http://mips.gsf.de/webservices/services/SimapService2.0?wsdl).


Subject(s)
Computational Biology/methods , Databases, Genetic , Databases, Nucleic Acid , Databases, Protein , Proteins/chemistry , Animals , Computational Biology/trends , Humans , Information Storage and Retrieval/methods , Internet , Open Reading Frames , Protein Structure, Tertiary , Sequence Analysis, Protein , Software , User-Computer Interface
18.
Nat Genet ; 42(2): 137-41, 2010 Feb.
Article in English | MEDLINE | ID: mdl-20037589

ABSTRACT

Serum metabolite concentrations provide a direct readout of biological processes in the human body, and they are associated with disorders such as cardiovascular and metabolic diseases. We present a genome-wide association study (GWAS) of 163 metabolic traits measured in human blood from 1,809 participants from the KORA population, with replication in 422 participants of the TwinsUK cohort. For eight out of nine replicated loci (FADS1, ELOVL2, ACADS, ACADM, ACADL, SPTLC3, ETFDH and SLC16A9), the genetic variant is located in or near genes encoding enzymes or solute carriers whose functions match the associating metabolic traits. In our study, the use of metabolite concentration ratios as proxies for enzymatic reaction rates reduced the variance and yielded robust statistical associations with P values ranging from 3 x 10(-24) to 6.5 x 10(-179). These loci explained 5.6%-36.3% of the observed variance in metabolite concentrations. For several loci, associations with clinically relevant parameters have been reported previously.


Subject(s)
Genetic Variation , Genome, Human/genetics , Genome-Wide Association Study , Metabolome/genetics , Delta-5 Fatty Acid Desaturase , Genetic Loci/genetics , Humans , Polymorphism, Single Nucleotide/genetics , Reproducibility of Results , United Kingdom
19.
Cytometry A ; 75(10): 816-32, 2009 Oct.
Article in English | MEDLINE | ID: mdl-19739086

ABSTRACT

Recent developments in proteomics technology offer new opportunities for clinical applications in hospital or specialized laboratories including the identification of novel biomarkers, monitoring of disease, detecting adverse effects of drugs, and environmental hazards. Advanced spectrometry technologies and the development of new protein array formats have brought these analyses to a standard, which now has the potential to be used in clinical diagnostics. Besides standardization of methodologies and distribution of proteomic data into public databases, the nature of the human body fluid proteome with its high dynamic range in protein concentrations, its quantitation problems, and its extreme complexity present enormous challenges. Molecular cell biology (cytomics) with its link to proteomics is a new fast moving scientific field, which addresses functional cell analysis and bioinformatic approaches to search for novel cellular proteomic biomarkers or their release products into body fluids that provide better insight into the enormous biocomplexity of disease processes and are suitable for patient stratification, therapeutic monitoring, and prediction of prognosis. Experience from studies of in vitro diagnostics and especially in clinical chemistry showed that the majority of errors occurs in the preanalytical phase and the setup of the diagnostic strategy. This is also true for clinical proteomics where similar preanalytical variables such as inter- and intra-assay variability due to biological variations or proteolytical activities in the sample will most likely also influence the results of proteomics studies. However, before complex proteomic analysis can be introduced at a broader level into the clinic, standardization of the preanalytical phase including patient preparation, sample collection, sample preparation, sample storage, measurement, and data analysis is another issue which has to be improved. In this report, we discuss the recent advances and applications that fulfill the criteria for clinical proteomics with the focus on cellular proteomics (cytoproteomics) as related to preanalytical and analytical standardization and to quality control measures required for effective implementation of these technologies and analytes into routine laboratory testing to generate novel actionable health information. It will then be crucial to design and carry out clinical studies that can eventually identify novel clinical diagnostic strategies based on these techniques and validate their impact on clinical decision making.


Subject(s)
Cells/metabolism , Proteomics/methods , Proteomics/trends , Analytic Sample Preparation Methods , Computational Biology , Humans , Proteomics/standards , Statistics as Topic
20.
PLoS One ; 4(7): e6473, 2009 Jul 31.
Article in English | MEDLINE | ID: mdl-19649282

ABSTRACT

It is known that miRNA target sites are very short and the effect of miRNA-target site interaction alone appears as being unspecific. Recent experiments suggest further context signals involved in miRNA target site recognition and regulation. Here, we present a novel GC-rich RNA motif downstream of experimentally supported miRNA target sites in human mRNAs with no similarity to previously reported functional motifs. We demonstrate that the novel motif can be found in at least one third of all transcripts regulated by miRNAs. Furthermore, we show that motif occurrence and the frequency of miRNA target sites as well as the stability of their duplex structures correlate. The finding, that the novel motif is significantly associated with miRNA target sites, suggests a functional role of the motif in miRNA target site biology. Beyond, the novel motif has the impact to improve prediction of miRNA target sites significantly.


Subject(s)
Enhancer Elements, Genetic , MicroRNAs/genetics , 3' Untranslated Regions , Humans
SELECTION OF CITATIONS
SEARCH DETAIL
...