Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Nat Methods ; 9(4): 345-50, 2012 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-22453911

RESUMO

The International Molecular Exchange (IMEx) consortium is an international collaboration between major public interaction data providers to share literature-curation efforts and make a nonredundant set of protein interactions available in a single search interface on a common website (http://www.imexconsortium.org/). Common curation rules have been developed, and a central registry is used to manage the selection of articles to enter into the dataset. We discuss the advantages of such a service to the user, our quality-control measures and our data-distribution practices.


Assuntos
Bases de Dados de Proteínas , Mapeamento de Interação de Proteínas , Proteínas/metabolismo , Publicações Periódicas como Assunto , Ligação Proteica , Proteínas/química , Controle de Qualidade
2.
Nucleic Acids Res ; 39(Database issue): D220-4, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21109531

RESUMO

The Munich Information Center for Protein Sequences (MIPS at the Helmholtz Center for Environmental Health, Neuherberg, Germany) has many years of experience in providing annotated collections of biological data. Selected data sets of high relevance, such as model genomes, are subjected to careful manual curation, while the bulk of high-throughput data is annotated by automatic means. High-quality reference resources developed in the past and still actively maintained include Saccharomyces cerevisiae, Neurospora crassa and Arabidopsis thaliana genome databases as well as several protein interaction data sets (MPACT, MPPI and CORUM). More recent projects are PhenomiR, the database on microRNA-related phenotypes, and MIPS PlantsDB for integrative and comparative plant genome research. The interlinked resources SIMAP and PEDANT provide homology relationships as well as up-to-date and consistent annotation for 38,000,000 protein sequences. PPLIPS and CCancer are versatile tools for proteomics and functional genomics interfacing to a database of compilations from gene lists extracted from literature. A novel literature-mining tool, EXCERBT, gives access to structured information on classified relations between genes, proteins, phenotypes and diseases extracted from Medline abstracts by semantic analysis. All databases described here, as well as the detailed descriptions of our projects can be accessed through the MIPS WWW server (http://mips.helmholtz-muenchen.de).


Assuntos
Bases de Dados Genéticas , Mineração de Dados , Bases de Dados de Proteínas , Genes Neoplásicos , Genoma de Planta , Genômica , Metabolômica , MicroRNAs/metabolismo , Fenótipo , Proteômica , Análise de Sequência de Proteína , Integração de Sistemas
3.
BMC Genomics ; 13: 490, 2012 Sep 18.
Artigo em Inglês | MEDLINE | ID: mdl-22988944

RESUMO

BACKGROUND: Genome-wide association studies (GWAS) have provided a large set of genetic loci influencing the risk for many common diseases. Association studies typically analyze one specific trait in single populations in an isolated fashion without taking into account the potential phenotypic and genetic correlation between traits. However, GWA data can be efficiently used to identify overlapping loci with analogous or contrasting effects on different diseases. RESULTS: Here, we describe a new approach to systematically prioritize and interpret available GWA data. We focus on the analysis of joint and disjoint genetic determinants across diseases. Using network analysis, we show that variant-based approaches are superior to locus-based analyses. In addition, we provide a prioritization of disease loci based on network properties and discuss the roles of hub loci across several diseases. We demonstrate that, in general, agonistic associations appear to reflect current disease classifications, and present the potential use of effect sizes in refining and revising these agonistic signals. We further identify potential branching points in disease etiologies based on antagonistic variants and describe plausible small-scale models of the underlying molecular switches. CONCLUSIONS: The observation that a surprisingly high fraction (>15%) of the SNPs considered in our study are associated both agonistically and antagonistically with related as well as unrelated disorders indicates that the molecular mechanisms influencing causes and progress of human diseases are in part interrelated. Genetic overlaps between two diseases also suggest the importance of the affected entities in the specific pathogenic pathways and should be investigated further.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Análise por Conglomerados , Loci Gênicos , Genoma Humano , Humanos , Razão de Chances
4.
Bioinformatics ; 27(10): 1346-50, 2011 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-21441577

RESUMO

MOTIVATION: Pairing between the target sequence and the 6-8 nt long seed sequence of the miRNA presents the most important feature for miRNA target site prediction. Novel high-throughput technologies such as Argonaute HITS-CLIP afford meanwhile a detailed study of miRNA:mRNA duplices. These interaction maps enable a first discrimination between functional and non-functional target sites in a bulky fashion. Prediction algorithms apply different seed paradigms to identify miRNA target sites. Therefore, a quantitative assessment of miRNA target site prediction is of major interest. RESULTS: We identified a set of canonical seed types based on a transcriptome wide analysis of experimentally verified functional target sites. We confirmed the specificity of long seeds but we found that the majority of functional target sites are formed by less specific seeds of only 6 nt indicating a crucial role of this type. A substantial fraction of genuine target sites arenon-conserved. Moreover, the majority of functional sites remain uncovered by common prediction methods.


Assuntos
Algoritmos , Perfilação da Expressão Gênica , MicroRNAs/química , MicroRNAs/genética , Animais , Sequência de Bases , Fatores de Iniciação em Eucariotos/metabolismo , Humanos , Camundongos , MicroRNAs/metabolismo , Análise de Sequência com Séries de Oligonucleotídeos , Oligonucleotídeos/genética , Oligonucleotídeos/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo
5.
BMC Bioinformatics ; 11: 522, 2010 Oct 20.
Artigo em Inglês | MEDLINE | ID: mdl-20961418

RESUMO

BACKGROUND: Extensive and automated data integration in bioinformatics facilitates the construction of large, complex biological networks. However, the challenge lies in the interpretation of these networks. While most research focuses on the unipartite or bipartite case, we address the more general but common situation of k-partite graphs. These graphs contain k different node types and links are only allowed between nodes of different types. In order to reveal their structural organization and describe the contained information in a more coarse-grained fashion, we ask how to detect clusters within each node type. RESULTS: Since entities in biological networks regularly have more than one function and hence participate in more than one cluster, we developed a k-partite graph partitioning algorithm that allows for overlapping (fuzzy) clusters. It determines for each node a degree of membership to each cluster. Moreover, the algorithm estimates a weighted k-partite graph that connects the extracted clusters. Our method is fast and efficient, mimicking the multiplicative update rules commonly employed in algorithms for non-negative matrix factorization. It facilitates the decomposition of networks on a chosen scale and therefore allows for analysis and interpretation of structures on various resolution levels. Applying our algorithm to a tripartite disease-gene-protein complex network, we were able to structure this graph on a large scale into clusters that are functionally correlated and biologically meaningful. Locally, smaller clusters enabled reclassification or annotation of the clusters' elements. We exemplified this for the transcription factor MECP2. CONCLUSIONS: In order to cope with the overwhelming amount of information available from biomedical literature, we need to tackle the challenge of finding structures in large networks with nodes of multiple types. To this end, we presented a novel fuzzy k-partite graph partitioning algorithm that allows the decomposition of these objects in a comprehensive fashion. We validated our approach both on artificial and real-world data. It is readily applicable to any further problem.


Assuntos
Análise por Conglomerados , Biologia Computacional/métodos , Modelos Biológicos , Algoritmos , Reconhecimento Automatizado de Padrão/métodos , Fatores de Transcrição
6.
Nat Biotechnol ; 25(8): 894-8, 2007 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-17687370

RESUMO

A wealth of molecular interaction data is available in the literature, ranging from large-scale datasets to a single interaction confirmed by several different techniques. These data are all too often reported either as free text or in tables of variable format, and are often missing key pieces of information essential for a full understanding of the experiment. Here we propose MIMIx, the minimum information required for reporting a molecular interaction experiment. Adherence to these reporting guidelines will result in publications of increased clarity and usefulness to the scientific community and will support the rapid, systematic capture of molecular interaction data in public databases, thereby improving access to valuable interaction data.


Assuntos
Bases de Dados de Proteínas/normas , Guias como Assunto , Armazenamento e Recuperação da Informação/normas , Mapeamento de Interação de Proteínas/normas , Proteômica/normas , Pesquisa/normas , Humanos , Internacionalidade
7.
Nucleic Acids Res ; 36(Database issue): D651-5, 2008 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-17999995

RESUMO

DIMA-the domain interaction map has evolved from a simple web server for domain phylogenetic profiling into an integrative prediction resource combining both experimental data on domain-domain interactions and predictions from two different algorithms. With this update, DIMA obtains greatly improved coverage at the level of genomes and domains as well as with respect to available prediction approaches. The domain phylogenetic profiling method now uses SIMAP as its backend for exhaustive domain hit coverage: 7038 Pfam domains were profiled over 460 completely sequenced genomes. Domain pair exclusion predictions were produced from 83 969 distinct protein-protein interactions obtained from IntAct resulting in 21 513 domain pairs with significant domain pair exclusion algorithm scores. Additional predictions applying the same algorithm to predicted protein interactions from STRING yielded 2378 high-confidence pairs. Experimental data comes from iPfam (3074) and 3did (3034 pairs), two databases identifying domain contacts in solved protein structures. Taken together, these two resources yielded 3653 distinct interacting domain pairs. DIMA is available at http://mips.gsf.de/genre/proj/dima.


Assuntos
Bases de Dados de Proteínas , Domínios e Motivos de Interação entre Proteínas , Mapeamento de Interação de Proteínas , Internet , Filogenia , Domínios e Motivos de Interação entre Proteínas/genética , Proteínas/classificação , Interface Usuário-Computador
8.
Nucleic Acids Res ; 36(Database issue): D289-92, 2008 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-18037617

RESUMO

Protein sequences are the most important source of evolutionary and functional information for new proteins. In order to facilitate the computationally intensive tasks of sequence analysis, the Similarity Matrix of Proteins (SIMAP) database aims to provide a comprehensive and up-to-date dataset of the pre-calculated sequence similarity matrix and sequence-based features like InterPro domains for all proteins contained in the major public sequence databases. As of September 2007, SIMAP covers approximately 17 million proteins and more than 6 million non-redundant sequences and provides a complete annotation based on InterPro 16. Novel features of SIMAP include a new, portlet-based web portal providing multiple, structured views on retrieved proteins and integration of protein clusters and a unique search method for similar domain architectures. Access to SIMAP is freely provided for academic use through the web portal for individuals at http://mips.gsf.de/simap/and through Web Services for programmatic access at http://mips.gsf.de/webservices/services/SimapService2.0?wsdl.


Assuntos
Bases de Dados de Proteínas , Alinhamento de Sequência , Análise de Sequência de Proteína , Internet , Estrutura Terciária de Proteína , Proteínas/classificação , Interface Usuário-Computador
9.
Nucleic Acids Res ; 36(Database issue): D646-50, 2008 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-17965090

RESUMO

Protein complexes are key molecular entities that integrate multiple gene products to perform cellular functions. The CORUM (http://mips.gsf.de/genre/proj/corum/index.html) database is a collection of experimentally verified mammalian protein complexes. Information is manually derived by critical reading of the scientific literature from expert annotators. Information about protein complexes includes protein complex names, subunits, literature references as well as the function of the complexes. For functional annotation, we use the FunCat catalogue that enables to organize the protein complex space into biologically meaningful subsets. The database contains more than 1750 protein complexes that are built from 2400 different genes, thus representing 12% of the protein-coding genes in human. A web-based system is available to query, view and download the data. CORUM provides a comprehensive dataset of protein complexes for discoveries in systems biology, analyses of protein networks and protein complex-associated diseases. Comparable to the MIPS reference dataset of protein complexes from yeast, CORUM intends to serve as a reference for mammalian protein complexes.


Assuntos
Bases de Dados de Proteínas , Complexos Multiproteicos/fisiologia , Animais , Humanos , Internet , Camundongos , Complexos Multiproteicos/análise , Complexos Multiproteicos/química , Ratos , Interface Usuário-Computador
10.
Nucleic Acids Res ; 34(Database issue): D252-6, 2006 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-16381858

RESUMO

Similarity Matrix of Proteins (SIMAP) (http://mips.gsf.de/simap) provides a database based on a pre-computed similarity matrix covering the similarity space formed by >4 million amino acid sequences from public databases and completely sequenced genomes. The database is capable of handling very large datasets and is updated incrementally. For sequence similarity searches and pairwise alignments, we implemented a grid-enabled software system, which is based on FASTA heuristics and the Smith-Waterman algorithm. Our ProtInfo system allows querying by protein sequences covered by the SIMAP dataset as well as by fragments of these sequences, highly similar sequences and title words. Each sequence in the database is supplemented with pre-calculated features generated by detailed sequence analyses. By providing WWW interfaces as well as web-services, we offer the SIMAP resource as an efficient and comprehensive tool for sequence similarity searches.


Assuntos
Bases de Dados de Proteínas , Homologia de Sequência de Aminoácidos , Internet , Alinhamento de Sequência , Software , Interface Usuário-Computador
11.
Nucleic Acids Res ; 34(Database issue): D436-41, 2006 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-16381906

RESUMO

In recent years, the Munich Information Center for Protein Sequences (MIPS) yeast protein-protein interaction (PPI) dataset has been used in numerous analyses of protein networks and has been called a gold standard because of its quality and comprehensiveness [H. Yu, N. M. Luscombe, H. X. Lu, X. Zhu, Y. Xia, J. D. Han, N. Bertin, S. Chung, M. Vidal and M. Gerstein (2004) Genome Res., 14, 1107-1118]. MPact and the yeast protein localization catalog provide information related to the proximity of proteins in yeast. Beside the integration of high-throughput data, information about experimental evidence for PPIs in the literature was compiled by experts adding up to 4300 distinct PPIs connecting 1500 proteins in yeast. As the interaction data is a complementary part of CYGD, interactive mapping of data on other integrated data types such as the functional classification catalog [A. Ruepp, A. Zollner, D. Maier, K. Albermann, J. Hani, M. Mokrejs, I. Tetko, U. Güldener, G. Mannhaupt, M. Münsterkötter and H. W. Mewes (2004) Nucleic Acids Res., 32, 5539-5545] is possible. A survey of signaling proteins and comparison with pathway data from KEGG demonstrates that based on these manually annotated data only an extensive overview of the complexity of this functional network can be obtained in yeast. The implementation of a web-based PPI-analysis tool allows analysis and visualization of protein interaction networks and facilitates integration of our curated data with high-throughput datasets. The complete dataset as well as user-defined sub-networks can be retrieved easily in the standardized PSI-MI format. The resource can be accessed through http://mips.gsf.de/genre/proj/mpact.


Assuntos
Bases de Dados de Proteínas , Mapeamento de Interação de Proteínas , Proteínas de Saccharomyces cerevisiae/metabolismo , Internet , Saccharomyces cerevisiae/metabolismo , Transdução de Sinais , Software , Técnicas do Sistema de Duplo-Híbrido , Interface Usuário-Computador
12.
Nucleic Acids Res ; 34(Database issue): D456-8, 2006 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-16381910

RESUMO

The MIPS Fusarium graminearum Genome Database (FGDB) is a comprehensive genome database on one of the most devastating fungal plant pathogens of wheat and barley. FGDB provides information on two gene sets independently derived by automated annotation of the F.graminearum genome sequence. A complete manually revised gene set will be completed within the near future. The initial results of systematic manual correction of gene calls are already part of the current gene set. The database can be accessed to retrieve information from bioinformatics analyses and functional classifications of the proteins. The data are also organized in the well established MIPS catalogs and novel query techniques are available to search the data. The comprehensive set of gene calls was also used for the design of an Affymetrix GeneChip. The resource is accessible on http://mips.gsf.de/genre/proj/fusarium/.


Assuntos
Bases de Dados Genéticas , Fusarium/genética , Genoma Fúngico , Proteínas Fúngicas/classificação , Proteínas Fúngicas/genética , Proteínas Fúngicas/fisiologia , Genômica , Internet , Interface Usuário-Computador
13.
Nucleic Acids Res ; 34(Database issue): D568-71, 2006 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-16381934

RESUMO

MfunGD (http://mips.gsf.de/genre/proj/mfungd/) provides a resource for annotated mouse proteins and their occurrence in protein networks. Manual annotation concentrates on proteins which are found to interact physically with other proteins. Accordingly, manually curated information from a protein-protein interaction database (MPPI) and a database of mammalian protein complexes is interconnected with MfunGD. Protein function annotation is performed using the Functional Catalogue (FunCat) annotation scheme which is widely used for the analysis of protein networks. The dataset is also supplemented with information about the literature that was used in the annotation process as well as links to the SIMAP Fasta database, the Pedant protein analysis system and cross-references to external resources. Proteins that so far were not manually inspected are annotated automatically by a graphical probabilistic model and/or superparamagnetic clustering. The database is continuously expanding to include the rapidly growing amount of functional information about gene products from mouse. MfunGD is implemented in GenRE, a J2EE-based component-oriented multi-tier architecture following the separation of concern principle.


Assuntos
Bases de Dados Genéticas , Genômica , Camundongos/genética , Complexos Multiproteicos/genética , Complexos Multiproteicos/fisiologia , Animais , Internet , Complexos Multiproteicos/química , Proteômica , Software , Interface Usuário-Computador
14.
BMC Biol ; 5: 44, 2007 Oct 09.
Artigo em Inglês | MEDLINE | ID: mdl-17925023

RESUMO

BACKGROUND: Molecular interaction Information is a key resource in modern biomedical research. Publicly available data have previously been provided in a broad array of diverse formats, making access to this very difficult. The publication and wide implementation of the Human Proteome Organisation Proteomics Standards Initiative Molecular Interactions (HUPO PSI-MI) format in 2004 was a major step towards the establishment of a single, unified format by which molecular interactions should be presented, but focused purely on protein-protein interactions. RESULTS: The HUPO-PSI has further developed the PSI-MI XML schema to enable the description of interactions between a wider range of molecular types, for example nucleic acids, chemical entities, and molecular complexes. Extensive details about each supported molecular interaction can now be captured, including the biological role of each molecule within that interaction, detailed description of interacting domains, and the kinetic parameters of the interaction. The format is supported by data management and analysis tools and has been adopted by major interaction data providers. Additionally, a simpler, tab-delimited format MITAB2.5 has been developed for the benefit of users who require only minimal information in an easy to access configuration. CONCLUSION: The PSI-MI XML2.5 and MITAB2.5 formats have been jointly developed by interaction data producers and providers from both the academic and commercial sector, and are already widely implemented and well supported by an active development community. PSI-MI XML2.5 enables the description of highly detailed molecular interaction data and facilitates data exchange between databases and users without loss of information. MITAB2.5 is a simpler format appropriate for fast Perl parsing or loading into Microsoft Excel.


Assuntos
Bases de Dados de Proteínas/normas , Processamento de Linguagem Natural , Mapeamento de Interação de Proteínas/métodos , Proteômica/métodos , Biologia Computacional , Gráficos por Computador , Sistemas de Gerenciamento de Base de Dados , Proteômica/normas , Interface Usuário-Computador
15.
Methods Mol Biol ; 396: 3-15, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-18025682

RESUMO

Conserved domains carry many of the functional features found in the proteins of an organism. This includes not only catalytic activity, substrate binding, and structural features but also molecular adapters, which mediate the physical interactions between proteins or proteins with other molecules. In addition, two conserved domains can be linked not by physical contact but by a common function like forming a binding pocket. Although a wealth of experimental data has been collected and carefully curated for protein-protein interactions, as of today little useful data is available from major databases with respect to relations on the domain level. This lack of data makes computational prediction of domain-domain interactions a very important endeavor. In this chapter, we discuss the available experimental data (iPfam) and describe some important approaches to the problem of identifying interacting and/or functionally linked domain pairs from different kinds of input data. Specifically, we will discuss phylogenetic profiling on the level of conserved protein domains on one hand and inference of domain-interactions from observed or predicted protein-protein interactions datasets on the other. We explore the predictive power of these predictions and point out the importance of deploying as many different methods as possible for the best results.


Assuntos
Proteínas/química , Catálise , Conformação Proteica , Proteínas/metabolismo , Especificidade por Substrato
16.
Bioinformatics ; 21 Suppl 2: ii42-6, 2005 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-16204123

RESUMO

MOTIVATION: Sequence similarity searches are of great importance in bioinformatics. Exhaustive searches for homologous proteins in databases are computationally expensive and can be replaced by a database of pre-calculated homologies in many cases. Retrieving similarities from an incrementally updated database instead of repeatedly recalculating them should provide homologs much faster and frees computational resources for other purposes. RESULTS: We have implemented SIMAP-a database containing the similarity space formed by almost all amino acid sequences from public databases and completely sequenced genomes. The database is capable of handling very large datasets and allows incremental updates. We have implemented a powerful backbone for similarity computation, which is based on FASTA heuristics. By providing WWW interfaces as well as web services, we make our data accessible to the worldwide community. We have also adapted procedures to detect putative orthologs as example applications. AVAILABILITY: The SIMAP portal page providing links to SIMAP services is publicly available: http://mips.gsf.de/services/analysis/simap/. The web services can be accessed under http://mips.gsf.de/proj/hobitws/services/RPCSimapService?wsdl and http://mips.gsf.de/proj/hobitws/services/DocSimapService?wsdl


Assuntos
Bases de Dados de Proteínas , Armazenamento e Recuperação da Informação/métodos , Proteínas/química , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Homologia de Sequência de Aminoácidos , Software , Algoritmos , Sequência de Aminoácidos , Simulação por Computador , Sistemas de Gerenciamento de Base de Dados , Modelos Químicos , Modelos Moleculares , Dados de Sequência Molecular , Proteínas/ultraestrutura , Interface Usuário-Computador
17.
Stem Cell Reports ; 5(5): 702-715, 2015 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-26527384

RESUMO

Hematopoietic stem cells (HSCs) are preserved in co-cultures with UG26-1B6 stromal cells or their conditioned medium. We performed a genome-wide study of gene expression changes of UG26-1B6 stromal cells in contact with Lineage⁻ SCA-1⁺ KIT⁺ (LSK) cells. This analysis identified connective tissue growth factor (CTGF) to be upregulated in response to LSK cells. We found that co-culture of HSCs on CTGF knockdown stroma (shCtgf) shows impaired engraftment and long-term quality. Further experiments demonstrated that CD34⁻ CD48⁻ CD150⁺ LSK (CD34⁻ SLAM) cell numbers from shCtgf co-cultures increase in G0 and senescence and show delayed time to first cell division. To understand this observation, a CTGF signaling network model was assembled, which was experimentally validated. In co-culture experiments of CD34⁻ SLAM cells with shCtgf stromal cells, we found that SMAD2/3-dependent signaling was activated, with increasing p27(Kip1) expression and downregulating cyclin D1. Our data support the view that LSK cells modulate gene expression in the niche to maintain repopulating HSC activity.


Assuntos
Ciclo Celular , Fator de Crescimento do Tecido Conjuntivo/farmacologia , Células-Tronco Hematopoéticas/citologia , Células Estromais/metabolismo , Animais , Linhagem Celular , Células Cultivadas , Fator de Crescimento do Tecido Conjuntivo/metabolismo , Ciclina D1/genética , Ciclina D1/metabolismo , Inibidor de Quinase Dependente de Ciclina p27/genética , Inibidor de Quinase Dependente de Ciclina p27/metabolismo , Células-Tronco Hematopoéticas/efeitos dos fármacos , Células-Tronco Hematopoéticas/metabolismo , Camundongos , Camundongos Endogâmicos C57BL , Proteína Smad2/metabolismo , Proteína Smad3/metabolismo , Nicho de Células-Tronco
18.
PLoS One ; 7(5): e36694, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22606281

RESUMO

Genome-wide association studies (GWAS) have become an effective tool to map genes and regions contributing to multifactorial human diseases and traits. A comparably small number of variants identified by GWAS are known to have a direct effect on protein structure whereas the majority of variants is thought to exert their moderate influences on the phenotype through regulatory changes in mRNA expression. MicroRNAs (miRNAs) have been identified as powerful posttranscriptional regulators of mRNAs. Binding to their target sites, which are mostly located within the 3'-untranslated region (3'-UTR) of mRNA transcripts, they modulate mRNA expression and stability. Until today almost all human mRNA transcripts are known to harbor at least one miRNA target site with an average of over 20 miRNA target sites per transcript. Among 5,101 GWAS-identified sentinel single nucleotide polymorphisms (SNPs) that correspond to 18,884 SNPs in linkage disequilibrium (LD) with the sentinels (r2 ≥ 0.8) we identified a significant overrepresentation of SNPs that affect the 3'-UTR of genes (OR = 2.33, 95% CI = 2.12-2.57, P < 10(-52)). This effect was even stronger considering all SNPs in one LD bin a single signal (OR = 4.27, 95% CI = 3.84-4.74, P < 10(-114)). Based on crosslinking immunoprecipitation data we identified four mechanisms affecting miRNA regulation by 3'-UTR mutations: (i) deletion or (ii) creation of miRNA recognition elements within validated RNA-induced silencing complex binding sites, (iii) alteration of 3'-UTR splicing leading to a loss of binding sites, and (iv) change of binding affinity due to modifications of 3'-UTR folding. We annotated 53 SNPs of a total of 288 trait-associated 3'-UTR SNPs as mediating at least one of these mechanisms. Using a qualitative systems biology approach, we demonstrate how our findings can be used to support biological interpretation of GWAS results as well as to provide new experimentally testable hypotheses.


Assuntos
MicroRNAs/genética , Polimorfismo Genético , Regiões 3' não Traduzidas , Proteínas Cromossômicas não Histona/genética , Teste de Complementação Genética , Estudo de Associação Genômica Ampla , Humanos , Desequilíbrio de Ligação , Metabolismo dos Lipídeos/genética , Cirrose Hepática Biliar/genética , Modelos Genéticos , Mutação , Polimorfismo de Nucleotídeo Único , Splicing de RNA , Estabilidade de RNA , Biologia de Sistemas
19.
PLoS One ; 5(11): e13698, 2010 Nov 04.
Artigo em Inglês | MEDLINE | ID: mdl-21079808

RESUMO

BACKGROUND: Gene expression as governed by the interplay of the components of regulatory networks is indeed one of the most complex fundamental processes in biological systems. Although several methods have been published to unravel the hierarchical structure of regulatory networks, weaknesses such as the incorrect or inconsistent assignment of elements to their hierarchical levels, the incapability to cope with cyclic dependencies within the networks or the need for a manual curation to retrieve non-overlapping levels remain unsolved. METHODOLOGY/RESULTS: We developed HiNO as a significant improvement of the so-called breadth-first-search (BFS) method. While BFS is capable of determining the overall hierarchical structures from gene regulatory networks, it especially has problems solving feed-forward type of loops leading to conflicts within the level assignments. We resolved these problems by adding a recursive correction approach consisting of two steps. First each vertex is placed on the lowest level that this vertex and its regulating vertices are assigned to (downgrade procedure). Second, vertices are assigned to the next higher level (upgrade procedure) if they have successors with the same level assignment and have themselves no regulators. We evaluated HiNO by comparing it with the BFS method by applying them to the regulatory networks from Saccharomyces cerevisiae and Escherichia coli, respectively. The comparison shows clearly how conflicts in level assignment are resolved in HiNO in order to produce correct hierarchical structures even on the local levels in an automated fashion. CONCLUSIONS: We showed that the resolution of conflicting assignments clearly improves the BFS-method. While we restricted our analysis to gene regulatory networks, our approach is suitable to deal with any directed hierarchical networks structure such as the interaction of microRNAs or the action of non-coding RNAs in general. Furthermore we provide a user-friendly web-interface for HiNO that enables the extraction of the hierarchical structure of any directed regulatory network. AVAILABILITY: HiNO is freely accessible at http://mips.helmholtz-muenchen.de/hino/.


Assuntos
Algoritmos , Biologia Computacional/métodos , Redes Reguladoras de Genes/genética , Modelos Genéticos , Análise por Conglomerados , Escherichia coli/genética , Proteínas de Escherichia coli/genética , Perfilação da Expressão Gênica , Internet , Reprodutibilidade dos Testes , Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/genética
20.
PLoS One ; 4(7): e6473, 2009 Jul 31.
Artigo em Inglês | MEDLINE | ID: mdl-19649282

RESUMO

It is known that miRNA target sites are very short and the effect of miRNA-target site interaction alone appears as being unspecific. Recent experiments suggest further context signals involved in miRNA target site recognition and regulation. Here, we present a novel GC-rich RNA motif downstream of experimentally supported miRNA target sites in human mRNAs with no similarity to previously reported functional motifs. We demonstrate that the novel motif can be found in at least one third of all transcripts regulated by miRNAs. Furthermore, we show that motif occurrence and the frequency of miRNA target sites as well as the stability of their duplex structures correlate. The finding, that the novel motif is significantly associated with miRNA target sites, suggests a functional role of the motif in miRNA target site biology. Beyond, the novel motif has the impact to improve prediction of miRNA target sites significantly.


Assuntos
Elementos Facilitadores Genéticos , MicroRNAs/genética , Regiões 3' não Traduzidas , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA