Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
1.
Mol Cell ; 63(4): 579-592, 2016 08 18.
Artículo en Inglés | MEDLINE | ID: mdl-27540857

RESUMEN

Gene fusions are common cancer-causing mutations, but the molecular principles by which fusion protein products affect interaction networks and cause disease are not well understood. Here, we perform an integrative analysis of the structural, interactomic, and regulatory properties of thousands of putative fusion proteins. We demonstrate that genes that form fusions (i.e., parent genes) tend to be highly connected hub genes, whose protein products are enriched in structured and disordered interaction-mediating features. Fusion often results in the loss of these parental features and the depletion of regulatory sites such as post-translational modifications. Fusion products disproportionately connect proteins that did not previously interact in the protein interaction network. In this manner, fusion products can escape cellular regulation and constitutively rewire protein interaction networks. We suggest that the deregulation of central, interaction-prone proteins may represent a widespread mechanism by which fusion proteins alter the topology of cellular signaling pathways and promote cancer.


Asunto(s)
Fusión Génica , Proteínas de Neoplasias/genética , Proteínas de Neoplasias/metabolismo , Neoplasias/genética , Neoplasias/metabolismo , Mapas de Interacción de Proteínas , Biología Computacional , Bases de Datos de Proteínas , Humanos , Mapeo de Interacción de Proteínas , Procesamiento Proteico-Postraduccional , Transducción de Señal , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Ubiquitinación
2.
Nucleic Acids Res ; 47(D1): D490-D494, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30445555

RESUMEN

Here, we present a major update to the SUPERFAMILY database and the webserver. We describe the addition of new SUPERFAMILY 2.0 profile HMM library containing a total of 27 623 HMMs. The database now includes Superfamily domain annotations for millions of protein sequences taken from the Universal Protein Recourse Knowledgebase (UniProtKB) and the National Center for Biotechnology Information (NCBI). This addition constitutes about 51 and 45 million distinct protein sequences obtained from UniProtKB and NCBI respectively. Currently, the database contains annotations for 63 244 and 102 151 complete genomes taken from UniProtKB and NCBI respectively. The current sequence collection and genome update is the biggest so far in the history of SUPERFAMILY updates. In order to the deal with the massive wealth of information, here we introduce a new SUPERFAMILY 2.0 webserver (http://supfam.org). Currently, the webserver mainly focuses on the search, retrieval and display of Superfamily annotation for the entire sequence and genome collection in the database.


Asunto(s)
Bases de Datos de Proteínas , Dominios Proteicos , Proteoma/química , Genoma , Internet , Cadenas de Markov , Dominios Proteicos/genética , Análisis de Secuencia de Proteína
3.
Plant Physiol ; 173(2): 1371-1390, 2017 02.
Artículo en Inglés | MEDLINE | ID: mdl-27909045

RESUMEN

Of the three classes of enzymes involved in ubiquitination, ubiquitin-conjugating enzymes (E2) have been often incorrectly considered to play merely an auxiliary role in the process, and few E2 enzymes have been investigated in plants. To reveal the role of E2 in plant innate immunity, we identified and cloned 40 tomato genes encoding ubiquitin E2 proteins. Thioester assays indicated that the majority of the genes encode enzymatically active E2. Phylogenetic analysis classified the 40 tomato E2 enzymes into 13 groups, of which members of group III were found to interact and act specifically with AvrPtoB, a Pseudomonas syringae pv tomato effector that uses its ubiquitin ligase (E3) activity to suppress host immunity. Knocking down the expression of group III E2 genes in Nicotiana benthamiana diminished the AvrPtoB-promoted degradation of the Fen kinase and the AvrPtoB suppression of host immunity-associated programmed cell death. Importantly, silencing group III E2 genes also resulted in reduced pattern-triggered immunity (PTI). By contrast, programmed cell death induced by several effector-triggered immunity elicitors was not affected on group III-silenced plants. Functional characterization suggested redundancy among group III members for their role in the suppression of plant immunity by AvrPtoB and in PTI and identified UBIQUITIN-CONJUGATING11 (UBC11), UBC28, UBC29, UBC39, and UBC40 as playing a more significant role in PTI than other group III members. Our work builds a foundation for the further characterization of E2s in plant immunity and reveals that AvrPtoB has evolved a strategy for suppressing host immunity that is difficult for the plant to thwart.


Asunto(s)
Inmunidad de la Planta/fisiología , Proteínas de Plantas/inmunología , Solanum lycopersicum/genética , Enzimas Ubiquitina-Conjugadoras/inmunología , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Muerte Celular , Silenciador del Gen , Genoma de Planta , Interacciones Huésped-Patógeno/inmunología , Solanum lycopersicum/citología , Solanum lycopersicum/inmunología , Solanum lycopersicum/microbiología , Filogenia , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Plantas Modificadas Genéticamente , Proteínas Serina-Treonina Quinasas/genética , Proteínas Serina-Treonina Quinasas/metabolismo , Pseudomonas syringae/patogenicidad , Nicotiana/genética , Nicotiana/metabolismo , Enzimas Ubiquitina-Conjugadoras/genética , Enzimas Ubiquitina-Conjugadoras/metabolismo , Ubiquitinación
4.
Proc Natl Acad Sci U S A ; 112(38): 11893-8, 2015 Sep 22.
Artículo en Inglés | MEDLINE | ID: mdl-26324906

RESUMEN

The most diverse marine ecosystems, coral reefs, depend upon a functional symbiosis between a cnidarian animal host (the coral) and intracellular photosynthetic dinoflagellate algae. The molecular and cellular mechanisms underlying this endosymbiosis are not well understood, in part because of the difficulties of experimental work with corals. The small sea anemone Aiptasia provides a tractable laboratory model for investigating these mechanisms. Here we report on the assembly and analysis of the Aiptasia genome, which will provide a foundation for future studies and has revealed several features that may be key to understanding the evolution and function of the endosymbiosis. These features include genomic rearrangements and taxonomically restricted genes that may be functionally related to the symbiosis, aspects of host dependence on alga-derived nutrients, a novel and expanded cnidarian-specific family of putative pattern-recognition receptors that might be involved in the animal-algal interactions, and extensive lineage-specific horizontal gene transfer. Extensive integration of genes of prokaryotic origin, including genes for antimicrobial peptides, presumably reflects an intimate association of the animal-algal pair also with its prokaryotic microbiome.


Asunto(s)
Antozoos/fisiología , Genoma/genética , Anémonas de Mar/genética , Simbiosis/genética , Animales , Cromosomas/genética , Evolución Molecular , Perfilación de la Expresión Génica , Transferencia de Gen Horizontal/genética , Tamaño del Genoma , Interacciones Microbianas/genética , Modelos Biológicos , Anotación de Secuencia Molecular , Filogenia , Secuencias Repetitivas de Ácidos Nucleicos/genética , Sintenía/genética
5.
Nucleic Acids Res ; 43(10): 4814-22, 2015 May 26.
Artículo en Inglés | MEDLINE | ID: mdl-25934802

RESUMEN

We have discovered that positions of splice junctions in genes are constrained by the tolerance for disorder-promoting amino acids in the translated protein region. It is known that efficient splicing requires nucleotide bias at the splice junction; the preferred usage produces a distribution of amino acids that is disorder-promoting. We observe that efficiency of splicing, as seen in the amino-acid distribution, is not compromised to accommodate globular structure. Thus we infer that it is the positions of splice junctions in the gene that must be under constraint by the local protein environment. Examining exonic splicing enhancers found near the splice junction in the gene, reveals that these (short DNA motifs) are more prevalent in exons that encode disordered protein regions than exons encoding structured regions. Thus we also conclude that local protein features constrain efficient splicing more in structure than in disorder.


Asunto(s)
Proteínas Intrínsecamente Desordenadas/genética , Sitios de Empalme de ARN , Aminoácidos/análisis , Animales , Eucariontes/genética , Exones , Motivos de Nucleótidos , Nucleótidos/análisis
6.
Nucleic Acids Res ; 43(Database issue): D227-33, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25414345

RESUMEN

We present updates to the SUPERFAMILY 1.75 (http://supfam.org) online resource and protein sequence collection. The hidden Markov model library that provides sequence homology to SCOP structural domains remains unchanged at version 1.75. In the last 4 years SUPERFAMILY has more than doubled its holding of curated complete proteomes over all cellular life, from 1400 proteomes reported previously in 2010 up to 3258 at present. Outside of the main sequence collection, SUPERFAMILY continues to provide domain annotation for sequences provided by other resources such as: UniProt, Ensembl, PDB, much of JGI Phytozome and selected subcollections of NCBI RefSeq. Despite this growth in data volume, SUPERFAMILY now provides users with an expanded and daily updated phylogenetic tree of life (sTOL). This tree is built with genomic-scale domain annotation data as before, but constantly updated when new species are introduced to the sequence library. Our Gene Ontology and other functional and phenotypic annotations previously reported have stood up to critical assessment by the function prediction community. We have now introduced these data in an integrated manner online at the level of an individual sequence, and--in the case of whole genomes--with enrichment analysis against a taxonomically defined background.


Asunto(s)
Bases de Datos de Proteínas , Estructura Terciaria de Proteína , Ontología de Genes , Anotación de Secuencia Molecular , Filogenia , Proteínas/clasificación , Proteínas/genética , Proteoma/química , Análisis de Secuencia de Proteína
7.
Nucleic Acids Res ; 43(Database issue): D382-6, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25348407

RESUMEN

Genome3D (http://www.genome3d.eu) is a collaborative resource that provides predicted domain annotations and structural models for key sequences. Since introducing Genome3D in a previous NAR paper, we have substantially extended and improved the resource. We have annotated representatives from Pfam families to improve coverage of diverse sequences and added a fast sequence search to the website to allow users to find Genome3D-annotated sequences similar to their own. We have improved and extended the Genome3D data, enlarging the source data set from three model organisms to 10, and adding VIVACE, a resource new to Genome3D. We have analysed and updated Genome3D's SCOP/CATH mapping. Finally, we have improved the superposition tools, which now give users a more powerful interface for investigating similarities and differences between structural models.


Asunto(s)
Bases de Datos de Proteínas , Anotación de Secuencia Molecular , Estructura Terciaria de Proteína , Algoritmos , Genómica , Internet , Modelos Moleculares , Estructura Terciaria de Proteína/genética , Análisis de Secuencia de Proteína
8.
Mol Biol Evol ; 31(6): 1364-74, 2014 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-24692656

RESUMEN

Humans are composed of hundreds of cell types. As the genomic DNA of each somatic cell is identical, cell type is determined by what is expressed and when. Until recently, little has been reported about the determinants of human cell identity, particularly from the joint perspective of gene evolution and expression. Here, we chart the evolutionary past of all documented human cell types via the collective histories of proteins, the principal product of gene expression. FANTOM5 data provide cell-type-specific digital expression of human protein-coding genes and the SUPERFAMILY resource is used to provide protein domain annotation. The evolutionary epoch in which each protein was created is inferred by comparison with domain annotation of all other completely sequenced genomes. Studying the distribution across epochs of genes expressed in each cell type reveals insights into human cellular evolution in terms of protein innovation. For each cell type, its history of protein innovation is charted based on the genes it expresses. Combining the histories of all cell types enables us to create a timeline of cell evolution. This timeline identifies the possibility that our common ancestor Coelomata (cavity-forming animals) provided the innovation required for the innate immune system, whereas cells which now form the brain of human have followed a trajectory of continually accumulating novel proteins since Opisthokonta (boundary of animals and fungi). We conclude that exaptation of existing domain architectures into new contexts is the dominant source of cell-type-specific domain architectures.


Asunto(s)
Evolución Molecular , Filogenia , Proteínas/química , Proteínas/genética , Células Eucariotas , Humanos , Inmunidad Innata , Estructura Terciaria de Proteína , Análisis de Secuencia de Proteína , Transcriptoma
9.
Environ Microbiol ; 17(1): 4-9, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25339269

RESUMEN

We present the Proteome Quality Index (PQI; http://pqi-list.org), a much-needed resource for users of bacterial and eukaryotic proteomes. Completely sequenced genomes for which there is an available set of protein sequences (the proteome) are given a one- to five-star rating supported by 11 different metrics of quality. The database indexes over 3000 proteomes at the time of writing and is provided via a website for browsing, filtering and downloading. Previous to this work, there was no systematic way to account for the large variability in quality of the thousands of proteomes, and this is likely to have profoundly influenced the outcome of many published studies, in particular large-scale comparative analyses. The lack of a measure of proteome quality is likely due to the difficulty in producing one, a problem that we have approached by integrating multiple metrics. The continued development and improvement of the index will require the contribution of additional metrics by us and by others; the PQI provides a useful point of reference for the scientific community, but it is only the first step towards a 'standard' for the field.


Asunto(s)
Bases de Datos de Proteínas , Proteoma/normas , Genoma , Internet
10.
Nucleic Acids Res ; 41(Database issue): D508-16, 2013 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-23203878

RESUMEN

We present the Database of Disordered Protein Prediction (D(2)P(2)), available at http://d2p2.pro (including website source code). A battery of disorder predictors and their variants, VL-XT, VSL2b, PrDOS, PV2, Espritz and IUPred, were run on all protein sequences from 1765 complete proteomes (to be updated as more genomes are completed). Integrated with these results are all of the predicted (mostly structured) SCOP domains using the SUPERFAMILY predictor. These disorder/structure annotations together enable comparison of the disorder predictors with each other and examination of the overlap between disordered predictions and SCOP domains on a large scale. D(2)P(2) will increase our understanding of the interplay between disorder and structure, the genomic distribution of disorder, and its evolutionary history. The parsed data are made available in a unified format for download as flat files or SQL tables either by genome, by predictor, or for the complete set. An interactive website provides a graphical view of each protein annotated with the SCOP domains and disordered regions from all predictors overlaid (or shown as a consensus). There are statistics and tools for browsing and comparing genomes and their disorder within the context of their position on the tree of life.


Asunto(s)
Bases de Datos de Proteínas , Conformación Proteica , Genoma , Internet , Estructura Terciaria de Proteína , Proteínas/química , Proteínas/genética , Análisis de Secuencia de Proteína
11.
Nat Commun ; 14(1): 919, 2023 02 17.
Artículo en Inglés | MEDLINE | ID: mdl-36808136

RESUMEN

Cohort-wide sequencing studies have revealed that the largest category of variants is those deemed 'rare', even for the subset located in coding regions (99% of known coding variants are seen in less than 1% of the population. Associative methods give some understanding how rare genetic variants influence disease and organism-level phenotypes. But here we show that additional discoveries can be made through a knowledge-based approach using protein domains and ontologies (function and phenotype) that considers all coding variants regardless of allele frequency. We describe an ab initio, genetics-first method making molecular knowledge-based interpretations for exome-wide non-synonymous variants for phenotypes at the organism and cellular level. By using this reverse approach, we identify plausible genetic causes for developmental disorders that have eluded other established methods and present molecular hypotheses for the causal genetics of 40 phenotypes generated from a direct-to-consumer genotype cohort. This system offers a chance to extract further discovery from genetic data after standard tools have been applied.


Asunto(s)
Exoma , Predisposición Genética a la Enfermedad , Humanos , Fenotipo , Genotipo , Frecuencia de los Genes
12.
Protein Sci ; 25(5): 1030-6, 2016 May.
Artículo en Inglés | MEDLINE | ID: mdl-26941008

RESUMEN

We have identified that the collagen helix has the potential to be disruptive to analyses of intrinsically disordered proteins. The collagen helix is an extended fibrous structure that is both promiscuous and repetitive. Whilst its sequence is predicted to be disordered, this type of protein structure is not typically considered as intrinsic disorder. Here, we show that collagen-encoding proteins skew the distribution of exon lengths in genes. We find that previous results, demonstrating that exons encoding disordered regions are more likely to be symmetric, are due to the abundance of the collagen helix. Other related results, showing increased levels of alternative splicing in disorder-encoding exons, still hold after considering collagen-containing proteins. Aside from analyses of exons, we find that the set of proteins that contain collagen significantly alters the amino acid composition of regions predicted as disordered. We conclude that research in this area should be conducted in the light of the collagen helix.


Asunto(s)
Empalme Alternativo , Colágeno/química , Colágeno/genética , Exones , Secuencia de Aminoácidos , Genoma Humano , Humanos , Proteínas Intrínsecamente Desordenadas/química , Proteínas Intrínsecamente Desordenadas/genética , Conformación Proteica , Estructura Secundaria de Proteína
13.
Genome Biol Evol ; 8(7): 2118-32, 2016 07 14.
Artículo en Inglés | MEDLINE | ID: mdl-27358427

RESUMEN

To progress our understanding of molecular evolution from a collection of well-studied genes toward the level of the cell, we must consider whole systems. Here, we reveal the evolution of an important intracellular signaling system. The calcium-signaling toolkit is made up of different multidomain proteins that have undergone duplication, recombination, sequence divergence, and selection. The picture of evolution, considering the repertoire of proteins in the toolkit of both extant organisms and ancestors, is radically different from that of other systems. In eukaryotes, the repertoire increased in both abundance and diversity at a far greater rate than general genomic expansion. We describe how calcium-based intracellular signaling evolution differs not only in rate but in nature, and how this correlates with the disparity of plants and animals.


Asunto(s)
Señalización del Calcio/genética , Proteínas de Unión al Calcio/genética , Evolución Molecular , Animales , Proteínas de Unión al Calcio/química , Proteínas de Unión al Calcio/metabolismo , Eucariontes/genética
14.
Nat Genet ; 48(3): 331-5, 2016 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-26780608

RESUMEN

Transdifferentiation, the process of converting from one cell type to another without going through a pluripotent state, has great promise for regenerative medicine. The identification of key transcription factors for reprogramming is currently limited by the cost of exhaustive experimental testing of plausible sets of factors, an approach that is inefficient and unscalable. Here we present a predictive system (Mogrify) that combines gene expression data with regulatory network information to predict the reprogramming factors necessary to induce cell conversion. We have applied Mogrify to 173 human cell types and 134 tissues, defining an atlas of cellular reprogramming. Mogrify correctly predicts the transcription factors used in known transdifferentiations. Furthermore, we validated two new transdifferentiations predicted by Mogrify. We provide a practical and efficient mechanism for systematically implementing novel cell conversions, facilitating the generalization of reprogramming of human cells. Predictions are made available to help rapidly further the field of cell conversion.


Asunto(s)
Diferenciación Celular/genética , Transdiferenciación Celular/genética , Reprogramación Celular/genética , Redes Reguladoras de Genes , Fibroblastos , Humanos , Células Madre Pluripotentes Inducidas , Medicina Regenerativa , Factores de Transcripción/biosíntesis , Factores de Transcripción/genética
15.
Biochimie ; 119: 269-77, 2015 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-25980317

RESUMEN

To help evaluate how protein function impacts on genome evolution, we introduce a new concept of 'architecture plasticity potential' - the capacity to form distinct domain architectures - both for an individual domain, or more generally for a set of domains grouped by shared function. We devise a scoring metric to measure the plasticity potential for these domain sets, and evaluate how function has changed over time for different species. Applying this metric to a phylogenetic tree of eukaryotic genomes, we find that the involvement of each function is not random but highly selective. For certain lineages there is strong bias for evolution to involve domains related to certain functions. In general eukaryotic genomes, particularly animals, expand complex functional activities such as signalling and regulation, but at the cost of reducing metabolic processes. We also observe differential evolution of transcriptional regulation and a unique evolutionary role of channel regulators; crucially this is only observable in terms of the architecture plasticity potential. Our findings provide a new layer of information to understand the significance of function in eukaryotic genome evolution. A web search tool, available at http://supfam.org/Pevo, offers a wide spectrum of options for exploring functional importance in eukaryotic genome evolution.


Asunto(s)
Eucariontes/genética , Evolución Molecular , Genoma , Genómica/métodos , Modelos Genéticos , Proteoma/química , Animales , Linaje de la Célula , Plasticidad de la Célula , Bases de Datos Genéticas , Bases de Datos de Proteínas , Eucariontes/citología , Eucariontes/metabolismo , Humanos , Internet , Filogenia , Estructura Terciaria de Proteína , Proteoma/genética , Proteoma/metabolismo , Motor de Búsqueda , Homología Estructural de Proteína
16.
Curr Opin Struct Biol ; 27: 129-37, 2014 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-25198166

RESUMEN

The seven-transmembrane (7TM) helix fold of G-protein coupled receptors (GPCRs) has been adapted for a wide variety of physiologically important signaling functions. Here, we discuss the diversity in the structured and disordered regions of GPCRs based on the recently published crystal structures and sequence analysis of all human GPCRs. A comparison of the structures of rhodopsin-like receptors (class A), secretin-like receptors (class B), metabotropic receptors (class C) and frizzled receptors (class F) shows that the relative arrangement of the transmembrane helices is conserved across all four GPCR classes although individual receptors can be activated by ligand binding at varying positions within and around the transmembrane helical bundle. A systematic analysis of GPCR sequences reveals the presence of disordered segments in the cytoplasmic side, abundant post-translational modification sites, evidence for alternative splicing and several putative linear peptide motifs that have the potential to mediate interactions with cytosolic proteins. While the structured regions permit the receptor to bind diverse ligands, the disordered regions appear to have an underappreciated role in modulating downstream signaling in response to the cellular state. An integrated paradigm combining the knowledge of structured and disordered regions is imperative for gaining a holistic understanding of the GPCR (un)structure-function relationship.


Asunto(s)
Receptores Acoplados a Proteínas G/química , Animales , Membrana Celular/química , Membrana Celular/metabolismo , Humanos , Receptores Acoplados a Proteínas G/metabolismo
17.
Sci Rep ; 3: 2015, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23778980

RESUMEN

We report a daily-updated sequenced/species Tree Of Life (sTOL) as a reference for the increasing number of cellular organisms with their genomes sequenced. The sTOL builds on a likelihood-based weight calibration algorithm to consolidate NCBI taxonomy information in concert with unbiased sampling of molecular characters from whole genomes of all sequenced organisms. Via quantifying the extent of agreement between taxonomic and molecular data, we observe there are many potential improvements that can be made to the status quo classification, particularly in the Fungi kingdom; we also see that the current state of many animal genomes is rather poor. To augment the use of sTOL in providing evolutionary contexts, we integrate an ontology infrastructure and demonstrate its utility for evolutionary understanding on: nuclear receptors, stem cells and eukaryotic genomes. The sTOL (http://supfam.org/SUPERFAMILY/sTOL) provides a binary tree of (sequenced) life, and contributes to an analytical platform linking genome evolution, function and phenotype.


Asunto(s)
Bases de Datos Genéticas , Genoma , Genómica , Filogenia , Animales , Biología Computacional/métodos , Bases de Datos Genéticas/normas , Genómica/métodos , Genómica/normas , Internet
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA