Pesquisa | BVS IEC

Smithers, Ben; Oates, Matt; Gough, Julian.

Nucleic Acids Res ; 47(10): 4970-4973, 2019 06 04.

Artigo em Inglês | MEDLINE | ID: mdl-30997511

RESUMO

The alignment between the boundaries of protein domains and the boundaries of exons could provide evidence for the evolution of proteins via domain shuffling, but literature in the field has so far struggled to conclusively show this. Here, on larger data sets than previously possible, we do finally show that this phenomenon is indisputably found widely across the eukaryotic tree. In contrast, the alignment between exons and the boundaries of intrinsically disordered regions of proteins is not a general property of eukaryotes. Most interesting of all is the discovery that domain-exon alignment is much more common in recently evolved protein sequences than older ones.

Assuntos

Células Eucarióticas/metabolismo , Éxons/genética , Íntrons/genética , Proteínas/genética , Animais , Evolução Molecular , Genoma/genética , Humanos

The SUPERFAMILY 2.0 database: a significant proteome update and a new webserver.

Pandurangan, Arun Prasad; Stahlhacke, Jonathan; Oates, Matt E; Smithers, Ben; Gough, Julian.

Nucleic Acids Res ; 47(D1): D490-D494, 2019 01 08.

Artigo em Inglês | MEDLINE | ID: mdl-30445555

RESUMO

Here, we present a major update to the SUPERFAMILY database and the webserver. We describe the addition of new SUPERFAMILY 2.0 profile HMM library containing a total of 27 623 HMMs. The database now includes Superfamily domain annotations for millions of protein sequences taken from the Universal Protein Recourse Knowledgebase (UniProtKB) and the National Center for Biotechnology Information (NCBI). This addition constitutes about 51 and 45 million distinct protein sequences obtained from UniProtKB and NCBI respectively. Currently, the database contains annotations for 63 244 and 102 151 complete genomes taken from UniProtKB and NCBI respectively. The current sequence collection and genome update is the biggest so far in the history of SUPERFAMILY updates. In order to the deal with the massive wealth of information, here we introduce a new SUPERFAMILY 2.0 webserver (http://supfam.org). Currently, the webserver mainly focuses on the search, retrieval and display of Superfamily annotation for the entire sequence and genome collection in the database.

Assuntos

Bases de Dados de Proteínas , Domínios Proteicos , Proteoma/química , Genoma , Internet , Cadeias de Markov , Domínios Proteicos/genética , Análise de Sequência de Proteína

InterPro in 2017-beyond protein family and domain annotations.

Finn, Robert D; Attwood, Teresa K; Babbitt, Patricia C; Bateman, Alex; Bork, Peer; Bridge, Alan J; Chang, Hsin-Yu; Dosztányi, Zsuzsanna; El-Gebali, Sara; Fraser, Matthew; Gough, Julian; Haft, David; Holliday, Gemma L; Huang, Hongzhan; Huang, Xiaosong; Letunic, Ivica; Lopez, Rodrigo; Lu, Shennan; Marchler-Bauer, Aron; Mi, Huaiyu; Mistry, Jaina; Natale, Darren A; Necci, Marco; Nuka, Gift; Orengo, Christine A; Park, Youngmi; Pesseat, Sebastien; Piovesan, Damiano; Potter, Simon C; Rawlings, Neil D; Redaschi, Nicole; Richardson, Lorna; Rivoire, Catherine; Sangrador-Vegas, Amaia; Sigrist, Christian; Sillitoe, Ian; Smithers, Ben; Squizzato, Silvano; Sutton, Granger; Thanki, Narmada; Thomas, Paul D; Tosatto, Silvio C E; Wu, Cathy H; Xenarios, Ioannis; Yeh, Lai-Su; Young, Siew-Yit; Mitchell, Alex L.

Nucleic Acids Res ; 45(D1): D190-D199, 2017 01 04.

Artigo em Inglês | MEDLINE | ID: mdl-27899635

RESUMO

InterPro (http://www.ebi.ac.uk/interpro/) is a freely available database used to classify protein sequences into families and to predict the presence of important domains and sites. InterProScan is the underlying software that allows both protein and nucleic acid sequences to be searched against InterPro's predictive models, which are provided by its member databases. Here, we report recent developments with InterPro and its associated software, including the addition of two new databases (SFLD and CDD), and the functionality to include residue-level annotation and prediction of intrinsic disorder. These developments enrich the annotations provided by InterPro, increase the overall number of residues annotated and allow more specific functional inferences.

Assuntos

Biologia Computacional/métodos , Bases de Dados de Proteínas , Domínios e Motivos de Interação entre Proteínas , Software , Humanos , Anotação de Sequência Molecular , Filogenia

Splice junctions are constrained by protein disorder.

Smithers, Ben; Oates, Matt E; Gough, Julian.

Nucleic Acids Res ; 43(10): 4814-22, 2015 May 26.

Artigo em Inglês | MEDLINE | ID: mdl-25934802

RESUMO

We have discovered that positions of splice junctions in genes are constrained by the tolerance for disorder-promoting amino acids in the translated protein region. It is known that efficient splicing requires nucleotide bias at the splice junction; the preferred usage produces a distribution of amino acids that is disorder-promoting. We observe that efficiency of splicing, as seen in the amino-acid distribution, is not compromised to accommodate globular structure. Thus we infer that it is the positions of splice junctions in the gene that must be under constraint by the local protein environment. Examining exonic splicing enhancers found near the splice junction in the gene, reveals that these (short DNA motifs) are more prevalent in exons that encode disordered protein regions than exons encoding structured regions. Thus we also conclude that local protein features constrain efficient splicing more in structure than in disorder.

Assuntos

Proteínas Intrinsicamente Desordenadas/genética , Sítios de Splice de RNA , Aminoácidos/análise , Animais , Eucariotos/genética , Éxons , Motivos de Nucleotídeos , Nucleotídeos/análise

The SUPERFAMILY 1.75 database in 2014: a doubling of data.

Oates, Matt E; Stahlhacke, Jonathan; Vavoulis, Dimitrios V; Smithers, Ben; Rackham, Owen J L; Sardar, Adam J; Zaucha, Jan; Thurlby, Natalie; Fang, Hai; Gough, Julian.

Nucleic Acids Res ; 43(Database issue): D227-33, 2015 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-25414345

RESUMO

We present updates to the SUPERFAMILY 1.75 (http://supfam.org) online resource and protein sequence collection. The hidden Markov model library that provides sequence homology to SCOP structural domains remains unchanged at version 1.75. In the last 4 years SUPERFAMILY has more than doubled its holding of curated complete proteomes over all cellular life, from 1400 proteomes reported previously in 2010 up to 3258 at present. Outside of the main sequence collection, SUPERFAMILY continues to provide domain annotation for sequences provided by other resources such as: UniProt, Ensembl, PDB, much of JGI Phytozome and selected subcollections of NCBI RefSeq. Despite this growth in data volume, SUPERFAMILY now provides users with an expanded and daily updated phylogenetic tree of life (sTOL). This tree is built with genomic-scale domain annotation data as before, but constantly updated when new species are introduced to the sequence library. Our Gene Ontology and other functional and phenotypic annotations previously reported have stood up to critical assessment by the function prediction community. We have now introduced these data in an integrated manner online at the level of an individual sequence, and--in the case of whole genomes--with enrichment analysis against a taxonomically defined background.

Assuntos

Bases de Dados de Proteínas , Estrutura Terciária de Proteína , Ontologia Genética , Anotação de Sequência Molecular , Filogenia , Proteínas/classificação , Proteínas/genética , Proteoma/química , Análise de Sequência de Proteína

A proteome quality index.

Zaucha, Jan; Stahlhacke, Jonathan; Oates, Matt E; Thurlby, Natalie; Rackham, Owen J L; Fang, Hai; Smithers, Ben; Gough, Julian.

Environ Microbiol ; 17(1): 4-9, 2015 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-25339269

RESUMO

We present the Proteome Quality Index (PQI; http://pqi-list.org), a much-needed resource for users of bacterial and eukaryotic proteomes. Completely sequenced genomes for which there is an available set of protein sequences (the proteome) are given a one- to five-star rating supported by 11 different metrics of quality. The database indexes over 3000 proteomes at the time of writing and is provided via a website for browsing, filtering and downloading. Previous to this work, there was no systematic way to account for the large variability in quality of the thousands of proteomes, and this is likely to have profoundly influenced the outcome of many published studies, in particular large-scale comparative analyses. The lack of a measure of proteome quality is likely due to the difficulty in producing one, a problem that we have approached by integrating multiple metrics. The continued development and improvement of the index will require the contribution of additional metrics by us and by others; the PQI provides a useful point of reference for the scientific community, but it is only the first step towards a 'standard' for the field.

Assuntos

Bases de Dados de Proteínas , Proteoma/normas , Genoma , Internet

Three reasons protein disorder analysis makes more sense in the light of collagen.

Smithers, Ben; Oates, Matt E; Tompa, Peter; Gough, Julian.

Protein Sci ; 25(5): 1030-6, 2016 May.

Artigo em Inglês | MEDLINE | ID: mdl-26941008

RESUMO

We have identified that the collagen helix has the potential to be disruptive to analyses of intrinsically disordered proteins. The collagen helix is an extended fibrous structure that is both promiscuous and repetitive. Whilst its sequence is predicted to be disordered, this type of protein structure is not typically considered as intrinsic disorder. Here, we show that collagen-encoding proteins skew the distribution of exon lengths in genes. We find that previous results, demonstrating that exons encoding disordered regions are more likely to be symmetric, are due to the abundance of the collagen helix. Other related results, showing increased levels of alternative splicing in disorder-encoding exons, still hold after considering collagen-containing proteins. Aside from analyses of exons, we find that the set of proteins that contain collagen significantly alters the amino acid composition of regions predicted as disordered. We conclude that research in this area should be conducted in the light of the collagen helix.

Assuntos

Processamento Alternativo , Colágeno/química , Colágeno/genética , Éxons , Sequência de Aminoácidos , Genoma Humano , Humanos , Proteínas Intrinsicamente Desordenadas/química , Proteínas Intrinsicamente Desordenadas/genética , Conformação Proteica , Estrutura Secundária de Proteína

An expanded evaluation of protein function prediction methods shows an improvement in accuracy.

Jiang, Yuxiang; Oron, Tal Ronnen; Clark, Wyatt T; Bankapur, Asma R; D'Andrea, Daniel; Lepore, Rosalba; Funk, Christopher S; Kahanda, Indika; Verspoor, Karin M; Ben-Hur, Asa; Koo, Da Chen Emily; Penfold-Brown, Duncan; Shasha, Dennis; Youngs, Noah; Bonneau, Richard; Lin, Alexandra; Sahraeian, Sayed M E; Martelli, Pier Luigi; Profiti, Giuseppe; Casadio, Rita; Cao, Renzhi; Zhong, Zhaolong; Cheng, Jianlin; Altenhoff, Adrian; Skunca, Nives; Dessimoz, Christophe; Dogan, Tunca; Hakala, Kai; Kaewphan, Suwisa; Mehryary, Farrokh; Salakoski, Tapio; Ginter, Filip; Fang, Hai; Smithers, Ben; Oates, Matt; Gough, Julian; Törönen, Petri; Koskinen, Patrik; Holm, Liisa; Chen, Ching-Tai; Hsu, Wen-Lian; Bryson, Kevin; Cozzetto, Domenico; Minneci, Federico; Jones, David T; Chapman, Samuel; Bkc, Dukka; Khan, Ishita K; Kihara, Daisuke; Ofer, Dan.

Genome Biol ; 17(1): 184, 2016 09 07.

Artigo em Inglês | MEDLINE | ID: mdl-27604469

RESUMO

BACKGROUND: A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging. RESULTS: We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2. CONCLUSIONS: The top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent.

Assuntos

Biologia Computacional , Proteínas/química , Software , Relação Estrutura-Atividade , Algoritmos , Bases de Dados de Proteínas , Ontologia Genética , Humanos , Anotação de Sequência Molecular , Proteínas/genética

Sequential transcriptional changes dictate safe and effective antigen-specific immunotherapy.

Burton, Bronwen R; Britton, Graham J; Fang, Hai; Verhagen, Johan; Smithers, Ben; Sabatos-Peyton, Catherine A; Carney, Laura J; Gough, Julian; Strobel, Stephan; Wraith, David C.

Nat Commun ; 5: 4741, 2014 Sep 03.

Artigo em Inglês | MEDLINE | ID: mdl-25182274

RESUMO

Antigen-specific immunotherapy combats autoimmunity or allergy by reinstating immunological tolerance to target antigens without compromising immune function. Optimization of dosing strategy is critical for effective modulation of pathogenic CD4(+) T-cell activity. Here we report that dose escalation is imperative for safe, subcutaneous delivery of the high self-antigen doses required for effective tolerance induction and elicits anergic, interleukin (IL)-10-secreting regulatory CD4(+) T cells. Analysis of the CD4(+) T-cell transcriptome, at consecutive stages of escalating dose immunotherapy, reveals progressive suppression of transcripts positively regulating inflammatory effector function and repression of cell cycle pathways. We identify transcription factors, c-Maf and NFIL3, and negative co-stimulatory molecules, LAG-3, TIGIT, PD-1 and TIM-3, which characterize this regulatory CD4(+) T-cell population and whose expression correlates with the immunoregulatory cytokine IL-10. These results provide a rationale for dose escalation in T-cell-directed immunotherapy and reveal novel immunological and transcriptional signatures as surrogate markers of successful immunotherapy.

Assuntos

Autoantígenos/administração & dosagem , Linfócitos T CD4-Positivos/efeitos dos fármacos , Dessensibilização Imunológica/métodos , Encefalomielite Autoimune Experimental/terapia , Peptídeos/administração & dosagem , Transcriptoma/efeitos dos fármacos , Animais , Antígenos CD/genética , Antígenos CD/imunologia , Autoantígenos/química , Autoantígenos/imunologia , Fatores de Transcrição de Zíper de Leucina Básica/genética , Fatores de Transcrição de Zíper de Leucina Básica/imunologia , Linfócitos T CD4-Positivos/imunologia , Linfócitos T CD4-Positivos/patologia , Anergia Clonal/efeitos dos fármacos , Misturas Complexas/administração & dosagem , Misturas Complexas/imunologia , Relação Dose-Resposta Imunológica , Encefalomielite Autoimune Experimental/induzido quimicamente , Encefalomielite Autoimune Experimental/imunologia , Encefalomielite Autoimune Experimental/patologia , Feminino , Adjuvante de Freund/administração & dosagem , Adjuvante de Freund/imunologia , Regulação da Expressão Gênica , Receptor Celular 2 do Vírus da Hepatite A , Injeções Subcutâneas , Interleucina-10/genética , Interleucina-10/imunologia , Masculino , Camundongos , Camundongos Transgênicos , Peptídeos/química , Peptídeos/imunologia , Receptor de Morte Celular Programada 1/genética , Receptor de Morte Celular Programada 1/imunologia , Proteínas Proto-Oncogênicas c-maf/genética , Proteínas Proto-Oncogênicas c-maf/imunologia , Receptores Imunológicos/genética , Receptores Imunológicos/imunologia , Receptores Virais/genética , Receptores Virais/imunologia , Medula Espinal/química , Transcriptoma/imunologia , Proteína do Gene 3 de Ativação de Linfócitos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA