Pesquisa | BVS Violência e Saúde

The InterPro protein families and domains database: 20 years on.

Blum, Matthias; Chang, Hsin-Yu; Chuguransky, Sara; Grego, Tiago; Kandasaamy, Swaathi; Mitchell, Alex; Nuka, Gift; Paysan-Lafosse, Typhaine; Qureshi, Matloob; Raj, Shriya; Richardson, Lorna; Salazar, Gustavo A; Williams, Lowri; Bork, Peer; Bridge, Alan; Gough, Julian; Haft, Daniel H; Letunic, Ivica; Marchler-Bauer, Aron; Mi, Huaiyu; Natale, Darren A; Necci, Marco; Orengo, Christine A; Pandurangan, Arun P; Rivoire, Catherine; Sigrist, Christian J A; Sillitoe, Ian; Thanki, Narmada; Thomas, Paul D; Tosatto, Silvio C E; Wu, Cathy H; Bateman, Alex; Finn, Robert D.

Nucleic Acids Res ; 49(D1): D344-D354, 2021 01 08.

Artigo em Inglês | MEDLINE | ID: mdl-33156333

RESUMO

The InterPro database (https://www.ebi.ac.uk/interpro/) provides an integrative classification of protein sequences into families, and identifies functionally important domains and conserved sites. InterProScan is the underlying software that allows protein and nucleic acid sequences to be searched against InterPro's signatures. Signatures are predictive models which describe protein families, domains or sites, and are provided by multiple databases. InterPro combines signatures representing equivalent families, domains or sites, and provides additional information such as descriptions, literature references and Gene Ontology (GO) terms, to produce a comprehensive resource for protein classification. Founded in 1999, InterPro has become one of the most widely used resources for protein family annotation. Here, we report the status of InterPro (version 81.0) in its 20th year of operation, and its associated software, including updates to database content, the release of a new website and REST API, and performance improvements in InterProScan.

Assuntos

Bases de Dados de Proteínas , Proteínas/química , Sequência de Aminoácidos , COVID-19/metabolismo , Internet , Anotação de Sequência Molecular , Domínios Proteicos , Mapas de Interação de Proteínas , SARS-CoV-2/metabolismo , Alinhamento de Sequência

InterPro in 2019: improving coverage, classification and access to protein sequence annotations.

Mitchell, Alex L; Attwood, Teresa K; Babbitt, Patricia C; Blum, Matthias; Bork, Peer; Bridge, Alan; Brown, Shoshana D; Chang, Hsin-Yu; El-Gebali, Sara; Fraser, Matthew I; Gough, Julian; Haft, David R; Huang, Hongzhan; Letunic, Ivica; Lopez, Rodrigo; Luciani, Aurélien; Madeira, Fabio; Marchler-Bauer, Aron; Mi, Huaiyu; Natale, Darren A; Necci, Marco; Nuka, Gift; Orengo, Christine; Pandurangan, Arun P; Paysan-Lafosse, Typhaine; Pesseat, Sebastien; Potter, Simon C; Qureshi, Matloob A; Rawlings, Neil D; Redaschi, Nicole; Richardson, Lorna J; Rivoire, Catherine; Salazar, Gustavo A; Sangrador-Vegas, Amaia; Sigrist, Christian J A; Sillitoe, Ian; Sutton, Granger G; Thanki, Narmada; Thomas, Paul D; Tosatto, Silvio C E; Yong, Siew-Yit; Finn, Robert D.

Nucleic Acids Res ; 47(D1): D351-D360, 2019 01 08.

Artigo em Inglês | MEDLINE | ID: mdl-30398656

RESUMO

The InterPro database (http://www.ebi.ac.uk/interpro/) classifies protein sequences into families and predicts the presence of functionally important domains and sites. Here, we report recent developments with InterPro (version 70.0) and its associated software, including an 18% growth in the size of the database in terms on new InterPro entries, updates to content, the inclusion of an additional entry type, refined modelling of discontinuous domains, and the development of a new programmatic interface and website. These developments extend and enrich the information provided by InterPro, and provide greater flexibility in terms of data access. We also show that InterPro's sequence coverage has kept pace with the growth of UniProtKB, and discuss how our evaluation of residue coverage may help guide future curation activities.

Assuntos

Bases de Dados de Proteínas , Anotação de Sequência Molecular , Animais , Bases de Dados Genéticas , Ontologia Genética , Humanos , Internet , Família Multigênica , Domínios Proteicos/genética , Homologia de Sequência de Aminoácidos , Software , Interface Usuário-Computador

InterPro in 2017-beyond protein family and domain annotations.

Finn, Robert D; Attwood, Teresa K; Babbitt, Patricia C; Bateman, Alex; Bork, Peer; Bridge, Alan J; Chang, Hsin-Yu; Dosztányi, Zsuzsanna; El-Gebali, Sara; Fraser, Matthew; Gough, Julian; Haft, David; Holliday, Gemma L; Huang, Hongzhan; Huang, Xiaosong; Letunic, Ivica; Lopez, Rodrigo; Lu, Shennan; Marchler-Bauer, Aron; Mi, Huaiyu; Mistry, Jaina; Natale, Darren A; Necci, Marco; Nuka, Gift; Orengo, Christine A; Park, Youngmi; Pesseat, Sebastien; Piovesan, Damiano; Potter, Simon C; Rawlings, Neil D; Redaschi, Nicole; Richardson, Lorna; Rivoire, Catherine; Sangrador-Vegas, Amaia; Sigrist, Christian; Sillitoe, Ian; Smithers, Ben; Squizzato, Silvano; Sutton, Granger; Thanki, Narmada; Thomas, Paul D; Tosatto, Silvio C E; Wu, Cathy H; Xenarios, Ioannis; Yeh, Lai-Su; Young, Siew-Yit; Mitchell, Alex L.

Nucleic Acids Res ; 45(D1): D190-D199, 2017 01 04.

Artigo em Inglês | MEDLINE | ID: mdl-27899635

RESUMO

InterPro (http://www.ebi.ac.uk/interpro/) is a freely available database used to classify protein sequences into families and to predict the presence of important domains and sites. InterProScan is the underlying software that allows both protein and nucleic acid sequences to be searched against InterPro's predictive models, which are provided by its member databases. Here, we report recent developments with InterPro and its associated software, including the addition of two new databases (SFLD and CDD), and the functionality to include residue-level annotation and prediction of intrinsic disorder. These developments enrich the annotations provided by InterPro, increase the overall number of residues annotated and allow more specific functional inferences.

Assuntos

Biologia Computacional/métodos , Bases de Dados de Proteínas , Domínios e Motivos de Interação entre Proteínas , Software , Humanos , Anotação de Sequência Molecular , Filogenia

The InterPro protein families database: the classification resource after 15 years.

Mitchell, Alex; Chang, Hsin-Yu; Daugherty, Louise; Fraser, Matthew; Hunter, Sarah; Lopez, Rodrigo; McAnulla, Craig; McMenamin, Conor; Nuka, Gift; Pesseat, Sebastien; Sangrador-Vegas, Amaia; Scheremetjew, Maxim; Rato, Claudia; Yong, Siew-Yit; Bateman, Alex; Punta, Marco; Attwood, Teresa K; Sigrist, Christian J A; Redaschi, Nicole; Rivoire, Catherine; Xenarios, Ioannis; Kahn, Daniel; Guyot, Dominique; Bork, Peer; Letunic, Ivica; Gough, Julian; Oates, Matt; Haft, Daniel; Huang, Hongzhan; Natale, Darren A; Wu, Cathy H; Orengo, Christine; Sillitoe, Ian; Mi, Huaiyu; Thomas, Paul D; Finn, Robert D.

Nucleic Acids Res ; 43(Database issue): D213-21, 2015 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-25428371

RESUMO

The InterPro database (http://www.ebi.ac.uk/interpro/) is a freely available resource that can be used to classify sequences into protein families and to predict the presence of important domains and sites. Central to the InterPro database are predictive models, known as signatures, from a range of different protein family databases that have different biological focuses and use different methodological approaches to classify protein families and domains. InterPro integrates these signatures, capitalizing on the respective strengths of the individual databases, to produce a powerful protein classification resource. Here, we report on the status of InterPro as it enters its 15th year of operation, and give an overview of new developments with the database and its associated Web interfaces and software. In particular, the new domain architecture search tool is described and the process of mapping of Gene Ontology terms to InterPro is outlined. We also discuss the challenges faced by the resource given the explosive growth in sequence data in recent years. InterPro (version 48.0) contains 36,766 member database signatures integrated into 26,238 InterPro entries, an increase of over 3993 entries (5081 signatures), since 2012.

Assuntos

Bases de Dados de Proteínas , Proteínas/classificação , Bactérias/metabolismo , Ontologia Genética , Estrutura Terciária de Proteína , Proteínas/genética , Análise de Sequência de Proteína , Software

EBI metagenomics--a new resource for the analysis and archiving of metagenomic data.

Hunter, Sarah; Corbett, Matthew; Denise, Hubert; Fraser, Matthew; Gonzalez-Beltran, Alejandra; Hunter, Christopher; Jones, Philip; Leinonen, Rasko; McAnulla, Craig; Maguire, Eamonn; Maslen, John; Mitchell, Alex; Nuka, Gift; Oisel, Arnaud; Pesseat, Sebastien; Radhakrishnan, Rajesh; Rocca-Serra, Philippe; Scheremetjew, Maxim; Sterk, Peter; Vaughan, Daniel; Cochrane, Guy; Field, Dawn; Sansone, Susanna-Assunta.

Nucleic Acids Res ; 42(Database issue): D600-6, 2014 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-24165880

RESUMO

Metagenomics is a relatively recently established but rapidly expanding field that uses high-throughput next-generation sequencing technologies to characterize the microbial communities inhabiting different ecosystems (including oceans, lakes, soil, tundra, plants and body sites). Metagenomics brings with it a number of challenges, including the management, analysis, storage and sharing of data. In response to these challenges, we have developed a new metagenomics resource (http://www.ebi.ac.uk/metagenomics/) that allows users to easily submit raw nucleotide reads for functional and taxonomic analysis by a state-of-the-art pipeline, and have them automatically stored (together with descriptive, standards-compliant metadata) in the European Nucleotide Archive.

Assuntos

Bases de Dados Genéticas , Metagenômica , Perfilação da Expressão Gênica , Internet , Metabolômica , Proteômica , Software

InterProScan 5: genome-scale protein function classification.

Jones, Philip; Binns, David; Chang, Hsin-Yu; Fraser, Matthew; Li, Weizhong; McAnulla, Craig; McWilliam, Hamish; Maslen, John; Mitchell, Alex; Nuka, Gift; Pesseat, Sebastien; Quinn, Antony F; Sangrador-Vegas, Amaia; Scheremetjew, Maxim; Yong, Siew-Yit; Lopez, Rodrigo; Hunter, Sarah.

Bioinformatics ; 30(9): 1236-40, 2014 May 01.

Artigo em Inglês | MEDLINE | ID: mdl-24451626

RESUMO

MOTIVATION: Robust large-scale sequence analysis is a major challenge in modern genomic science, where biologists are frequently trying to characterize many millions of sequences. Here, we describe a new Java-based architecture for the widely used protein function prediction software package InterProScan. Developments include improvements and additions to the outputs of the software and the complete reimplementation of the software framework, resulting in a flexible and stable system that is able to use both multiprocessor machines and/or conventional clusters to achieve scalable distributed data analysis. InterProScan is freely available for download from the EMBl-EBI FTP site and the open source code is hosted at Google Code.

Assuntos

Genoma , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Proteínas/análise , Arabidopsis/química , Arabidopsis/genética , Análise por Conglomerados , Linguagens de Programação , Proteínas/genética , Proteínas/metabolismo , Software

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA