Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros

Base de dados
Tipo de documento
Assunto da revista
País de afiliação
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 49(D1): D266-D273, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33237325

RESUMO

CATH (https://www.cathdb.info) identifies domains in protein structures from wwPDB and classifies these into evolutionary superfamilies, thereby providing structural and functional annotations. There are two levels: CATH-B, a daily snapshot of the latest domain structures and superfamily assignments, and CATH+, with additional derived data, such as predicted sequence domains, and functionally coherent sequence subsets (Functional Families or FunFams). The latest CATH+ release, version 4.3, significantly increases coverage of structural and sequence data, with an addition of 65,351 fully-classified domains structures (+15%), providing 500 238 structural domains, and 151 million predicted sequence domains (+59%) assigned to 5481 superfamilies. The FunFam generation pipeline has been re-engineered to cope with the increased influx of data. Three times more sequences are captured in FunFams, with a concomitant increase in functional purity, information content and structural coverage. FunFam expansion increases the structural annotations provided for experimental GO terms (+59%). We also present CATH-FunVar web-pages displaying variations in protein sequences and their proximity to known or predicted functional sites. We present two case studies (1) putative cancer drivers and (2) SARS-CoV-2 proteins. Finally, we have improved links to and from CATH including SCOP, InterPro, Aquaria and 2DProt.


Assuntos
Biologia Computacional/estatística & dados numéricos , Bases de Dados de Proteínas/estatística & dados numéricos , Domínios Proteicos , Proteínas/química , Sequência de Aminoácidos , COVID-19/epidemiologia , COVID-19/prevenção & controle , COVID-19/virologia , Biologia Computacional/métodos , Epidemias , Humanos , Internet , Anotação de Sequência Molecular , Proteínas/genética , Proteínas/metabolismo , SARS-CoV-2/genética , SARS-CoV-2/metabolismo , SARS-CoV-2/fisiologia , Análise de Sequência de Proteína/métodos , Homologia de Sequência de Aminoácidos , Proteínas Virais/química , Proteínas Virais/genética , Proteínas Virais/metabolismo
2.
Bioinformatics ; 37(8): 1099-1106, 2021 05 23.
Artigo em Inglês | MEDLINE | ID: mdl-33135053

RESUMO

MOTIVATION: Identification of functional sites in proteins is essential for functional characterization, variant interpretation and drug design. Several methods are available for predicting either a generic functional site, or specific types of functional site. Here, we present FunSite, a machine learning predictor that identifies catalytic, ligand-binding and protein-protein interaction functional sites using features derived from protein sequence and structure, and evolutionary data from CATH functional families (FunFams). RESULTS: FunSite's prediction performance was rigorously benchmarked using cross-validation and a holdout dataset. FunSite outperformed other publicly available functional site prediction methods. We show that conserved residues in FunFams are enriched in functional sites. We found FunSite's performance depends greatly on the quality of functional site annotations and the information content of FunFams in the training data. Finally, we analyze which structural and evolutionary features are most predictive for functional sites. AVAILABILITYAND IMPLEMENTATION: https://github.com/UCL/cath-funsite-predictor. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Aprendizado de Máquina , Proteínas , Sequência de Aminoácidos , Evolução Biológica , Humanos , Proteínas/genética
3.
Nucleic Acids Res ; 47(D1): D280-D284, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30398663

RESUMO

This article provides an update of the latest data and developments within the CATH protein structure classification database (http://www.cathdb.info). The resource provides two levels of release: CATH-B, a daily snapshot of the latest structural domain boundaries and superfamily assignments, and CATH+, which adds layers of derived data, such as predicted sequence domains, functional annotations and functional clustering (known as Functional Families or FunFams). The most recent CATH+ release (version 4.2) provides a huge update in the coverage of structural data. This release increases the number of fully- classified domains by over 40% (from 308 999 to 434 857 structural domains), corresponding to an almost two- fold increase in sequence data (from 53 million to over 95 million predicted domains) organised into 6119 superfamilies. The coverage of high-resolution, protein PDB chains that contain at least one assigned CATH domain is now 90.2% (increased from 82.3% in the previous release). A number of highly requested features have also been implemented in our web pages: allowing the user to view an alignment between their query sequence and a representative FunFam structure and providing tools that make it easier to view the full structural context (multi-domain architecture) of domains and chains.


Assuntos
Bases de Dados de Proteínas , Genoma , Sequência de Aminoácidos , Animais , Sequência Conservada , Ontologia Genética , Humanos , Modelos Moleculares , Anotação de Sequência Molecular , Família Multigênica/genética , Conformação Proteica , Domínios Proteicos/genética , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos , Relação Estrutura-Atividade
4.
J Immunol ; 191(11): 5398-409, 2013 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-24146041

RESUMO

EBV elicits primary CD8(+) T cell responses that, by T cell cloning from infectious mononucleosis (IM) patients, appear skewed toward immediate early (IE) and some early (E) lytic cycle proteins, with late (L) proteins rarely targeted. However, L Ag-specific responses have been detected regularly in polyclonal T cell cultures from long-term virus carriers. To resolve this apparent difference between responses to primary and persistent infection, 13 long-term carriers were screened in ex vivo IFN-γ ELISPOT assays using peptides spanning the two IE, six representative E, and seven representative L proteins. This revealed memory CD8 responses to 44 new lytic cycle epitopes that straddle all three protein classes but, in terms of both frequency and size, maintain the IE > E > L hierarchy of immunodominance. Having identified the HLA restriction of 10 (including 7 L) new epitopes using memory CD8(+) T cell clones, we looked in HLA-matched IM patients and found such reactivities but typically at low levels, explaining why they had gone undetected in the original IM clonal screens. Wherever tested, all CD8(+) T cell clones against these novel lytic cycle epitopes recognized lytically infected cells naturally expressing their target Ag. Surprisingly, however, clones against the most frequently recognized L Ag, the BNRF1 tegument protein, also recognized latently infected, growth-transformed cells. We infer that BNRF1 is also a latent Ag that could be targeted in T cell therapy of EBV-driven B-lymphoproliferative disease.


Assuntos
Linfócitos T CD8-Positivos/imunologia , Herpesvirus Humano 4/imunologia , Mononucleose Infecciosa/imunologia , Sequência de Aminoácidos , Linfócitos T CD8-Positivos/virologia , Células Cultivadas , ELISPOT , Antígenos HLA/metabolismo , Humanos , Epitopos Imunodominantes/imunologia , Epitopos Imunodominantes/metabolismo , Interferon gama/metabolismo , Dados de Sequência Molecular , Fragmentos de Peptídeos/imunologia , Fragmentos de Peptídeos/metabolismo , Ligação Proteica , Proteínas do Envelope Viral/imunologia , Proteínas do Envelope Viral/metabolismo , Latência Viral/imunologia
5.
Elife ; 122023 10 03.
Artigo em Inglês | MEDLINE | ID: mdl-37787768

RESUMO

Many proteins remain poorly characterized even in well-studied organisms, presenting a bottleneck for research. We applied phenomics and machine-learning approaches with Schizosaccharomyces pombe for broad cues on protein functions. We assayed colony-growth phenotypes to measure the fitness of deletion mutants for 3509 non-essential genes in 131 conditions with different nutrients, drugs, and stresses. These analyses exposed phenotypes for 3492 mutants, including 124 mutants of 'priority unstudied' proteins conserved in humans, providing varied functional clues. For example, over 900 proteins were newly implicated in the resistance to oxidative stress. Phenotype-correlation networks suggested roles for poorly characterized proteins through 'guilt by association' with known proteins. For complementary functional insights, we predicted Gene Ontology (GO) terms using machine learning methods exploiting protein-network and protein-homology data (NET-FF). We obtained 56,594 high-scoring GO predictions, of which 22,060 also featured high information content. Our phenotype-correlation data and NET-FF predictions showed a strong concordance with existing PomBase GO annotations and protein networks, with integrated analyses revealing 1675 novel GO predictions for 783 genes, including 47 predictions for 23 priority unstudied proteins. Experimental validation identified new proteins involved in cellular aging, showing that these predictions and phenomics data provide a rich resource to uncover new protein functions.


Assuntos
Proteínas de Schizosaccharomyces pombe , Schizosaccharomyces , Humanos , Fenômica , Proteínas de Schizosaccharomyces pombe/genética , Fenótipo , Schizosaccharomyces/genética , Aprendizado de Máquina
6.
Sci Rep ; 10(1): 18517, 2020 10 28.
Artigo em Inglês | MEDLINE | ID: mdl-33116184

RESUMO

Alzheimer's disease (AD), the most prevalent form of dementia, is a progressive and devastating neurodegenerative condition for which there are no effective treatments. Understanding the molecular pathology of AD during disease progression may identify new ways to reduce neuronal damage. Here, we present a longitudinal study tracking dynamic proteomic alterations in the brains of an inducible Drosophila melanogaster model of AD expressing the Arctic mutant Aß42 gene. We identified 3093 proteins from flies that were induced to express Aß42 and age-matched healthy controls using label-free quantitative ion-mobility data independent analysis mass spectrometry. Of these, 228 proteins were significantly altered by Aß42 accumulation and were enriched for AD-associated processes. Network analyses further revealed that these proteins have distinct hub and bottleneck properties in the brain protein interaction network, suggesting that several may have significant effects on brain function. Our unbiased analysis provides useful insights into the key processes governing the progression of amyloid toxicity and forms a basis for further functional analyses in model organisms and translation to mammalian systems.


Assuntos
Peptídeos beta-Amiloides/metabolismo , Encéfalo/metabolismo , Fragmentos de Peptídeos/metabolismo , Mapas de Interação de Proteínas/fisiologia , Doença de Alzheimer/metabolismo , Doença de Alzheimer/fisiopatologia , Peptídeos beta-Amiloides/fisiologia , Animais , Modelos Animais de Doenças , Progressão da Doença , Proteínas de Drosophila/metabolismo , Drosophila melanogaster/metabolismo , Estudos Longitudinais , Neurônios/metabolismo , Fragmentos de Peptídeos/fisiologia , Proteômica/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA