Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Nucleic Acids Res ; 51(D1): D384-D388, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36477806

RESUMO

NLM's conserved domain database (CDD) is a collection of protein domain and protein family models constructed as multiple sequence alignments. Its main purpose is to provide annotation for protein and translated nucleotide sequences with the location of domain footprints and associated functional sites, and to define protein domain architecture as a basis for assigning gene product names and putative/predicted function. CDD has been available publicly for over 20 years and has grown substantially during that time. Maintaining an archive of pre-computed annotation continues to be a challenge and has slowed down the cadence of CDD releases. CDD curation staff builds hierarchical classifications of large protein domain families, adds models for novel domain families via surveillance of the protein 'dark matter' that currently lacks annotation, and now spends considerable effort on providing names and attribution for conserved domain architectures. CDD can be accessed at https://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml.


Assuntos
Bases de Dados de Proteínas , Proteínas , Humanos , Sequência de Aminoácidos , Sequência Conservada , Estrutura Terciária de Proteína , Proteínas/química , Proteínas/genética , Domínios Proteicos
2.
Nucleic Acids Res ; 49(D1): D1020-D1028, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33270901

RESUMO

The Reference Sequence (RefSeq) project at the National Center for Biotechnology Information (NCBI) contains nearly 200 000 bacterial and archaeal genomes and 150 million proteins with up-to-date annotation. Changes in the Prokaryotic Genome Annotation Pipeline (PGAP) since 2018 have resulted in a substantial reduction in spurious annotation. The hierarchical collection of protein family models (PFMs) used by PGAP as evidence for structural and functional annotation was expanded to over 35 000 protein profile hidden Markov models (HMMs), 12 300 BlastRules and 36 000 curated CDD architectures. As a result, >122 million or 79% of RefSeq proteins are now named based on a match to a curated PFM. Gene symbols, Enzyme Commission numbers or supporting publication attributes are available on over 40% of the PFMs and are inherited by the proteins and features they name, facilitating multi-genome analyses and connections to the literature. In adherence with the principles of FAIR (findable, accessible, interoperable, reusable), the PFMs are available in the Protein Family Models Entrez database to any user. Finally, the reference and representative genome set, a taxonomically diverse subset of RefSeq prokaryotic genomes, is now recalculated regularly and available for download and homology searches with BLAST. RefSeq is found at https://www.ncbi.nlm.nih.gov/refseq/.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Genoma Arqueal/genética , Genoma Bacteriano/genética , Anotação de Sequência Molecular/métodos , Proteínas/genética , Curadoria de Dados/métodos , Mineração de Dados/métodos , Genômica/métodos , Internet , Proteínas/classificação , Interface Usuário-Computador
3.
Nucleic Acids Res ; 48(D1): D265-D268, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31777944

RESUMO

As NLM's Conserved Domain Database (CDD) enters its 20th year of operations as a publicly available resource, CDD curation staff continues to develop hierarchical classifications of widely distributed protein domain families, and to record conserved sites associated with molecular function, so that they can be mapped onto user queries in support of hypothesis-driven biomolecular research. CDD offers both an archive of pre-computed domain annotations as well as live search services for both single protein or nucleotide queries and larger sets of protein query sequences. CDD staff has continued to characterize protein families via conserved domain architectures and has built up a significant corpus of curated domain architectures in support of naming bacterial proteins in RefSeq. These architecture definitions are available via SPARCLE, the Subfamily Protein Architecture Labeling Engine. CDD can be accessed at https://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml.


Assuntos
Bases de Dados de Proteínas , Domínios Proteicos , Sequência de Aminoácidos , Sequência Conservada
4.
Nucleic Acids Res ; 46(D1): D851-D860, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29112715

RESUMO

The Reference Sequence (RefSeq) project at the National Center for Biotechnology Information (NCBI) provides annotation for over 95 000 prokaryotic genomes that meet standards for sequence quality, completeness, and freedom from contamination. Genomes are annotated by a single Prokaryotic Genome Annotation Pipeline (PGAP) to provide users with a resource that is as consistent and accurate as possible. Notable recent changes include the development of a hierarchical evidence scheme, a new focus on curating annotation evidence sources, the addition and curation of protein profile hidden Markov models (HMMs), release of an updated pipeline (PGAP-4), and comprehensive re-annotation of RefSeq prokaryotic genomes. Antimicrobial resistance proteins have been reannotated comprehensively, improved structural annotation of insertion sequence transposases and selenoproteins is provided, curated complex domain architectures have given upgraded names to millions of multidomain proteins, and we introduce a new kind of annotation rule-BlastRules. Continual curation of supporting evidence, and propagation of improved names onto RefSeq proteins ensures that the functional annotation of genomes is kept current. An increasing share of our annotation now derives from HMMs and other sets of annotation rules that are portable by nature, and available for download and for reuse by other investigators. RefSeq is found at https://www.ncbi.nlm.nih.gov/refseq/.


Assuntos
Curadoria de Dados , Bases de Dados de Ácidos Nucleicos , Genoma , Anotação de Sequência Molecular , Células Procarióticas , Archaea/genética , Bactérias/genética , Bases de Dados de Proteínas , Eucariotos/genética , Previsões , Humanos , Homologia de Sequência , Software , Vírus/genética
5.
Nucleic Acids Res ; 45(D1): D200-D203, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27899674

RESUMO

NCBI's Conserved Domain Database (CDD) aims at annotating biomolecular sequences with the location of evolutionarily conserved protein domain footprints, and functional sites inferred from such footprints. An archive of pre-computed domain annotation is maintained for proteins tracked by NCBI's Entrez database, and live search services are offered as well. CDD curation staff supplements a comprehensive collection of protein domain and protein family models, which have been imported from external providers, with representations of selected domain families that are curated in-house and organized into hierarchical classifications of functionally distinct families and sub-families. CDD also supports comparative analyses of protein families via conserved domain architectures, and a recent curation effort focuses on providing functional characterizations of distinct subfamily architectures using SPARCLE: Subfamily Protein Architecture Labeling Engine. CDD can be accessed at https://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Proteínas , Domínios e Motivos de Interação entre Proteínas , Proteínas , Disseminação de Informação , Internet , Proteínas/química , Proteínas/classificação , Proteínas/genética
6.
Nucleic Acids Res ; 43(Database issue): D222-6, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25414356

RESUMO

NCBI's CDD, the Conserved Domain Database, enters its 15(th) year as a public resource for the annotation of proteins with the location of conserved domain footprints. Going forward, we strive to improve the coverage and consistency of domain annotation provided by CDD. We maintain a live search system as well as an archive of pre-computed domain annotation for sequences tracked in NCBI's Entrez protein database, which can be retrieved for single sequences or in bulk. We also maintain import procedures so that CDD contains domain models and domain definitions provided by several collections available in the public domain, as well as those produced by an in-house curation effort. The curation effort aims at increasing coverage and providing finer-grained classifications of common protein domains, for which a wealth of functional and structural data has become available. CDD curation generates alignment models of representative sequence fragments, which are in agreement with domain boundaries as observed in protein 3D structure, and which model the structurally conserved cores of domain families as well as annotate conserved features. CDD can be accessed at http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml.


Assuntos
Bases de Dados de Proteínas , Estrutura Terciária de Proteína , Motivos de Aminoácidos , Sequência de Aminoácidos , Sequência Conservada , Curadoria de Dados
7.
Nucleic Acids Res ; 41(Database issue): D348-52, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23197659

RESUMO

CDD, the Conserved Domain Database, is part of NCBI's Entrez query and retrieval system and is also accessible via http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml. CDD provides annotation of protein sequences with the location of conserved domain footprints and functional sites inferred from these footprints. Pre-computed annotation is available via Entrez, and interactive search services accept single protein or nucleotide queries, as well as batch submissions of protein query sequences, utilizing RPS-BLAST to rapidly identify putative matches. CDD incorporates several protein domain and full-length protein model collections, and maintains an active curation effort that aims at providing fine grained classifications for major and well-characterized protein domain families, as supported by available protein three-dimensional (3D) structure and the published literature. To this date, the majority of protein 3D structures are represented by models tracked by CDD, and CDD curators are characterizing novel families that emerge from protein structure determination efforts.


Assuntos
Bases de Dados de Proteínas , Conformação Proteica , Estrutura Terciária de Proteína , Sequência de Aminoácidos , Sequência Conservada , Internet , Modelos Moleculares , Anotação de Sequência Molecular , Proteínas/química , Proteínas/classificação , Proteínas/genética , Análise de Sequência de Proteína
8.
Nucleic Acids Res ; 39(Database issue): D225-9, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21109532

RESUMO

NCBI's Conserved Domain Database (CDD) is a resource for the annotation of protein sequences with the location of conserved domain footprints, and functional sites inferred from these footprints. CDD includes manually curated domain models that make use of protein 3D structure to refine domain models and provide insights into sequence/structure/function relationships. Manually curated models are organized hierarchically if they describe domain families that are clearly related by common descent. As CDD also imports domain family models from a variety of external sources, it is a partially redundant collection. To simplify protein annotation, redundant models and models describing homologous families are clustered into superfamilies. By default, domain footprints are annotated with the corresponding superfamily designation, on top of which specific annotation may indicate high-confidence assignment of family membership. Pre-computed domain annotation is available for proteins in the Entrez/Protein dataset, and a novel interface, Batch CD-Search, allows the computation and download of annotation for large sets of protein queries. CDD can be accessed via http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml.


Assuntos
Bases de Dados de Proteínas , Estrutura Terciária de Proteína , Sequência de Aminoácidos , Sequência Conservada , Modelos Biológicos , Proteínas/classificação , Análise de Sequência de Proteína
9.
Nucleic Acids Res ; 37(Database issue): D205-10, 2009 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-18984618

RESUMO

NCBI's Conserved Domain Database (CDD) is a collection of multiple sequence alignments and derived database search models, which represent protein domains conserved in molecular evolution. The collection can be accessed at http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml, and is also part of NCBI's Entrez query and retrieval system, cross-linked to numerous other resources. CDD provides annotation of domain footprints and conserved functional sites on protein sequences. Precalculated domain annotation can be retrieved for protein sequences tracked in NCBI's Entrez system, and CDD's collection of models can be queried with novel protein sequences via the CD-Search service at http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi. Starting with the latest version of CDD, v2.14, information from redundant and homologous domain models is summarized at a superfamily level, and domain annotation on proteins is flagged as either 'specific' (identifying molecular function with high confidence) or as 'non-specific' (identifying superfamily membership only).


Assuntos
Bases de Dados de Proteínas , Estrutura Terciária de Proteína , Sequência de Aminoácidos , Sequência Conservada , Proteínas/classificação , Alinhamento de Sequência , Análise de Sequência de Proteína
10.
Proc Natl Acad Sci U S A ; 100(15): 8758-63, 2003 Jul 22.
Artigo em Inglês | MEDLINE | ID: mdl-12840145

RESUMO

Previous in vitro studies showed that the bromodomain binds to acetyllysines on histone tails, leading to the proposal that the domain is involved in deciphering the histone code. However, there is little in vivo evidence supporting the binding of bromodomains to acetylated chromatin in the native environment. Brd4 is a member of the BET family that carries two bromodomains. It associates with mitotic chromosomes, a feature characteristic of the family. Here, we studied the interaction of Brd4 with chromatin in living cells by photobleaching. Brd4 was mobile and interacted with chromatin with a rapid "on and off" mode of binding. This interaction required both bromodomains. Indicating a preferential interaction with acetylated chromatin, Brd4 became less mobile upon increased chromatin acetylation caused by a histone deacetylase inhibitor. Providing biochemical support, salt solubility of Brd4 was markedly reduced upon increased histone acetylation. This change also required both bromodomains. In peptide binding assays, Brd4 avidly bound to di- and tetraacetylated histone H4 and diacetylated H3, but weakly or not at all to mono- and unacetylated H3 and H4. By contrast, it did not bind to unacetylated H4 or H3. Further, Brd4 colocalized with acetylated H4 and H3 in noncentromeric regions of mitotic chromosomes. This colocalization also required both bromodomains. These observations indicate that Brd4 specifically recognizes acetylated histone codes, and this recognition is passed onto the chromatin of newly divided cells.


Assuntos
Cromatina/metabolismo , Proteínas de Fusão Oncogênica/metabolismo , Acetilação , Sequência de Aminoácidos , Animais , Linhagem Celular , Cromatina/química , Recuperação de Fluorescência Após Fotodegradação , Histonas/química , Histonas/metabolismo , Interfase , Camundongos , Mitose , Dados de Sequência Molecular , Proteínas Nucleares , Proteínas de Fusão Oncogênica/genética , Mutação Puntual , Ligação Proteica , Proteínas Recombinantes de Fusão/genética , Proteínas Recombinantes de Fusão/metabolismo , Fatores de Transcrição
11.
Infect Immun ; 70(12): 6948-60, 2002 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-12438374

RESUMO

Apical membrane antigen 1 (AMA1) is regarded as a leading malaria blood-stage vaccine candidate. While the overall structure of AMA1 is conserved in Plasmodium spp., numerous AMA1 allelic variants of P. falciparum have been described. The effect of AMA1 allelic diversity on the ability of a recombinant AMA1 vaccine to protect against human infection by different P. falciparum strains is unknown. We characterize two allelic forms of AMA1 that were both produced in Pichia pastoris at a sufficient economy of scale to be usable for clinical vaccine studies. Both proteins were used to immunize rabbits, singly and in combination, in order to evaluate their immunogenicity and the ability of elicited antibodies to block the growth of different P. falciparum clones. Both antigens, when used alone, elicited high homologous anti-AMA1 titers, with reduced strain cross-reactivity. Similarly, sera from rabbits immunized with a single antigen were capable of blocking the growth of homologous parasite strains at levels theoretically sufficient to clear parasite infections. However, heterologous inhibition was significantly reduced, providing experimental evidence that AMA1 allelic diversity is a result of immune pressure. Encouragingly, rabbits immunized with a combination of both antigens exhibited titers and levels of parasite inhibition as good as those of the single-antigen-immunized rabbits for each of the homologous parasite lines, and consequently exhibited a broadening of allelic diversity coverage.


Assuntos
Alelos , Antígenos de Protozoários , Variação Genética , Vacinas Antimaláricas/imunologia , Proteínas de Membrana/imunologia , Proteínas de Membrana/metabolismo , Plasmodium falciparum/imunologia , Proteínas de Protozoários/imunologia , Proteínas de Protozoários/metabolismo , Sequência de Aminoácidos , Animais , Anticorpos Antiprotozoários/sangue , Humanos , Imunização , Esquemas de Imunização , Vacinas Antimaláricas/genética , Malária Falciparum/prevenção & controle , Proteínas de Membrana/genética , Dados de Sequência Molecular , Plasmodium falciparum/genética , Plasmodium falciparum/crescimento & desenvolvimento , Proteínas de Protozoários/genética , Coelhos , Proteínas Recombinantes/genética , Proteínas Recombinantes/imunologia , Proteínas Recombinantes/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA