Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 66
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Blood ; 140(10): 1167-1181, 2022 09 08.
Artigo em Inglês | MEDLINE | ID: mdl-35853161

RESUMO

Patients with acute myeloid leukemia (AML) often achieve remission after allogeneic hematopoietic cell transplantation (allo-HCT) but subsequently die of relapse driven by leukemia cells resistant to elimination by allogeneic T cells based on decreased major histocompatibility complex II (MHC-II) expression and apoptosis resistance. Here we demonstrate that mouse-double-minute-2 (MDM2) inhibition can counteract immune evasion of AML. MDM2 inhibition induced MHC class I and II expression in murine and human AML cells. Using xenografts of human AML and syngeneic mouse models of leukemia, we show that MDM2 inhibition enhanced cytotoxicity against leukemia cells and improved survival. MDM2 inhibition also led to increases in tumor necrosis factor-related apoptosis-inducing ligand receptor-1 and -2 (TRAIL-R1/2) on leukemia cells and higher frequencies of CD8+CD27lowPD-1lowTIM-3low T cells, with features of cytotoxicity (perforin+CD107a+TRAIL+) and longevity (bcl-2+IL-7R+). CD8+ T cells isolated from leukemia-bearing MDM2 inhibitor-treated allo-HCT recipients exhibited higher glycolytic activity and enrichment for nucleotides and their precursors compared with vehicle control subjects. T cells isolated from MDM2 inhibitor-treated AML-bearing mice eradicated leukemia in secondary AML-bearing recipients. Mechanistically, the MDM2 inhibitor-mediated effects were p53-dependent because p53 knockdown abolished TRAIL-R1/2 and MHC-II upregulation, whereas p53 binding to TRAILR1/2 promotors increased upon MDM2 inhibition. The observations in the mouse models were complemented by data from human individuals. Patient-derived AML cells exhibited increased TRAIL-R1/2 and MHC-II expression on MDM2 inhibition. In summary, we identified a targetable vulnerability of AML cells to allogeneic T-cell-mediated cytotoxicity through the restoration of p53-dependent TRAIL-R1/2 and MHC-II production via MDM2 inhibition.


Assuntos
Leucemia Mieloide Aguda , Proteína Supressora de Tumor p53 , Animais , Apoptose , Humanos , Leucemia Mieloide Aguda/genética , Complexo Principal de Histocompatibilidade , Camundongos , Proteínas Proto-Oncogênicas c-mdm2/metabolismo , Transplante Homólogo , Proteína Supressora de Tumor p53/genética , Proteína Supressora de Tumor p53/metabolismo , Regulação para Cima
2.
Br J Haematol ; 203(2): 264-281, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37539479

RESUMO

Acute myeloid leukaemia (AML) relapse after allogeneic haematopoietic cell transplantation (allo-HCT) is often driven by immune-related mechanisms and associated with poor prognosis. Immune checkpoint inhibitors combined with hypomethylating agents (HMA) may restore or enhance the graft-versus-leukaemia effect. Still, data about using this combination regimen after allo-HCT are limited. We conducted a prospective, phase II, open-label, single-arm study in which we treated patients with haematological AML relapse after allo-HCT with HMA plus the anti-PD-1 antibody nivolumab. The response was correlated with DNA-, RNA- and protein-based single-cell technology assessments to identify biomarkers associated with therapeutic efficacy. Sixteen patients received a median number of 2 (range 1-7) nivolumab applications. The overall response rate (CR/PR) at day 42 was 25%, and another 25% of the patients achieved stable disease. The median overall survival was 15.6 months. High-parametric cytometry documented a higher frequency of activated (ICOS+ , HLA-DR+ ), low senescence (KLRG1- , CD57- ) CD8+ effector T cells in responders. We confirmed these findings in a preclinical model. Single-cell transcriptomics revealed a pro-inflammatory rewiring of the expression profile of T and myeloid cells in responders. In summary, the study indicates that the post-allo-HCT HMA/nivolumab combination induces anti-AML immune responses in selected patients and could be considered as a bridging approach to a second allo-HCT. Trial-registration: EudraCT-No. 2017-002194-18.

3.
PLoS Comput Biol ; 18(10): e1010610, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-36260616

RESUMO

Proteins that are known only at a sequence level outnumber those with an experimental characterization by orders of magnitude. Classifying protein regions (domains) into homologous families can generate testable functional hypotheses for yet unannotated sequences. Existing domain family resources typically use at least some degree of manual curation: they grow slowly over time and leave a large fraction of the protein sequence space unclassified. We here describe automatic clustering by Density Peak Clustering of UniRef50 v. 2017_07, a protein sequence database including approximately 23M sequences. We performed a radical re-implementation of a pipeline we previously developed in order to allow handling millions of sequences and data volumes of the order of 3 TeraBytes. The modified pipeline, which we call DPCfam, finds ∼ 45,000 protein clusters in UniRef50. Our automatic classification is in close correspondence to the ones of the Pfam and ECOD resources: in particular, about 81% of medium-large Pfam families and 72% of ECOD families can be mapped to clusters generated by DPCfam. In addition, our protocol finds more than 14,000 clusters constituted of protein regions with no Pfam annotation, which are therefore candidates for representing novel protein families. These results are made available to the scientific community through a dedicated repository.


Assuntos
Proteínas , Bases de Dados de Proteínas , Proteínas/genética , Análise por Conglomerados , Sequência de Aminoácidos , Domínios Proteicos
4.
BMC Bioinformatics ; 22(1): 121, 2021 Mar 12.
Artigo em Inglês | MEDLINE | ID: mdl-33711918

RESUMO

BACKGROUND: The identification of protein families is of outstanding practical importance for in silico protein annotation and is at the basis of several bioinformatic resources. Pfam is possibly the most well known protein family database, built in many years of work by domain experts with extensive use of manual curation. This approach is generally very accurate, but it is quite time consuming and it may suffer from a bias generated from the hand-curation itself, which is often guided by the available experimental evidence. RESULTS: We introduce a procedure that aims to identify automatically putative protein families. The procedure is based on Density Peak Clustering and uses as input only local pairwise alignments between protein sequences. In the experiment we present here, we ran the algorithm on about 4000 full-length proteins with at least one domain classified by Pfam as belonging to the Pseudouridine synthase and Archaeosine transglycosylase (PUA) clan. We obtained 71 automatically-generated sequence clusters with at least 100 members. While our clusters were largely consistent with the Pfam classification, showing good overlap with either single or multi-domain Pfam family architectures, we also observed some inconsistencies. The latter were inspected using structural and sequence based evidence, which suggested that the automatic classification captured evolutionary signals reflecting non-trivial features of protein family architectures. Based on this analysis we identified a putative novel pre-PUA domain as well as alternative boundaries for a few PUA or PUA-associated families. As a first indication that our approach was unlikely to be clan-specific, we performed the same analysis on the P53 clan, obtaining comparable results. CONCLUSIONS: The clustering procedure described in this work takes advantage of the information contained in a large set of pairwise alignments and successfully identifies a set of putative families and family architectures in an unsupervised manner. Comparison with the Pfam classification highlights significant overlap and points to interesting differences, suggesting that our new algorithm could have potential in applications related to automatic protein classification. Testing this hypothesis, however, will require further experiments on large and diverse sequence datasets.


Assuntos
Proteínas , Alinhamento de Sequência , Sequência de Aminoácidos , Análise por Conglomerados , Bases de Dados de Proteínas , Humanos , Proteínas/genética
5.
Nucleic Acids Res ; 44(D1): D279-85, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26673716

RESUMO

In the last two years the Pfam database (http://pfam.xfam.org) has undergone a substantial reorganisation to reduce the effort involved in making a release, thereby permitting more frequent releases. Arguably the most significant of these changes is that Pfam is now primarily based on the UniProtKB reference proteomes, with the counts of matched sequences and species reported on the website restricted to this smaller set. Building families on reference proteomes sequences brings greater stability, which decreases the amount of manual curation required to maintain them. It also reduces the number of sequences displayed on the website, whilst still providing access to many important model organisms. Matches to the full UniProtKB database are, however, still available and Pfam annotations for individual UniProtKB sequences can still be retrieved. Some Pfam entries (1.6%) which have no matches to reference proteomes remain; we are working with UniProt to see if sequences from them can be incorporated into reference proteomes. Pfam-B, the automatically-generated supplement to Pfam, has been removed. The current release (Pfam 29.0) includes 16 295 entries and 559 clans. The facility to view the relationship between families within a clan has been improved by the introduction of a new tool.


Assuntos
Bases de Dados de Proteínas , Proteínas/classificação , Proteoma/química , Alinhamento de Sequência , Análise de Sequência de Proteína , Anotação de Sequência Molecular
6.
Brief Bioinform ; 16(5): 865-72, 2015 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-25614388

RESUMO

Transport systems comprise roughly 10% of all proteins in a cell, playing critical roles in many processes. Improving and expanding their classification is an important goal that can affect studies ranging from comparative genomics to potential drug target searches. It is not surprising that different classification systems for transport proteins have arisen, be it within a specialized database, focused on this functional class of proteins, or as part of a broader classification system for all proteins. Two such databases are the Transporter Classification Database (TCDB) and the Protein family (Pfam) database. As part of a long-term endeavor to improve consistency between the two classification systems, we have compared transporter annotations in the two databases to understand the rationale for differences and to improve both systems. Differences sometimes reflect the fact that one database has a particular transporter family while the other does not. Differing family definitions and hierarchical organizations were reconciled, resulting in recognition of 69 Pfam 'Domains of Unknown Function', which proved to be transport protein families to be renamed using TCDB annotations. Of over 400 potential new Pfam families identified from TCDB, 10% have already been added to Pfam, and TCDB has created 60 new entries based on Pfam data. This work, for the first time, reveals the benefits of comprehensive database comparisons and explains the differences between Pfam and TCDB.


Assuntos
Bases de Dados de Proteínas , Proteínas/química
7.
Nature ; 473(7345): 50-4, 2011 May 05.
Artigo em Inglês | MEDLINE | ID: mdl-21471968

RESUMO

Saccharides have a central role in the nutrition of all living organisms. Whereas several saccharide uptake systems are shared between the different phylogenetic kingdoms, the phosphoenolpyruvate-dependent phosphotransferase system exists almost exclusively in bacteria. This multi-component system includes an integral membrane protein EIIC that transports saccharides and assists in their phosphorylation. Here we present the crystal structure of an EIIC from Bacillus cereus that transports diacetylchitobiose. The EIIC is a homodimer, with an expansive interface formed between the amino-terminal halves of the two protomers. The carboxy-terminal half of each protomer has a large binding pocket that contains a diacetylchitobiose, which is occluded from both sides of the membrane with its site of phosphorylation near the conserved His250 and Glu334 residues. The structure shows the architecture of this important class of transporters, identifies the determinants of substrate binding and phosphorylation, and provides a framework for understanding the mechanism of sugar translocation.


Assuntos
Bacillus cereus/enzimologia , Proteínas de Membrana Transportadoras/química , Modelos Moleculares , Sítios de Ligação , Metabolismo dos Carboidratos , Cristalização , Fosforilação , Estrutura Quaternária de Proteína , Estrutura Terciária de Proteína
8.
Nature ; 471(7338): 336-40, 2011 Mar 17.
Artigo em Inglês | MEDLINE | ID: mdl-21317882

RESUMO

The TrkH/TrkG/KtrB proteins mediate K(+) uptake in bacteria and probably evolved from simple K(+) channels by multiple gene duplications or fusions. Here we present the crystal structure of a TrkH from Vibrio parahaemolyticus. TrkH is a homodimer, and each protomer contains an ion permeation pathway. A selectivity filter, similar in architecture to those of K(+) channels but significantly shorter, is lined by backbone and side-chain oxygen atoms. Functional studies showed that TrkH is selective for permeation of K(+) and Rb(+) over smaller ions such as Na(+) or Li(+). Immediately intracellular to the selectivity filter are an intramembrane loop and an arginine residue, both highly conserved, which constrict the permeation pathway. Substituting the arginine with an alanine significantly increases the rate of K(+) flux. These results reveal the molecular basis of K(+) selectivity and suggest a novel gating mechanism for this large and important family of membrane transport proteins.


Assuntos
Canais de Potássio/química , Canais de Potássio/metabolismo , Vibrio parahaemolyticus/química , Transportadores de Cassetes de Ligação de ATP/química , Sequência de Aminoácidos , Cristalografia por Raios X , Proteínas de Escherichia coli/química , Ativação do Canal Iônico , Transporte de Íons , Modelos Moleculares , Dados de Sequência Molecular , Potássio/metabolismo , Relação Estrutura-Atividade , Especificidade por Substrato
9.
Nucleic Acids Res ; 43(Database issue): D382-6, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25348407

RESUMO

Genome3D (http://www.genome3d.eu) is a collaborative resource that provides predicted domain annotations and structural models for key sequences. Since introducing Genome3D in a previous NAR paper, we have substantially extended and improved the resource. We have annotated representatives from Pfam families to improve coverage of diverse sequences and added a fast sequence search to the website to allow users to find Genome3D-annotated sequences similar to their own. We have improved and extended the Genome3D data, enlarging the source data set from three model organisms to 10, and adding VIVACE, a resource new to Genome3D. We have analysed and updated Genome3D's SCOP/CATH mapping. Finally, we have improved the superposition tools, which now give users a more powerful interface for investigating similarities and differences between structural models.


Assuntos
Bases de Dados de Proteínas , Anotação de Sequência Molecular , Estrutura Terciária de Proteína , Algoritmos , Genômica , Internet , Modelos Moleculares , Estrutura Terciária de Proteína/genética , Análise de Sequência de Proteína
10.
Nucleic Acids Res ; 43(Database issue): D213-21, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25428371

RESUMO

The InterPro database (http://www.ebi.ac.uk/interpro/) is a freely available resource that can be used to classify sequences into protein families and to predict the presence of important domains and sites. Central to the InterPro database are predictive models, known as signatures, from a range of different protein family databases that have different biological focuses and use different methodological approaches to classify protein families and domains. InterPro integrates these signatures, capitalizing on the respective strengths of the individual databases, to produce a powerful protein classification resource. Here, we report on the status of InterPro as it enters its 15th year of operation, and give an overview of new developments with the database and its associated Web interfaces and software. In particular, the new domain architecture search tool is described and the process of mapping of Gene Ontology terms to InterPro is outlined. We also discuss the challenges faced by the resource given the explosive growth in sequence data in recent years. InterPro (version 48.0) contains 36,766 member database signatures integrated into 26,238 InterPro entries, an increase of over 3993 entries (5081 signatures), since 2012.


Assuntos
Bases de Dados de Proteínas , Proteínas/classificação , Bactérias/metabolismo , Ontologia Genética , Estrutura Terciária de Proteína , Proteínas/genética , Análise de Sequência de Proteína , Software
11.
Nature ; 467(7319): 1074-80, 2010 Oct 28.
Artigo em Inglês | MEDLINE | ID: mdl-20981093

RESUMO

The plant SLAC1 anion channel controls turgor pressure in the aperture-defining guard cells of plant stomata, thereby regulating the exchange of water vapour and photosynthetic gases in response to environmental signals such as drought or high levels of carbon dioxide. Here we determine the crystal structure of a bacterial homologue (Haemophilus influenzae) of SLAC1 at 1.20 Å resolution, and use structure-inspired mutagenesis to analyse the conductance properties of SLAC1 channels. SLAC1 is a symmetrical trimer composed from quasi-symmetrical subunits, each having ten transmembrane helices arranged from helical hairpin pairs to form a central five-helix transmembrane pore that is gated by an extremely conserved phenylalanine residue. Conformational features indicate a mechanism for control of gating by kinase activation, and electrostatic features of the pore coupled with electrophysiological characteristics indicate that selectivity among different anions is largely a function of the energetic cost of ion dehydration.


Assuntos
Proteínas de Arabidopsis/química , Proteínas de Bactérias/química , Haemophilus influenzae/química , Proteínas de Membrana/química , Estômatos de Plantas/metabolismo , Homologia Estrutural de Proteína , Sequência de Aminoácidos , Animais , Arabidopsis/genética , Arabidopsis/metabolismo , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Cristalografia por Raios X , Condutividade Elétrica , Haemophilus influenzae/genética , Ativação do Canal Iônico , Modelos Moleculares , Dados de Sequência Molecular , Oócitos/metabolismo , Fenilalanina/química , Fenilalanina/metabolismo , Eletricidade Estática , Especificidade por Substrato , Xenopus laevis
12.
Nucleic Acids Res ; 42(Web Server issue): W337-43, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24799431

RESUMO

PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein-protein binding sites (ISIS2), protein-polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org.


Assuntos
Conformação Proteica , Software , Substituição de Aminoácidos , Sítios de Ligação , Ontologia Genética , Internet , Proteínas Intrinsicamente Desordenadas/química , Proteínas de Membrana/química , Mutação , Mapeamento de Interação de Proteínas , Proteínas/análise , Proteínas/genética , Proteínas/metabolismo , Alinhamento de Sequência , Análise de Sequência de Proteína , Homologia de Sequência de Aminoácidos
13.
Nucleic Acids Res ; 42(Database issue): D222-30, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24288371

RESUMO

Pfam, available via servers in the UK (http://pfam.sanger.ac.uk/) and the USA (http://pfam.janelia.org/), is a widely used database of protein families, containing 14 831 manually curated entries in the current release, version 27.0. Since the last update article 2 years ago, we have generated 1182 new families and maintained sequence coverage of the UniProt Knowledgebase (UniProtKB) at nearly 80%, despite a 50% increase in the size of the underlying sequence database. Since our 2012 article describing Pfam, we have also undertaken a comprehensive review of the features that are provided by Pfam over and above the basic family data. For each feature, we determined the relevance, computational burden, usage statistics and the functionality of the feature in a website context. As a consequence of this review, we have removed some features, enhanced others and developed new ones to meet the changing demands of computational biology. Here, we describe the changes to Pfam content. Notably, we now provide family alignments based on four different representative proteome sequence data sets and a new interactive DNA search interface. We also discuss the mapping between Pfam and known 3D structures.


Assuntos
Bases de Dados de Proteínas , Alinhamento de Sequência , Análise de Sequência de Proteína , Internet , Proteínas Intrinsicamente Desordenadas/química , Conformação Proteica , Proteínas/química , Proteínas/classificação , Proteínas/genética , Proteoma/química , Análise de Sequência de DNA
14.
BMC Bioinformatics ; 16: 7, 2015 Jan 16.
Artigo em Inglês | MEDLINE | ID: mdl-25592227

RESUMO

BACKGROUND: N-terminal domains of BVU_4064 and BF1687 proteins from Bacteroides vulgatus and Bacteroides fragilis respectively are members of the Pfam family PF12985 (DUF3869). Proteins containing a domain from this family can be found in most Bacteroides species and, in large numbers, in all human gut microbiome samples. Both BVU_4064 and BF1687 proteins have a consensus lipobox motif implying they are anchored to the membrane, but their functions are otherwise unknown. The C-terminal half of BVU_4064 is assigned to protein family PF12986 (DUF3870); the equivalent part of BF1687 was unclassified. RESULTS: Crystal structures of both BVU_4064 and BF1687 proteins, solved at the JCSG center, show strikingly similar three-dimensional structures. The main difference between the two is that the two domains in the BVU_4064 protein are connected by a short linker, as opposed to a longer insertion made of 4 helices placed linearly along with a strand that is added to the C-terminal domain in the BF1687 protein. The N-terminal domain in both proteins, corresponding to the PF12985 (DUF3869) domain is a ß-sandwich with pre-albumin-like fold, found in many proteins belonging to the Transthyretin clan of Pfam. The structures of C-terminal domains of both proteins, corresponding to the PF12986 (DUF3870) domain in BVU_4064 protein and an unclassified domain in the BF1687 protein, show significant structural similarity to bacterial pore-forming toxins. A helix in this domain is in an analogous position to a loop connecting the second and third strands in the toxin structures, where this loop is implicated to play a role in the toxin insertion into the host cell membrane. The same helix also points to the groove between the N- and C-terminal domains that are loosely held together by hydrophobic and hydrogen bond interactions. The presence of several conserved residues in this region together with these structural determinants could make it a functionally important region in these proteins. CONCLUSIONS: Structural analysis of BVU_4064 and BF1687 points to possible roles in mediating multiple interactions on the cell-surface/extracellular matrix. In particular the N-terminal domain could be involved in adhesive interactions, the C-terminal domain and the inter-domain groove in lipid or carbohydrate interactions.


Assuntos
Proteínas de Bactérias/análise , Proteínas de Bactérias/química , Bacteroides/química , Moléculas de Adesão Celular/metabolismo , Lipídeos/química , Proteínas de Membrana/metabolismo , Sequência de Aminoácidos , Proteínas de Bactérias/metabolismo , Adesão Celular/fisiologia , Moléculas de Adesão Celular/química , Cristalografia por Raios X , Humanos , Proteínas de Membrana/química , Dados de Sequência Molecular , Dobramento de Proteína , Estrutura Terciária de Proteína , Análise de Sequência de Proteína , Homologia de Sequência de Aminoácidos
15.
Nucleic Acids Res ; 41(12): e121, 2013 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-23598997

RESUMO

Detection of protein homology via sequence similarity has important applications in biology, from protein structure and function prediction to reconstruction of phylogenies. Although current methods for aligning protein sequences are powerful, challenges remain, including problems with homologous overextension of alignments and with regions under convergent evolution. Here, we test the ability of the profile hidden Markov model method HMMER3 to correctly assign homologous sequences to >13,000 manually curated families from the Pfam database. We identify problem families using protein regions that match two or more Pfam families not currently annotated as related in Pfam. We find that HMMER3 E-value estimates seem to be less accurate for families that feature periodic patterns of compositional bias, such as the ones typically observed in coiled-coils. These results support the continued use of manually curated inclusion thresholds in the Pfam database, especially on the subset of families that have been identified as problematic in experiments such as these. They also highlight the need for developing new methods that can correct for this particular type of compositional bias.


Assuntos
Análise de Sequência de Proteína/métodos , Homologia de Sequência de Aminoácidos , Evolução Molecular , Cadeias de Markov , Proteínas de Membrana/química , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína , Proteínas/classificação , Alinhamento de Sequência
16.
BMC Bioinformatics ; 15: 1, 2014 Jan 03.
Artigo em Inglês | MEDLINE | ID: mdl-24383880

RESUMO

BACKGROUND: The Acel_2062 protein from Acidothermus cellulolyticus is a protein of unknown function. Initial sequence analysis predicted that it was a metallopeptidase from the presence of a motif conserved amongst the Asp-zincins, which are peptidases that contain a single, catalytic zinc ion ligated by the histidines and aspartic acid within the motif (HEXXHXXGXXD). The Acel_2062 protein was chosen by the Joint Center for Structural Genomics for crystal structure determination to explore novel protein sequence space and structure-based function annotation. RESULTS: The crystal structure confirmed that the Acel_2062 protein consisted of a single, zincin-like metallopeptidase-like domain. The Met-turn, a structural feature thought to be important for a Met-zincin because it stabilizes the active site, is absent, and its stabilizing role may have been conferred to the C-terminal Tyr113. In our crystallographic model there are two molecules in the asymmetric unit and from size-exclusion chromatography, the protein dimerizes in solution. A water molecule is present in the putative zinc-binding site in one monomer, which is replaced by one of two observed conformations of His95 in the other. CONCLUSIONS: The Acel_2062 protein is structurally related to the zincins. It contains the minimum structural features of a member of this protein superfamily, and can be described as a "mini- zincin". There is a striking parallel with the structure of a mini-Glu-zincin, which represents the minimum structure of a Glu-zincin (a metallopeptidase in which the third zinc ligand is a glutamic acid). Rather than being an ancestral state, phylogenetic analysis suggests that the mini-zincins are derived from larger proteins.


Assuntos
Proteínas de Bactérias/química , Metaloproteases/química , Zinco/química , Actinomycetales/química , Actinomycetales/enzimologia , Motivos de Aminoácidos , Sequência de Aminoácidos , Proteínas de Bactérias/metabolismo , Dimerização , Metaloproteases/metabolismo , Modelos Moleculares , Dados de Sequência Molecular , Filogenia , Subunidades Proteicas , Alinhamento de Sequência , Zinco/metabolismo
17.
Genome Res ; 21(6): 898-907, 2011 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-21482623

RESUMO

High-throughput X-ray absorption spectroscopy was used to measure transition metal content based on quantitative detection of X-ray fluorescence signals for 3879 purified proteins from several hundred different protein families generated by the New York SGX Research Center for Structural Genomics. Approximately 9% of the proteins analyzed showed the presence of transition metal atoms (Zn, Cu, Ni, Co, Fe, or Mn) in stoichiometric amounts. The method is highly automated and highly reliable based on comparison of the results to crystal structure data derived from the same protein set. To leverage the experimental metalloprotein annotations, we used a sequence-based de novo prediction method, MetalDetector, to identify Cys and His residues that bind to transition metals for the redundancy reduced subset of 2411 sequences sharing <70% sequence identity and having at least one His or Cys. As the HT-XAS identifies metal type and protein binding, while the bioinformatics analysis identifies metal- binding residues, the results were combined to identify putative metal-binding sites in the proteins and their associated families. We explored the combination of this data with homology models to generate detailed structure models of metal-binding sites for representative proteins. Finally, we used extended X-ray absorption fine structure data from two of the purified Zn metalloproteins to validate predicted metalloprotein binding site structures. This combination of experimental and bioinformatics approaches provides comprehensive active site analysis on the genome scale for metalloproteins as a class, revealing new insights into metalloprotein structure and function.


Assuntos
Metaloproteínas/química , Software , Espectroscopia por Absorção de Raios X/métodos , Sítios de Ligação/genética , Biologia Computacional/métodos , Fluorescência , Genômica/métodos , Metais Pesados/análise , Síncrotrons
18.
Nucleic Acids Res ; 40(Database issue): D290-301, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22127870

RESUMO

Pfam is a widely used database of protein families, currently containing more than 13,000 manually curated protein families as of release 26.0. Pfam is available via servers in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/). Here, we report on changes that have occurred since our 2010 NAR paper (release 24.0). Over the last 2 years, we have generated 1840 new families and increased coverage of the UniProt Knowledgebase (UniProtKB) to nearly 80%. Notably, we have taken the step of opening up the annotation of our families to the Wikipedia community, by linking Pfam families to relevant Wikipedia pages and encouraging the Pfam and Wikipedia communities to improve and expand those pages. We continue to improve the Pfam website and add new visualizations, such as the 'sunburst' representation of taxonomic distribution of families. In this work we additionally address two topics that will be of particular interest to the Pfam community. First, we explain the definition and use of family-specific, manually curated gathering thresholds. Second, we discuss some of the features of domains of unknown function (also known as DUFs), which constitute a rapidly growing class of families within Pfam.


Assuntos
Bases de Dados de Proteínas , Proteínas/classificação , Enciclopédias como Assunto , Internet , Estrutura Terciária de Proteína , Homologia de Sequência de Aminoácidos
19.
Nucleic Acids Res ; 40(Database issue): D306-12, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22096229

RESUMO

InterPro (http://www.ebi.ac.uk/interpro/) is a database that integrates diverse information about protein families, domains and functional sites, and makes it freely available to the public via Web-based interfaces and services. Central to the database are diagnostic models, known as signatures, against which protein sequences can be searched to determine their potential function. InterPro has utility in the large-scale analysis of whole genomes and meta-genomes, as well as in characterizing individual protein sequences. Herein we give an overview of new developments in the database and its associated software since 2009, including updates to database content, curation processes and Web and programmatic interfaces.


Assuntos
Bases de Dados de Proteínas , Estrutura Terciária de Proteína , Proteínas/classificação , Proteínas/fisiologia , Análise de Sequência de Proteína , Software , Terminologia como Assunto , Interface Usuário-Computador
20.
Sci Data ; 11(1): 568, 2024 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-38824125

RESUMO

Technological advances in massively parallel sequencing have led to an exponential growth in the number of known protein sequences. Much of this growth originates from metagenomic projects producing new sequences from environmental and clinical samples. The Unified Human Gastrointestinal Proteome (UHGP) catalogue is one of the most relevant metagenomic datasets with applications ranging from medicine to biology. However, the low levels of sequence annotation may impair its usability. This work aims to produce a family classification of UHGP sequences to facilitate downstream structural and functional annotation. This is achieved through the release of the DPCfam-UHGP50 dataset containing 10,778 putative protein families generated using DPCfam clustering, an unsupervised pipeline grouping sequences into single or multi-domain architectures. DPCfam-UHGP50 considerably improves family coverage at protein and residue levels compared to the manually curated repository Pfam. In the hope that DPCfam-UHGP50 will foster future discoveries in the field of metagenomics of the human gut, we release a FAIR-compliant database of our results that is easily accessible via a searchable web server and Zenodo repository.


Assuntos
Proteoma , Humanos , Trato Gastrointestinal/metabolismo , Análise por Conglomerados , Anotação de Sequência Molecular , Metagenômica , Bases de Dados de Proteínas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA