Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 66
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Blood ; 140(10): 1167-1181, 2022 09 08.
Artículo en Inglés | MEDLINE | ID: mdl-35853161

RESUMEN

Patients with acute myeloid leukemia (AML) often achieve remission after allogeneic hematopoietic cell transplantation (allo-HCT) but subsequently die of relapse driven by leukemia cells resistant to elimination by allogeneic T cells based on decreased major histocompatibility complex II (MHC-II) expression and apoptosis resistance. Here we demonstrate that mouse-double-minute-2 (MDM2) inhibition can counteract immune evasion of AML. MDM2 inhibition induced MHC class I and II expression in murine and human AML cells. Using xenografts of human AML and syngeneic mouse models of leukemia, we show that MDM2 inhibition enhanced cytotoxicity against leukemia cells and improved survival. MDM2 inhibition also led to increases in tumor necrosis factor-related apoptosis-inducing ligand receptor-1 and -2 (TRAIL-R1/2) on leukemia cells and higher frequencies of CD8+CD27lowPD-1lowTIM-3low T cells, with features of cytotoxicity (perforin+CD107a+TRAIL+) and longevity (bcl-2+IL-7R+). CD8+ T cells isolated from leukemia-bearing MDM2 inhibitor-treated allo-HCT recipients exhibited higher glycolytic activity and enrichment for nucleotides and their precursors compared with vehicle control subjects. T cells isolated from MDM2 inhibitor-treated AML-bearing mice eradicated leukemia in secondary AML-bearing recipients. Mechanistically, the MDM2 inhibitor-mediated effects were p53-dependent because p53 knockdown abolished TRAIL-R1/2 and MHC-II upregulation, whereas p53 binding to TRAILR1/2 promotors increased upon MDM2 inhibition. The observations in the mouse models were complemented by data from human individuals. Patient-derived AML cells exhibited increased TRAIL-R1/2 and MHC-II expression on MDM2 inhibition. In summary, we identified a targetable vulnerability of AML cells to allogeneic T-cell-mediated cytotoxicity through the restoration of p53-dependent TRAIL-R1/2 and MHC-II production via MDM2 inhibition.


Asunto(s)
Leucemia Mieloide Aguda , Proteína p53 Supresora de Tumor , Animales , Apoptosis , Humanos , Leucemia Mieloide Aguda/genética , Complejo Mayor de Histocompatibilidad , Ratones , Proteínas Proto-Oncogénicas c-mdm2/metabolismo , Trasplante Homólogo , Proteína p53 Supresora de Tumor/genética , Proteína p53 Supresora de Tumor/metabolismo , Regulación hacia Arriba
2.
Br J Haematol ; 203(2): 264-281, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37539479

RESUMEN

Acute myeloid leukaemia (AML) relapse after allogeneic haematopoietic cell transplantation (allo-HCT) is often driven by immune-related mechanisms and associated with poor prognosis. Immune checkpoint inhibitors combined with hypomethylating agents (HMA) may restore or enhance the graft-versus-leukaemia effect. Still, data about using this combination regimen after allo-HCT are limited. We conducted a prospective, phase II, open-label, single-arm study in which we treated patients with haematological AML relapse after allo-HCT with HMA plus the anti-PD-1 antibody nivolumab. The response was correlated with DNA-, RNA- and protein-based single-cell technology assessments to identify biomarkers associated with therapeutic efficacy. Sixteen patients received a median number of 2 (range 1-7) nivolumab applications. The overall response rate (CR/PR) at day 42 was 25%, and another 25% of the patients achieved stable disease. The median overall survival was 15.6 months. High-parametric cytometry documented a higher frequency of activated (ICOS+ , HLA-DR+ ), low senescence (KLRG1- , CD57- ) CD8+ effector T cells in responders. We confirmed these findings in a preclinical model. Single-cell transcriptomics revealed a pro-inflammatory rewiring of the expression profile of T and myeloid cells in responders. In summary, the study indicates that the post-allo-HCT HMA/nivolumab combination induces anti-AML immune responses in selected patients and could be considered as a bridging approach to a second allo-HCT. Trial-registration: EudraCT-No. 2017-002194-18.

3.
PLoS Comput Biol ; 18(10): e1010610, 2022 10.
Artículo en Inglés | MEDLINE | ID: mdl-36260616

RESUMEN

Proteins that are known only at a sequence level outnumber those with an experimental characterization by orders of magnitude. Classifying protein regions (domains) into homologous families can generate testable functional hypotheses for yet unannotated sequences. Existing domain family resources typically use at least some degree of manual curation: they grow slowly over time and leave a large fraction of the protein sequence space unclassified. We here describe automatic clustering by Density Peak Clustering of UniRef50 v. 2017_07, a protein sequence database including approximately 23M sequences. We performed a radical re-implementation of a pipeline we previously developed in order to allow handling millions of sequences and data volumes of the order of 3 TeraBytes. The modified pipeline, which we call DPCfam, finds ∼ 45,000 protein clusters in UniRef50. Our automatic classification is in close correspondence to the ones of the Pfam and ECOD resources: in particular, about 81% of medium-large Pfam families and 72% of ECOD families can be mapped to clusters generated by DPCfam. In addition, our protocol finds more than 14,000 clusters constituted of protein regions with no Pfam annotation, which are therefore candidates for representing novel protein families. These results are made available to the scientific community through a dedicated repository.


Asunto(s)
Proteínas , Bases de Datos de Proteínas , Proteínas/genética , Análisis por Conglomerados , Secuencia de Aminoácidos , Dominios Proteicos
4.
BMC Bioinformatics ; 22(1): 121, 2021 Mar 12.
Artículo en Inglés | MEDLINE | ID: mdl-33711918

RESUMEN

BACKGROUND: The identification of protein families is of outstanding practical importance for in silico protein annotation and is at the basis of several bioinformatic resources. Pfam is possibly the most well known protein family database, built in many years of work by domain experts with extensive use of manual curation. This approach is generally very accurate, but it is quite time consuming and it may suffer from a bias generated from the hand-curation itself, which is often guided by the available experimental evidence. RESULTS: We introduce a procedure that aims to identify automatically putative protein families. The procedure is based on Density Peak Clustering and uses as input only local pairwise alignments between protein sequences. In the experiment we present here, we ran the algorithm on about 4000 full-length proteins with at least one domain classified by Pfam as belonging to the Pseudouridine synthase and Archaeosine transglycosylase (PUA) clan. We obtained 71 automatically-generated sequence clusters with at least 100 members. While our clusters were largely consistent with the Pfam classification, showing good overlap with either single or multi-domain Pfam family architectures, we also observed some inconsistencies. The latter were inspected using structural and sequence based evidence, which suggested that the automatic classification captured evolutionary signals reflecting non-trivial features of protein family architectures. Based on this analysis we identified a putative novel pre-PUA domain as well as alternative boundaries for a few PUA or PUA-associated families. As a first indication that our approach was unlikely to be clan-specific, we performed the same analysis on the P53 clan, obtaining comparable results. CONCLUSIONS: The clustering procedure described in this work takes advantage of the information contained in a large set of pairwise alignments and successfully identifies a set of putative families and family architectures in an unsupervised manner. Comparison with the Pfam classification highlights significant overlap and points to interesting differences, suggesting that our new algorithm could have potential in applications related to automatic protein classification. Testing this hypothesis, however, will require further experiments on large and diverse sequence datasets.


Asunto(s)
Proteínas , Alineación de Secuencia , Secuencia de Aminoácidos , Análisis por Conglomerados , Bases de Datos de Proteínas , Humanos , Proteínas/genética
5.
Nucleic Acids Res ; 44(D1): D279-85, 2016 Jan 04.
Artículo en Inglés | MEDLINE | ID: mdl-26673716

RESUMEN

In the last two years the Pfam database (http://pfam.xfam.org) has undergone a substantial reorganisation to reduce the effort involved in making a release, thereby permitting more frequent releases. Arguably the most significant of these changes is that Pfam is now primarily based on the UniProtKB reference proteomes, with the counts of matched sequences and species reported on the website restricted to this smaller set. Building families on reference proteomes sequences brings greater stability, which decreases the amount of manual curation required to maintain them. It also reduces the number of sequences displayed on the website, whilst still providing access to many important model organisms. Matches to the full UniProtKB database are, however, still available and Pfam annotations for individual UniProtKB sequences can still be retrieved. Some Pfam entries (1.6%) which have no matches to reference proteomes remain; we are working with UniProt to see if sequences from them can be incorporated into reference proteomes. Pfam-B, the automatically-generated supplement to Pfam, has been removed. The current release (Pfam 29.0) includes 16 295 entries and 559 clans. The facility to view the relationship between families within a clan has been improved by the introduction of a new tool.


Asunto(s)
Bases de Datos de Proteínas , Proteínas/clasificación , Proteoma/química , Alineación de Secuencia , Análisis de Secuencia de Proteína , Anotación de Secuencia Molecular
6.
Brief Bioinform ; 16(5): 865-72, 2015 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-25614388

RESUMEN

Transport systems comprise roughly 10% of all proteins in a cell, playing critical roles in many processes. Improving and expanding their classification is an important goal that can affect studies ranging from comparative genomics to potential drug target searches. It is not surprising that different classification systems for transport proteins have arisen, be it within a specialized database, focused on this functional class of proteins, or as part of a broader classification system for all proteins. Two such databases are the Transporter Classification Database (TCDB) and the Protein family (Pfam) database. As part of a long-term endeavor to improve consistency between the two classification systems, we have compared transporter annotations in the two databases to understand the rationale for differences and to improve both systems. Differences sometimes reflect the fact that one database has a particular transporter family while the other does not. Differing family definitions and hierarchical organizations were reconciled, resulting in recognition of 69 Pfam 'Domains of Unknown Function', which proved to be transport protein families to be renamed using TCDB annotations. Of over 400 potential new Pfam families identified from TCDB, 10% have already been added to Pfam, and TCDB has created 60 new entries based on Pfam data. This work, for the first time, reveals the benefits of comprehensive database comparisons and explains the differences between Pfam and TCDB.


Asunto(s)
Bases de Datos de Proteínas , Proteínas/química
7.
Nature ; 473(7345): 50-4, 2011 May 05.
Artículo en Inglés | MEDLINE | ID: mdl-21471968

RESUMEN

Saccharides have a central role in the nutrition of all living organisms. Whereas several saccharide uptake systems are shared between the different phylogenetic kingdoms, the phosphoenolpyruvate-dependent phosphotransferase system exists almost exclusively in bacteria. This multi-component system includes an integral membrane protein EIIC that transports saccharides and assists in their phosphorylation. Here we present the crystal structure of an EIIC from Bacillus cereus that transports diacetylchitobiose. The EIIC is a homodimer, with an expansive interface formed between the amino-terminal halves of the two protomers. The carboxy-terminal half of each protomer has a large binding pocket that contains a diacetylchitobiose, which is occluded from both sides of the membrane with its site of phosphorylation near the conserved His250 and Glu334 residues. The structure shows the architecture of this important class of transporters, identifies the determinants of substrate binding and phosphorylation, and provides a framework for understanding the mechanism of sugar translocation.


Asunto(s)
Bacillus cereus/enzimología , Proteínas de Transporte de Membrana/química , Modelos Moleculares , Sitios de Unión , Metabolismo de los Hidratos de Carbono , Cristalización , Fosforilación , Estructura Cuaternaria de Proteína , Estructura Terciaria de Proteína
8.
Nature ; 471(7338): 336-40, 2011 Mar 17.
Artículo en Inglés | MEDLINE | ID: mdl-21317882

RESUMEN

The TrkH/TrkG/KtrB proteins mediate K(+) uptake in bacteria and probably evolved from simple K(+) channels by multiple gene duplications or fusions. Here we present the crystal structure of a TrkH from Vibrio parahaemolyticus. TrkH is a homodimer, and each protomer contains an ion permeation pathway. A selectivity filter, similar in architecture to those of K(+) channels but significantly shorter, is lined by backbone and side-chain oxygen atoms. Functional studies showed that TrkH is selective for permeation of K(+) and Rb(+) over smaller ions such as Na(+) or Li(+). Immediately intracellular to the selectivity filter are an intramembrane loop and an arginine residue, both highly conserved, which constrict the permeation pathway. Substituting the arginine with an alanine significantly increases the rate of K(+) flux. These results reveal the molecular basis of K(+) selectivity and suggest a novel gating mechanism for this large and important family of membrane transport proteins.


Asunto(s)
Canales de Potasio/química , Canales de Potasio/metabolismo , Vibrio parahaemolyticus/química , Transportadoras de Casetes de Unión a ATP/química , Secuencia de Aminoácidos , Cristalografía por Rayos X , Proteínas de Escherichia coli/química , Activación del Canal Iónico , Transporte Iónico , Modelos Moleculares , Datos de Secuencia Molecular , Potasio/metabolismo , Relación Estructura-Actividad , Especificidad por Sustrato
9.
Nucleic Acids Res ; 43(Database issue): D382-6, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25348407

RESUMEN

Genome3D (http://www.genome3d.eu) is a collaborative resource that provides predicted domain annotations and structural models for key sequences. Since introducing Genome3D in a previous NAR paper, we have substantially extended and improved the resource. We have annotated representatives from Pfam families to improve coverage of diverse sequences and added a fast sequence search to the website to allow users to find Genome3D-annotated sequences similar to their own. We have improved and extended the Genome3D data, enlarging the source data set from three model organisms to 10, and adding VIVACE, a resource new to Genome3D. We have analysed and updated Genome3D's SCOP/CATH mapping. Finally, we have improved the superposition tools, which now give users a more powerful interface for investigating similarities and differences between structural models.


Asunto(s)
Bases de Datos de Proteínas , Anotación de Secuencia Molecular , Estructura Terciaria de Proteína , Algoritmos , Genómica , Internet , Modelos Moleculares , Estructura Terciaria de Proteína/genética , Análisis de Secuencia de Proteína
10.
Nucleic Acids Res ; 43(Database issue): D213-21, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25428371

RESUMEN

The InterPro database (http://www.ebi.ac.uk/interpro/) is a freely available resource that can be used to classify sequences into protein families and to predict the presence of important domains and sites. Central to the InterPro database are predictive models, known as signatures, from a range of different protein family databases that have different biological focuses and use different methodological approaches to classify protein families and domains. InterPro integrates these signatures, capitalizing on the respective strengths of the individual databases, to produce a powerful protein classification resource. Here, we report on the status of InterPro as it enters its 15th year of operation, and give an overview of new developments with the database and its associated Web interfaces and software. In particular, the new domain architecture search tool is described and the process of mapping of Gene Ontology terms to InterPro is outlined. We also discuss the challenges faced by the resource given the explosive growth in sequence data in recent years. InterPro (version 48.0) contains 36,766 member database signatures integrated into 26,238 InterPro entries, an increase of over 3993 entries (5081 signatures), since 2012.


Asunto(s)
Bases de Datos de Proteínas , Proteínas/clasificación , Bacterias/metabolismo , Ontología de Genes , Estructura Terciaria de Proteína , Proteínas/genética , Análisis de Secuencia de Proteína , Programas Informáticos
11.
Nature ; 467(7319): 1074-80, 2010 Oct 28.
Artículo en Inglés | MEDLINE | ID: mdl-20981093

RESUMEN

The plant SLAC1 anion channel controls turgor pressure in the aperture-defining guard cells of plant stomata, thereby regulating the exchange of water vapour and photosynthetic gases in response to environmental signals such as drought or high levels of carbon dioxide. Here we determine the crystal structure of a bacterial homologue (Haemophilus influenzae) of SLAC1 at 1.20 Å resolution, and use structure-inspired mutagenesis to analyse the conductance properties of SLAC1 channels. SLAC1 is a symmetrical trimer composed from quasi-symmetrical subunits, each having ten transmembrane helices arranged from helical hairpin pairs to form a central five-helix transmembrane pore that is gated by an extremely conserved phenylalanine residue. Conformational features indicate a mechanism for control of gating by kinase activation, and electrostatic features of the pore coupled with electrophysiological characteristics indicate that selectivity among different anions is largely a function of the energetic cost of ion dehydration.


Asunto(s)
Proteínas de Arabidopsis/química , Proteínas Bacterianas/química , Haemophilus influenzae/química , Proteínas de la Membrana/química , Estomas de Plantas/metabolismo , Homología Estructural de Proteína , Secuencia de Aminoácidos , Animales , Arabidopsis/genética , Arabidopsis/metabolismo , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Cristalografía por Rayos X , Conductividad Eléctrica , Haemophilus influenzae/genética , Activación del Canal Iónico , Modelos Moleculares , Datos de Secuencia Molecular , Oocitos/metabolismo , Fenilalanina/química , Fenilalanina/metabolismo , Electricidad Estática , Especificidad por Sustrato , Xenopus laevis
12.
Nucleic Acids Res ; 42(Web Server issue): W337-43, 2014 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-24799431

RESUMEN

PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein-protein binding sites (ISIS2), protein-polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org.


Asunto(s)
Conformación Proteica , Programas Informáticos , Sustitución de Aminoácidos , Sitios de Unión , Ontología de Genes , Internet , Proteínas Intrínsecamente Desordenadas/química , Proteínas de la Membrana/química , Mutación , Mapeo de Interacción de Proteínas , Proteínas/análisis , Proteínas/genética , Proteínas/metabolismo , Alineación de Secuencia , Análisis de Secuencia de Proteína , Homología de Secuencia de Aminoácido
13.
Nucleic Acids Res ; 42(Database issue): D222-30, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24288371

RESUMEN

Pfam, available via servers in the UK (http://pfam.sanger.ac.uk/) and the USA (http://pfam.janelia.org/), is a widely used database of protein families, containing 14 831 manually curated entries in the current release, version 27.0. Since the last update article 2 years ago, we have generated 1182 new families and maintained sequence coverage of the UniProt Knowledgebase (UniProtKB) at nearly 80%, despite a 50% increase in the size of the underlying sequence database. Since our 2012 article describing Pfam, we have also undertaken a comprehensive review of the features that are provided by Pfam over and above the basic family data. For each feature, we determined the relevance, computational burden, usage statistics and the functionality of the feature in a website context. As a consequence of this review, we have removed some features, enhanced others and developed new ones to meet the changing demands of computational biology. Here, we describe the changes to Pfam content. Notably, we now provide family alignments based on four different representative proteome sequence data sets and a new interactive DNA search interface. We also discuss the mapping between Pfam and known 3D structures.


Asunto(s)
Bases de Datos de Proteínas , Alineación de Secuencia , Análisis de Secuencia de Proteína , Internet , Proteínas Intrínsecamente Desordenadas/química , Conformación Proteica , Proteínas/química , Proteínas/clasificación , Proteínas/genética , Proteoma/química , Análisis de Secuencia de ADN
14.
BMC Bioinformatics ; 16: 7, 2015 Jan 16.
Artículo en Inglés | MEDLINE | ID: mdl-25592227

RESUMEN

BACKGROUND: N-terminal domains of BVU_4064 and BF1687 proteins from Bacteroides vulgatus and Bacteroides fragilis respectively are members of the Pfam family PF12985 (DUF3869). Proteins containing a domain from this family can be found in most Bacteroides species and, in large numbers, in all human gut microbiome samples. Both BVU_4064 and BF1687 proteins have a consensus lipobox motif implying they are anchored to the membrane, but their functions are otherwise unknown. The C-terminal half of BVU_4064 is assigned to protein family PF12986 (DUF3870); the equivalent part of BF1687 was unclassified. RESULTS: Crystal structures of both BVU_4064 and BF1687 proteins, solved at the JCSG center, show strikingly similar three-dimensional structures. The main difference between the two is that the two domains in the BVU_4064 protein are connected by a short linker, as opposed to a longer insertion made of 4 helices placed linearly along with a strand that is added to the C-terminal domain in the BF1687 protein. The N-terminal domain in both proteins, corresponding to the PF12985 (DUF3869) domain is a ß-sandwich with pre-albumin-like fold, found in many proteins belonging to the Transthyretin clan of Pfam. The structures of C-terminal domains of both proteins, corresponding to the PF12986 (DUF3870) domain in BVU_4064 protein and an unclassified domain in the BF1687 protein, show significant structural similarity to bacterial pore-forming toxins. A helix in this domain is in an analogous position to a loop connecting the second and third strands in the toxin structures, where this loop is implicated to play a role in the toxin insertion into the host cell membrane. The same helix also points to the groove between the N- and C-terminal domains that are loosely held together by hydrophobic and hydrogen bond interactions. The presence of several conserved residues in this region together with these structural determinants could make it a functionally important region in these proteins. CONCLUSIONS: Structural analysis of BVU_4064 and BF1687 points to possible roles in mediating multiple interactions on the cell-surface/extracellular matrix. In particular the N-terminal domain could be involved in adhesive interactions, the C-terminal domain and the inter-domain groove in lipid or carbohydrate interactions.


Asunto(s)
Proteínas Bacterianas/análisis , Proteínas Bacterianas/química , Bacteroides/química , Moléculas de Adhesión Celular/metabolismo , Lípidos/química , Proteínas de la Membrana/metabolismo , Secuencia de Aminoácidos , Proteínas Bacterianas/metabolismo , Adhesión Celular/fisiología , Moléculas de Adhesión Celular/química , Cristalografía por Rayos X , Humanos , Proteínas de la Membrana/química , Datos de Secuencia Molecular , Pliegue de Proteína , Estructura Terciaria de Proteína , Análisis de Secuencia de Proteína , Homología de Secuencia de Aminoácido
15.
Nucleic Acids Res ; 41(12): e121, 2013 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-23598997

RESUMEN

Detection of protein homology via sequence similarity has important applications in biology, from protein structure and function prediction to reconstruction of phylogenies. Although current methods for aligning protein sequences are powerful, challenges remain, including problems with homologous overextension of alignments and with regions under convergent evolution. Here, we test the ability of the profile hidden Markov model method HMMER3 to correctly assign homologous sequences to >13,000 manually curated families from the Pfam database. We identify problem families using protein regions that match two or more Pfam families not currently annotated as related in Pfam. We find that HMMER3 E-value estimates seem to be less accurate for families that feature periodic patterns of compositional bias, such as the ones typically observed in coiled-coils. These results support the continued use of manually curated inclusion thresholds in the Pfam database, especially on the subset of families that have been identified as problematic in experiments such as these. They also highlight the need for developing new methods that can correct for this particular type of compositional bias.


Asunto(s)
Análisis de Secuencia de Proteína/métodos , Homología de Secuencia de Aminoácido , Evolución Molecular , Cadenas de Markov , Proteínas de la Membrana/química , Estructura Secundaria de Proteína , Estructura Terciaria de Proteína , Proteínas/clasificación , Alineación de Secuencia
16.
BMC Bioinformatics ; 15: 1, 2014 Jan 03.
Artículo en Inglés | MEDLINE | ID: mdl-24383880

RESUMEN

BACKGROUND: The Acel_2062 protein from Acidothermus cellulolyticus is a protein of unknown function. Initial sequence analysis predicted that it was a metallopeptidase from the presence of a motif conserved amongst the Asp-zincins, which are peptidases that contain a single, catalytic zinc ion ligated by the histidines and aspartic acid within the motif (HEXXHXXGXXD). The Acel_2062 protein was chosen by the Joint Center for Structural Genomics for crystal structure determination to explore novel protein sequence space and structure-based function annotation. RESULTS: The crystal structure confirmed that the Acel_2062 protein consisted of a single, zincin-like metallopeptidase-like domain. The Met-turn, a structural feature thought to be important for a Met-zincin because it stabilizes the active site, is absent, and its stabilizing role may have been conferred to the C-terminal Tyr113. In our crystallographic model there are two molecules in the asymmetric unit and from size-exclusion chromatography, the protein dimerizes in solution. A water molecule is present in the putative zinc-binding site in one monomer, which is replaced by one of two observed conformations of His95 in the other. CONCLUSIONS: The Acel_2062 protein is structurally related to the zincins. It contains the minimum structural features of a member of this protein superfamily, and can be described as a "mini- zincin". There is a striking parallel with the structure of a mini-Glu-zincin, which represents the minimum structure of a Glu-zincin (a metallopeptidase in which the third zinc ligand is a glutamic acid). Rather than being an ancestral state, phylogenetic analysis suggests that the mini-zincins are derived from larger proteins.


Asunto(s)
Proteínas Bacterianas/química , Metaloproteasas/química , Zinc/química , Actinomycetales/química , Actinomycetales/enzimología , Secuencias de Aminoácidos , Secuencia de Aminoácidos , Proteínas Bacterianas/metabolismo , Dimerización , Metaloproteasas/metabolismo , Modelos Moleculares , Datos de Secuencia Molecular , Filogenia , Subunidades de Proteína , Alineación de Secuencia , Zinc/metabolismo
17.
Genome Res ; 21(6): 898-907, 2011 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-21482623

RESUMEN

High-throughput X-ray absorption spectroscopy was used to measure transition metal content based on quantitative detection of X-ray fluorescence signals for 3879 purified proteins from several hundred different protein families generated by the New York SGX Research Center for Structural Genomics. Approximately 9% of the proteins analyzed showed the presence of transition metal atoms (Zn, Cu, Ni, Co, Fe, or Mn) in stoichiometric amounts. The method is highly automated and highly reliable based on comparison of the results to crystal structure data derived from the same protein set. To leverage the experimental metalloprotein annotations, we used a sequence-based de novo prediction method, MetalDetector, to identify Cys and His residues that bind to transition metals for the redundancy reduced subset of 2411 sequences sharing <70% sequence identity and having at least one His or Cys. As the HT-XAS identifies metal type and protein binding, while the bioinformatics analysis identifies metal- binding residues, the results were combined to identify putative metal-binding sites in the proteins and their associated families. We explored the combination of this data with homology models to generate detailed structure models of metal-binding sites for representative proteins. Finally, we used extended X-ray absorption fine structure data from two of the purified Zn metalloproteins to validate predicted metalloprotein binding site structures. This combination of experimental and bioinformatics approaches provides comprehensive active site analysis on the genome scale for metalloproteins as a class, revealing new insights into metalloprotein structure and function.


Asunto(s)
Metaloproteínas/química , Programas Informáticos , Espectroscopía de Absorción de Rayos X/métodos , Sitios de Unión/genética , Biología Computacional/métodos , Fluorescencia , Genómica/métodos , Metales Pesados/análisis , Sincrotrones
18.
Nucleic Acids Res ; 40(Database issue): D290-301, 2012 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-22127870

RESUMEN

Pfam is a widely used database of protein families, currently containing more than 13,000 manually curated protein families as of release 26.0. Pfam is available via servers in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/). Here, we report on changes that have occurred since our 2010 NAR paper (release 24.0). Over the last 2 years, we have generated 1840 new families and increased coverage of the UniProt Knowledgebase (UniProtKB) to nearly 80%. Notably, we have taken the step of opening up the annotation of our families to the Wikipedia community, by linking Pfam families to relevant Wikipedia pages and encouraging the Pfam and Wikipedia communities to improve and expand those pages. We continue to improve the Pfam website and add new visualizations, such as the 'sunburst' representation of taxonomic distribution of families. In this work we additionally address two topics that will be of particular interest to the Pfam community. First, we explain the definition and use of family-specific, manually curated gathering thresholds. Second, we discuss some of the features of domains of unknown function (also known as DUFs), which constitute a rapidly growing class of families within Pfam.


Asunto(s)
Bases de Datos de Proteínas , Proteínas/clasificación , Enciclopedias como Asunto , Internet , Estructura Terciaria de Proteína , Homología de Secuencia de Aminoácido
19.
Nucleic Acids Res ; 40(Database issue): D306-12, 2012 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-22096229

RESUMEN

InterPro (http://www.ebi.ac.uk/interpro/) is a database that integrates diverse information about protein families, domains and functional sites, and makes it freely available to the public via Web-based interfaces and services. Central to the database are diagnostic models, known as signatures, against which protein sequences can be searched to determine their potential function. InterPro has utility in the large-scale analysis of whole genomes and meta-genomes, as well as in characterizing individual protein sequences. Herein we give an overview of new developments in the database and its associated software since 2009, including updates to database content, curation processes and Web and programmatic interfaces.


Asunto(s)
Bases de Datos de Proteínas , Estructura Terciaria de Proteína , Proteínas/clasificación , Proteínas/fisiología , Análisis de Secuencia de Proteína , Programas Informáticos , Terminología como Asunto , Interfaz Usuario-Computador
20.
Sci Data ; 11(1): 568, 2024 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-38824125

RESUMEN

Technological advances in massively parallel sequencing have led to an exponential growth in the number of known protein sequences. Much of this growth originates from metagenomic projects producing new sequences from environmental and clinical samples. The Unified Human Gastrointestinal Proteome (UHGP) catalogue is one of the most relevant metagenomic datasets with applications ranging from medicine to biology. However, the low levels of sequence annotation may impair its usability. This work aims to produce a family classification of UHGP sequences to facilitate downstream structural and functional annotation. This is achieved through the release of the DPCfam-UHGP50 dataset containing 10,778 putative protein families generated using DPCfam clustering, an unsupervised pipeline grouping sequences into single or multi-domain architectures. DPCfam-UHGP50 considerably improves family coverage at protein and residue levels compared to the manually curated repository Pfam. In the hope that DPCfam-UHGP50 will foster future discoveries in the field of metagenomics of the human gut, we release a FAIR-compliant database of our results that is easily accessible via a searchable web server and Zenodo repository.


Asunto(s)
Proteoma , Humanos , Tracto Gastrointestinal/metabolismo , Análisis por Conglomerados , Anotación de Secuencia Molecular , Metagenómica , Bases de Datos de Proteínas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA