Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 85
Filtrar
Mais filtros

País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 52(D1): D762-D769, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37962425

RESUMO

The Reference Sequence (RefSeq) project at the National Center for Biotechnology Information (NCBI) contains over 315 000 bacterial and archaeal genomes and 236 million proteins with up-to-date and consistent annotation. In the past 3 years, we have expanded the diversity of the RefSeq collection by including the best quality metagenome-assembled genomes (MAGs) submitted to INSDC (DDBJ, ENA and GenBank), while maintaining its quality by adding validation checks. Assemblies are now more stringently evaluated for contamination and for completeness of annotation prior to acceptance into RefSeq. MAGs now account for over 17000 assemblies in RefSeq, split over 165 orders and 362 families. Changes in the Prokaryotic Genome Annotation Pipeline (PGAP), which is used to annotate nearly all RefSeq assemblies include better detection of protein-coding genes. Nearly 83% of RefSeq proteins are now named by a curated Protein Family Model, a 4.7% increase in the past three years ago. In addition to literature citations, Enzyme Commission numbers, and gene symbols, Gene Ontology terms are now assigned to 48% of RefSeq proteins, allowing for easier multi-genome comparison. RefSeq is found at https://www.ncbi.nlm.nih.gov/refseq/. PGAP is available as a stand-alone tool able to produce GenBank-ready files at https://github.com/ncbi/pgap.


Assuntos
Archaea , Bactérias , Bases de Dados de Ácidos Nucleicos , Metagenoma , Archaea/genética , Bactérias/genética , Bases de Dados de Ácidos Nucleicos/normas , Bases de Dados de Ácidos Nucleicos/tendências , Genoma Arqueal/genética , Genoma Bacteriano/genética , Internet , Anotação de Sequência Molecular , Proteínas/genética
2.
Nucleic Acids Res ; 51(D1): D418-D427, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36350672

RESUMO

The InterPro database (https://www.ebi.ac.uk/interpro/) provides an integrative classification of protein sequences into families, and identifies functionally important domains and conserved sites. Here, we report recent developments with InterPro (version 90.0) and its associated software, including updates to data content and to the website. These developments extend and enrich the information provided by InterPro, and provide a more user friendly access to the data. Additionally, we have worked on adding Pfam website features to the InterPro website, as the Pfam website will be retired in late 2022. We also show that InterPro's sequence coverage has kept pace with the growth of UniProtKB. Moreover, we report the development of a card game as a method of engaging the non-scientific community. Finally, we discuss the benefits and challenges brought by the use of artificial intelligence for protein structure prediction.


Assuntos
Bases de Dados de Proteínas , Humanos , Sequência de Aminoácidos , Inteligência Artificial , Internet , Proteínas/química , Software
3.
J Bacteriol ; 206(1): e0017323, 2024 01 25.
Artigo em Inglês | MEDLINE | ID: mdl-38084967

RESUMO

The LPXTG protein-sorting signal, found in surface proteins of various Gram-positive pathogens, was the founding member of a growing panel of prokaryotic small C-terminal sorting domains. Sortase A cleaves LPXTG, exosortases (XrtA and XrtB) cleave the PEP-CTERM sorting signal, archaeosortase A cleaves PGF-CTERM, and rhombosortase cleaves GlyGly-CTERM domains. Four sorting signal domains without previously known processing proteases are the MYXO-CTERM, JDVT-CTERM, Synerg-CTERM, and CGP-CTERM domains. These exhibit the standard tripartite architecture of a short signature motif, a hydrophobic transmembrane segment, and an Arg-rich cluster. Each has an invariant cysteine in its signature motif. Computational evidence strongly suggests that each of these four Cys-containing sorting signals is processed, at least in part, by a cognate family of glutamic-type intramembrane endopeptidases related to the eukaryotic type II CAAX-processing protease Rce1. For the MYXO-CTERM sorting signals of different lineages, their sorting enzymes, called myxosortases, include MrtX (MXAN_2755 in Myxococcus xanthus), MrtC, and MrtP, all with radically different N-terminal domains but with a conserved core. Related predicted sorting enzymes were also identified for JDVT-CTERM (MrtJ), Synerg-CTERM (MrtS), and CGP-CTERM (MrtA). This work establishes a major new family of protein-sorting housekeeping endopeptidases contributing to the surface attachment of proteins in prokaryotes. IMPORTANCE Homologs of the eukaryotic type II CAAX-box protease Rce1, a membrane-embedded endopeptidase found in yeast and human ER and involved in sorting proteins to their proper cellular locations, are abundant in prokaryotes but not well understood there. This bioinformatics paper identifies several subgroups of the family as cognate endopeptidases for four protein-sorting signals processed by previously unknown machinery. Sorting signals with newly identified processing enzymes include three novel ones, but also MYXO-CTERM, which had been the focus of previous experimental work in the model fruiting and gliding bacterium Myxococcus xanthus. The new findings will substantially improve our understanding of Cys-containing C-terminal protein-sorting signals and of protein trafficking generally in bacteria and archaea.


Assuntos
Cisteína , Peptídeo Hidrolases , Humanos , Cisteína/metabolismo , Transporte Proteico , Peptídeo Hidrolases/metabolismo , Proteínas de Membrana/metabolismo , Bactérias/metabolismo , Saccharomyces cerevisiae
4.
J Bacteriol ; 205(1): e0025922, 2023 01 26.
Artigo em Inglês | MEDLINE | ID: mdl-36598231

RESUMO

The bioinformatics of a nine-gene locus, designated selenocysteine-assisted organometallic (SAO), was investigated after identifying six new selenoprotein families and constructing hidden Markov models (HMMs) that find and annotate members of those families. Four are selenoproteins in most SAO loci, including Clostridium difficile. They include two ABC transporter subunits, namely, permease SaoP, with selenocysteine (U) at the channel-gating position, and substrate-binding subunit SaoB. Cytosolic selenoproteins include SaoL, homologous to MerB organomercurial lyases from mercury resistance loci, and SaoT, related to thioredoxins. SaoL, SaoB, and surface protein SaoC (an occasional selenoprotein) share an unusual CU dipeptide motif, which is something rare in selenoproteins but found in selenoprotein variants of mercury resistance transporter subunit MerT. A nonselenoprotein, SaoE, shares homology with Cu/Zn efflux and arsenical efflux pumps. The organization of the SAO system suggests substrate interaction with surface-exposed selenoproteins, followed by import, metabolism that may cleave a carbon-to-heavy metal bond, and finally metal efflux. A novel type of mercury resistance is possible, but SAO instead may support fermentative metabolism, with selenocysteine-mediated formation of organometallic intermediates, followed by import, degradation, and metal efflux. Phylogenetic profiling shows SOA loci consistently co-occur with Stickland fermentation markers but even more consistently with 8Fe-9S cofactor-type double-cubane proteins. Hypothesizing that the SAO system forms organometallic intermediates, we investigated the known methylmercury formation protein families HgcA and HgcB. Both families contained overlooked selenoproteins. Most HgcAs have a CU motif N terminal to their previously accepted start sites. Seeking additional rare and overlooked selenoproteins may help reveal more cryptic aspects of microbial biochemistry. IMPORTANCE This work adds 8 novel prokaryotic selenoproteins to the 80 or so families previously known. It describes the SAO (selenocysteine-assisted organometallic) locus, with the most selenoproteins of any known system. The rare CU motif recurs throughout, suggesting the formation and degradation of organometallic compounds. That suggestion triggered a reexamination of HgcA and HcgB, which are methylmercury formation proteins that can adversely impact food safety. Both are selenoproteins, once corrected, with HgcA again showing a CU motif. The SAO system is plausibly a mercury resistance locus for selenium-dependent anaerobes. But instead, it may exploit heavy metals as cofactors in organometallic intermediate-forming pathways that circumvent high activation energies and facilitate the breakdown of otherwise poorly accessible nutrients. SAO could provide an edge that helps Clostridium difficile, an important pathogen, establish disease.


Assuntos
Clostridioides difficile , Mercúrio , Compostos de Metilmercúrio , Clostridioides difficile/genética , Clostridioides difficile/metabolismo , Selenocisteína/metabolismo , Filogenia , Selenoproteínas/genética , Selenoproteínas/metabolismo
5.
Nucleic Acids Res ; 49(D1): D1020-D1028, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33270901

RESUMO

The Reference Sequence (RefSeq) project at the National Center for Biotechnology Information (NCBI) contains nearly 200 000 bacterial and archaeal genomes and 150 million proteins with up-to-date annotation. Changes in the Prokaryotic Genome Annotation Pipeline (PGAP) since 2018 have resulted in a substantial reduction in spurious annotation. The hierarchical collection of protein family models (PFMs) used by PGAP as evidence for structural and functional annotation was expanded to over 35 000 protein profile hidden Markov models (HMMs), 12 300 BlastRules and 36 000 curated CDD architectures. As a result, >122 million or 79% of RefSeq proteins are now named based on a match to a curated PFM. Gene symbols, Enzyme Commission numbers or supporting publication attributes are available on over 40% of the PFMs and are inherited by the proteins and features they name, facilitating multi-genome analyses and connections to the literature. In adherence with the principles of FAIR (findable, accessible, interoperable, reusable), the PFMs are available in the Protein Family Models Entrez database to any user. Finally, the reference and representative genome set, a taxonomically diverse subset of RefSeq prokaryotic genomes, is now recalculated regularly and available for download and homology searches with BLAST. RefSeq is found at https://www.ncbi.nlm.nih.gov/refseq/.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Genoma Arqueal/genética , Genoma Bacteriano/genética , Anotação de Sequência Molecular/métodos , Proteínas/genética , Curadoria de Dados/métodos , Mineração de Dados/métodos , Genômica/métodos , Internet , Proteínas/classificação , Interface Usuário-Computador
6.
Nucleic Acids Res ; 49(D1): D344-D354, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33156333

RESUMO

The InterPro database (https://www.ebi.ac.uk/interpro/) provides an integrative classification of protein sequences into families, and identifies functionally important domains and conserved sites. InterProScan is the underlying software that allows protein and nucleic acid sequences to be searched against InterPro's signatures. Signatures are predictive models which describe protein families, domains or sites, and are provided by multiple databases. InterPro combines signatures representing equivalent families, domains or sites, and provides additional information such as descriptions, literature references and Gene Ontology (GO) terms, to produce a comprehensive resource for protein classification. Founded in 1999, InterPro has become one of the most widely used resources for protein family annotation. Here, we report the status of InterPro (version 81.0) in its 20th year of operation, and its associated software, including updates to database content, the release of a new website and REST API, and performance improvements in InterProScan.


Assuntos
Bases de Dados de Proteínas , Proteínas/química , Sequência de Aminoácidos , COVID-19/metabolismo , Internet , Anotação de Sequência Molecular , Domínios Proteicos , Mapas de Interação de Proteínas , SARS-CoV-2/metabolismo , Alinhamento de Sequência
7.
Antimicrob Agents Chemother ; 66(4): e0033322, 2022 04 19.
Artigo em Inglês | MEDLINE | ID: mdl-35380458

RESUMO

Assigning names to ß-lactamase variants has been inconsistent and has led to confusion in the published literature. The common availability of whole genome sequencing has resulted in an exponential growth in the number of new ß-lactamase genes. In November 2021 an international group of ß-lactamase experts met virtually to develop a consensus for the way naturally-occurring ß-lactamase genes should be named. This document formalizes the process for naming novel ß-lactamases, followed by their subsequent publication.


Assuntos
Inibidores de beta-Lactamases , beta-Lactamases , Consenso , beta-Lactamases/genética
8.
Artigo em Inglês | MEDLINE | ID: mdl-31712217

RESUMO

Unlike for classes A and B, a standardized amino acid numbering scheme has not been proposed for the class C (AmpC) ß-lactamases, which complicates communication in the field. Here, we propose a scheme developed through a collaborative approach that considers both sequence and structure, preserves traditional numbering of catalytically important residues (Ser64, Lys67, Tyr150, and Lys315), is adaptable to new variants or enzymes yet to be discovered and includes a variation for genetic and epidemiological applications.


Assuntos
Proteínas de Bactérias/classificação , Bactérias Gram-Negativas/genética , Bactérias Gram-Positivas/genética , Mutação , Terminologia como Assunto , Resistência beta-Lactâmica/genética , beta-Lactamases/classificação , Sequência de Aminoácidos , Antibacterianos/química , Antibacterianos/farmacologia , Proteínas de Bactérias/antagonistas & inibidores , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Expressão Gênica , Bactérias Gram-Negativas/efeitos dos fármacos , Bactérias Gram-Negativas/enzimologia , Bactérias Gram-Positivas/efeitos dos fármacos , Bactérias Gram-Positivas/enzimologia , Cooperação Internacional , Estrutura Secundária de Proteína , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos , Inibidores de beta-Lactamases/química , Inibidores de beta-Lactamases/farmacologia , beta-Lactamases/genética , beta-Lactamases/metabolismo , beta-Lactamas/química , beta-Lactamas/farmacologia
9.
Nucleic Acids Res ; 46(D1): D851-D860, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29112715

RESUMO

The Reference Sequence (RefSeq) project at the National Center for Biotechnology Information (NCBI) provides annotation for over 95 000 prokaryotic genomes that meet standards for sequence quality, completeness, and freedom from contamination. Genomes are annotated by a single Prokaryotic Genome Annotation Pipeline (PGAP) to provide users with a resource that is as consistent and accurate as possible. Notable recent changes include the development of a hierarchical evidence scheme, a new focus on curating annotation evidence sources, the addition and curation of protein profile hidden Markov models (HMMs), release of an updated pipeline (PGAP-4), and comprehensive re-annotation of RefSeq prokaryotic genomes. Antimicrobial resistance proteins have been reannotated comprehensively, improved structural annotation of insertion sequence transposases and selenoproteins is provided, curated complex domain architectures have given upgraded names to millions of multidomain proteins, and we introduce a new kind of annotation rule-BlastRules. Continual curation of supporting evidence, and propagation of improved names onto RefSeq proteins ensures that the functional annotation of genomes is kept current. An increasing share of our annotation now derives from HMMs and other sets of annotation rules that are portable by nature, and available for download and for reuse by other investigators. RefSeq is found at https://www.ncbi.nlm.nih.gov/refseq/.


Assuntos
Curadoria de Dados , Bases de Dados de Ácidos Nucleicos , Genoma , Anotação de Sequência Molecular , Células Procarióticas , Archaea/genética , Bactérias/genética , Bases de Dados de Proteínas , Eucariotos/genética , Previsões , Humanos , Homologia de Sequência , Software , Vírus/genética
10.
Artigo em Inglês | MEDLINE | ID: mdl-31427293

RESUMO

Antimicrobial resistance (AMR) is a major public health problem that requires publicly available tools for rapid analysis. To identify AMR genes in whole-genome sequences, the National Center for Biotechnology Information (NCBI) has produced AMRFinder, a tool that identifies AMR genes using a high-quality curated AMR gene reference database. The Bacterial Antimicrobial Resistance Reference Gene Database consists of up-to-date gene nomenclature, a set of hidden Markov models (HMMs), and a curated protein family hierarchy. Currently, it contains 4,579 antimicrobial resistance proteins and more than 560 HMMs. Here, we describe AMRFinder and its associated database. To assess the predictive ability of AMRFinder, we measured the consistency between predicted AMR genotypes from AMRFinder and resistance phenotypes of 6,242 isolates from the National Antimicrobial Resistance Monitoring System (NARMS). This included 5,425 Salmonella enterica, 770 Campylobacter spp., and 47 Escherichia coli isolates phenotypically tested against various antimicrobial agents. Of 87,679 susceptibility tests performed, 98.4% were consistent with predictions. To assess the accuracy of AMRFinder, we compared its gene symbol output with that of a 2017 version of ResFinder, another publicly available resistance gene detection system. Most gene calls were identical, but there were 1,229 gene symbol differences (8.8%) between them, with differences due to both algorithmic differences and database composition. AMRFinder missed 16 loci that ResFinder found, while ResFinder missed 216 loci that AMRFinder identified. Based on these results, AMRFinder appears to be a highly accurate AMR gene detection system.

11.
Environ Microbiol ; 20(5): 1677-1692, 2018 05.
Artigo em Inglês | MEDLINE | ID: mdl-29473278

RESUMO

Bacterial floc formation plays a central role in the activated sludge (AS) process, which has been widely utilized for sewage and wastewater treatment. The formation of AS flocs has long been known to require exopolysaccharide biosynthesis. This study demonstrates an additional requirement for a PEP-CTERM protein in Zoogloea resiniphila, a dominant AS bacterium harboring a large exopolysaccharide biosynthesis gene cluster. Two members of a wide-spread family of high copy number-per-genome PEP-CTERM genes, transcriptionally regulated by the RpoN sigma factor and accessory PrsK-PrsR two-component system and at least one of these, pepA, must be expressed for Zoogloea to build the floc structures that allow gravitational sludge settling and recycling. Without PrsK or PrsR, Zoogloea cells were planktonic rather than flocculated and secreted exopolysaccharides were released into the growth broth in soluble form. Overexpression of PepA could circumvent the requirement of rpoN, prsK and prsR for the floc-forming phenotype by fixing the exopolysaccharides to bacterial cells. However, overexpression of PepA, which underwent post-translational modifications, could not rescue the long-rod morphology of the rpoN mutant. Consistently, PEP-CTERM genes and exopolysaccharide biosynthesis gene cluster are present in the genome of the floc-forming Nitrospira comammox and Mitsuaria strain as well as many other AS bacteria.


Assuntos
Esgotos/microbiologia , Águas Residuárias/microbiologia , Zoogloea/fisiologia , Proteínas de Bactérias/metabolismo , Floculação , Fator sigma/metabolismo , Eliminação de Resíduos Líquidos , Águas Residuárias/química
12.
J Antimicrob Chemother ; 73(10): 2625-2630, 2018 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-30053115

RESUMO

The initial report of the mcr-1 (mobile colistin resistance) gene has led to many reports of mcr-1 variants and other mcr genes from different bacterial species originating from human, animal and environmental samples in different geographical locations. Resistance gene nomenclature is complex and unfortunately problems such as different names being used for the same gene/protein or the same name being used for different genes/proteins are not uncommon. Registries exist for some families, such as bla (ß-lactamase) genes, but there is as yet no agreed nomenclature scheme for mcr genes. The National Center for Biotechnology Information (NCBI) recently took over assigning bla allele numbers from the longstanding Lahey ß-lactamase website and has agreed to do the same for mcr genes. Here, we propose a nomenclature scheme that we hope will be acceptable to researchers in this area and that will reduce future confusion.


Assuntos
Alelos , Antibacterianos/farmacologia , Bactérias/genética , Colistina/farmacologia , Farmacorresistência Bacteriana/genética , Genes MDR , Bactérias/efeitos dos fármacos , Escherichia coli/efeitos dos fármacos , Proteínas de Escherichia coli/genética , Testes de Sensibilidade Microbiana , Terminologia como Assunto , Sequenciamento Completo do Genoma , beta-Lactamases/genética
13.
Nucleic Acids Res ; 44(D1): D733-45, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26553804

RESUMO

The RefSeq project at the National Center for Biotechnology Information (NCBI) maintains and curates a publicly available database of annotated genomic, transcript, and protein sequence records (http://www.ncbi.nlm.nih.gov/refseq/). The RefSeq project leverages the data submitted to the International Nucleotide Sequence Database Collaboration (INSDC) against a combination of computation, manual curation, and collaboration to produce a standard set of stable, non-redundant reference sequences. The RefSeq project augments these reference sequences with current knowledge including publications, functional features and informative nomenclature. The database currently represents sequences from more than 55,000 organisms (>4800 viruses, >40,000 prokaryotes and >10,000 eukaryotes; RefSeq release 71), ranging from a single record to complete genomes. This paper summarizes the current status of the viral, prokaryotic, and eukaryotic branches of the RefSeq project, reports on improvements to data access and details efforts to further expand the taxonomic representation of the collection. We also highlight diverse functional curation initiatives that support multiple uses of RefSeq data including taxonomic validation, genome annotation, comparative genomics, and clinical testing. We summarize our approach to utilizing available RNA-Seq and other data types in our manual curation process for vertebrate, plant, and other species, and describe a new direction for prokaryotic genomes and protein name management.


Assuntos
Bases de Dados Genéticas , Genômica , Animais , Bovinos , Perfilação da Expressão Gênica , Genoma Fúngico , Genoma Humano , Genoma Microbiano , Genoma de Planta , Genoma Viral , Genômica/normas , Humanos , Invertebrados/genética , Camundongos , Anotação de Sequência Molecular , Nematoides/genética , Filogenia , RNA Longo não Codificante/genética , Ratos , Padrões de Referência , Análise de Sequência de Proteína , Análise de Sequência de RNA , Vertebrados/genética
14.
Nucleic Acids Res ; 43(Database issue): D213-21, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25428371

RESUMO

The InterPro database (http://www.ebi.ac.uk/interpro/) is a freely available resource that can be used to classify sequences into protein families and to predict the presence of important domains and sites. Central to the InterPro database are predictive models, known as signatures, from a range of different protein family databases that have different biological focuses and use different methodological approaches to classify protein families and domains. InterPro integrates these signatures, capitalizing on the respective strengths of the individual databases, to produce a powerful protein classification resource. Here, we report on the status of InterPro as it enters its 15th year of operation, and give an overview of new developments with the database and its associated Web interfaces and software. In particular, the new domain architecture search tool is described and the process of mapping of Gene Ontology terms to InterPro is outlined. We also discuss the challenges faced by the resource given the explosive growth in sequence data in recent years. InterPro (version 48.0) contains 36,766 member database signatures integrated into 26,238 InterPro entries, an increase of over 3993 entries (5081 signatures), since 2012.


Assuntos
Bases de Dados de Proteínas , Proteínas/classificação , Bactérias/metabolismo , Ontologia Genética , Estrutura Terciária de Proteína , Proteínas/genética , Análise de Sequência de Proteína , Software
15.
Nucleic Acids Res ; 42(14): e111, 2014 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-24914053

RESUMO

Toward achieving rapid and large scale genome modification directly in a target organism, we have developed a new genome engineering strategy that uses a combination of bioinformatics aided design, large synthetic DNA and site-specific recombinases. Using Cre recombinase we swapped a target 126-kb segment of the Escherichia coli genome with a 72-kb synthetic DNA cassette, thereby effectively eliminating over 54 kb of genomic DNA from three non-contiguous regions in a single recombination event. We observed complete replacement of the native sequence with the modified synthetic sequence through the action of the Cre recombinase and no competition from homologous recombination. Because of the versatility and high-efficiency of the Cre-lox system, this method can be used in any organism where this system is functional as well as adapted to use with other highly precise genome engineering systems. Compared to present-day iterative approaches in genome engineering, we anticipate this method will greatly speed up the creation of reduced, modularized and optimized genomes through the integration of deletion analyses data, transcriptomics, synthetic biology and site-specific recombination.


Assuntos
Engenharia Genética/métodos , Recombinação Genética , Deleção Cromossômica , DNA/biossíntese , Escherichia coli/genética , Genoma Bacteriano , Genômica/métodos , Integrases/metabolismo , Biologia Sintética/métodos
16.
J Bacteriol ; 198(5): 808-15, 2015 Dec 28.
Artigo em Inglês | MEDLINE | ID: mdl-26712937

RESUMO

UNLABELLED: For years, the S-layer glycoprotein (SLG), the sole component of many archaeal cell walls, was thought to be anchored to the cell surface by a C-terminal transmembrane segment. Recently, however, we demonstrated that the Haloferax volcanii SLG C terminus is removed by an archaeosortase (ArtA), a novel peptidase. SLG, which was previously shown to be lipid modified, contains a C-terminal tripartite structure, including a highly conserved proline-glycine-phenylalanine (PGF) motif. Here, we demonstrate that ArtA does not process an SLG variant where the PGF motif is replaced with a PFG motif (slg(G796F,F797G)). Furthermore, using radiolabeling, we show that SLG lipid modification requires the PGF motif and is ArtA dependent, lending confirmation to the use of a novel C-terminal lipid-mediated protein-anchoring mechanism by prokaryotes. Similar to the case for the ΔartA strain, the growth, cellular morphology, and cell wall of the slg(G796F,F797G) strain, in which modifications of additional H. volcanii ArtA substrates should not be altered, are adversely affected, demonstrating the importance of these posttranslational SLG modifications. Our data suggest that ArtA is either directly or indirectly involved in a novel proteolysis-coupled, covalent lipid-mediated anchoring mechanism. Given that archaeosortase homologs are encoded by a broad range of prokaryotes, it is likely that this anchoring mechanism is widely conserved. IMPORTANCE: Prokaryotic proteins bound to cell surfaces through intercalation, covalent attachment, or protein-protein interactions play critical roles in essential cellular processes. Unfortunately, the molecular mechanisms that anchor proteins to archaeal cell surfaces remain poorly characterized. Here, using the archaeon H. volcanii as a model system, we report the first in vivo studies of a novel protein-anchoring pathway involving lipid modification of a peptidase-processed C terminus. Our findings not only yield important insights into poorly understood aspects of archaeal biology but also have important implications for key bacterial species, including those of the human microbiome. Additionally, insights may facilitate industrial applications, given that photosynthetic cyanobacteria encode uncharacterized homologs of this evolutionarily conserved enzyme, or may spur development of unique drug delivery systems.


Assuntos
Proteínas Arqueais/metabolismo , Haloferax volcanii/metabolismo , Lipídeos/química , Glicoproteínas de Membrana/metabolismo , Peptídeo Hidrolases/metabolismo , Motivos de Aminoácidos , Proteínas Arqueais/química , Proteínas Arqueais/genética , Membrana Celular , Regulação da Expressão Gênica em Archaea/fisiologia , Regulação Enzimológica da Expressão Gênica/fisiologia , Glicina/química , Haloferax volcanii/citologia , Haloferax volcanii/genética , Metabolismo dos Lipídeos , Glicoproteínas de Membrana/genética , Fenilalanina/química , Prolina/química
18.
Nucleic Acids Res ; 41(Database issue): D387-95, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23197656

RESUMO

TIGRFAMs, available online at http://www.jcvi.org/tigrfams is a database of protein family definitions. Each entry features a seed alignment of trusted representative sequences, a hidden Markov model (HMM) built from that alignment, cutoff scores that let automated annotation pipelines decide which proteins are members, and annotations for transfer onto member proteins. Most TIGRFAMs models are designated equivalog, meaning they assign a specific name to proteins conserved in function from a common ancestral sequence. Models describing more functionally heterogeneous families are designated subfamily or domain, and assign less specific but more widely applicable annotations. The Genome Properties database, available at http://www.jcvi.org/genome-properties, specifies how computed evidence, including TIGRFAMs HMM results, should be used to judge whether an enzymatic pathway, a protein complex or another type of molecular subsystem is encoded in a genome. TIGRFAMs and Genome Properties content are developed in concert because subsystems reconstruction for large numbers of genomes guides selection of seed alignment sequences and cutoff values during protein family construction. Both databases specialize heavily in bacterial and archaeal subsystems. At present, 4284 models appear in TIGRFAMs, while 628 systems are described by Genome Properties. Content derives both from subsystem discovery work and from biocuration of the scientific literature.


Assuntos
Bases de Dados de Proteínas , Proteínas/classificação , Proteínas/genética , Genoma Arqueal , Genoma Bacteriano , Genômica/métodos , Internet , Cadeias de Markov , Anotação de Sequência Molecular , Proteínas/fisiologia , Alinhamento de Sequência
19.
PLoS Genet ; 8(4): e1002626, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22511878

RESUMO

Biofilms are dense microbial communities. Although widely distributed and medically important, how biofilm cells interact with one another is poorly understood. Recently, we described a novel process whereby myxobacterial biofilm cells exchange their outer membrane (OM) lipoproteins. For the first time we report here the identification of two host proteins, TraAB, required for transfer. These proteins are predicted to localize in the cell envelope; and TraA encodes a distant PA14 lectin-like domain, a cysteine-rich tandem repeat region, and a putative C-terminal protein sorting tag named MYXO-CTERM, while TraB encodes an OmpA-like domain. Importantly, TraAB are required in donors and recipients, suggesting bidirectional transfer. By use of a lipophilic fluorescent dye, we also discovered that OM lipids are exchanged. Similar to lipoproteins, dye transfer requires TraAB function, gliding motility and a structured biofilm. Importantly, OM exchange was found to regulate swarming and development behaviors, suggesting a new role in cell-cell communication. A working model proposes TraA is a cell surface receptor that mediates cell-cell adhesion for OM fusion, in which lipoproteins/lipids are transferred by lateral diffusion. We further hypothesize that cell contact-dependent exchange helps myxobacteria to coordinate their social behaviors.


Assuntos
Proteínas da Membrana Bacteriana Externa/genética , Comunicação Celular , Membrana Celular , Metabolismo dos Lipídeos , Myxococcus xanthus/genética , Proteínas da Membrana Bacteriana Externa/metabolismo , Proteínas de Bactérias/genética , Biofilmes/crescimento & desenvolvimento , Adesão Celular/genética , Comunicação Celular/genética , Membrana Celular/genética , Membrana Celular/metabolismo , Metabolismo dos Lipídeos/genética , Proteínas Motores Moleculares/genética , Myxococcus xanthus/citologia , Conformação Proteica , Transporte Proteico/genética
20.
Mol Microbiol ; 88(6): 1164-75, 2013 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-23651326

RESUMO

Cell surfaces are decorated by a variety of proteins that facilitate interactions with their environments and support cell stability. These secreted proteins are anchored to the cell by mechanisms that are diverse, and, in archaea, poorly understood. Recently published in silico data suggest that in some species a subset of secreted euryarchaeal proteins, which includes the S-layer glycoprotein, is processed and covalently linked to the cell membrane by enzymes referred to as archaeosortases. In silico work led to the proposal that an independent, sortase-like system for proteolysis-coupled, carboxy-terminal lipid modification exists in bacteria (exosortase) and archaea (archaeosortase). Here, we provide the first in vivo characterization of an archaeosortase in the haloarchaeal model organism Haloferax volcanii. Deletion of the artA gene (HVO_0915) resulted in multiple biological phenotypes: (a) poor growth, especially under low-salt conditions, (b) alterations in cell shape and the S-layer, (c) impaired motility, suppressors of which still exhibit poor growth, and (d) impaired conjugation. We studied one of the ArtA substrates, the S-layer glycoprotein, using detailed proteomic analysis. While the carboxy-terminal region of S-layer glycoproteins, consisting of a putative threonine-rich O-glycosylated region followed by a hydrophobic transmembrane helix, has been notoriously resistant to any proteomic peptide identification, we were able to identify two overlapping peptides from the transmembrane domain present in the ΔartA strain but not in the wild-type strain. This clearly shows that ArtA is involved in carboxy-terminal post-translational processing of the S-layer glycoprotein. As it is known from previous studies that a lipid is covalently attached to the carboxy-terminal region of the S-layer glycoprotein, our data strongly support the conclusion that archaeosortase functions analogously to sortase, mediating proteolysis-coupled, covalent cell surface attachment.


Assuntos
Conjugação Genética , Endopeptidases/metabolismo , Haloferax volcanii/enzimologia , Haloferax volcanii/fisiologia , Locomoção , Glicoproteínas de Membrana/metabolismo , Endopeptidases/genética , Deleção de Genes , Haloferax volcanii/genética , Haloferax volcanii/crescimento & desenvolvimento , Processamento de Proteína Pós-Traducional
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa