Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 102
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Microbiol Mol Biol Rev ; 61(4): 393-410, 1997 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-9409145

RESUMO

The ArC/XylS family of prokaryotic positive transcriptional regulators includes more than 100 proteins and polypeptides derived from open reading frames translated from DNA sequences. Members of this family are widely distributed and have been found in the gamma subgroup of the proteobacteria, low- and high-G + C-content gram-positive bacteria, and cyanobacteria. These proteins are defined by a profile that can be accessed from PROSITE PS01124. Members of the family are about 300 amino acids long and have three main regulatory functions in common: carbon metabolism, stress response, and pathogenesis. Multiple alignments of the proteins of the family define a conserved stretch of 99 amino acids usually located at the C-terminal region of the regulator and connected to a nonconserved region via a linker. The conserved stretch contains all the elements required to bind DNA target sequences and to activate transcription from cognate promoters. Secondary analysis of the conserved region suggests that it contains two potential alpha-helix-turn-alpha-helix DNA binding motifs. The first, and better-fitting motif is supported by biochemical data, whereas existing biochemical data neither support nor refute the proposal that the second region possesses this structure. The phylogenetic relationship suggests that members of the family have recruited the nonconserved domain(s) into a series of existing domains involved in DNA recognition and transcription stimulation and that this recruited domain governs the role that the regulator carries out. For some regulators, it has been demonstrated that the nonconserved region contains the dimerization domain. For the regulators involved in carbon metabolism, the effector binding determinants are also in this region. Most regulators belonging to the AraC/XylS family recognize multiple binding sites in the regulated promoters. One of the motifs usually overlaps or is adjacent to the -35 region of the cognate promoters. Footprinting assays have suggested that these regulators protect a stretch of up to 20 bp in the target promoters, and multiple alignments of binding sites for a number of regulators have shown that the proteins recognize short motifs within the protected region.


Assuntos
Transativadores/classificação , Transativadores/genética , Sequência de Aminoácidos , Proteínas de Bactérias , Proteínas de Ligação a DNA , Expressão Gênica , Genes araC , Dados de Sequência Molecular , Filogenia , Alinhamento de Sequência , Transativadores/fisiologia
2.
Nucleic Acids Res ; 31(13): 3822-3, 2003 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-12824428

RESUMO

NEWT is a new taxonomy portal to the SWISS-PROT protein sequence knowledgebase. It contains taxonomy data, which is updated daily, for the complete set of species represented in SWISS-PROT, as well as those stored at the NCBI. Users can navigate through the taxonomy tree and access corresponding SWISS-PROT protein entries. In addition, a manually curated selection of external links allows access to specific information on selected species. NEWT is available at http://www.ebi.ac.uk/newt/.


Assuntos
Classificação , Bases de Dados de Proteínas , Internet , Integração de Sistemas , Interface Usuário-Computador
3.
Nucleic Acids Res ; 29(1): 37-40, 2001 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-11125043

RESUMO

Signature databases are vital tools for identifying distant relationships in novel sequences and hence for inferring protein function. InterPro is an integrated documentation resource for protein families, domains and functional sites, which amalgamates the efforts of the PROSITE, PRINTS, Pfam and ProDom database projects. Each InterPro entry includes a functional description, annotation, literature references and links back to the relevant member database(s). Release 2.0 of InterPro (October 2000) contains over 3000 entries, representing families, domains, repeats and sites of post-translational modification encoded by a total of 6804 different regular expressions, profiles, fingerprints and Hidden Markov Models. Each InterPro entry lists all the matches against SWISS-PROT and TrEMBL (more than 1,000,000 hits from 462,500 proteins in SWISS-PROT and TrEMBL). The database is accessible for text- and sequence-based searches at http://www.ebi.ac.uk/interpro/. Questions can be emailed to interhelp@ebi.ac.uk.


Assuntos
Bases de Dados Factuais , Proteínas , Serviços de Informação , Internet , Estrutura Terciária de Proteína , Proteínas/química , Proteínas/genética
4.
Artigo em Inglês | MEDLINE | ID: mdl-27055825

RESUMO

Ion channels are transmembrane proteins that selectively allow ions to flow across the plasma membrane and play key roles in diverse biological processes. A multitude of diseases, called channelopathies, such as epilepsies, muscle paralysis, pain syndromes, cardiac arrhythmias or hypoglycemia are due to ion channel mutations. A wide corpus of literature is available on ion channels, covering both their functions and their roles in disease. The research community needs to access this data in a user-friendly, yet systematic manner. However, extraction and integration of this increasing amount of data have been proven to be difficult because of the lack of a standardized vocabulary that describes the properties of ion channels at the molecular level. To address this, we have developed Ion Channel ElectroPhysiology Ontology (ICEPO), an ontology that allows one to annotate the electrophysiological parameters of the voltage-gated class of ion channels. This ontology is based on a three-state model of ion channel gating describing the three conformations/states that an ion channel can adopt: closed, open and inactivated. This ontology supports the capture of voltage-gated ion channel electrophysiological data from the literature in a structured manner and thus enables other applications such as querying and reasoning tools. Here, we present ICEPO (ICEPO ftp site:ftp://ftp.nextprot.org/pub/current_release/controlled_vocabularies/), as well as examples of its use.


Assuntos
Bases de Dados como Assunto , Eletrofisiologia , Ontologia Genética , Canais Iônicos/metabolismo , Humanos , Ativação do Canal Iônico , Modelos Biológicos , Anotação de Sequência Molecular , Mutação/genética
5.
J Mol Biol ; 289(3): 645-57, 1999 Jun 11.
Artigo em Inglês | MEDLINE | ID: mdl-10356335

RESUMO

The availability of genome sequences, affordable mass spectrometers and high-resolution two-dimensional gels has made possible the identification of hundreds of proteins from many organisms by peptide mass fingerprinting. However, little attention has been paid to how information generated by these means can be utilised for detailed protein characterisation. Here we present an approach for the systematic characterisation of proteins using mass spectrometry and a software tool FindMod. This tool, available on the internet at http://www.expasy.ch/sprot/findmod.html , examines peptide mass fingerprinting data for mass differences between empirical and theoretical peptides. Where mass differences correspond to a post-translational modification, intelligent rules are applied to predict the amino acids in the peptide, if any, that might carry the modification. FindMod rules were constructed by examining 5153 incidences of post-translational modifications documented in the SWISS-PROT database, and for the 22 post-translational modifications currently considered (acetylation, amidation, biotinylation, C-mannosylation, deamidation, flavinylation, farnesylation, formylation, geranyl-geranylation, gamma-carboxyglutamic acids, hydroxylation, lipoylation, methylation, myristoylation, N -acyl diglyceride (tripalmitate), O-GlcNAc, palmitoylation, phosphorylation, pyridoxal phosphate, phospho-pantetheine, pyrrolidone carboxylic acid, sulphation) a total of 29 different rules were made. These consider which amino acids can carry a modification, whether the modification occurs on N-terminal, C-terminal or internal amino acids, and the type of organisms on which the modification can be found. We illustrate the utility of the approach with proteins from 2-D gels of Escherichia coli and sheep wool, where post-translational modifications predicted by FindMod were confirmed by MALDI post-source decay peptide fragmentation. As the approach is amenable to automation, it presents a potentially large-scale means of protein characterisation in proteome projects.


Assuntos
Peroxidases , Processamento de Proteína Pós-Traducional , Software , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz/métodos , Acetilação , Amidas/metabolismo , Sequência de Aminoácidos , Substituição de Aminoácidos , Cisteína/metabolismo , Escherichia coli/química , Processamento de Imagem Assistida por Computador , Queratinas/metabolismo , Lisina/metabolismo , Metionina/análogos & derivados , Metionina/metabolismo , Metilação , Dados de Sequência Molecular , Oxirredutases/metabolismo , Fator Tu de Elongação de Peptídeos/metabolismo , Mapeamento de Peptídeos , Peroxirredoxinas , Fenilalanina , Especificidade da Espécie , Tirosina
6.
J Mol Biol ; 278(3): 599-608, 1998 May 08.
Artigo em Inglês | MEDLINE | ID: mdl-9600841

RESUMO

Genome sequences are available for increasing numbers of organisms. The proteomes (protein complement expressed by the genome) of many such organisms are being studied with two-dimensional (2D) gel electrophoresis. Here we have investigated the application of short N-terminal and C-terminal sequence tags to the identification of proteins separated on 2D gels. The theoretical N and C termini of 15, 519 proteins, representing all SWISS-PROT entries for the organisms Mycoplasma genitalium, Bacillus subtilis, Escherichia coli, Saccharomyces cerevisiae and human, were analysed. Sequence tags were found to be surprisingly specific, with N-terminal tags of four amino acid residues found to be unique for between 43% and 83% of proteins, and C-terminal tags of four amino acid residues unique for between 74% and 97% of proteins, depending on the species studied. Sequence tags of five amino acid residues were found to be even more specific. To utilise this specificity of sequence tags for protein identification, we created a world-wide web-accessible protein identification program, TagIdent (http://www.expasy.ch/www/tools.html), which matches sequence tags of up to six amino acid residues as well as estimated protein pI and mass against proteins in the SWISS-PROT database. We demonstrate the utility of this identification approach with sequence tags generated from 91 different E. coli proteins purified by 2D gel electrophoresis. Fifty-one proteins were unambiguously identified by virtue of their sequence tags and estimated pI and mass, and a further 11 proteins identified when sequence tags were combined with protein amino acid composition data. We conlcude that the TagIdent identification approach is best suited to the identification of proteins from prokaryotes whose complete genome sequences are available. The approach is less well suited to proteins from eukaryotes, as many eukaryotic proteins are not amenable to sequencing via Edman degradation, and tag protein identification cannot be unambiguous unless an organism's complete sequence is available.


Assuntos
Sequência de Aminoácidos , Cisteína Endopeptidases/genética , Bases de Dados Factuais , Complexos Multienzimáticos/genética , Proteínas/química , Proteínas/genética , Sitios de Sequências Rotuladas , Bacillus subtilis/genética , Eletroforese em Gel Bidimensional , Eletroforese em Gel de Poliacrilamida , Humanos , Dados de Sequência Molecular , Mycoplasma/genética , Biblioteca de Peptídeos , Complexo de Endopeptidases do Proteassoma , Saccharomyces cerevisiae/genética , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos
7.
Trends Biotechnol ; 19(5): 178-81, 2001 May.
Artigo em Inglês | MEDLINE | ID: mdl-11301130

RESUMO

The availability of the human genome sequence has enabled the exploration and exploitation of the human genome and proteome to begin. Research has now focussed on the annotation of the genome and in particular of the proteome. With expert annotation extracted from the literature by biologists as the foundation, it has been possible to expand into the areas of data mining and automatic annotation. With further development and integration of pattern recognition methods and the application of alignments clustering, proteome analysis can now be provided in a meaningful way. These various approaches have been integrated to attach, extract and combine as much relevant information as possible to the proteome. This resource should be valuable to users from both research and industry.


Assuntos
Bases de Dados Factuais , Genoma , Proteínas/química , Algoritmos , Humanos , Internet , Modelos Estatísticos
8.
Protein Sci ; 7(8): 1829-35, 1998 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-10082381

RESUMO

Sequence analysis of the probable archaeal phosphoglycerate mutase resulted in the identification of a superfamily of metalloenzymes with similar metal-binding sites and predicted conserved structural fold. This superfamily unites alkaline phosphatase, N-acetylgalactosamine-4-sulfatase, and cerebroside sulfatase, enzymes with known three-dimensional structures, with phosphopentomutase, 2,3-bisphosphoglycerate-independent phosphoglycerate mutase, phosphoglycerol transferase, phosphonate monoesterase, streptomycin-6-phosphate phosphatase, alkaline phosphodiesterase/nucleotide pyrophosphatase PC-1, and several closely related sulfatases. In addition to the metal-binding motifs, all these enzymes contain a set of conserved amino acid residues that are likely to be required for the enzymatic activity. Mutational changes in the vicinity of these residues in several sulfatases cause mucopolysaccharidosis (Hunter, Maroteaux-Lamy, Morquio, and Sanfilippo syndromes) and metachromatic leucodystrophy.


Assuntos
Fosfatase Alcalina/classificação , Metaloendopeptidases/classificação , Fosfoglicerato Mutase/classificação , Fosfotransferases/classificação , Sulfatases/classificação , Sequência de Aminoácidos , Simulação por Computador , Bases de Dados Factuais , Humanos , Concentração de Íons de Hidrogênio , Modelos Moleculares , Dados de Sequência Molecular , Filogenia , Homologia de Sequência de Aminoácidos
9.
Protein Sci ; 3(5): 853-6, 1994 May.
Artigo em Inglês | MEDLINE | ID: mdl-8061614

RESUMO

Two families of deaminases, one specific for cytidine, the other for deoxycytidylate, are shown to possess a novel zinc-binding motif, here designated ZBS. We have (1) identified the protein members of these 2 families, (2) carried out sequence analyses that allow specification of this zinc-binding motif, and (3) determined signature sequences that will allow identification of additional members of these families as their sequences become available.


Assuntos
Citidina Desaminase/química , DCMP Desaminase/química , Sequência de Aminoácidos , Animais , Bacillus/enzimologia , Bacillus/genética , Sítios de Ligação , Caenorhabditis elegans/enzimologia , Caenorhabditis elegans/genética , Citidina Desaminase/genética , Citidina Desaminase/metabolismo , DCMP Desaminase/genética , DCMP Desaminase/metabolismo , Escherichia coli/enzimologia , Escherichia coli/genética , Humanos , Dados de Sequência Molecular , Saccharomyces cerevisiae/enzimologia , Saccharomyces cerevisiae/genética , Homologia de Sequência de Aminoácidos , Zinco/metabolismo
10.
Pharmacogenetics ; 9(4): 421-34, 1999 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-10780262

RESUMO

As currently being performed with an increasing number of superfamilies, a standardized gene nomenclature system is proposed here, based on divergent evolution, using multiple alignment analysis of all 86 eukaryotic aldehyde dehydrogenase (ALDH) amino-acid sequences known at this time. The ALDHs represent a superfamily of NAD(P)(+)-dependent enzymes having similar primary structures that oxidize a wide spectrum of endogenous and exogenous aliphatic and aromatic aldehydes. To date, a total of 54 animal, 15 plant, 14 yeast, and three fungal ALDH genes or cDNAs have been sequenced. These ALDHs can be divided into a total of 18 families (comprising 37 subfamilies), and all nonhuman ALDH genes are named here after the established human ALDH genes, when possible. An ALDH protein from one gene family is defined as having approximately < or = 40% amino-acid identity to that from another family. Two members of the same subfamily exhibit approximately > or = 60% amino-acid identity and are expected to be located at the same subchromosomal site. For naming each gene, it is proposed that the root symbol 'ALDH' denoting 'aldehyde dehydrogenase' be followed by an Arabic number representing the family and, when needed, a letter designating the subfamily and an Arabic number denoting the individual gene within the subfamily; all letters are capitalized in all mammals except mouse and fruit fly, e.g. 'human ALDH3A1 (mouse, Drosophila Aldh3a1).' It is suggested that the Human Gene Nomenclature Guidelines (http://++www.gene.ucl.ac.uk/nomenclature/guidelines.h tml) be used for all species other than mouse and Drosophila. Following these guidelines, the gene is italicized, whereas the corresponding cDNA, mRNA, protein or enzyme activity is written with upper-case letters and without italics, e.g. 'human, mouse or Drosophila ALDH3A1 cDNA, mRNA, or activity'. If an orthologous gene between species cannot be identified with certainty, sequential naming of these genes will be carried out in chronological order as they are reported to us. In addition, 20 human ALDH variant alleles that have been reported to date are listed herein and are recommended to be given numbers (or a number plus a capital letter) following an asterisk (e.g. 'ALDH3A2*2, ALDH2*4C'). It is anticipated that this eukaryotic ALDH gene nomenclature system will be extended to include bacterial genes within the next 2 years and that this nomenclature system will require updating on a regular basis; an ALDH Web site has been established for this purpose (http://++www.uchsc.edu/sp./sp./alcdbase/a ldhcov.html) and will serve as a medium for interaction amongst colleagues in this field.


Assuntos
Aldeído Desidrogenase/genética , Mapeamento Cromossômico , Evolução Molecular , Polimorfismo Genético , Sequência de Aminoácidos , Animais , Células Eucarióticas/enzimologia , Humanos , Camundongos , Dados de Sequência Molecular , Homologia de Sequência de Aminoácidos , Terminologia como Assunto
11.
Pharmacogenetics ; 7(4): 255-69, 1997 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-9295054

RESUMO

This review represents an update of the nomenclature system for the UDP glucuronosyltransferase gene superfamily, which is based on divergent evolution. Since the previous review in 1991, sequences of many related UDP glycosyltransferases from lower organisms have appeared in the database, which expand our database considerably. At latest count, in animals, yeast, plants and bacteria there are 110 distinct cDNAs/genes whose protein products all contain a characteristic 'signature sequence' and, thus, are regarded as members of the same superfamily. Comparison of a relatedness tree of proteins leads to the definition of 33 families. It should be emphasized that at least six cloned UDP-GlcNAc N-acetylglucosaminyltransferases are not sufficiently homologous to be included as members of this superfamily and may represent an example of convergent evolution. For naming each gene, it is recommended that the root symbol UGT for human (Ugt for mouse and Drosophila), denoting 'UDP glycosyltransferase,' be followed by an Arabic number representing the family, a letter designating the subfamily, and an Arabic numeral denoting the individual gene within the family or subfamily, e.g. 'human UGT2B4' and 'mouse Ugt2b5'. We recommend the name 'UDP glycosyltransferase' because many of the proteins do not preferentially use UDP glucuronic acid, or their nucleotide sugar preference is unknown. Whereas the gene is italicized, the corresponding cDNA, transcript, protein and enzyme activity should be written with upper-case letters and without italics, e.g. 'human or mouse UGT1A1.' The UGT1 gene (spanning > 500 kb) contains at least 12 promoters/first exons, which can be spliced and joined with common exons 2 through 5, leading to different N-terminal halves but identical C-terminal halves of the gene products; in this scheme each first exon is regarded as a distinct gene (e.g. UGT1A1, UGT1A2, ... UGT1A12). When an orthologous gene between species cannot be identified with certainty, as occurs in the UGT2B subfamily, sequential naming of the genes is being carried out chronologically as they become characterized. We suggest that the Human Gene Nomenclature Guidelines (http://www.gene.acl.ac.uk/nomenclature/guidelines.html++ +) be used for all species other than the mouse and Drosophila. Thirty published human UGT1A1 mutant alleles responsible for clinical hyperbilirubinemias are listed herein, and given numbers following an asterisk (e.g. UGT1A1*30) consistent with the Human Gene Nomenclature Guidelines. It is anticipated that this UGT gene nomenclature system will require updating on a regular basis.


Assuntos
Evolução Molecular , Genes , Glucuronosiltransferase/genética , Família Multigênica , Terminologia como Assunto , Sequência de Aminoácidos , Animais , Glucuronosiltransferase/química , Humanos , Dados de Sequência Molecular , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos
12.
FEBS Lett ; 242(2): 211-4, 1989 Jan 02.
Artigo em Inglês | MEDLINE | ID: mdl-2914602

RESUMO

The primary sequence motif HExxH has been found in many zinc-dependent endopeptidases. We show that a larger signature comprising this sequence is common to most of the known zinc-dependent endopeptidases, and that the presence of the signature can be indicative of membership in the family. A search of the protein sequence databases for entries containing the signature retrieved several unexpected potential zinc endopeptidases.


Assuntos
Metaloendopeptidases , Sequência de Aminoácidos , Animais , Humanos , Relação Estrutura-Atividade
13.
Biotechniques ; 6(6): 566-72, 1988 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-3273189

RESUMO

This paper describes a series of protein analyses using the molecular biology software package PC/GENE, which runs on an IBM or compatible microcomputer. A nucleic acid sequence was first edited and then translated into an amino acid sequence. The amino acid composition, isoelectric point, molecular weight, and other properties of the sequence were determined. Programs to predict secondary structure, alpha helix membrane associations, hydrophobic and hydrophilic regions, and surface and antigenic sites from the amino acid sequence were also used. A search was made in a data base for sequences containing a region similar to a region in the protein sequence. Sequence alignments and queries of data bases can also be performed.


Assuntos
Ácidos Nucleicos/análise , Proteínas/análise , Software , Sequência de Aminoácidos , Sequência de Bases , Biotecnologia , Sistemas de Informação , Conformação Proteica , Homologia de Sequência do Ácido Nucleico
15.
Methods Inf Med ; 34(1-2): 75-8, 1995 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-9082141

RESUMO

The sharing of knowledge worldwide using hypermedia facilities and fast communication protocols (i.e., Mosaic and World Wide Web) provides a growth capacity with tremendous versatility and efficacy. The example of ExPASy, a molecular biology server developed at the University Hospital of Geneva, is striking. ExPASy provides hypermedia facilities to browse through several up-to-date biological and medical databases around the world and to link information from protein maps to genome information and diseases. Its extensive access is open through World Wide Web. Its concept could be extended to patient data including texts, laboratory data, relevant literature findings, sounds, images and movies. A new hypermedia culture is spreading very rapidly where the international fast transmission of documents is the central element. It is part of the emerging new "information society".


Assuntos
Redes de Comunicação de Computadores , Biologia Molecular , Inteligência Artificial , Bases de Dados Factuais , Difusão de Inovações , Humanos , Transferência de Tecnologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA