Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
J Ind Microbiol Biotechnol ; 50(1)2023 Feb 17.
Artigo em Inglês | MEDLINE | ID: mdl-37656881

RESUMO

Biomanufacturing could contribute as much as ${\$}$30 trillion to the global economy by 2030. However, the success of the growing bioeconomy depends on our ability to manufacture high-performing strains in a time- and cost-effective manner. The Design-Build-Test-Learn (DBTL) framework has proven to be an effective strain engineering approach. Significant improvements have been made in genome engineering, genotyping, and phenotyping throughput over the last couple of decades that have greatly accelerated the DBTL cycles. However, to achieve a radical reduction in strain development time and cost, we need to look at the strain engineering process through a lens of optimizing the whole cycle, as opposed to simply increasing throughput at each stage. We propose an approach that integrates all 4 stages of the DBTL cycle and takes advantage of the advances in computational design, high-throughput genome engineering, and phenotyping methods, as well as machine learning tools for making predictions about strain scale-up performance. In this perspective, we discuss the challenges of industrial strain engineering, outline the best approaches to overcoming these challenges, and showcase examples of successful strain engineering projects for production of heterologous proteins, amino acids, and small molecules, as well as improving tolerance, fitness, and de-risking the scale-up of industrial strains.

3.
Nat Commun ; 14(1): 241, 2023 01 16.
Artigo em Inglês | MEDLINE | ID: mdl-36646716

RESUMO

Deep mutational scanning is a powerful approach to investigate a wide variety of research questions including protein function and stability. Here, we perform deep mutational scanning on three essential E. coli proteins (FabZ, LpxC and MurA) involved in cell envelope synthesis using high-throughput CRISPR genome editing, and study the effect of the mutations in their original genomic context. We use more than 17,000 variants of the proteins to interrogate protein function and the importance of individual amino acids in supporting viability. Additionally, we exploit these libraries to study resistance development against antimicrobial compounds that target the selected proteins. Among the three proteins studied, MurA seems to be the superior antimicrobial target due to its low mutational flexibility, which decreases the chance of acquiring resistance-conferring mutations that simultaneously preserve MurA function. Additionally, we rank anti-LpxC lead compounds for further development, guided by the number of resistance-conferring mutations against each compound. Our results show that deep mutational scanning studies can be used to guide drug development, which we hope will contribute towards the development of novel antimicrobial therapies.


Assuntos
Antibacterianos , Proteínas de Escherichia coli , Antibacterianos/farmacologia , Antibacterianos/química , Proteínas de Bactérias/metabolismo , Escherichia coli/metabolismo , Mutação , Proteínas de Escherichia coli/genética , Proteínas de Escherichia coli/farmacologia
4.
Nucleic Acids Res ; 36(Database issue): D943-6, 2008 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-17933772

RESUMO

The Generation Challenge Programme (GCP; www.generationcp.org) has developed an online resource documenting stress-responsive genes comparatively across plant species. This public resource is a compendium of protein families, phylogenetic trees, multiple sequence alignments (MSA) and associated experimental evidence. The central objective of this resource is to elucidate orthologous and paralogous relationships between plant genes that may be involved in response to environmental stress, mainly abiotic stresses such as water deficit ('drought'). The web-based graphical user interface (GUI) of the resource includes query and visualization tools that allow diverse searches and browsing of the underlying project database. The web interface can be accessed at http://dayhoff.generationcp.org.


Assuntos
Produtos Agrícolas/genética , Bases de Dados Genéticas , Genes de Plantas , Produtos Agrícolas/metabolismo , Desidratação , Meio Ambiente , Perfilação da Expressão Gênica , Internet , Filogenia , Proteínas de Plantas/química , Proteínas de Plantas/classificação , Alinhamento de Sequência , Interface Usuário-Computador
5.
PLoS Comput Biol ; 3(8): e160, 2007 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-17708678

RESUMO

Function prediction by homology is widely used to provide preliminary functional annotations for genes for which experimental evidence of function is unavailable or limited. This approach has been shown to be prone to systematic error, including percolation of annotation errors through sequence databases. Phylogenomic analysis avoids these errors in function prediction but has been difficult to automate for high-throughput application. To address this limitation, we present a computationally efficient pipeline for phylogenomic classification of proteins. This pipeline uses the SCI-PHY (Subfamily Classification in Phylogenomics) algorithm for automatic subfamily identification, followed by subfamily hidden Markov model (HMM) construction. A simple and computationally efficient scoring scheme using family and subfamily HMMs enables classification of novel sequences to protein families and subfamilies. Sequences representing entirely novel subfamilies are differentiated from those that can be classified to subfamilies in the input training set using logistic regression. Subfamily HMM parameters are estimated using an information-sharing protocol, enabling subfamilies containing even a single sequence to benefit from conservation patterns defining the family as a whole or in related subfamilies. SCI-PHY subfamilies correspond closely to functional subtypes defined by experts and to conserved clades found by phylogenetic analysis. Extensive comparisons of subfamily and family HMM performances show that subfamily HMMs dramatically improve the separation between homologous and non-homologous proteins in sequence database searches. Subfamily HMMs also provide extremely high specificity of classification and can be used to predict entirely novel subtypes. The SCI-PHY Web server at http://phylogenomics.berkeley.edu/SCI-PHY/ allows users to upload a multiple sequence alignment for subfamily identification and subfamily HMM construction. Biologists wishing to provide their own subfamily definitions can do so. Source code is available on the Web page. The Berkeley Phylogenomics Group PhyloFacts resource contains pre-calculated subfamily predictions and subfamily HMMs for more than 40,000 protein families and domains at http://phylogenomics.berkeley.edu/phylofacts/.


Assuntos
Algoritmos , Inteligência Artificial , Reconhecimento Automatizado de Padrão/métodos , Proteínas/química , Proteínas/classificação , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Cadeias de Markov , Dados de Sequência Molecular , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
6.
Nucleic Acids Res ; 35(Web Server issue): W27-32, 2007 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-17488835

RESUMO

Phylogenomic analysis addresses the limitations of function prediction based on annotation transfer, and has been shown to enable the highest accuracy in prediction of protein molecular function. The Berkeley Phylogenomics Group provides a series of web servers for phylogenomic analysis: classification of sequences to pre-computed families and subfamilies using the PhyloFacts Phylogenomic Encyclopedia, FlowerPower clustering of proteins sharing the same domain architecture, MUSCLE multiple sequence alignment, SATCHMO simultaneous alignment and tree construction and SCI-PHY subfamily identification. The PhyloBuilder web server provides an integrated phylogenomic pipeline starting with a user-supplied protein sequence, proceeding to homolog identification, multiple alignment, phylogenetic tree construction, subfamily identification and structure prediction. The Berkeley Phylogenomics Group resources are available at http://phylogenomics.berkeley.edu.


Assuntos
Biologia Computacional/métodos , Filogenia , Algoritmos , Animais , Computadores , Bases de Dados Genéticas , Bases de Dados de Proteínas , Humanos , Internet , Modelos Genéticos , Conformação Proteica , Alinhamento de Sequência , Análise de Sequência de Proteína , Software , Interface Usuário-Computador
7.
BMC Evol Biol ; 7 Suppl 1: S12, 2007 Feb 08.
Artigo em Inglês | MEDLINE | ID: mdl-17288570

RESUMO

BACKGROUND: Function prediction by transfer of annotation from the top database hit in a homology search has been shown to be prone to systematic error. Phylogenomic analysis reduces these errors by inferring protein function within the evolutionary context of the entire family. However, accuracy of function prediction for multi-domain proteins depends on all members having the same overall domain structure. By contrast, most common homolog detection methods are optimized for retrieving local homologs, and do not address this requirement. RESULTS: We present FlowerPower, a novel clustering algorithm designed for the identification of global homologs as a precursor to structural phylogenomic analysis. Similar to methods such as PSIBLAST, FlowerPower employs an iterative approach to clustering sequences. However, rather than using a single HMM or profile to expand the cluster, FlowerPower identifies subfamilies using the SCI-PHY algorithm and then selects and aligns new homologs using subfamily hidden Markov models. FlowerPower is shown to outperform BLAST, PSI-BLAST and the UCSC SAM-Target 2K methods at discrimination between proteins in the same domain architecture class and those having different overall domain structures. CONCLUSION: Structural phylogenomic analysis enables biologists to avoid the systematic errors associated with annotation transfer; clustering sequences based on sharing the same domain architecture is a critical first step in this process. FlowerPower is shown to consistently identify homologous sequences having the same domain architecture as the query. AVAILABILITY: FlowerPower is available as a webserver at http://phylogenomics.berkeley.edu/flowerpower/.


Assuntos
Algoritmos , Filogenia , Estrutura Terciária de Proteína , Proteínas/fisiologia , Análise de Sequência de Proteína/métodos , Animais , Análise por Conglomerados , Bases de Dados Genéticas , Humanos , Proteínas/classificação , Projetos de Pesquisa , Alinhamento de Sequência
8.
Genome Biol ; 7(9): R83, 2006.
Artigo em Inglês | MEDLINE | ID: mdl-16973001

RESUMO

The Berkeley Phylogenomics Group presents PhyloFacts, a structural phylogenomic encyclopedia containing almost 10,000 'books' for protein families and domains, with pre-calculated structural, functional and evolutionary analyses. PhyloFacts enables biologists to avoid the systematic errors associated with function prediction by homology through the integration of a variety of experimental data and bioinformatics methods in an evolutionary framework. Users can submit sequences for classification to families and functional subfamilies. PhyloFacts is available as a worldwide web resource from http://phylogenomics.berkeley.edu/phylofacts.


Assuntos
Bases de Dados de Proteínas , Proteínas , Animais , Evolução Molecular , Humanos , Filogenia , Estrutura Terciária de Proteína , Proteínas/química , Proteínas/classificação , Proteínas/genética , Relação Estrutura-Atividade
9.
Plant Physiol ; 138(2): 611-23, 2005 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-15955925

RESUMO

The tomato (Lycopersicon esculentum) Cf-9 resistance gene encodes the first characterized member of the plant receptor-like protein (RLP) family. Other RLPs such as CLAVATA2 and TOO MANY MOUTHS are known to regulate development. The domain structure of RLPs consists of extracellular leucine-rich repeats, a transmembrane helix, and a short cytoplasmic region. Here, we identify 90 RLPs in rice (Oryza sativa) and compare them with functionally characterized RLPs from different plant species and with 56 Arabidopsis (Arabidopsis thaliana) RLPs, including the downy mildew resistance protein RPP27. Many RLPs cluster into four distinct superclades, three of which include RLPs known to be involved in plant defense. Sequence comparisons reveal diagnostic amino acid residues that may specify different molecular functions in different RLP subtypes. This analysis of rice RLPs thus identified at least 73 candidate resistance genes and four genes potentially involved in development. Due to the synteny between rice and other Gramineae, this analysis should provide valuable tools for experimental studies in rice and other cereals.


Assuntos
Arabidopsis/genética , Oryza/genética , Proteínas de Plantas/genética , Receptores de Superfície Celular/genética , Sequência de Aminoácidos , Arabidopsis/química , Proteínas de Arabidopsis/química , Proteínas de Arabidopsis/genética , Sequência Conservada , Regulação da Expressão Gênica de Plantas , Genes de Plantas , Genoma de Planta , Dados de Sequência Molecular , Oryza/química , Filogenia , Proteínas de Plantas/química , Receptores de Superfície Celular/química , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos
10.
Mol Cell Proteomics ; 4(8): 1072-84, 2005 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-15901827

RESUMO

We report an extensive proteome analysis of rice etioplasts, which were highly purified from dark-grown leaves by a novel protocol using Nycodenz density gradient centrifugation. Comparative protein profiling of different cell compartments from leaf tissue demonstrated the purity of the etioplast preparation by the absence of diagnostic marker proteins of other cell compartments. Systematic analysis of the etioplast proteome identified 240 unique proteins that provide new insights into heterotrophic plant metabolism and control of gene expression. They include several new proteins that were not previously known to localize to plastids. The etioplast proteins were compared with proteomes from Arabidopsis chloroplasts and plastid from tobacco Bright Yellow 2 cells. Together with computational structure analyses of proteins without functional annotations, this comparative proteome analysis revealed novel etioplast-specific proteins. These include components of the plastid gene expression machinery such as two RNA helicases, an RNase II-like hydrolytic exonuclease, and a site 2 protease-like metalloprotease all of which were not known previously to localize to the plastid and are indicative for so far unknown regulatory mechanisms of plastid gene expression. All etioplast protein identifications and related data were integrated into a data base that is freely available upon request.


Assuntos
Regulação da Expressão Gênica de Plantas , Oryza/química , Proteínas de Plantas/metabolismo , Plastídeos/química , Sequência de Aminoácidos , Arabidopsis/química , Arabidopsis/genética , Cloroplastos , Biologia Computacional , Eletroforese em Gel Bidimensional , Exonucleases/metabolismo , Espectrometria de Massas , Metaloproteases/metabolismo , Dados de Sequência Molecular , Proteínas de Plantas/análise , Proteínas de Plantas/química , Proteínas de Plantas/classificação , Proteoma/análise , Homologia de Sequência de Aminoácidos , Transdução de Sinais
11.
Pac Symp Biocomput ; : 322-33, 2005.
Artigo em Inglês | MEDLINE | ID: mdl-15759638

RESUMO

The limitations of homology-based methods for prediction of protein molecular function are well known; differences in domain structure, gene duplication events and errors in existing database annotations complicate this process. In this paper we present a method to detect and model protein subfamilies, which can be used in high-throughput, genome-scale phylogenomic inference of protein function. We demonstrate the method on a set of nine PFAM families, and show that subfamily HMMs provide greater separation of homologs and non-homologs than is possible with a single HMM for each family. We also show that subfamily HMMs can be used for functional classification with a very low expected error rate. The BETE method for identifying functional subfamilies is illustrated on a set of serotonin receptors.


Assuntos
Genômica , Animais , Teorema de Bayes , Evolução Biológica , Bases de Dados de Ácidos Nucleicos , Bases de Dados de Proteínas , Enzimas/genética , Duplicação Gênica , Cadeias de Markov , Modelos Genéticos , Filogenia , Proteínas/química , Proteínas/genética , Alinhamento de Sequência
12.
Proc Natl Acad Sci U S A ; 102(6): 2087-92, 2005 Feb 08.
Artigo em Inglês | MEDLINE | ID: mdl-15684089

RESUMO

During infection of Arabidopsis thaliana, the bacterium Pseudomonas syringae pv tomato delivers the effector protein AvrRpt2 into the plant cell cytosol. Within the plant cell, AvrRpt2 undergoes N-terminal processing and causes elimination of Arabidopsis RIN4. Previous work established that AvrRpt2 is a putative cysteine protease, and AvrRpt2 processing and RIN4 elimination require an intact predicted catalytic triad in that AvrRpt2. In this work, proteolytic events that depend on AvrRpt2 activity were characterized. The amino acid sequence surrounding the processing site of AvrRpt2 and two related sequences from RIN4 triggered Avr-Rpt2-dependent proteolytic cleavage of a synthetic substrate, demonstrating that these sequences are cleavage recognition sites for AvrRpt2 activity. Processing-deficient AvrRpt2 mutants were identified and shown to retain their ability to eliminate wild-type RIN4. Single amino acid substitutions were made in each of the two RIN4 cleavage sites, and mutation of both sites resulted in cleavage-resistant RIN4. Growth of Pseudomonas expressing AvrRpt2 was significantly higher than catalytically inactive mutants on Arabidopsis rin4/rps2 mutant plants, suggesting there are additional protein targets of AvrRpt2 that account for the virulence activity of this effector. Bioinformatics analysis identified putative Arabidopsis proteins containing sequences similar to the proteolytic cleavage sites conserved in AvrRpt2 and RIN4. Several of these proteins were eliminated in an AvrRpt2-dependent manner in a transient in planta expression system. These results identify amino acids important for AvrRpt2 substrate recognition and cleavage as well as demonstrate AvrRpt2 protease activity eliminates multiple Arabidopsis proteins in a transient expression system.


Assuntos
Proteínas de Bactérias/metabolismo , Pseudomonas syringae/metabolismo , Sequência de Aminoácidos , Arabidopsis/microbiologia , Proteínas de Arabidopsis/metabolismo , Proteínas de Bactérias/genética , Proteínas de Transporte/metabolismo , Peptídeos e Proteínas de Sinalização Intracelular , Dados de Sequência Molecular , Alinhamento de Sequência
13.
Proc Natl Acad Sci U S A ; 102(5): 1685-90, 2005 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-15668378

RESUMO

The Agrobacterium T-DNA transporter belongs to a growing class of evolutionarily conserved transporters, called type IV secretion systems (T4SSs). VirB4, 789 aa, is the largest T4SS component, providing a rich source of possible structural domains. Here, we use a variety of bioinformatics methods to predict that the C-terminal domain of VirB4 (including the Walker A and B nucleotide-binding motifs) is related by divergent evolution to the cytoplasmic domain of TrwB, the coupling protein required for conjugative transfer of plasmid R388 from Escherichia coli. This prediction is supported by detailed sequence and structure analyses showing conservation of functionally and structurally important residues between VirB4 and TrwB. The availability of a solved crystal structure for TrwB enables the construction of a comparative model for VirB4 and the prediction that, like TrwB, VirB4 forms a hexamer. These results lead to a model in which VirB4 acts as a docking site at the entrance of the T4SS channel and acts in concert with VirD4 and VirB11 to transport substrates (T-strand linked to VirD2 or proteins such as VirE2, VirE3, or VirF) through the T4SS.


Assuntos
Proteínas de Bactérias/química , Transportadores de Ânions Orgânicos/química , Transportadores de Ânions Orgânicos/metabolismo , Rhizobium/fisiologia , Sequência de Aminoácidos , Proteínas de Bactérias/metabolismo , Substâncias Macromoleculares/química , Substâncias Macromoleculares/metabolismo , Modelos Moleculares , Dados de Sequência Molecular , Fragmentos de Peptídeos/química , Fragmentos de Peptídeos/metabolismo , Estrutura Secundária de Proteína , Transporte Proteico , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos
14.
Curr Protoc Mol Biol ; Chapter 19: Unit 19.5, 2005 May.
Artigo em Inglês | MEDLINE | ID: mdl-18265355

RESUMO

Prediction of molecular function of proteins has become an important task in the genomics era. A wide variety of sequence analysis tools are available to biologists for this task. We have selected one or two primary protocols for tasks such as domain detection, subcellular localization, and motif detection. We also present a strategy for integration of results from different protocols. All the resources needed for these protocols are accessible via publicly available Web servers and databases and require little or no computational expertise.


Assuntos
Sistemas de Gerenciamento de Base de Dados , Análise de Sequência de Proteína/métodos , Motivos de Aminoácidos , Armazenamento e Recuperação da Informação , Internet , Estrutura Terciária de Proteína , Software , Interface Usuário-Computador
15.
Curr Protoc Bioinformatics ; Chapter 6: Unit 6.9, 2005 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-18428751

RESUMO

With the explosion in sequence data, accurate prediction of protein function has become a vital task in prioritizing experimental investigation. While computationally efficient methods for homology-based function prediction have been developed to make this approach feasible in high-throughput mode, it is not without its dangers. Biological processes such as gene duplication, domain shuffling, and speciation produce families of related genes whose gene products can have vastly different molecular functions. Standard sequence-comparison approaches may not discriminate effectively among these candidate homologs, leading to errors in database annotations. In this unit, we describe phylogenomic approaches to reduce the error rate in function prediction. Phylogenomic inference of protein molecular function consists of a series of subtasks. Once a cluster of homologs is identified, a multiple sequence alignment and phylogenetic tree are constructed. Finally, the phylogenetic tree is overlaid with experimental data culled for the members of the family, and changes in biochemical function can be traced along the evolutionary tree.


Assuntos
Algoritmos , Evolução Molecular , Modelos Genéticos , Proteínas/genética , Proteínas/metabolismo , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Sequência de Bases , Simulação por Computador , Análise Mutacional de DNA/métodos , Dados de Sequência Molecular , Filogenia , Homologia de Sequência do Ácido Nucleico
16.
Curr Protoc Protein Sci ; Chapter 2: 2.11.1-2.11.24, 2005 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-18429280

RESUMO

Prediction of molecular function of proteins has become an important task in the genomics era. A wide variety of sequence analysis tools are available to biologists for this task. We have selected one or two primary protocols for tasks such as domain detection, subcellular localization, and motif detection. We also present a strategy for integration of results from different protocols. All the resources needed for these protocols are accessible via publicly available Web servers and databases and require little or no computational expertise.


Assuntos
Proteínas/química , Sequência de Aminoácidos , Bases de Dados de Proteínas , Humanos , Internet , Dados de Sequência Molecular , Estrutura Secundária de Proteína , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos , Frações Subcelulares/química
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA