Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
Bioinformatics ; 38(1): 44-51, 2021 12 22.
Artigo em Inglês | MEDLINE | ID: mdl-34415301

RESUMO

MOTIVATION: Accurate automatic annotation of protein function relies on both innovative models and robust datasets. Due to their importance in biological processes, the identification of DNA-binding proteins directly from protein sequence has been the focus of many studies. However, the datasets used to train and evaluate these methods have suffered from substantial flaws. We describe some of the weaknesses of the datasets used in previous DNA-binding protein literature and provide several new datasets addressing these problems. We suggest new evaluative benchmark tasks that more realistically assess real-world performance for protein annotation models. We propose a simple new model for the prediction of DNA-binding proteins and compare its performance on the improved datasets to two previously published models. In addition, we provide extensive tests showing how the best models predict across taxa. RESULTS: Our new gradient boosting model, which uses features derived from a published protein language model, outperforms the earlier models. Perhaps surprisingly, so does a baseline nearest neighbor model using BLAST percent identity. We evaluate the sensitivity of these models to perturbations of DNA-binding regions and control regions of protein sequences. The successful data-driven models learn to focus on DNA-binding regions. When predicting across taxa, the best models are highly accurate across species in the same kingdom and can provide some information when predicting across kingdoms. AVAILABILITY AND IMPLEMENTATION: The data and results for this article can be found at https://doi.org/10.5281/zenodo.5153906. The code for this article can be found at https://doi.org/10.5281/zenodo.5153683. The code, data and results can also be found at https://github.com/AZaitzeff/tools_for_dna_binding_proteins.


Assuntos
Proteínas de Ligação a DNA , DNA , Proteínas de Ligação a DNA/genética , Sequência de Aminoácidos , Anotação de Sequência Molecular
2.
PLoS Biol ; 12(2): e1001789, 2014 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-24558347

RESUMO

Evolutionary adaptation to a constant environment is often accompanied by specialization and a reduction of fitness in other environments. We assayed the ability of the Lenski Escherichia coli populations to grow on a range of carbon sources after 50,000 generations of adaptation on glucose. Using direct measurements of growth rates, we demonstrated that declines in performance were much less widespread than suggested by previous results from Biolog assays of cellular respiration. Surprisingly, there were many performance increases on a variety of substrates. In addition to the now famous example of citrate, we observed several other novel gains of function for organic acids that the ancestral strain only marginally utilized. Quantitative growth data also showed that strains with a higher mutation rate exhibited significantly more declines, suggesting that most metabolic erosion was driven by mutation accumulation and not by physiological tradeoffs. These reductions in growth by mutator strains were ameliorated by growth at lower temperature, consistent with the hypothesis that this metabolic erosion is largely caused by destabilizing mutations to the associated enzymes. We further hypothesized that reductions in growth rate would be greatest for substrates used most differently from glucose, and we used flux balance analysis to formulate this question quantitatively. To our surprise, we found no significant relationship between decreases in growth and dissimilarity to glucose metabolism. Taken as a whole, these data suggest that in a single resource environment, specialization does not mainly result as an inevitable consequence of adaptive tradeoffs, but rather due to the gradual accumulation of disabling mutations in unused portions of the genome.


Assuntos
Adaptação Fisiológica/genética , Escherichia coli/genética , Metabolismo dos Carboidratos , Meios de Cultura , Escherichia coli/crescimento & desenvolvimento , Escherichia coli/metabolismo , Evolução Molecular , Redes e Vias Metabólicas/genética , Mutagênese , Mutação , Consumo de Oxigênio
3.
PLoS Comput Biol ; 9(6): e1003091, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23818838

RESUMO

The most powerful genome-scale framework to model metabolism, flux balance analysis (FBA), is an evolutionary optimality model. It hypothesizes selection upon a proposed optimality criterion in order to predict the set of internal fluxes that would maximize fitness. Here we present a direct test of the optimality assumption underlying FBA by comparing the central metabolic fluxes predicted by multiple criteria to changes measurable by a (13)C-labeling method for experimentally-evolved strains. We considered datasets for three Escherichia coli evolution experiments that varied in their length, consistency of environment, and initial optimality. For ten populations that were evolved for 50,000 generations in glucose minimal medium, we observed modest changes in relative fluxes that led to small, but significant decreases in optimality and increased the distance to the predicted optimal flux distribution. In contrast, seven populations evolved on the poor substrate lactate for 900 generations collectively became more optimal and had flux distributions that moved toward predictions. For three pairs of central metabolic knockouts evolved on glucose for 600-800 generations, there was a balance between cases where optimality and flux patterns moved toward or away from FBA predictions. Despite this variation in predictability of changes in central metabolism, two generalities emerged. First, improved growth largely derived from evolved increases in the rate of substrate use. Second, FBA predictions bore out well for the two experiments initiated with ancestors with relatively sub-optimal yield, whereas those begun already quite optimal tended to move somewhat away from predictions. These findings suggest that the tradeoff between rate and yield is surprisingly modest. The observed positive correlation between rate and yield when adaptation initiated further from the optimum resulted in the ability of FBA to use stoichiometric constraints to predict the evolution of metabolism despite selection for rate.


Assuntos
Evolução Molecular , Metabolismo , Isótopos de Carbono/metabolismo , Escherichia coli/genética , Escherichia coli/crescimento & desenvolvimento , Escherichia coli/metabolismo , Glucose/metabolismo , Ácido Láctico/metabolismo
4.
BMC Evol Biol ; 12: 151, 2012 Aug 21.
Artigo em Inglês | MEDLINE | ID: mdl-22909317

RESUMO

BACKGROUND: Specialization for ecological niches is a balance of evolutionary adaptation and its accompanying tradeoffs. Here we focus on the Lenski Long-Term Evolution Experiment, which has maintained cultures of Escherichia coli in the same defined seasonal environment for 50,000 generations. Over this time, much adaptation and specialization to the environment has occurred. The presence of citrate in the growth media selected one lineage to gain the novel ability to utilize citrate as a carbon source after 31,000 generations. Here we test whether other strains have specialized to rely on citrate after 50,000 generations. RESULTS: We show that in addition to the citrate-catabolizing strain, three other lineages evolving in parallel have acquired a dependence on citrate for optimal growth on glucose. None of these strains were stimulated indirectly by the sodium present in disodium citrate, nor exhibited even partial utilization of citrate as a carbon source. Instead, all three of these citrate-stimulated populations appear to rely on it as a chelator of iron. CONCLUSIONS: The strains we examine here have evolved specialization to their environment through apparent loss of function. Our results are most consistent with the accumulation of mutations in iron transport genes that were obviated by abundant citrate. The results present another example where a subtle decision in the design of an evolution experiment led to unexpected evolutionary outcomes.


Assuntos
Adaptação Fisiológica/genética , Citratos/metabolismo , Escherichia coli/crescimento & desenvolvimento , Evolução Molecular , Glucose/metabolismo , Isótopos de Carbono/análise , Quelantes/metabolismo , Escherichia coli/genética , Ferro/metabolismo , Citrato de Sódio
5.
PLoS One ; 17(3): e0265020, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35286324

RESUMO

Engineered proteins generally must possess a stable structure in order to achieve their designed function. Stable designs, however, are astronomically rare within the space of all possible amino acid sequences. As a consequence, many designs must be tested computationally and experimentally in order to find stable ones, which is expensive in terms of time and resources. Here we use a high-throughput, low-fidelity assay to experimentally evaluate the stability of approximately 200,000 novel proteins. These include a wide range of sequence perturbations, providing a baseline for future work in the field. We build a neural network model that predicts protein stability given only sequences of amino acids, and compare its performance to the assayed values. We also report another network model that is able to generate the amino acid sequences of novel stable proteins given requested secondary sequences. Finally, we show that the predictive model-despite weaknesses including a noisy data set-can be used to substantially increase the stability of both expert-designed and model-generated proteins.


Assuntos
Redes Neurais de Computação , Proteínas , Sequência de Aminoácidos , Aminoácidos , Estabilidade Proteica , Proteínas/química
6.
Science ; 343(6177): 1366-9, 2014 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-24603152

RESUMO

Ecological opportunities promote population divergence into coexisting lineages. However, the genetic mechanisms that enable new lineages to exploit these opportunities are poorly understood except in cases of single mutations. We examined how two Escherichia coli lineages diverged from their common ancestor at the outset of a long-term coexistence. By sequencing genomes and reconstructing the genetic history of one lineage, we showed that three mutations together were sufficient to produce the frequency-dependent fitness effects that allowed this lineage to invade and stably coexist with the other. These mutations all affected regulatory genes and collectively caused substantial metabolic changes. Moreover, the particular derived alleles were critical for the initial divergence and invasion, indicating that the establishment of this polymorphism depended on specific epistatic interactions.


Assuntos
Epistasia Genética , Escherichia coli/genética , Escherichia coli/fisiologia , Mutação , Polimorfismo Genético , Alelos , Escherichia coli/metabolismo , Evolução Molecular , Genes Bacterianos , Genes Reguladores , Aptidão Genética , Genótipo , Glucose/metabolismo , Interações Microbianas
7.
Cell Rep ; 7(4): 1104-15, 2014 May 22.
Artigo em Inglês | MEDLINE | ID: mdl-24794435

RESUMO

The interspecies exchange of metabolites plays a key role in the spatiotemporal dynamics of microbial communities. This raises the question of whether ecosystem-level behavior of structured communities can be predicted using genome-scale metabolic models for multiple organisms. We developed a modeling framework that integrates dynamic flux balance analysis with diffusion on a lattice and applied it to engineered communities. First, we predicted and experimentally confirmed the species ratio to which a two-species mutualistic consortium converges and the equilibrium composition of a newly engineered three-member community. We next identified a specific spatial arrangement of colonies, which gives rise to what we term the "eclipse dilemma": does a competitor placed between a colony and its cross-feeding partner benefit or hurt growth of the original colony? Our experimentally validated finding that the net outcome is beneficial highlights the complex nature of metabolic interactions in microbial communities while at the same time demonstrating their predictability.


Assuntos
Ecossistema , Microbiota/fisiologia , Modelos Biológicos , Comportamento Espacial/fisiologia , Análise Espaço-Temporal
8.
Nat Genet ; 43(12): 1275-80, 2011 Nov 13.
Artigo em Inglês | MEDLINE | ID: mdl-22081229

RESUMO

Bacterial pathogens evolve during the infection of their human host(1-8), but separating adaptive and neutral mutations remains challenging(9-11). Here we identify bacterial genes under adaptive evolution by tracking recurrent patterns of mutations in the same pathogenic strain during the infection of multiple individuals. We conducted a retrospective study of a Burkholderia dolosa outbreak among subjects with cystic fibrosis, sequencing the genomes of 112 isolates collected from 14 individuals over 16 years. We find that 17 bacterial genes acquired nonsynonymous mutations in multiple individuals, which indicates parallel adaptive evolution. Mutations in these genes affect important pathogenic phenotypes, including antibiotic resistance and bacterial membrane composition and implicate oxygen-dependent regulation as paramount in lung infections. Several genes have not previously been implicated in pathogenesis and may represent new therapeutic targets. The identification of parallel molecular evolution as a pathogen spreads among multiple individuals points to the key selection forces it experiences within human hosts.


Assuntos
Infecções por Burkholderia/microbiologia , Burkholderia/genética , Evolução Molecular , Genes Bacterianos , Adaptação Biológica , Antibacterianos/farmacologia , Bacteriemia/microbiologia , Burkholderia/efeitos dos fármacos , Burkholderia/patogenicidade , Infecções por Burkholderia/epidemiologia , Ciprofloxacina/farmacologia , Farmacorresistência Bacteriana , Epidemias , Genoma Bacteriano , Interações Hospedeiro-Patógeno , Humanos , Funções Verossimilhança , Lipopolissacarídeos/genética , Pneumopatias/microbiologia , Filogenia , Polimorfismo de Nucleotídeo Único , Estudos Retrospectivos , Seleção Genética , Fatores de Virulência/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA