Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
Mais filtros

Bases de dados
Tipo de documento
Intervalo de ano de publicação
1.
Antonie Van Leeuwenhoek ; 113(2): 175-183, 2020 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-31522373

RESUMO

Microbial communities are important regulators of many processes in all ecosystems. Understanding of ecosystem processes requires at least an overview of the involved microorganisms. While in-depth identification of microbial species in environmental samples can be achieved by next generation sequencing, profiling of whole microbial communities can be accomplished via less labour-intensive approaches. Especially automated ribosomal intergenic spacer analysis (ARISA) are of interest as they are highly specific even at fine scales and widely applicable for environmental samples. Yet, established protocols lack the possibility to compare prokaryotic and eukaryotic communities as different primer sets are necessary. However, shifts in the eukaryote to prokaryote ratio can be a useful indicator for ecosystem processes like decomposition or nutrient cycling. We propose a protocol to analyse prokaryotic and eukaryotic communities using a single primer pair based reaction based on a region with variable length (V4, which is about 180 bp shorter in prokaryotes compared to eukaryotes) in the small ribosomal subunit flanked by two highly conservative regions. Shifts in the prokaryotic and eukaryotic ratio between samples can be reliably detected by fragment length polymorphism analysis as well as sequencing of this region. Together with established approaches such as ARISA or 16S and ITS rDNA sequencing, this can provide a more complex insight into microbial community shifts and ecosystem processes.


Assuntos
DNA Ribossômico/genética , Análise de Sequência de DNA/métodos , Ecossistema , Eucariotos/metabolismo , Sequenciamento de Nucleotídeos em Larga Escala , Filogenia , Reação em Cadeia da Polimerase , Células Procarióticas/metabolismo , RNA Ribossômico 16S/genética
2.
Biochemistry (Mosc) ; 83(2): 129-139, 2018 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-29618299

RESUMO

Many proteins need recognition of specific DNA sequences for functioning. The number of recognition sites and their distribution along the DNA might be of biological importance. For example, the number of restriction sites is often reduced in prokaryotic and phage genomes to decrease the probability of DNA cleavage by restriction endonucleases. We call a sequence an exceptional one if its frequency in a genome significantly differs from one predicted by some mathematical model. An exceptional sequence could be either under- or over-represented, depending on its frequency in comparison with the predicted one. Exceptional sequences could be considered biologically meaningful, for example, as targets of DNA-binding proteins or as parts of abundant repetitive elements. Several methods to predict frequency of a short sequence in a genome, based on actual frequencies of certain its subsequences, are used. The most popular are methods based on Markov chain models. But any rigorous comparison of the methods has not previously been performed. We compared three methods for the prediction of short sequence frequencies: the maximum-order Markov chain model-based method, the method that uses geometric mean of extended Markovian estimates, and the method that utilizes frequencies of all subsequences including discontiguous ones. We applied them to restriction sites in complete genomes of 2500 prokaryotic species and demonstrated that the results depend greatly on the method used: lists of 5% of the most under-represented sites differed by up to 50%. The method designed by Burge and coauthors in 1992, which utilizes all subsequences of the sequence, showed a higher precision than the other two methods both on prokaryotic genomes and randomly generated sequences after computational imitation of selective pressure. We propose this method as the first choice for detection of exceptional sequences in prokaryotic genomes.


Assuntos
Genoma Arqueal , Genoma Bacteriano , Genômica/métodos , Células Procarióticas/metabolismo , Sequência de Bases , Proteínas de Ligação a DNA/química , Proteínas de Ligação a DNA/genética , Bases de Dados Genéticas , Cadeias de Markov
3.
J Theor Biol ; 437: 222-224, 2018 01 21.
Artigo em Inglês | MEDLINE | ID: mdl-29080779

RESUMO

A variety of evolutionary processes in biology can be viewed as settings where organisms 'catalyse' the formation of new types of organisms. One example, relevant to the origin of life, is where transient biological colonies (e.g. prokaryotes or protocells) give rise to new colonies via lateral gene transfer. In this short note, we describe and analyse a simple random process which models such settings. By applying theory from general birth-death processes, we describe how the survival of a population under catalytic diversification depends on interplay of the catalysis rate and the initial population size. We also note how such process can also be viewed within the framework of 'self-sustaining autocatalytic networks'.


Assuntos
Algoritmos , Células Artificiais/metabolismo , Transferência Genética Horizontal , Modelos Genéticos , Células Procarióticas/metabolismo , Simulação por Computador , Evolução Molecular , Genoma Bacteriano/genética , Cadeias de Markov , Origem da Vida
4.
Mol Biol Evol ; 35(1): 211-224, 2018 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-29106597

RESUMO

Prokaryotes evolved to thrive in an extremely diverse set of habitats, and their proteomes bear signatures of environmental conditions. Although correlations between amino acid usage and environmental temperature are well-documented, understanding of the mechanisms of thermal adaptation remains incomplete. Here, we couple the energetic costs of protein folding and protein homeostasis to build a microscopic model explaining both the overall amino acid composition and its temperature trends. Low biosynthesis costs lead to low diversity of physical interactions between amino acid residues, which in turn makes proteins less stable and drives up chaperone activity to maintain appropriate levels of folded, functional proteins. Assuming that the cost of chaperone activity is proportional to the fraction of unfolded client proteins, we simulated thermal adaptation of model proteins subject to minimization of the total cost of amino acid synthesis and chaperone activity. For the first time, we predicted both the proteome-average amino acid abundances and their temperature trends simultaneously, and found strong correlations between model predictions and 402 genomes of bacteria and archaea. The energetic constraint on protein evolution is more apparent in highly expressed proteins, selected by codon adaptation index. We found that in bacteria, highly expressed proteins are similar in composition to thermophilic ones, whereas in archaea no correlation between predicted expression level and thermostability was observed. At the same time, thermal adaptations of highly expressed proteins in bacteria and archaea are nearly identical, suggesting that universal energetic constraints prevail over the phylogenetic differences between these domains of life.


Assuntos
Adaptação Fisiológica/fisiologia , Células Procarióticas/metabolismo , Proteostase/fisiologia , Aclimatação/genética , Aclimatação/fisiologia , Adaptação Fisiológica/genética , Aminoácidos/genética , Archaea/genética , Archaea/metabolismo , Bactérias/genética , Bactérias/metabolismo , Evolução Biológica , Códon/metabolismo , Simulação por Computador , Evolução Molecular , Temperatura Alta , Filogenia , Células Procarióticas/fisiologia , Proteoma/genética , Temperatura
5.
J Math Biol ; 75(5): 1253-1283, 2017 11.
Artigo em Inglês | MEDLINE | ID: mdl-28289838

RESUMO

This paper analyzes, in the context of a prokaryotic cell, the stochastic variability of the number of proteins when there is a control of gene expression by an autoregulation scheme. The goal of this work is to estimate the efficiency of the regulation to limit the fluctuations of the number of copies of a given protein. The autoregulation considered in this paper relies mainly on a negative feedback: the proteins are repressors of their own gene expression. The efficiency of a production process without feedback control is compared to a production process with an autoregulation of the gene expression assuming that both of them produce the same average number of proteins. The main characteristic used for the comparison is the standard deviation of the number of proteins at equilibrium. With a Markovian representation and a simple model of repression, we prove that, under a scaling regime, the repression mechanism follows a Hill repression scheme with an hyperbolic control. An explicit asymptotic expression of the variance of the number of proteins under this regulation mechanism is obtained. Simulations are used to study other aspects of autoregulation such as the rate of convergence to equilibrium of the production process and the case where the control of the production process of proteins is achieved via the inhibition of mRNAs.


Assuntos
Regulação da Expressão Gênica , Modelos Genéticos , Retroalimentação Fisiológica , Homeostase , Cadeias de Markov , Conceitos Matemáticos , Células Procarióticas/metabolismo , Biossíntese de Proteínas/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Processos Estocásticos
6.
BMC Evol Biol ; 16(1): 215, 2016 10 18.
Artigo em Inglês | MEDLINE | ID: mdl-27756227

RESUMO

BACKGROUND: A defining feature of eukaryotic cells is the presence of various distinct membrane-bound compartments with different metabolic roles. Material exchange between most compartments occurs via a sophisticated vesicle trafficking system. This intricate cellular architecture of eukaryotes appears to have emerged suddenly, about 2 billion years ago, from much less complex ancestors. How the eukaryotic cell acquired its internal complexity is poorly understood, partly because no prokaryotic precursors have been found for many key factors involved in compartmentalization. One exception is the Cdc48 protein family, which consists of several distinct classical ATPases associated with various cellular activities (AAA+) proteins with two consecutive AAA domains. RESULTS: Here, we have classified the Cdc48 family through iterative use of hidden Markov models and tree building. We found only one type, Cdc48, in prokaryotes, although a set of eight diverged members that function at distinct subcellular compartments were retrieved from eukaryotes and were probably present in the last eukaryotic common ancestor (LECA). Pronounced changes in sequence and domain structure during the radiation into the LECA set are delineated. Moreover, our analysis brings to light lineage-specific losses and duplications that often reflect important biological changes. Remarkably, we also found evidence for internal duplications within the LECA set that probably occurred during the rise of the eukaryotic cell. CONCLUSIONS: Our analysis corroborates the idea that the diversification of the Cdc48 family is closely intertwined with the development of the compartments of the eukaryotic cell.


Assuntos
Adenosina Trifosfatases/química , Proteínas de Ciclo Celular/química , Células Eucarióticas/metabolismo , Evolução Molecular , Adenosina Trifosfatases/genética , Evolução Biológica , Proteínas de Ciclo Celular/genética , Células Eucarióticas/citologia , Células Eucarióticas/ultraestrutura , Cadeias de Markov , Filogenia , Células Procarióticas/citologia , Células Procarióticas/metabolismo , Células Procarióticas/ultraestrutura , Domínios Proteicos , Proteína com Valosina
7.
BMC Bioinformatics ; 17(1): 260, 2016 Jun 30.
Artigo em Inglês | MEDLINE | ID: mdl-27363390

RESUMO

BACKGROUND: Comparative analysis of whole genome sequence data from closely related prokaryotic species or strains is becoming an increasingly important and accessible approach for addressing both fundamental and applied biological questions. While there are number of excellent tools developed for performing this task, most scale poorly when faced with hundreds of genome sequences, and many require extensive manual curation. RESULTS: We have developed a de-novo genome analysis pipeline (DeNoGAP) for the automated, iterative and high-throughput analysis of data from comparative genomics projects involving hundreds of whole genome sequences. The pipeline is designed to perform reference-assisted and de novo gene prediction, homolog protein family assignment, ortholog prediction, functional annotation, and pan-genome analysis using a range of proven tools and databases. While most existing methods scale quadratically with the number of genomes since they rely on pairwise comparisons among predicted protein sequences, DeNoGAP scales linearly since the homology assignment is based on iteratively refined hidden Markov models. This iterative clustering strategy enables DeNoGAP to handle a very large number of genomes using minimal computational resources. Moreover, the modular structure of the pipeline permits easy updates as new analysis programs become available. CONCLUSION: DeNoGAP integrates bioinformatics tools and databases for comparative analysis of a large number of genomes. The pipeline offers tools and algorithms for annotation and analysis of completed and draft genome sequences. The pipeline is developed using Perl, BioPerl and SQLite on Ubuntu Linux version 12.04 LTS. Currently, the software package accompanies script for automated installation of necessary external programs on Ubuntu Linux; however, the pipeline should be also compatible with other Linux and Unix systems after necessary external programs are installed. DeNoGAP is freely available at https://sourceforge.net/projects/denogap/ .


Assuntos
Genoma , Genômica/métodos , Células Procarióticas/metabolismo , Software , Algoritmos , Sequência de Aminoácidos , Análise por Conglomerados , Biologia Computacional , Cadeias de Markov , Anotação de Sequência Molecular , Reprodutibilidade dos Testes , Homologia de Sequência do Ácido Nucleico
8.
Mol Phylogenet Evol ; 96: 102-111, 2016 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-26724405

RESUMO

UNLABELLED: Traditional methods for sequence comparison and phylogeny reconstruction rely on pair wise and multiple sequence alignments. But alignment could not be directly applied to whole genome/proteome comparison and phylogenomic studies due to their high computational complexity. Hence alignment-free methods became popular in recent years. Here we propose a fast alignment-free method for whole genome/proteome comparison and phylogeny reconstruction using higher order Markov model and chaos game representation. In the present method, we use the transition matrices of higher order Markov models to characterize amino acid or DNA sequences for their comparison. The order of the Markov model is uniquely identified by maximizing the average Shannon entropy of conditional probability distributions. Using one-dimensional chaos game representation and linked list, this method can reduce large memory and time consumption which is due to the large-scale conditional probability distributions. To illustrate the effectiveness of our method, we employ it for fast phylogeny reconstruction based on genome/proteome sequences of two species data sets used in previous published papers. Our results demonstrate that the present method is useful and efficient. AVAILABILITY AND IMPLEMENTATION: The source codes for our algorithm to get the distance matrix and genome/proteome sequences can be downloaded from ftp://121.199.20.25/. The software Phylip and EvolView we used to construct phylogenetic trees can be referred from their websites.


Assuntos
Genoma/genética , Cadeias de Markov , Dinâmica não Linear , Filogenia , Células Procarióticas/metabolismo , Proteoma/genética , Algoritmos , Alinhamento de Sequência , Software
9.
BMC Genomics ; 14: 700, 2013 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-24118975

RESUMO

BACKGROUND: Insertion Sequences (ISs) and their non-autonomous derivatives (MITEs) are important components of prokaryotic genomes inducing duplication, deletion, rearrangement or lateral gene transfers. Although ISs and MITEs are relatively simple and basic genetic elements, their detection remains a difficult task due to their remarkable sequence diversity. With the advent of high-throughput genome and metagenome sequencing technologies, the development of fast, reliable and sensitive methods of ISs and MITEs detection become an important challenge. So far, almost all studies dealing with prokaryotic transposons have used classical BLAST-based detection methods against reference libraries. Here we introduce alternative methods of detection either taking advantages of the structural properties of the elements (de novo methods) or using an additional library-based method using profile HMM searches. RESULTS: In this study, we have developed three different work flows dedicated to ISs and MITEs detection: the first two use de novo methods detecting either repeated sequences or presence of Inverted Repeats; the third one use 28 in-house transposase alignment profiles with HMM search methods. We have compared the respective performances of each method using a reference dataset of 30 archaeal and 30 bacterial genomes in addition to simulated and real metagenomes. Compared to a BLAST-based method using ISFinder as library, de novo methods significantly improve ISs and MITEs detection. For example, in the 30 archaeal genomes, we discovered 30 new elements (+20%) in addition to the 141 multi-copies elements already detected by the BLAST approach. Many of the new elements correspond to ISs belonging to unknown or highly divergent families. The total number of MITEs has even doubled with the discovery of elements displaying very limited sequence similarities with their respective autonomous partners (mainly in the Inverted Repeats of the elements). Concerning metagenomes, with the exception of short reads data (<300 bp) for which both techniques seem equally limited, profile HMM searches considerably ameliorate the detection of transposase encoding genes (up to +50%) generating low level of false positives compare to BLAST-based methods. CONCLUSION: Compared to classical BLAST-based methods, the sensitivity of de novo and profile HMM methods developed in this study allow a better and more reliable detection of transposons in prokaryotic genomes and metagenomes. We believed that future studies implying ISs and MITEs identification in genomic data should combine at least one de novo and one library-based method, with optimal results obtained by running the two de novo methods in addition to a library-based search. For metagenomic data, profile HMM search should be favored, a BLAST-based step is only useful to the final annotation into groups and families.


Assuntos
Biologia Computacional/métodos , Elementos de DNA Transponíveis/genética , Cadeias de Markov , Células Procarióticas/metabolismo , Archaea/genética , Bactérias/genética , Sequência de Bases , Bases de Dados de Ácidos Nucleicos , Genoma Arqueal/genética , Genoma Bacteriano/genética , Sequências Repetidas Invertidas/genética , Metagenoma/genética , Dados de Sequência Molecular , Padrões de Referência , Reprodutibilidade dos Testes
10.
Proc Natl Acad Sci U S A ; 110(24): 10039-44, 2013 Jun 11.
Artigo em Inglês | MEDLINE | ID: mdl-23630264

RESUMO

Contrary to the textbook portrayal of glycolysis as a single pathway conserved across all domains of life, not all sugar-consuming organisms use the canonical Embden-Meyerhoff-Parnass (EMP) glycolytic pathway. Prokaryotic glucose metabolism is particularly diverse, including several alternative glycolytic pathways, the most common of which is the Entner-Doudoroff (ED) pathway. The prevalence of the ED pathway is puzzling as it produces only one ATP per glucose--half as much as the EMP pathway. We argue that the diversity of prokaryotic glucose metabolism may reflect a tradeoff between a pathway's energy (ATP) yield and the amount of enzymatic protein required to catalyze pathway flux. We introduce methods for analyzing pathways in terms of thermodynamics and kinetics and show that the ED pathway is expected to require several-fold less enzymatic protein to achieve the same glucose conversion rate as the EMP pathway. Through genomic analysis, we further show that prokaryotes use different glycolytic pathways depending on their energy supply. Specifically, energy-deprived anaerobes overwhelmingly rely upon the higher ATP yield of the EMP pathway, whereas the ED pathway is common among facultative anaerobes and even more common among aerobes. In addition to demonstrating how protein costs can explain the use of alternative metabolic strategies, this study illustrates a direct connection between an organism's environment and the thermodynamic and biochemical properties of the metabolic pathways it employs.


Assuntos
Trifosfato de Adenosina/biossíntese , Proteínas de Bactérias/metabolismo , Glucose/metabolismo , Glicólise , Redes e Vias Metabólicas , Aerobiose , Algoritmos , Anaerobiose , Bactérias/classificação , Bactérias/genética , Bactérias/metabolismo , Proteínas de Bactérias/genética , Metabolismo Energético , Escherichia coli/genética , Escherichia coli/metabolismo , Cinética , Modelos Biológicos , Filogenia , Células Procarióticas/metabolismo , Especificidade da Espécie , Termodinâmica
11.
PLoS One ; 8(3): e59484, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23544073

RESUMO

BACKGROUND: Recent studies on genome assembly from short-read sequencing data reported the limitation of this technology to reconstruct the entire genome even at very high depth coverage. We investigated the limitation from the perspective of information theory to evaluate the effect of repeats on short-read genome assembly using idealized (error-free) reads at different lengths. METHODOLOGY/PRINCIPAL FINDINGS: We define a metric H(k) to be the entropy of sequencing reads at a read length k and use the relative loss of entropy ΔH(k) to measure the impact of repeats for the reconstruction of whole-genome from sequences of length k. In our experiments, we found that entropy loss correlates well with de-novo assembly coverage of a genome, and a score of ΔH(k)>1% indicates a severe loss in genome reconstruction fidelity. The minimal read lengths to achieve ΔH(k)<1% are different for various organisms and are independent of the genome size. For example, in order to meet the threshold of ΔH(k)<1%, a read length of 60 bp is needed for the sequencing of human genome (3.2 10(9) bp) and 320 bp for the sequencing of fruit fly (1.8×10(8) bp). We also calculated the ΔH(k) scores for 2725 prokaryotic chromosomes and plasmids at several read lengths. Our results indicate that the levels of repeats in different genomes are diverse and the entropy of sequencing reads provides a measurement for the repeat structures. CONCLUSIONS/SIGNIFICANCE: The proposed entropy-based measurement, which can be calculated in seconds to minutes in most cases, provides a rapid quantitative evaluation on the limitation of idealized short-read genome sequencing. Moreover, the calculation can be parallelized to scale up to large euakryotic genomes. This approach may be useful to tune the sequencing parameters to achieve better genome assemblies when a closely related genome is already available.


Assuntos
Entropia , Genoma/genética , Sequências Repetitivas de Ácido Nucleico/genética , Análise de Sequência de DNA/métodos , Animais , Bactérias/genética , Pareamento de Bases/genética , Sequência de Bases , Cromossomos/genética , Cromossomos Artificiais Bacterianos/genética , Humanos , Células Procarióticas/metabolismo
12.
BMC Genomics ; 11: 628, 2010 Nov 11.
Artigo em Inglês | MEDLINE | ID: mdl-21070645

RESUMO

BACKGROUND: A central tenet in biochemistry for over 50 years has held that microorganisms, plants and, more recently, certain apicomplexan parasites synthesize essential aromatic compounds via elaboration of a complete shikimic acid pathway, whereas metazoans lacking this pathway require a dietary source of these compounds. The large number of sequenced bacterial and archaean genomes now available for comparative genomic analyses allows the fundamentals of this contention to be tested in prokaryotes. Using Hidden Markov Model profiles (HMM profiles) to identify all known enzymes of the pathway, we report the presence of genes encoding shikimate pathway enzymes in the hypothetical proteomes constructed from the genomes of 488 sequenced prokaryotes. RESULTS: Amongst free-living prokaryotes most Bacteria possess, as expected, genes encoding a complete shikimic acid pathway, whereas of the culturable Archaea, only one was found to have a complete complement of recognisable enzymes in its predicted proteome. It may be that in the Archaea, the primary amino-acid sequences of enzymes of the pathway are highly divergent and so are not detected by HMM profiles. Alternatively, structurally unrelated (non-orthologous) proteins might be performing the same biochemical functions as those encoding recognized genes of the shikimate pathway. Most surprisingly, 30% of host-associated (mutualistic, commensal and pathogenic) bacteria likewise do not possess a complete shikimic acid pathway. Many of these microbes show some degree of genome reduction, suggesting that these host-associated bacteria might sequester essential aromatic compounds from a parasitised host, as a 'shared metabolic adaptation' in mutualistic symbiosis, or obtain them from other consorts having the complete biosynthetic pathway. The HMM results gave 84% agreement when compared against data in the highly curated BioCyc reference database of genomes and metabolic pathways. CONCLUSIONS: These results challenge the conventional belief that the shikimic acid pathway is universal and essential in prokaryotes. The possibilities that non-orthologous enzymes catalyse reactions in this pathway (especially in the Archaea), or that there exist specific uptake mechanisms for the acquisition of shikimate intermediates or essential pathway products, warrant further examination to better understand the precise metabolic attributes of host-beneficial and pathogenic bacteria.


Assuntos
Genes Bacterianos/genética , Interações Hospedeiro-Patógeno/genética , Redes e Vias Metabólicas/genética , Ácido Chiquímico/metabolismo , Archaea/genética , Archaea/metabolismo , Bactérias/enzimologia , Bactérias/genética , Bases de Dados Genéticas , Cadeias de Markov , Células Procarióticas/metabolismo , Proteoma/genética , Análise de Sequência de DNA , Ácido Chiquímico/química , Moldes Genéticos
13.
BMC Genomics ; 11: 491, 2010 Sep 09.
Artigo em Inglês | MEDLINE | ID: mdl-20828396

RESUMO

BACKGROUND: Out-of-frame stop codons (OSCs) occur naturally in coding sequences of all organisms, providing a mechanism of early termination of translation in incorrect reading frame so that the metabolic cost associated with frameshift events can be reduced. Given such a functional significance, we expect statistically overrepresented OSCs in coding sequences as a result of a widespread selection. Accordingly, we examined available prokaryotic genomes to look for evidence of this selection. RESULTS: The complete genome sequences of 990 prokaryotes were obtained from NCBI GenBank. We found that low G+C content coding sequences contain significantly more OSCs and G+C content at specific codon positions were the principal determinants of OSC usage bias in the different reading frames. To investigate if there is overrepresentation of OSCs, we modeled the trinucleotide and hexanucleotide biases of the coding sequences using Markov models, and calculated the expected OSC frequencies for each organism using a Monte Carlo approach. More than 93% of 342 phylogenetically representative prokaryotic genomes contain excess OSCs. Interestingly the degree of OSC overrepresentation correlates positively with G+C content, which may represent a compensatory mechanism for the negative correlation of OSC frequency with G+C content. We extended the analysis using additional compositional bias models and showed that lower-order bias like codon usage and dipeptide bias could not explain the OSC overrepresentation. The degree of OSC overrepresentation was found to correlate negatively with the optimal growth temperature of the organism after correcting for the G+C% and AT skew of the coding sequence. CONCLUSIONS: The present study uses approaches with statistical rigor to show that OSC overrepresentation is a widespread phenomenon among prokaryotes. Our results support the hypothesis that OSCs carry functional significance and have been selected in the course of genome evolution to act against unintended frameshift occurrences. Some results also hint that OSC overrepresentation being a compensatory mechanism to make up for the decrease in OSCs in high G+C organisms, thus revealing the interplay between two different determinants of OSC frequency.


Assuntos
Códon de Terminação/genética , Mutação da Fase de Leitura/genética , Peptídeos/genética , Células Procarióticas/metabolismo , Fases de Leitura/genética , Seleção Genética/genética , Sequência de Aminoácidos , Bactérias/genética , Bactérias/crescimento & desenvolvimento , Composição de Bases/genética , Sequência de Bases , Viés , Simulação por Computador , Genoma Bacteriano/genética , Cadeias de Markov , Dados de Sequência Molecular , Método de Monte Carlo , Fases de Leitura Aberta/genética , Peptídeos/química , Análise de Componente Principal , Temperatura
14.
J Math Biol ; 61(2): 231-251, 2010 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-19756606

RESUMO

A simple model of gene regulation in response to stochastically changing environmental conditions is developed and analyzed. The model consists of a differential equation driven by a continuous time 2-state Markov process. The density function of the resulting process converges to a beta distribution. We show that the moments converge to their stationary values exponentially in time. Simulations of a two-stage process where protein production depends on mRNA concentrations are also presented demonstrating that protein concentration tracks the environment whenever the rate of protein turnover is larger than the rate of environmental change. Single-celled organisms are therefore expected to have relatively high mRNA and protein turnover rates for genes that respond to environmental fluctuations.


Assuntos
Meio Ambiente , Regulação da Expressão Gênica/fisiologia , Cadeias de Markov , Modelos Genéticos , Algoritmos , Simulação por Computador , Células Eucarióticas/metabolismo , Células Procarióticas/metabolismo , Proteínas/genética , Proteínas/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo
15.
PLoS One ; 4(12): e8113, 2009 Dec 02.
Artigo em Inglês | MEDLINE | ID: mdl-19956556

RESUMO

BACKGROUND: DNA word frequencies, normalized for genomic AT content, are remarkably stable within prokaryotic genomes and are therefore said to reflect a "genomic signature." The genomic signatures can be used to phylogenetically classify organisms from arbitrary sampled DNA. Genomic signatures can also be used to search for horizontally transferred DNA or DNA regions subjected to special selection forces. Thus, the stability of the genomic signature can be used as a measure of genomic homogeneity. The factors associated with the stability of the genomic signatures are not known, and this motivated us to investigate further. We analyzed the intra-genomic variance of genomic signatures based on AT content normalization (0(th) order Markov model) as well as genomic signatures normalized by smaller DNA words (1(st) and 2(nd) order Markov models) for 636 sequenced prokaryotic genomes. Regression models were fitted, with intra-genomic signature variance as the response variable, to a set of factors representing genomic properties such as genomic AT content, genome size, habitat, phylum, oxygen requirement, optimal growth temperature and oligonucleotide usage variance (OUV, a measure of oligonucleotide usage bias), measured as the variance between genomic tetranucleotide frequencies and Markov chain approximated tetranucleotide frequencies, as predictors. PRINCIPAL FINDINGS: Regression analysis revealed that OUV was the most important factor (p<0.001) determining intra-genomic homogeneity as measured using genomic signatures. This means that the less random the oligonucleotide usage is in the sense of higher OUV, the more homogeneous the genome is in terms of the genomic signature. The other factors influencing variance in the genomic signature (p<0.001) were genomic AT content, phylum and oxygen requirement. CONCLUSIONS: Genomic homogeneity in prokaryotes is intimately linked to genomic GC content, oligonucleotide usage bias (OUV) and aerobiosis, while oligonucleotide usage bias (OUV) is associated with genomic GC content, aerobiosis and habitat.


Assuntos
Genoma/genética , Células Procarióticas/metabolismo , Bacillus cereus/genética , Viés , Escherichia coli/genética , Genoma Bacteriano/genética , Cadeias de Markov , Modelos Genéticos , Oligonucleotídeos/genética , Análise de Regressão
17.
Drug Discov Today ; 12(7-8): 319-26, 2007 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-17395092

RESUMO

Sialic acids are structurally diverse molecules that have important roles in the physiological reactions and characteristics of prokaryotes and eukaryotes. These include the ability to mask epitopes on underlying glycan chains and to repulse negatively charged moieties. Here, we describe the metabolism and immunological relevance of sialic acids and outline how their properties have been exploited by the pharmaceutical industry to enhance the therapeutic properties of proteins such as asparaginase and darbepoetin alpha.


Assuntos
Glicoproteínas/química , Preparações Farmacêuticas/química , Ácidos Siálicos/química , Animais , Desenho de Fármacos , Indústria Farmacêutica/métodos , Indústria Farmacêutica/tendências , Células Eucarióticas/metabolismo , Glicoproteínas/uso terapêutico , Humanos , Estrutura Molecular , Células Procarióticas/metabolismo , Ácidos Siálicos/imunologia , Ácidos Siálicos/metabolismo
18.
Comb Chem High Throughput Screen ; 9(7): 501-14, 2006 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-16925511

RESUMO

The discovery/development of novel drug candidates has witnessed dramatic changes over the last two decades. Old methods to identify lead compounds are not suitable to screen wide libraries generated by combinatorial chemistry techniques. High throughput screening (HTS) has become irreplaceable and hundreds of different approaches have been described. Assays based on purified components are flanked by whole cell-based assays, in which reporter genes are used to monitor, directly or indirectly, the influence of a chemical over the metabolism of living cells. The most convenient and widely used reporters for real-time measurements are luciferases, light emitting enzymes from evolutionarily distant organisms. Autofluorescent proteins have been also extensively employed, but proved to be more suitable for end-point measurements, in situ applications - such as the localization of fusion proteins in specific subcellular compartments - or environmental studies on microbial populations. The trend toward miniaturization and the technical advances in detection and liquid handling systems will allow to reach an ultra high throughput screening (uHTS), with 100,000 of compounds routinely screened each day. Here we show how similar approaches may be applied also to the search for new and potent antimicrobial agents.


Assuntos
Anti-Infecciosos/farmacologia , Proteínas de Bactérias/genética , Bioensaio/métodos , Técnicas Biossensoriais/métodos , Proteínas de Bactérias/metabolismo , Bioensaio/economia , Técnicas Biossensoriais/economia , Técnicas de Química Combinatória/economia , Técnicas de Química Combinatória/métodos , Técnicas Citológicas , Avaliação Pré-Clínica de Medicamentos/métodos , Genes Reporter , Luciferases/genética , Luciferases/metabolismo , Luminescência , Células Procarióticas/citologia , Células Procarióticas/metabolismo
19.
PLoS Comput Biol ; 1(6): e60, 2005 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-16292354

RESUMO

Clustered regularly interspaced short palindromic repeats (CRISPRs) are a family of DNA direct repeats found in many prokaryotic genomes. Repeats of 21-37 bp typically show weak dyad symmetry and are separated by regularly sized, nonrepetitive spacer sequences. Four CRISPR-associated (Cas) protein families, designated Cas1 to Cas4, are strictly associated with CRISPR elements and always occur near a repeat cluster. Some spacers originate from mobile genetic elements and are thought to confer "immunity" against the elements that harbor these sequences. In the present study, we have systematically investigated uncharacterized proteins encoded in the vicinity of these CRISPRs and found many additional protein families that are strictly associated with CRISPR loci across multiple prokaryotic species. Multiple sequence alignments and hidden Markov models have been built for 45 Cas protein families. These models identify family members with high sensitivity and selectivity and classify key regulators of development, DevR and DevS, in Myxococcus xanthus as Cas proteins. These identifications show that CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a repeat cluster or filling the region between two repeat clusters. Distinctive subsets of the collection of Cas proteins recur in phylogenetically distant species and correlate with characteristic repeat periodicity. The analyses presented here support initial proposals of mobility of these units, along with the likelihood that loci of different subtypes interact with one another as well as with host cell defensive, replicative, and regulatory systems. It is evident from this analysis that CRISPR/cas loci are larger, more complex, and more heterogeneous than previously appreciated.


Assuntos
Genoma/genética , Família Multigênica/genética , Células Procarióticas/metabolismo , Proteínas/classificação , Proteínas/genética , Sequências Repetitivas de Ácido Nucleico/genética , Genes Arqueais/genética , Genes Bacterianos/genética , Genes Fúngicos/genética , Genoma Bacteriano/genética , Haloarcula marismortui/classificação , Haloarcula marismortui/genética , Cadeias de Markov , Análise de Sequência com Séries de Oligonucleotídeos , Filogenia , Yersinia pestis/classificação , Yersinia pestis/genética
20.
Artigo em Inglês | MEDLINE | ID: mdl-12673384

RESUMO

Identification of promoters is very important in understanding gene regulating relationships in an organism, and computational identification of promoters has been a long standing problem in computational biology. A new method was presented to predict promoter regions in prokaryotic organism. The method predicted transcription unit (TU) first and the TU was divided into singlet that contains only one single gene in a TU, and operon that contains more than one gene. Based on these predicted TUs, promoter was predicted for each TU using hidden Markov model including explicit state duration density. Both predicted TUs and promoters were satisfying.


Assuntos
Cadeias de Markov , Regiões Promotoras Genéticas/genética , Transcrição Gênica/genética , Algoritmos , Genoma Bacteriano , Leptospira interrogans/genética , Modelos Genéticos , Células Procarióticas/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA