RESUMO
Cellular processes require precise and specific gene regulation, in which continuous mRNA degradation is a major element. The mRNA degradation mechanisms should be able to degrade a wide range of different RNA substrates with high efficiency, but should at the same time be limited, to avoid killing the cell by elimination of all cellular RNA. RNase Y is a major endoribonuclease found in most Firmicutes, including Bacillus subtilis and Staphylococcus aureus. However, the molecular interactions that direct RNase Y to cleave the correct RNA molecules at the correct position remain unknown. In this work we have identified transcripts that are homologs in S. aureus and B. subtilis, and are RNase Y targets in both bacteria. Two such transcript pairs were used as models to show a functional overlap between the S. aureus and the B. subtilis RNase Y, which highlighted the importance of the nucleotide sequence of the RNA molecule itself in the RNase Y targeting process. Cleavage efficiency is driven by the primary nucleotide sequence immediately downstream of the cleavage site and base-pairing in a secondary structure a few nucleotides downstream. Cleavage positioning is roughly localised by the downstream secondary structure and fine-tuned by the nucleotide immediately upstream of the cleavage. The identified elements were sufficient for RNase Y-dependent cleavage, since the sequence elements from one of the model transcripts were able to convert an exogenous non-target transcript into a target for RNase Y.
Assuntos
Bacillus subtilis , Regulação Bacteriana da Expressão Gênica , Clivagem do RNA , Estabilidade de RNA , RNA Bacteriano , Staphylococcus aureus , Staphylococcus aureus/genética , Staphylococcus aureus/enzimologia , Bacillus subtilis/genética , Bacillus subtilis/enzimologia , Bacillus subtilis/metabolismo , RNA Bacteriano/metabolismo , RNA Bacteriano/genética , Estabilidade de RNA/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Proteínas de Bactérias/metabolismo , Proteínas de Bactérias/genética , Endorribonucleases/metabolismo , Endorribonucleases/genética , Conformação de Ácido Nucleico , Sequência de BasesRESUMO
Mycobacterium tuberculosis, the bacterium responsible for human tuberculosis, has a genome encoding a remarkably high number of toxin-antitoxin systems of largely unknown function. We have recently shown that the M. tuberculosis genome encodes four of a widespread, MenAT family of nucleotidyltransferase toxin-antitoxin systems. In this study we characterize MenAT1, using tRNA sequencing to demonstrate MenT1 tRNA modification activity. MenT1 activity is blocked by MenA1, a short protein antitoxin unrelated to the MenA3 kinase. X-ray crystallographic analysis shows blockage of the conserved MenT fold by asymmetric binding of MenA1 across two MenT1 protomers, forming a heterotrimeric toxin-antitoxin complex. Finally, we also demonstrate tRNA modification by toxin MenT4, indicating conserved activity across the MenT family. Our study highlights variation in tRNA target preferences by MenT toxins, selective use of nucleotide substrates, and diverse modes of MenA antitoxin activity.
Assuntos
Antitoxinas , Mycobacterium tuberculosis , Toxinas Biológicas , Humanos , Antitoxinas/genética , Nucleotidiltransferases , Nucleotídeos , RNA de Transferência/genéticaRESUMO
BACKGROUND: Oomycetes are fungal-like microorganisms evolutionary distinct from true fungi, belonging to the Stramenopile lineage and comprising major plant pathogens. Both oomycetes and fungi express proteins able to interact with cellulose, a major component of plant and oomycete cell walls, through the presence of carbohydrate-binding module belonging to the family 1 (CBM1). Fungal CBM1-containing proteins were implicated in cellulose degradation whereas in oomycetes, the Cellulose Binding Elicitor Lectin (CBEL), a well-characterized CBM1-protein from Phytophthora parasitica, was implicated in cell wall integrity, adhesion to cellulosic substrates and induction of plant immunity. RESULTS: To extend our knowledge on CBM1-containing proteins in oomycetes, we have conducted a comprehensive analysis on 60 fungi and 7 oomycetes genomes leading to the identification of 518 CBM1-containing proteins. In plant-interacting microorganisms, the larger number of CBM1-protein coding genes is expressed by necrotroph and hemibiotrophic pathogens, whereas a strong reduction of these genes is observed in symbionts and biotrophs. In fungi, more than 70% of CBM1-containing proteins correspond to enzymatic proteins in which CBM1 is associated with a catalytic unit involved in cellulose degradation. In oomycetes more than 90% of proteins are similar to CBEL in which CBM1 is associated with a non-catalytic PAN/Apple domain, known to interact with specific carbohydrates or proteins. Distinct Stramenopile genomes like diatoms and brown algae are devoid of CBM1 coding genes. A CBM1-PAN/Apple association 3D structural modeling was built allowing the identification of amino acid residues interacting with cellulose and suggesting the putative interaction of the PAN/Apple domain with another type of glucan. By Surface Plasmon Resonance experiments, we showed that CBEL binds to glycoproteins through galactose or N-acetyl-galactosamine motifs. CONCLUSIONS: This study provides insight into the evolution and biological roles of CBM1-containing proteins from oomycetes. We show that while CBM1s from fungi and oomycetes are similar, they team up with different protein domains, either in proteins implicated in the degradation of plant cell wall components in the case of fungi or in proteins involved in adhesion to polysaccharidic substrates in the case of oomycetes. This work highlighted the unique role and evolution of CBM1 proteins in oomycete among the Stramenopile lineage.
Assuntos
Celulose/metabolismo , Fungos/genética , Genoma , Glicoproteínas/genética , Oomicetos/genética , Proteínas/genética , Sequência de Aminoácidos , Sequência de Bases , Sítios de Ligação , Parede Celular/química , Parede Celular/metabolismo , Fungos/metabolismo , Glucanos/metabolismo , Glicoproteínas/metabolismo , Modelos Moleculares , Dados de Sequência Molecular , Oomicetos/metabolismo , Plantas/microbiologia , Ligação Proteica , Estrutura Terciária de Proteína , Proteínas/metabolismo , Análise de Sequência de DNA , Homologia de Sequência de Aminoácidos , Ressonância de Plasmônio de SuperfícieRESUMO
Genetic screens are powerful methods for the discovery of gene-phenotype associations. However, a systems biology approach to genetics must leverage the massive amount of "omics" data to enhance the power and speed of functional gene discovery in vivo. Thus far, few computational methods for gene function prediction have been rigorously tested for their performance on a genome-wide scale in vivo. In this work, we demonstrate that integrating genome-wide computational gene prioritization with large-scale genetic screening is a powerful tool for functional gene discovery. To discover genes involved in neural development in Drosophila, we extend our strategy for the prioritization of human candidate disease genes to functional prioritization in Drosophila. We then integrate this prioritization strategy with a large-scale genetic screen for interactors of the proneural transcription factor Atonal using genomic deficiencies and mutant and RNAi collections. Using the prioritized genes validated in our genetic screen, we describe a novel genetic interaction network for Atonal. Lastly, we prioritize the whole Drosophila genome and identify candidate gene associations for ten receptor-signaling pathways. This novel database of prioritized pathway candidates, as well as a web application for functional prioritization in Drosophila, called Endeavour-HighFly, and the Atonal network, are publicly available resources. A systems genetics approach that combines the power of computational predictions with in vivo genetic screens strongly enhances the process of gene function and gene-gene association discovery.
Assuntos
Biologia Computacional/métodos , Drosophila melanogaster/genética , Animais , Bases de Dados Genéticas , Proteínas de Drosophila/genética , Drosophila melanogaster/metabolismo , Técnicas Genéticas , Genética , Genoma , Imuno-Histoquímica , Modelos Genéticos , Fenótipo , Mapeamento de Interação de Proteínas , Interferência de RNA , Transdução de SinaisRESUMO
Xanthomonas campestris pv. campestris is an epiphytic bacterium that can become a vascular pathogen responsible for black rot disease of crucifers. To adapt gene expression in response to ever-changing habitats, phytopathogenic bacteria have evolved signal transduction regulatory pathways, such as extracytoplasmic function (ECF) σ factors. The alternative sigma factor σ(E), encoded by rpoE, is crucial for envelope stress response and plays a role in the pathogenicity of many bacterial species. Here, we combine different approaches to investigate the role and mechanism of σ(E)-dependent activation in X. campestris pv. campestris. We show that the rpoE gene is organized as a single transcription unit with the anti-σ gene rseA and the protease gene mucD and that rpoE transcription is autoregulated. rseA and mucD transcription is also controlled by a highly conserved σ(E)-dependent promoter within the σ(E) gene sequence. The σ(E)-mediated stress response is required for stationary-phase survival, resistance to cadmium, and adaptation to membrane-perturbing stresses (elevated temperature and ethanol). Using microarray technology, we started to define the σ(E) regulon of X. campestris pv. campestris. These genes encode proteins belonging to different classes, including periplasmic or membrane proteins, biosynthetic enzymes, classical heat shock proteins, and the heat stress σ factor σ(H). The consensus sequence for the predicted σ(E)-regulated promoter elements is GGAACTN(15-17)GTCNNA. Determination of the rpoH transcription start site revealed that rpoH was directly regulated by σ(E) under both normal and heat stress conditions. Finally, σ(E) activity is regulated by the putative regulated intramembrane proteolysis (RIP) proteases RseP and DegS, as previously described in many other bacteria. However, our data suggest that RseP and DegS are not only dedicated to RseA cleavage and that the proteolytic cascade of RseA could involve other proteases.
Assuntos
Regulação Bacteriana da Expressão Gênica/fisiologia , Fator sigma/metabolismo , Xanthomonas campestris/metabolismo , Sequência de Bases , Cádmio/farmacologia , Diamida/farmacologia , Perfilação da Expressão Gênica , Regulação Bacteriana da Expressão Gênica/efeitos dos fármacos , Temperatura Alta , Família Multigênica , Óperon , Peptídeo Hidrolases/metabolismo , Regiões Promotoras Genéticas , Análise Serial de Proteínas , Fator sigma/genética , Estresse Fisiológico , Xanthomonas campestris/efeitos dos fármacos , Xanthomonas campestris/genéticaRESUMO
Membrane transporters constitute one of the largest functional categories of proteins in all organisms. In the yeast Saccharomyces cerevisiae, this represents about 300 proteins ( approximately 5% of the proteome). We here present the Yeast Transport Protein database (YTPdb), a user-friendly collaborative resource dedicated to the precise classification and annotation of yeast transporters. YTPdb exploits an evolution of the MediaWiki web engine used for popular collaborative databases like Wikipedia, allowing every registered user to edit the data in a user-friendly manner. Proteins in YTPdb are classified on the basis of functional criteria such as subcellular location or their substrate compounds. These classifications are hierarchical, allowing queries to be performed at various levels, from highly specific (e.g. ammonium as a substrate or the vacuole as a location) to broader (e.g. cation as a substrate or inner membranes as location). Other resources accessible for each transporter via YTPdb include post-translational modifications, K(m) values, a permanently updated bibliography, and a hierarchical classification into families. The YTPdb concept can be extrapolated to other organisms and could even be applied for other functional categories of proteins. YTPdb is accessible at http://homes.esat.kuleuven.be/ytpdb/.
Assuntos
Bases de Dados de Proteínas , Proteínas de Membrana Transportadoras/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/metabolismo , Internet , Processamento de Proteína Pós-Traducional , Proteoma/metabolismoRESUMO
SUMMARY: In recent years, the number of knowledge bases developed using Wiki technology has exploded. Unfortunately, next to their numerous advantages, classical Wikis present a critical limitation: the invaluable knowledge they gather is represented as free text, which hinders their computational exploitation. This is in sharp contrast with the current practice for biological databases where the data is made available in a structured way. Here, we present WikiOpener an extension for the classical MediaWiki engine that augments Wiki pages by allowing on-the-fly querying and formatting resources external to the Wiki. Those resources may provide data extracted from databases or DAS tracks, or even results returned by local or remote bioinformatics analysis tools. This also implies that structured data can be edited via dedicated forms. Hence, this generic resource combines the structure of biological databases with the flexibility of collaborative Wikis. AVAILABILITY: The source code and its documentation are freely available on the MediaWiki website: http://www.mediawiki.org/wiki/Extension:WikiOpener.
Assuntos
Biologia Computacional/métodos , Bases de Dados Factuais , Bases de Conhecimento , Software , InternetRESUMO
The search for feature enrichment is a widely used method to characterize a set of genes. While several tools have been designed for nominal features such as Gene Ontology annotations or KEGG Pathways, very little has been proposed to tackle numerical features such as the chromosomal positions of genes. For instance, microarray studies typically generate gene lists that are differentially expressed in the sample subgroups under investigation, and when studying diseases caused by genome alterations, it is of great interest to delineate the chromosomal regions that are significantly enriched in these lists. In this article, we present a positional gene enrichment analysis method (PGE) for the identification of chromosomal regions that are significantly enriched in a given set of genes. The strength of our method relies on an original query optimization approach that allows to virtually consider all the possible chromosomal regions for enrichment, and on the multiple testing correction which discriminates truly enriched regions versus those that can occur by chance. We have developed a Web tool implementing this method applied to the human genome (http://www.esat.kuleuven.be/~bioiuser/pge). We validated PGE on published lists of differentially expressed genes. These analyses showed significant overrepresentation of known aberrant chromosomal regions.
Assuntos
Algoritmos , Mapeamento Cromossômico/métodos , Perfilação da Expressão Gênica , Genoma Humano , Análise de Sequência com Séries de Oligonucleotídeos , Síndrome de Down/genética , Humanos , Internet , Leucemia Linfocítica Crônica de Células B/genética , Neuroblastoma/genética , SoftwareRESUMO
Endeavour (http://www.esat.kuleuven.be/endeavourweb; this web site is free and open to all users and there is no login requirement) is a web resource for the prioritization of candidate genes. Using a training set of genes known to be involved in a biological process of interest, our approach consists of (i) inferring several models (based on various genomic data sources), (ii) applying each model to the candidate genes to rank those candidates against the profile of the known genes and (iii) merging the several rankings into a global ranking of the candidate genes. In the present article, we describe the latest developments of Endeavour. First, we provide a web-based user interface, besides our Java client, to make Endeavour more universally accessible. Second, we support multiple species: in addition to Homo sapiens, we now provide gene prioritization for three major model organisms: Mus musculus, Rattus norvegicus and Caenorhabditis elegans. Third, Endeavour makes use of additional data sources and is now including numerous databases: ontologies and annotations, protein-protein interactions, cis-regulatory information, gene expression data sets, sequence information and text-mining data. We tested the novel version of Endeavour on 32 recent disease gene associations from the literature. Additionally, we describe a number of recent independent studies that made use of Endeavour to prioritize candidate genes for obesity and Type II diabetes, cleft lip and cleft palate, and pulmonary fibrosis.
Assuntos
Genes , Predisposição Genética para Doença , Software , Animais , Caenorhabditis elegans/genética , Drosophila melanogaster/genética , Humanos , Internet , Camundongos , Modelos Animais , Ratos , Peixe-Zebra/genéticaRESUMO
Molecular chaperones maintain cellular protein homeostasis by acting at almost every step in protein biogenesis pathways. The DnaK/HSP70 chaperone has been associated with almost every known essential chaperone functions in bacteria. To act as a bona fide chaperone, DnaK strictly relies on essential co-chaperone partners known as the J-domain proteins (JDPs, DnaJ, Hsp40), which preselect substrate proteins for DnaK, confer its specific cellular localization, and stimulate both its weak ATPase activity and substrate transfer. Remarkably, genome sequencing has revealed the presence of multiple JDP/DnaK chaperone/co-chaperone pairs in a number of bacterial genomes, suggesting that certain pairs have evolved toward more specific functions. In this review, we have used representative sets of bacterial and phage genomes to explore the distribution of JDP/DnaK pairs. Such analysis has revealed an unexpected reservoir of novel bacterial JDPs co-chaperones with very diverse and unexplored function that will be discussed.
Assuntos
Proteínas de Escherichia coli/genética , Escherichia coli/genética , Proteínas de Choque Térmico HSP40/genética , Proteínas de Choque Térmico HSP70/genética , Domínios Proteicos/genética , Adenosina Trifosfatases/genética , Bactérias/virologia , Bacteriófagos/genética , Escherichia coli/virologia , Humanos , Redes e Vias Metabólicas/genética , Chaperonas Moleculares/genética , Biossíntese de Proteínas/genéticaRESUMO
BACKGROUND: The search for enriched features has become widely used to characterize a set of genes or proteins. A key aspect of this technique is its ability to identify correlations amongst heterogeneous data such as Gene Ontology annotations, gene expression data and genome location of genes. Despite the rapid growth of available data, very little has been proposed in terms of formalization and optimization. Additionally, current methods mainly ignore the structure of the data which causes results redundancy. For example, when searching for enrichment in GO terms, genes can be annotated with multiple GO terms and should be propagated to the more general terms in the Gene Ontology. Consequently, the gene sets often overlap partially or totally, and this causes the reported enriched GO terms to be both numerous and redundant, hence, overwhelming the researcher with non-pertinent information. This situation is not unique, it arises whenever some hierarchical clustering is performed (e.g. based on the gene expression profiles), the extreme case being when genes that are neighbors on the chromosomes are considered. RESULTS: We present a generic framework to efficiently identify the most pertinent over-represented features in a set of genes. We propose a formal representation of gene sets based on the theory of partially ordered sets (posets), and give a formal definition of target set pertinence. Algorithms and compact representations of target sets are provided for the generation and the evaluation of the pertinent target sets. The relevance of our method is illustrated through the search for enriched GO annotations in the proteins involved in a multiprotein complex. The results obtained demonstrate the gain in terms of pertinence (up to 64% redundancy removed), space requirements (up to 73% less storage) and efficiency (up to 98% less comparisons). CONCLUSION: The generic framework presented in this article provides a formal approach to adequately represent available data and efficiently search for pertinent over-represented features in a set of genes or proteins. The formalism and the pertinence definition can be directly used by most of the methods and tools currently available for feature enrichment analysis.
Assuntos
Biologia Computacional/métodos , Compressão de Dados/métodos , Sistemas de Gerenciamento de Base de Dados , Perfilação da Expressão Gênica/métodos , Reconhecimento Automatizado de Padrão , Algoritmos , Inteligência Artificial , Análise por Conglomerados , Bases de Dados Genéticas/estatística & dados numéricos , Bases de Dados de Proteínas/estatística & dados numéricos , Eficiência , Perfilação da Expressão Gênica/estatística & dados numéricos , Teoria da Informação , Proteínas/classificação , Proteínas/genética , Proteínas/metabolismo , Relação Estrutura-Atividade , Terminologia como Assunto , Simplificação do TrabalhoRESUMO
The combination of sequencing and post-sequencing experimental approaches produces huge collections of data that are highly heterogeneous both in structure and in semantics. We propose a new strategy for the integration of such data. This strategy uses structured sets of sequences as a unified representation of biological information and defines a probabilistic measure of similarity between the sets. Sets can be composed of sequences that are known to have a biological relationship (e.g. proteins involved in a complex or a pathway) or that share similar values for a particular attribute (e.g. expression profile). We have developed a software, BlastSets, which implements this strategy. It exploits a database where the sets derived from diverse biological information can be deposited using a standard XML format. For a given query set, BlastSets returns target sets found in the database whose similarity to the query is statistically significant. The tool allowed us to automatically identify verified relationships between correlated expression profiles and biological pathways using publicly available data for Saccharomyces cerevisiae. It was also used to retrieve the members of a complex (ribosome) based on the mining of expression profiles. These first results validate the relevance of the strategy and demonstrate the promising potential of BlastSets.
Assuntos
Biologia Computacional/métodos , Análise de Sequência/métodos , Software , Bases de Dados Genéticas , Perfilação da Expressão Gênica , Genômica , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Integração de SistemasRESUMO
BACKGROUND: How to efficiently integrate the daily practice of molecular biologists, geneticists, and clinicians with the emerging computational strategies from systems biology is still much of an open question. DESCRIPTION: We built on the recent advances in Wiki-based technologies to develop a collaborative knowledge base and gene prioritization portal aimed at mapping genes and genomic regions, and untangling their relations with corresponding human phenotypes, congenital heart defects (CHDs). This portal is not only an evolving community repository of current knowledge on the genetic basis of CHDs, but also a collaborative environment for the study of candidate genes potentially implicated in CHDs - in particular by integrating recent strategies for the statistical prioritization of candidate genes. It thus serves and connects the broad community that is facing CHDs, ranging from the pediatric cardiologist and clinical geneticist to the basic investigator of cardiogenesis. CONCLUSIONS: This study describes the first specialized portal to collaboratively annotate and analyze gene-phenotype networks. Of broad interest to the biological community, we argue that such portals will play a significant role in systems biology studies of numerous complex biological processes.CHDWiki is accessible at http://www.esat.kuleuven.be/~bioiuser/chdwiki.