Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Sci Rep ; 7(1): 16692, 2017 11 30.
Artigo em Inglês | MEDLINE | ID: mdl-29192227

RESUMO

Nonsense-mediated mRNA decay (NMD) is an essential eukaryotic process regulating transcript quality and abundance, and is involved in diverse processes including brain development and plant defenses. Although some of the NMD machinery is conserved between kingdoms, little is known about its evolution. Phosphorylation of the core NMD component UPF1 is critical for NMD and is regulated in mammals by the SURF complex (UPF1, SMG1 kinase, SMG8, SMG9 and eukaryotic release factors). However, since SMG1 is reportedly missing from the genomes of fungi and the plant Arabidopsis thaliana, it remains unclear how UPF1 is activated outside the metazoa. We used comparative genomics to determine the conservation of the NMD pathway across eukaryotic evolution. We show that SURF components are present in all major eukaryotic lineages, including fungi, suggesting that in addition to UPF1 and SMG1, SMG8 and SMG9 also existed in the last eukaryotic common ancestor, 1.8 billion years ago. However, despite the ancient origins of the SURF complex, we also found that SURF factors have been independently lost across the Eukarya, pointing to genetic buffering within the essential NMD pathway. We infer an ancient role for SURF in regulating UPF1, and the intriguing possibility of undiscovered NMD regulatory pathways.


Assuntos
Eucariotos/genética , Evolução Molecular , Complexos Multienzimáticos/genética , Degradação do RNAm Mediada por Códon sem Sentido/genética , Genômica/métodos
2.
Plant Cell ; 29(11): 2786-2800, 2017 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-29070508

RESUMO

Gene and genome duplications have been rampant during the evolution of flowering plants. Unlike small-scale gene duplications, whole-genome duplications (WGDs) copy entire pathways or networks, and as such create the unique situation in which such duplicated pathways or networks could evolve novel functionality through the coordinated sub- or neofunctionalization of its constituent genes. Here, we describe a remarkable case of coordinated gene expression divergence following WGDs in Arabidopsis thaliana We identified a set of 92 homoeologous gene pairs that all show a similar pattern of tissue-specific gene expression divergence following WGD, with one homoeolog showing predominant expression in aerial tissues and the other homoeolog showing biased expression in tip-growth tissues. We provide evidence that this pattern of gene expression divergence seems to involve genes with a role in cell polarity and that likely function in the maintenance of cell wall integrity. Following WGD, many of these duplicated genes evolved separate functions through subfunctionalization in growth/development and stress response. Uncoupling these processes through genome duplications likely provided important adaptations with respect to growth and morphogenesis and defense against biotic and abiotic stress.


Assuntos
Arabidopsis/genética , Duplicação Gênica , Genes de Plantas/genética , Variação Genética , Genoma de Planta/genética , Adaptação Fisiológica/genética , Arabidopsis/crescimento & desenvolvimento , Evolução Molecular , Perfilação da Expressão Gênica , Regulação da Expressão Gênica no Desenvolvimento , Regulação da Expressão Gênica de Plantas , Ontologia Genética , Genes Duplicados/genética , Modelos Genéticos , Estresse Fisiológico
3.
Plant Physiol ; 171(3): 1704-19, 2016 07.
Artigo em Inglês | MEDLINE | ID: mdl-27225899

RESUMO

The genes coding for the core metabolic enzymes of the photorespiratory pathway that allows plants with C3-type photosynthesis to survive in an oxygen-rich atmosphere, have been largely discovered in genetic screens aimed to isolate mutants that are unviable under ambient air. As an exception, glycolate oxidase (GOX) mutants with a photorespiratory phenotype have not been described yet in C3 species. Using Arabidopsis (Arabidopsis thaliana) mutants lacking the peroxisomal CATALASE2 (cat2-2) that display stunted growth and cell death lesions under ambient air, we isolated a second-site loss-of-function mutation in GLYCOLATE OXIDASE1 (GOX1) that attenuated the photorespiratory phenotype of cat2-2 Interestingly, knocking out the nearly identical GOX2 in the cat2-2 background did not affect the photorespiratory phenotype, indicating that GOX1 and GOX2 play distinct metabolic roles. We further investigated their individual functions in single gox1-1 and gox2-1 mutants and revealed that their phenotypes can be modulated by environmental conditions that increase the metabolic flux through the photorespiratory pathway. High light negatively affected the photosynthetic performance and growth of both gox1-1 and gox2-1 mutants, but the negative consequences of severe photorespiration were more pronounced in the absence of GOX1, which was accompanied with lesser ability to process glycolate. Taken together, our results point toward divergent functions of the two photorespiratory GOX isoforms in Arabidopsis and contribute to a better understanding of the photorespiratory pathway.


Assuntos
Oxirredutases do Álcool/metabolismo , Proteínas de Arabidopsis/metabolismo , Arabidopsis/fisiologia , Oxirredutases do Álcool/genética , Arabidopsis/genética , Proteínas de Arabidopsis/genética , Respiração Celular , Evolução Molecular , Glicolatos/metabolismo , Luz , Metaboloma/genética , Mutação , Oxirredução , Fenótipo , Fotossíntese
4.
Plant Cell ; 28(2): 326-44, 2016 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-26744215

RESUMO

Gene duplication is an important mechanism for adding to genomic novelty. Hence, which genes undergo duplication and are preserved following duplication is an important question. It has been observed that gene duplicability, or the ability of genes to be retained following duplication, is a nonrandom process, with certain genes being more amenable to survive duplication events than others. Primarily, gene essentiality and the type of duplication (small-scale versus large-scale) have been shown in different species to influence the (long-term) survival of novel genes. However, an overarching view of "gene duplicability" is lacking, mainly due to the fact that previous studies usually focused on individual species and did not account for the influence of genomic context and the time of duplication. Here, we present a large-scale study in which we investigated duplicate retention for 9178 gene families shared between 37 flowering plant species, referred to as angiosperm core gene families. For most gene families, we observe a strikingly consistent pattern of gene duplicability across species, with gene families being either primarily single-copy or multicopy in all species. An intermediate class contains gene families that are often retained in duplicate for periods extending to tens of millions of years after whole-genome duplication, but ultimately appear to be largely restored to singleton status, suggesting that these genes may be dosage balance sensitive. The distinction between single-copy and multicopy gene families is reflected in their functional annotation, with single-copy genes being mainly involved in the maintenance of genome stability and organelle function and multicopy genes in signaling, transport, and metabolism. The intermediate class was overrepresented in regulatory genes, further suggesting that these represent putative dosage-balance-sensitive genes.


Assuntos
Dosagem de Genes , Duplicação Gênica , Genes Essenciais/genética , Genoma de Planta/genética , Genômica , Magnoliopsida/genética
5.
Appl Plant Sci ; 3(4)2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25909041

RESUMO

PREMISE OF THE STUDY: Targeted sequencing using next-generation sequencing (NGS) platforms offers enormous potential for plant systematics by enabling economical acquisition of multilocus data sets that can resolve difficult phylogenetic problems. However, because discovery of single-copy nuclear (SCN) loci from NGS data requires both bioinformatics skills and access to high-performance computing resources, the application of NGS data has been limited. METHODS AND RESULTS: We developed MarkerMiner 1.0, a fully automated, open-access bioinformatic workflow and application for discovery of SCN loci in angiosperms. Our new tool identified as many as 1993 SCN loci from transcriptomic data sampled as part of four independent test cases representing marker development projects at different phylogenetic scales. CONCLUSIONS: MarkerMiner is an easy-to-use and effective tool for discovery of putative SCN loci. It can be run locally or via the Web, and its tabular and alignment outputs facilitate efficient downstream assessments of phylogenetic utility, locus selection, intron-exon boundary prediction, and primer or probe development.

6.
Proc Natl Acad Sci U S A ; 110(8): 2898-903, 2013 Feb 19.
Artigo em Inglês | MEDLINE | ID: mdl-23382190

RESUMO

The importance of gene gain through duplication has long been appreciated. In contrast, the importance of gene loss has only recently attracted attention. Indeed, studies in organisms ranging from plants to worms and humans suggest that duplication of some genes might be better tolerated than that of others. Here we have undertaken a large-scale study to investigate the existence of duplication-resistant genes in the sequenced genomes of 20 flowering plants. We demonstrate that there is a large set of genes that is convergently restored to single-copy status following multiple genome-wide and smaller scale duplication events. We rule out the possibility that such a pattern could be explained by random gene loss only and therefore propose that there is selection pressure to preserve such genes as singletons. This is further substantiated by the observation that angiosperm single-copy genes do not comprise a random fraction of the genome, but instead are often involved in essential housekeeping functions that are highly conserved across all eukaryotes. Furthermore, single-copy genes are generally expressed more highly and in more tissues than non-single-copy genes, and they exhibit higher sequence conservation. Finally, we propose different hypotheses to explain their resistance against duplication.


Assuntos
Deleção de Genes , Duplicação Gênica , Magnoliopsida/genética , Genes de Plantas
7.
Curr Opin Plant Biol ; 15(2): 168-76, 2012 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-22305522

RESUMO

Polyploidy or whole-genome duplication is a frequent phenomenon within the plant kingdom and has been associated with the occurrence of evolutionary novelty and increase in biological complexity. Because genome-wide duplication events duplicate whole molecular networks it is of interest to investigate how these networks evolve subsequent to such events. Although genome duplications are generally followed by massive gene loss, at least part of the network is usually retained in duplicate and can rewire to execute novel functions. Alternatively, the network can remain largely redundant and as such confer robustness against mutations. The increasing availability of high-throughput data makes it possible to study evolution following whole genome duplication events at the network level. Here we discuss how the use of 'omics' data in network analysis can provide novel insights on network redundancy and rewiring and conclude with some directions for future research.


Assuntos
Duplicação Gênica/genética , Genoma de Planta/genética , Evolução Molecular , Poliploidia
8.
PLoS One ; 6(7): e20938, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21779320

RESUMO

BACKGROUND: Microarrays are the main technology for large-scale transcriptional gene expression profiling, but the large bodies of data available in public databases are not useful due to the large heterogeneity. There are several initiatives that attempt to bundle these data into expression compendia, but such resources for bacterial organisms are scarce and limited to integration of experiments from the same platform or to indirect integration of per experiment analysis results. METHODOLOGY/PRINCIPAL FINDINGS: We have constructed comprehensive organism-specific cross-platform expression compendia for three bacterial model organisms (Escherichia coli, Bacillus subtilis, and Salmonella enterica serovar Typhimurium) together with an access portal, dubbed COLOMBOS, that not only provides easy access to the compendia, but also includes a suite of tools for exploring, analyzing, and visualizing the data within these compendia. It is freely available at http://bioi.biw.kuleuven.be/colombos. The compendia are unique in directly combining expression information from different microarray platforms and experiments, and we illustrate the potential benefits of this direct integration with a case study: extending the known regulon of the Fur transcription factor of E. coli. The compendia also incorporate extensive annotations for both genes and experimental conditions; these heterogeneous data are functionally integrated in the COLOMBOS analysis tools to interactively browse and query the compendia not only for specific genes or experiments, but also metabolic pathways, transcriptional regulation mechanisms, experimental conditions, biological processes, etc. CONCLUSIONS/SIGNIFICANCE: We have created cross-platform expression compendia for several bacterial organisms and developed a complementary access port COLOMBOS, that also serves as a convenient expression analysis tool to extract useful biological information. This work is relevant to a large community of microbiologists by facilitating the use of publicly available microarray experiments to support their research.


Assuntos
Bacillus subtilis/genética , Bases de Dados Genéticas , Escherichia coli/genética , Salmonella enterica/genética , Análise de Sequência com Séries de Oligonucleotídeos
9.
Bioinformatics ; 27(14): 1948-56, 2011 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-21593133

RESUMO

MOTIVATION: Query-based biclustering techniques allow interrogating a gene expression compendium with a given gene or gene list. They do so by searching for genes in the compendium that have a profile close to the average expression profile of the genes in this query-list. As it can often not be guaranteed that the genes in a long query-list will all be mutually coexpressed, it is advisable to use each gene separately as a query. This approach, however, leaves the user with a tedious post-processing of partially redundant biclustering results. The fact that for each query-gene multiple parameter settings need to be tested in order to detect the 'most optimal bicluster size' adds to the redundancy problem. RESULTS: To aid with this post-processing, we developed an ensemble approach to be used in combination with query-based biclustering. The method relies on a specifically designed consensus matrix in which the biclustering outcomes for multiple query-genes and for different possible parameter settings are merged in a statistically robust way. Clustering of this matrix results in distinct, non-redundant consensus biclusters that maximally reflect the information contained within the original query-based biclustering results. The usefulness of the developed approach is illustrated on a biological case study in Escherichia coli. AVAILABILITY AND IMPLEMENTATION: Compiled Matlab code is available from http://homes.esat.kuleuven.be/~kmarchal/Supplementary_Information_DeSmet_2011/.


Assuntos
Perfilação da Expressão Gênica/métodos , Algoritmos , Análise por Conglomerados , Escherichia coli/genética , Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos/métodos
10.
BMC Bioinformatics ; 12 Suppl 1: S37, 2011 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-21342568

RESUMO

BACKGROUND: With the availability of large scale expression compendia it is now possible to view own findings in the light of what is already available and retrieve genes with an expression profile similar to a set of genes of interest (i.e., a query or seed set) for a subset of conditions. To that end, a query-based strategy is needed that maximally exploits the coexpression behaviour of the seed genes to guide the biclustering, but that at the same time is robust against the presence of noisy genes in the seed set as seed genes are often assumed, but not guaranteed to be coexpressed in the queried compendium. Therefore, we developed ProBic, a query-based biclustering strategy based on Probabilistic Relational Models (PRMs) that exploits the use of prior distributions to extract the information contained within the seed set. RESULTS: We applied ProBic on a large scale Escherichia coli compendium to extend partially described regulons with potentially novel members. We compared ProBic's performance with previously published query-based biclustering algorithms, namely ISA and QDB, from the perspective of bicluster expression quality, robustness of the outcome against noisy seed sets and biological relevance.This comparison learns that ProBic is able to retrieve biologically relevant, high quality biclusters that retain their seed genes and that it is particularly strong in handling noisy seeds. CONCLUSIONS: ProBic is a query-based biclustering algorithm developed in a flexible framework, designed to detect biologically relevant, high quality biclusters that retain relevant seed genes even in the presence of noise or when dealing with low quality seed sets.


Assuntos
Algoritmos , Perfilação da Expressão Gênica/métodos , Modelos Estatísticos , Análise por Conglomerados , Bases de Dados Genéticas , Escherichia coli/genética , Análise de Sequência com Séries de Oligonucleotídeos , Regulon
11.
Nucleic Acids Res ; 39(2): e6, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21051340

RESUMO

Recognition of genomic binding sites by transcription factors can occur through base-specific recognition, or by recognition of variations within the structure of the DNA macromolecule. In this article, we investigate what information can be retrieved from local DNA structural properties that is relevant to transcription factor binding and that cannot be captured by the nucleotide sequence alone. More specifically, we explore the benefit of employing the structural characteristics of DNA to create binding-site models that encompass indirect recognition for the Escherichia coli model organism. We developed a novel methodology [Conditional Random fields of Smoothed Structural Data (CRoSSeD)], based on structural scales and conditional random fields to model and predict regulator binding sites. The value of relying on local structural-DNA properties is demonstrated by improved classifier performance on a large number of biological datasets, and by the detection of novel binding sites which could be validated by independent data sources, and which could not be identified using sequence data alone. We further show that the CRoSSeD-binding-site models can be related to the actual molecular mechanisms of the transcription factor DNA binding, and thus cannot only be used for prediction of novel sites, but might also give valuable insights into unknown binding mechanisms of transcription factors.


Assuntos
Escherichia coli/genética , Modelos Estatísticos , Elementos Reguladores de Transcrição , Fatores de Transcrição/metabolismo , Sítios de Ligação , DNA Bacteriano/química , DNA Bacteriano/metabolismo , Probabilidade , Regulon
12.
Nat Rev Microbiol ; 8(10): 717-29, 2010 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-20805835

RESUMO

Network inference, which is the reconstruction of biological networks from high-throughput data, can provide valuable information about the regulation of gene expression in cells. However, it is an underdetermined problem, as the number of interactions that can be inferred exceeds the number of independent measurements. Different state-of-the-art tools for network inference use specific assumptions and simplifications to deal with underdetermination, and these influence the inferences. The outcome of network inference therefore varies between tools and can be highly complementary. Here we categorize the available tools according to the strategies that they use to deal with the problem of underdetermination. Such categorization allows an insight into why a certain tool is more appropriate for the specific research question or data set at hand.


Assuntos
Redes Reguladoras de Genes/genética , Modelos Genéticos , Bactérias/genética , Bactérias/metabolismo , Regulação da Expressão Gênica
13.
BMC Syst Biol ; 3: 49, 2009 May 07.
Artigo em Inglês | MEDLINE | ID: mdl-19422680

RESUMO

BACKGROUND: A myriad of methods to reverse-engineer transcriptional regulatory networks have been developed in recent years. Direct methods directly reconstruct a network of pairwise regulatory interactions while module-based methods predict a set of regulators for modules of coexpressed genes treated as a single unit. To date, there has been no systematic comparison of the relative strengths and weaknesses of both types of methods. RESULTS: We have compared a recently developed module-based algorithm, LeMoNe (Learning Module Networks), to a mutual information based direct algorithm, CLR (Context Likelihood of Relatedness), using benchmark expression data and databases of known transcriptional regulatory interactions for Escherichia coli and Saccharomyces cerevisiae. A global comparison using recall versus precision curves hides the topologically distinct nature of the inferred networks and is not informative about the specific subtasks for which each method is most suited. Analysis of the degree distributions and a regulator specific comparison show that CLR is 'regulator-centric', making true predictions for a higher number of regulators, while LeMoNe is 'target-centric', recovering a higher number of known targets for fewer regulators, with limited overlap in the predicted interactions between both methods. Detailed biological examples in E. coli and S. cerevisiae are used to illustrate these differences and to prove that each method is able to infer parts of the network where the other fails. Biological validation of the inferred networks cautions against over-interpreting recall and precision values computed using incomplete reference networks. CONCLUSION: Our results indicate that module-based and direct methods retrieve largely distinct parts of the underlying transcriptional regulatory networks. The choice of algorithm should therefore be based on the particular biological problem of interest and not on global metrics which cannot be transferred between organisms. The development of sound statistical methods for integrating the predictions of different reverse-engineering strategies emerges as an important challenge for future research.


Assuntos
Algoritmos , Redes Reguladoras de Genes , Transcrição Gênica , Membrana Celular/metabolismo , Respiração Celular/genética , Quimiotaxia/genética , Biologia Computacional , Escherichia coli/citologia , Escherichia coli/genética , Escherichia coli/metabolismo , Ácidos Graxos/metabolismo , Flagelos/genética , Flagelos/metabolismo , Perfilação da Expressão Gênica , Lipídeos de Membrana/metabolismo , Metionina/metabolismo , Nitrogênio/metabolismo , Reprodutibilidade dos Testes , Saccharomyces cerevisiae/citologia , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Sensibilidade e Especificidade
14.
Ann N Y Acad Sci ; 1158: 36-43, 2009 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-19348630

RESUMO

"Module networks" are a framework to learn gene regulatory networks from expression data using a probabilistic model in which coregulated genes share the same parameters and conditional distributions. We present a method to infer ensembles of such networks and an averaging procedure to extract the statistically most significant modules and their regulators. We show that the inferred probabilistic models extend beyond the dataset used to learn the models.


Assuntos
Expressão Gênica , Redes Reguladoras de Genes , Modelos Genéticos , Algoritmos , Biologia Computacional/métodos , Simulação por Computador , Perfilação da Expressão Gênica , Software
15.
Bioinformatics ; 25(4): 490-6, 2009 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-19136553

RESUMO

MOTIVATION: The solution of high-dimensional inference and prediction problems in computational biology is almost always a compromise between mathematical theory and practical constraints, such as limited computational resources. As time progresses, computational power increases but well-established inference methods often remain locked in their initial suboptimal solution. RESULTS: We revisit the approach of Segal et al. to infer regulatory modules and their condition-specific regulators from gene expression data. In contrast to their direct optimization-based solution, we use a more representative centroid-like solution extracted from an ensemble of possible statistical models to explain the data. The ensemble method automatically selects a subset of most informative genes and builds a quantitatively better model for them. Genes which cluster together in the majority of models produce functionally more coherent modules. Regulators which are consistently assigned to a module are more often supported by literature, but a single model always contains many regulator assignments not supported by the ensemble. Reliably detecting condition-specific or combinatorial regulation is particularly hard in a single optimum but can be achieved using ensemble averaging. AVAILABILITY: All software developed for this study is available from http://bioinformatics.psb.ugent.be/software.


Assuntos
Biologia Computacional/métodos , Redes Reguladoras de Genes , Software , Análise por Conglomerados , Modelos Genéticos , Modelos Estatísticos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA