RESUMO
BACKGROUND: Understanding transcriptional regulation by genome-wide microarray studies can contribute to unravel complex relationships between genes. Attempts to standardize the annotation of microarray data include the Minimum Information About a Microarray Experiment (MIAME) recommendations, the MAGE-ML format for data interchange, and the use of controlled vocabularies or ontologies. The existing software systems for microarray data analysis implement the mentioned standards only partially and are often hard to use and extend. Integration of genomic annotation data and other sources of external knowledge using open standards is therefore a key requirement for future integrated analysis systems. RESULTS: The EMMA 2 software has been designed to resolve shortcomings with respect to full MAGE-ML and ontology support and makes use of modern data integration techniques. We present a software system that features comprehensive data analysis functions for spotted arrays, and for the most common synthesized oligo arrays such as Agilent, Affymetrix and NimbleGen. The system is based on the full MAGE object model. Analysis functionality is based on R and Bioconductor packages and can make use of a compute cluster for distributed services. CONCLUSION: Our model-driven approach for automatically implementing a full MAGE object model provides high flexibility and compatibility. Data integration via SOAP-based web-services is advantageous in a distributed client-server environment as the collaborative analysis of microarray data is gaining more and more relevance in international research consortia. The adequacy of the EMMA 2 software design and implementation has been proven by its application in many distributed functional genomics projects. Its scalability makes the current architecture suited for extensions towards future transcriptomics methods based on high-throughput sequencing approaches which have much higher computational requirements than microarrays.
Assuntos
Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Software , Bases de Dados Genéticas , Genoma , Internet , Interface Usuário-ComputadorRESUMO
BACKGROUND: Databases for either sequence, annotation, or microarray experiments data are extremely beneficial to the research community, as they centrally gather information from experiments performed by different scientists. However, data from different sources develop their full capacities only when combined. The idea of a data warehouse directly adresses this problem and solves it by integrating all required data into one single database - hence there are already many data warehouses available to genetics. For the model legume Medicago truncatula, there is currently no such single data warehouse that integrates all freely available gene sequences, the corresponding gene expression data, and annotation information. Thus, we created the data warehouse TRUNCATULIX, an integrative database of Medicago truncatula sequence and expression data. RESULTS: The TRUNCATULIX data warehouse integrates five public databases for gene sequences, and gene annotations, as well as a database for microarray expression data covering raw data, normalized datasets, and complete expression profiling experiments. It can be accessed via an AJAX-based web interface using a standard web browser. For the first time, users can now quickly search for specific genes and gene expression data in a huge database based on high-quality annotations. The results can be exported as Excel, HTML, or as csv files for further usage. CONCLUSION: The integration of sequence, annotation, and gene expression data from several Medicago truncatula databases in TRUNCATULIX provides the legume community with access to data and data mining capability not previously available. TRUNCATULIX is freely available at http://www.cebitec.uni-bielefeld.de/truncatulix/.
Assuntos
Sistemas de Gerenciamento de Base de Dados , Bases de Dados Genéticas , Medicago truncatula/genética , Biologia Computacional/métodos , Interface Usuário-ComputadorRESUMO
The great majority of terrestrial plants enters a beneficial arbuscular mycorrhiza (AM) or ectomycorrhiza (ECM) symbiosis with soil fungi. In the SPP 1084 "MolMyk: Molecular Basics of Mycorrhizal Symbioses", high-throughput EST-sequencing was performed to obtain snapshots of the plant and fungal transcriptome in mycorrhizal roots and in extraradical hyphae. To focus activities, the interactions between Medicago truncatula and Glomus intraradices as well as Populus tremula and Amanita muscaria were selected as models for AM and ECM symbioses, respectively. Together, almost, 20.000 expressed sequence tags (ESTs) were generated from different random and suppressive subtractive hybridization (SSH) cDNA libraries, providing a comprehensive overview of the mycorrhizal transcriptome. To automatically cluster and annotate EST-sequences, the BioMake and SAMS software tools were developed. In connection with the eNorthern software SteN, plant genes with a predicted mycorrhiza-induced expression were identified. To support experimental transcriptome profiling, macro- and microarray tools have been constructed for the two model mycorrhizae, based either on PCR-amplified cDNAs or 70mer oligonucleotides. These arrays were used to profile the transcriptome of AM and ECM roots under different conditions, and the data obtained were uploaded to the ArrayLIMS and EMMA databases that are designed to store and evaluate expression profiles from DNA arrays. Together, the EST- and transcriptome databases can be mined to identify candidate genes for targeted functional studies.
Assuntos
Biologia Computacional/métodos , Etiquetas de Sequências Expressas , Micorrizas/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Simbiose/genética , Transcrição Gênica/genéticaRESUMO
BACKGROUND: Expressed Sequence Tags (ESTs) are in general used to gain a first insight into gene activities from a species of interest. Subsequently, and typically based on a combination of EST and genome sequences, microarray-based expression analyses are performed for a variety of conditions. In some cases, a multitude of EST and microarray experiments are conducted for one species, covering different tissues, cell states, and cell types. Under these circumstances, the challenge arises to combine results derived from the different expression profiling strategies, with the goal to uncover novel information on the basis of the integrated datasets. FINDINGS: Using our new analysis tool, MediPlEx (MEDIcago truncatula multiPLe EXpression analysis), expression data from EST experiments, oligonucleotide microarrays and Affymetrix GeneChips® can be combined and analyzed, leading to a novel approach to integrated transcriptome analysis. We have validated our tool via the identification of a set of well-characterized AM-specific and AM-induced marker genes, identified by MediPlEx on the basis of in silico and experimental gene expression profiles from roots colonized with AM fungi. CONCLUSIONS: MediPlEx offers an integrated analysis pipeline for different sets of expression data generated for the model legume Medicago truncatula. As expected, in silico and experimental gene expression data that cover the same biological condition correlate well. The collection of differentially expressed genes identified via MediPlEx provides a starting point for functional studies in plant mutants. MediPlEx can freely be used at http://www.cebitec.uni-bielefeld.de/mediplex.
RESUMO
In botryllid ascidians, allogeneic contacts between histoincompatible colonies lead to inflammatory rejection responses, which eventually separate the interacting colonies. In order to elucidate the molecular background of allogeneic rejection in the colonial ascidian Botryllus schlosseri, we performed microarray assays verified by qPCR, and employed bioinformatic analyses of the results, revealing disparate transcription profiles of the rejecting partners. While only minor expression changes were documented during rejection when both interacting genotypes were pooled together, analyses performed on each genotype separately portrayed disparate transcriptome responses. Allogeneic interacting genotypes that developed the morphological markers of rejection (points of rejection; PORs), termed 'rejected' genotypes, showed transcription inhibition of key functional gene groups, including protein biosynthesis, cell structure and motility and stress response genes. In contrast, the allogeneic partners that did not show PORs, termed 'rejecting' genotypes, showed minor expression changes that were different from those of the 'rejected' genotypes. This data demonstrates that the observed morphological changes in the 'rejected' genotypes are not due to active transcriptional response to the immune challenge but reflect transcription inhibition of response elements. Based on the morphological and molecular outcomes we suggest that the 'rejected' colony activates an injurious self-destructive mechanism in order to disconnect itself from its histoincompatible neighboring colony.
Assuntos
Perfilação da Expressão Gênica , Urocordados/imunologia , Animais , Imunidade Inata , Reação em Cadeia da PolimeraseRESUMO
In order to aid gene discovery and uncover genes responding to abiotic stressors in stress-tolerant brown algae of the genus Fucus, expressed sequence tags (ESTs) were studied in two species, Fucus serratus and Fucus vesiculosus. Clustering of over 12,000 ESTs from three libraries for heat shock/recovery and desiccation/rehydration resulted in identification of 2,503, 1,290, and 2,409 unigenes from heat-shocked F. serratus, desiccated F. serratus, and desiccated F. vesiculosus, respectively. Low overall annotation rates (18-31%) were strongly associated with the presence of long 3' untranslated regions in Fucus transcripts, as shown by analyses of predicted protein-coding sequence in annotated and nonannotated tentative consensus sequences. Posttranslational modification genes were overrepresented in the heat shock/recovery library, including many chaperones, the most abundant of which were a family of small heat shock protein transcripts, Hsp90 and Hsp70 members. Transcripts of LI818-like light-harvesting genes implicated in photoprotection were also expressed during heat shock in high light. The expression of several heat-shock-responsive genes was confirmed by quantitative reverse transcription polymerase chain reaction. However, candidate genes were notably absent from both desiccation/rehydration libraries, while the responses of the two species to desiccation were divergent, perhaps reflecting the species-specific physiological differences in stress tolerance previously established. Desiccation-tolerant F. vesiculosus overexpressed at least 17 ribosomal protein genes and two ubiquitin-ribosomal protein fusion genes, suggesting that ribosome function and/or biogenesis are important during cycles of rapid desiccation and rehydration in the intertidal zone and possibly indicate parallels with other poikilohydric organisms such as desiccation-tolerant bryophytes.
Assuntos
Desidratação/genética , Etiquetas de Sequências Expressas , Fucus/genética , Regulação da Expressão Gênica/genética , Proteínas de Choque Térmico/genética , Filogenia , Estresse Fisiológico/genética , Sequência de Bases , Análise por Conglomerados , Primers do DNA/genética , Fucus/metabolismo , Biblioteca Gênica , Proteínas de Choque Térmico/metabolismo , Complexos de Proteínas Captadores de Luz/genética , Complexos de Proteínas Captadores de Luz/metabolismo , Funções Verossimilhança , Dados de Sequência Molecular , Portugal , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Análise de Sequência de DNA , Especificidade da Espécie , Regiões não Traduzidas/genéticaRESUMO
DNA sequencing plays a more and more important role in various fields of genetics. This includes sequencing of whole genomes, libraries of cDNA clones and probes of metagenome communities. The applied sequencing technologies evolve permanently. With the emergence of ultrafast sequencing technologies, a new era of DNA sequencing has recently started. Concurrently, the needs for adapted bioinformatics tools arise. Since the ability to process current datasets efficiently is essential for modern genetics, a modular bioinformatics platform providing extensive sequence analysis methods, is designated to achieve well the constantly growing requirements. The Sequence Analysis and Management System (SAMS) is a bioinformatics software platform with a database backend designed to support the computational analysis of (1) whole genome shotgun (WGS) bacterial genome sequencing, (2) cDNA sequencing by reading expressed sequence tags (ESTs) as well as (3) sequence data obtained by ultrafast sequencing. It provides extensive bioinformatics analysis of sequenced single reads, sequencing libraries and fragments of arbitrary DNA sequences such as assembled contigs of metagenome reads for instance. The system has been implemented to cope with several thousands of sequences, efficiently processing them and storing the results for further analysis. With the project setup, SAMS automatically recognizes the data type.
Assuntos
Genoma Bacteriano , Gestão da Informação/métodos , Armazenamento e Recuperação da Informação/métodos , Análise de Sequência de DNA/métodos , Software , Análise por Conglomerados , Biologia Computacional , Bases de Dados Genéticas , Etiquetas de Sequências Expressas , Biblioteca Gênica , Genômica , Análise de Sequência com Séries de Oligonucleotídeos , Interface Usuário-ComputadorRESUMO
The arbuscular mycorrhizal (AM) association between terrestrial plants and soil fungi of the phylum Glomeromycota is the most widespread beneficial plant-microbe interaction on earth. In the course of the symbiosis, fungal hyphae colonise plant roots and supply limiting nutrients, in particular phosphorus, in exchange for carbon compounds. Owing to the obligate biotrophy of mycorrhizal fungi and the lack of genetic systems to study them, targeted molecular studies on AM symbioses proved to be difficult. With the emergence of plant genomics and the selection of suitable models, an application of untargeted expression profiling experiments became possible. In the model legume Medicago truncatula, high-throughput expressed sequence tag (EST)-sequencing in conjunction with in silico and experimental transcriptome profiling provided transcriptional snapshots that together defined the global genetic program activated during AM. Owing to an asynchronous development of the symbiosis, several hundred genes found to be activated during the symbiosis cannot be easily correlated with symbiotic structures, but the expression of selected genes has been extended to the cellular level to correlate gene expression with specific stages of AM development. These approaches identified marker genes for the AM symbiosis and provided the first insights into the molecular basis of gene expression regulation during AM.