RESUMEN
MAIN CONCLUSION: Presented here is the first Echinochloa colona leaf transcriptome. Analysis of gene expression before and after herbicide treatment reveals that E. colona mounts a stress response upon exposure to herbicide. Herbicides are the most frequently used means of controlling weeds. For many herbicides, the target site is known; however, it is considerably less clear how plant gene expression changes in response to herbicide exposure. In this study, changes in gene expression in response to herbicide exposure in imazamox-sensitive (S) and- resistant (R) junglerice (Echinochloa colona L.) biotypes was examined. As no reference genome is available for this weed, a reference leaf transcriptome was generated. Messenger RNA was isolated from imazamox-treated- and untreated R and S plants and the resulting cDNA libraries were sequenced on an Illumina HiSeq2000. The transcriptome was assembled, annotated, and differential gene expression analysis was performed to identify transcripts that were upregulated or downregulated in response to herbicide exposure for both biotypes. Differentially expressed transcripts included transcription factors, protein-modifying enzymes, and enzymes involved in metabolism and signaling. A literature search revealed that members of the families represented in this analysis were known to be involved in abiotic stress response in other plants, suggesting that imazamox exposure induced a stress response. A time course study examining a subset of transcripts showed that expression peaked within 4-12 h and then returned to untreated levels within 48 h of exposure. Testing of plants from two additional biotypes showed a similar change in gene expression 4 h after herbicide exposure compared to the resistant and sensitive biotypes. This study shows that within 48 h junglerice mounts a stress response to imazamox exposure.
Asunto(s)
Echinochloa/genética , Herbicidas/farmacología , Imidazoles/farmacología , Transcriptoma/efectos de los fármacos , Echinochloa/efectos de los fármacos , Análisis de Secuencia de ARN , Estrés FisiológicoRESUMEN
Motivation: The availability of databases identifying allergenic proteins via a transparent and consensus-based scientific approach is of prime importance to support the safety review of genetically-modified foods and feeds, and public safety in general. Over recent years, screening for potential new allergens sequences has become more complex due to the exponential increase of genomic sequence information. To address these challenges, an international collaborative scientific group coordinated by the Health and Environmental Sciences Institute (HESI), was tasked to develop a contemporary, adaptable, high-throughput process to build the COMprehensive Protein Allergen REsource (COMPARE) database, a publicly accessible allergen sequence data resource along with bioinformatics analytical tools following guidelines of FAO/WHO and CODEX Alimentarius Commission. Results: The COMPARE process is novel in that it involves the identification of candidate sequences via automated keyword-based sorting algorithm and manual curation of the annotated sequence entries retrieved from public protein sequence databases on a yearly basis; its process is meant for continuous improvement, with updates being transparently documented with each version; as a complementary approach, a yearly key-word based search of literature databases is added to identify new allergen sequences that were not (yet) submitted to protein databases; in addition, comments from the independent peer-review panel are posted on the website to increase transparency of decision making; finally, sequence comparison capabilities associated with the COMPARE database was developed to evaluate the potential allergenicity of proteins, based on internationally recognized guidelines, FAO/WHO and CODEX Alimentarius Commission.
RESUMEN
This chapter outlines key considerations for constructing and implementing an EST database. Instead of showing the technological details step by step, emphasis is put on the design of an EST database suited to the specific needs of EST projects and how to choose the most suitable tools. Using TBestDB as an example, we illustrate the essential factors to be considered for database construction and the steps for data population and annotation. This process employs technologies such as PostgreSQL, Perl, and PHP to build the database and interface, and tools such as AutoFACT for data processing and annotation. We discuss these in comparison to other available technologies and tools, and explain the reasons for our choices.
Asunto(s)
Biología Computacional/métodos , Etiquetas de Secuencia Expresada , Internet , Animales , Computadores , Bases de Datos Genéticas/tendencias , Procesamiento Automatizado de Datos , Humanos , Almacenamiento y Recuperación de la Información/métodos , Lenguajes de Programación , Programas InformáticosRESUMEN
The TBestDB database contains approximately 370,000 clustered expressed sequence tag (EST) sequences from 49 organisms, covering a taxonomically broad range of poorly studied, mainly unicellular eukaryotes, and includes experimental information, consensus sequences, gene annotations and metabolic pathway predictions. Most of these ESTs have been generated by the Protist EST Program, a collaboration among six Canadian research groups. EST sequences are read from trace files up to a minimum quality cut-off, vector and linker sequence is masked, and the ESTs are clustered using phrap. The resulting consensus sequences are automatically annotated by using the AutoFACT program. The datasets are automatically checked for clustering errors due to chimerism and potential cross-contamination between organisms, and suspect data are flagged in or removed from the database. Access to data deposited in TBestDB by individual users can be restricted to those users for a limited period. With this first report on TBestDB, we open the database to the research community for free processing, annotation, interspecies comparisons and GenBank submission of EST data generated in individual laboratories. For instructions on submission to TBestDB, contact tbestdb@bch.umontreal.ca. The database can be queried at http://tbestdb.bcm.umontreal.ca/.
Asunto(s)
Bases de Datos de Ácidos Nucleicos , Etiquetas de Secuencia Expresada/química , Animales , Secuencia de Bases , Análisis por Conglomerados , Secuencia de Consenso , Eucariontes/genética , Hongos/genética , Genómica , Internet , Interfaz Usuario-ComputadorRESUMEN
BACKGROUND: Assignment of function to new molecular sequence data is an essential step in genomics projects. The usual process involves similarity searches of a given sequence against one or more databases, an arduous process for large datasets. RESULTS: We present AutoFACT, a fully automated and customizable annotation tool that assigns biologically informative functions to a sequence. Key features of this tool are that it (1) analyzes nucleotide and protein sequence data; (2) determines the most informative functional description by combining multiple BLAST reports from several user-selected databases; (3) assigns putative metabolic pathways, functional classes, enzyme classes, GeneOntology terms and locus names; and (4) generates output in HTML, text and GFF formats for the user's convenience. We have compared AutoFACT to four well-established annotation pipelines. The error rate of functional annotation is estimated to be only between 1-2%. Comparison of AutoFACT to the traditional top-BLAST-hit annotation method shows that our procedure increases the number of functionally informative annotations by approximately 50%. CONCLUSION: AutoFACT will serve as a useful annotation tool for smaller sequencing groups lacking dedicated bioinformatics staff. It is implemented in PERL and runs on LINUX/UNIX platforms. AutoFACT is available at http://megasun.bch.umontreal.ca/Software/AutoFACT.htm.