Search | VHL Regional Portal

Identification of unannotated exons of low abundance transcripts in Drosophila melanogaster and cloning of a new serine protease gene upregulated upon injury.

Maia, Rafaela M; Valente, Valeria; Cunha, Marco A V; Sousa, Josane F; Araujo, Daniela D; Silva, Wilson A; Zago, Marco A; Dias-Neto, Emmanuel; Souza, Sandro J; Simpson, Andrew J G; Monesi, Nadia; Ramos, Ricardo G P; Espreafico, Enilza M; Paçó-Larson, Maria L.

BMC Genomics ; 8: 249, 2007 Jul 24.

Article in English | MEDLINE | ID: mdl-17650329

ABSTRACT

BACKGROUND: The sequencing of the D.melanogaster genome revealed an unexpected small number of genes (~ 14,000) indicating that mechanisms acting on generation of transcript diversity must have played a major role in the evolution of complex metazoans. Among the most extensively used mechanisms that accounts for this diversity is alternative splicing. It is estimated that over 40% of Drosophila protein-coding genes contain one or more alternative exons. A recent transcription map of the Drosophila embryogenesis indicates that 30% of the transcribed regions are unannotated, and that 1/3 of this is estimated as missed or alternative exons of previously characterized protein-coding genes. Therefore, the identification of the variety of expressed transcripts depends on experimental data for its final validation and is continuously being performed using different approaches. We applied the Open Reading Frame Expressed Sequence Tags (ORESTES) methodology, which is capable of generating cDNA data from the central portion of rare transcripts, in order to investigate the presence of hitherto unnanotated regions of Drosophila transcriptome. RESULTS: Bioinformatic analysis of 1,303 Drosophila ORESTES clusters identified 68 sequences derived from unannotated regions in the current Drosophila genome version (4.3). Of these, a set of 38 was analysed by polyA+ northern blot hybridization, validating 17 (50%) new exons of low abundance transcripts. For one of these ESTs, we obtained the cDNA encompassing the complete coding sequence of a new serine protease, named SP212. The SP212 gene is part of a serine protease gene cluster located in the chromosome region 88A12-B1. This cluster includes the predicted genes CG9631, CG9649 and CG31326, which were previously identified as up-regulated after immune challenges in genomic-scale microarray analysis. In agreement with the proposal that this locus is co-regulated in response to microorganisms infection, we show here that SP212 is also up-regulated upon injury. CONCLUSION: Using the ORESTES methodology we identified 17 novel exons from low abundance Drosophila transcripts, and through a PCR approach the complete CDS of one of these transcripts was defined. Our results show that the computational identification and manual inspection are not sufficient to annotate a genome in the absence of experimentally derived data.

Subject(s)

Drosophila melanogaster/genetics , Exons , RNA, Messenger/analysis , Serine Endopeptidases/genetics , Wounds and Injuries/genetics , Algorithms , Amino Acid Sequence , Animals , Base Sequence , Cloning, Molecular , Computational Biology , Databases, Nucleic Acid , Drosophila melanogaster/microbiology , Gene Expression Regulation, Enzymologic , Infections/genetics , Molecular Sequence Data , Open Reading Frames , Phylogeny , Up-Regulation , Wounds and Injuries/microbiology

The use of Open Reading frame ESTs (ORESTES) for analysis of the honey bee transcriptome.

Nunes, Francis M F; Valente, Valeria; Sousa, Josane F; Cunha, Marco A V; Pinheiro, Daniel G; Maia, Rafaela M; Araujo, Daniela D; Costa, Maria C R; Martins, Waleska K; Carvalho, Alex F; Monesi, Nadia; Nascimento, Adriana M; Peixoto, Pablo M V; Silva, Maria F R; Ramos, Ricardo G P; Reis, Luis F L; Dias-Neto, Emmanuel; Souza, Sandro J; Simpson, Andrew J G; Zago, Marco A; Soares, Ademilson E E; Bitondi, Marcia M G; Espreafico, Enilza M; Espindola, Foued S; Paco-Larson, Maria L; Simoes, Zila L P; Hartfelder, Klaus; Silva, Wilson A.

BMC Genomics ; 5: 84, 2004 Nov 03.

Article in English | MEDLINE | ID: mdl-15527499

ABSTRACT

BACKGROUND: The ongoing efforts to sequence the honey bee genome require additional initiatives to define its transcriptome. Towards this end, we employed the Open Reading frame ESTs (ORESTES) strategy to generate profiles for the life cycle of Apis mellifera workers. RESULTS: Of the 5,021 ORESTES, 35.2% matched with previously deposited Apis ESTs. The analysis of the remaining sequences defined a set of putative orthologs whose majority had their best-match hits with Anopheles and Drosophila genes. CAP3 assembly of the Apis ORESTES with the already existing 15,500 Apis ESTs generated 3,408 contigs. BLASTX comparison of these contigs with protein sets of organisms representing distinct phylogenetic clades revealed a total of 1,629 contigs that Apis mellifera shares with different taxa. Most (41%) represent genes that are in common to all taxa, another 21% are shared between metazoans (Bilateria), and 16% are shared only within the Insecta clade. A set of 23 putative genes presented a best match with human genes, many of which encode factors related to cell signaling/signal transduction. 1,779 contigs (52%) did not match any known sequence. Applying a correction factor deduced from a parallel analysis performed with Drosophila melanogaster ORESTES, we estimate that approximately half of these no-match ESTs contigs (22%) should represent Apis-specific genes. CONCLUSIONS: The versatile and cost-efficient ORESTES approach produced minilibraries for honey bee life cycle stages. Such information on central gene regions contributes to genome annotation and also lends itself to cross-transcriptome comparisons to reveal evolutionary trends in insect genomes.

Subject(s)

Bees/genetics , Expressed Sequence Tags , Open Reading Frames/genetics , Transcription, Genetic/genetics , Animals , Anopheles/genetics , Caenorhabditis elegans , Classification , Cluster Analysis , Contig Mapping/statistics & numerical data , Drosophila melanogaster/genetics , Genes, Helminth/genetics , Genes, Insect/genetics , Genome , Genome, Fungal , Genome, Human , Genome, Protozoan , Humans

A transcript finishing initiative for closing gaps in the human transcriptome.

Sogayar, Mari Cleide; Camargo, Anamaria A; Bettoni, Fabiana; Carraro, Dirce Maria; Pires, Lilian C; Parmigiani, Raphael B; Ferreira, Elisa N; de Sá Moreira, Eloísa; do Rosário D de O Latorre, Maria; Simpson, Andrew J G; Cruz, Luciana Oliveira; Degaki, Theri Leica; Festa, Fernanda; Massirer, Katlin B; Sogayar, Mari C; Filho, Fernando Camargo; Camargo, Luiz Paulo; Cunha, Marco A V; De Souza, Sandro J; Faria, Milton; Giuliatti, Silvana; Kopp, Leonardo; de Oliveira, Paulo S L; Paiva, Paulo B; Pereira, Anderson A; Pinheiro, Daniel G; Puga, Renato D; S de Souza, Jorge Estefano; Albuquerque, Dulcineia M; Andrade, Luís E C; Baia, Gilson S; Briones, Marcelo R S; Cavaleiro-Luna, Ana M S; Cerutti, Janete M; Costa, Fernando F; Costanzi-Strauss, Eugenia; Espreafico, Enilza M; Ferrasi, Adriana C; Ferro, Emer S; Fortes, Maria A H Z; Furchi, Joelma R F; Giannella-Neto, Daniel; Goldman, Gustavo H; Goldman, Maria H S; Gruber, Arthur; Guimarães, Gustavo S; Hackel, Christine; Henrique-Silva, Flavio; Kimura, Edna T; Leoni, Suzana G.

Genome Res ; 14(7): 1413-23, 2004 Jul.

Article in English | MEDLINE | ID: mdl-15197164

ABSTRACT

We report the results of a transcript finishing initiative, undertaken for the purpose of identifying and characterizing novel human transcripts, in which RT-PCR was used to bridge gaps between paired EST clusters, mapped against the genomic sequence. Each pair of EST clusters selected for experimental validation was designated a transcript finishing unit (TFU). A total of 489 TFUs were selected for validation, and an overall efficiency of 43.1% was achieved. We generated a total of 59,975 bp of transcribed sequences organized into 432 exons, contributing to the definition of the structure of 211 human transcripts. The structure of several transcripts reported here was confirmed during the course of this project, through the generation of their corresponding full-length cDNA sequences. Nevertheless, for 21% of the validated TFUs, a full-length cDNA sequence is not yet available in public databases, and the structure of 69.2% of these TFUs was not correctly predicted by computer programs. The TF strategy provides a significant contribution to the definition of the complete catalog of human genes and transcripts, because it appears to be particularly useful for identification of low abundance transcripts expressed in a restricted set of tissues as well as for the delineation of gene boundaries and alternatively spliced isoforms.

Subject(s)

Software , Transcription, Genetic/genetics , Alternative Splicing/genetics , Cell Line , Cell Line, Tumor , Computational Biology/methods , Computational Biology/statistics & numerical data , Consensus Sequence/genetics , DNA, Neoplasm , Databases, Genetic/classification , Expressed Sequence Tags , Genes/genetics , Genome, Human , HeLa Cells/pathology , Humans , Molecular Sequence Data , Open Reading Frames/genetics , Software Design , Software Validation , U937 Cells/pathology

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL