Pesquisa | Biblioteca Virtual em Saúde

No-match ORESTES explored as tumor markers.

Mello, Barbara P; Abrantes, Eduardo F; Torres, César H; Machado-Lima, Ariane; Fonseca, Rogério da Silva; Carraro, Dirce M; Brentani, Ricardo R; Reis, Luiz F L; Brentani, Helena.

Nucleic Acids Res ; 37(8): 2607-17, 2009 May.

Artigo em Inglês | MEDLINE | ID: mdl-19270067

RESUMO

Sequencing technologies and new bioinformatics tools have led to the complete sequencing of various genomes. However, information regarding the human transcriptome and its annotation is yet to be completed. The Human Cancer Genome Project, using ORESTES (open reading frame EST sequences) methodology, contributed to this objective by generating data from about 1.2 million expressed sequence tags. Approximately 30% of these sequences did not align to ESTs in the public databases and were considered no-match ORESTES. On the basis that a set of these ESTs could represent new transcripts, we constructed a cDNA microarray. This platform was used to hybridize against 12 different normal or tumor tissues. We identified 3421 transcribed regions not associated with annotated transcripts, representing 83.3% of the platform. The total number of differentially expressed sequences was 1007. Also, 28% of analyzed sequences could represent noncoding RNAs. Our data reinforces the knowledge of the human genome being pervasively transcribed, and point out molecular marker candidates for different cancers. To reinforce our data, we confirmed, by real-time PCR, the differential expression of three out of eight potentially tumor markers in prostate tissues. Lists of 1007 differentially expressed sequences, and the 291 potentially noncoding tumor markers were provided.

Assuntos

Biomarcadores Tumorais/biossíntese , Etiquetas de Sequências Expressas , RNA não Traduzido/biossíntese , Biomarcadores Tumorais/genética , Mapeamento Cromossômico , Etiquetas de Sequências Expressas/química , Perfilação da Expressão Gênica , Genoma Humano , Genômica , Humanos , Masculino , Análise de Sequência com Séries de Oligonucleotídeos , Reação em Cadeia da Polimerase , Neoplasias da Próstata/genética , Neoplasias da Próstata/metabolismo , RNA Neoplásico/biossíntese , Transcrição Gênica

Mining ORESTES no-match database: can we still contribute to cancer transcriptome?

Fonseca, Rogério da Silva; Carraro, Dirce Maria; Brentani, Helena.

Genet Mol Res ; 5(1): 24-32, 2006 Mar 31.

Artigo em Inglês | MEDLINE | ID: mdl-16755494

RESUMO

The Human Cancer Genome Project generated about 1 million expressed sequence tags by the ORESTES method, principally with the aim of obtaining data from cancer. Of this total, 341,680 showed no similarity with sequences in the public transcript databases, referred to as "no-match". Some of them represent low abundance or difficult to detect human transcripts, but part of these sequences represent genomic contamination or immature mRNA. We performed a bioinformatics pipeline to determine the novelty of ORESTES "no-match" datasets from prostate or breast tissues. We started with 14,908 clusters mapped on the human genome. A total of 2226 clusters originating from more than two libraries or singletons with gaps upon genome alignment were selected. Ninety-four clusters with canonical splice sites representing the most stringent criteria to be considered a gene were subjected to manual inspection regarding genomic hits. Of the manually inspected clusters, 49.6% contained new sequences where 42.2% were probable low-expression alternative forms of the characterized genes and 7.4% unpredicted genes. RT-PCR followed by sequencing was performed to validate the largest spliced sequence from 8 clusters, resulting in the confirmation of five sequences as true human transcript fragments. Some of them were differentially expressed between tumor and normal tissue by an in silico analysis. We can conclude that after clean up of the no-match dataset, we still have about 939 new exons and 165 unpredicted genes that could complete the prostate or breast transcriptome.

Assuntos

Neoplasias da Mama/genética , Etiquetas de Sequências Expressas , Fases de Leitura Aberta/genética , Neoplasias da Próstata/genética , Transcrição Gênica/genética , Análise por Conglomerados , Bases de Dados Genéticas , Feminino , Genoma Humano/genética , Humanos , Masculino , Reação em Cadeia da Polimerase Via Transcriptase Reversa

Avaliação das ORESTES NO MATCH geradas pelo Projeto Genoma Humano do Câncer (LICRIFAPESP-HCGP) / Mining ORESTES no-match database

Fonseca, Rogério da Silva.

São Paulo; s.n; 2005. 44 p. ilus, tab.

Tese em Português | Inca | ID: biblio-1118031

RESUMO

O Projeto Genoma Humano do Câncer gerou aproximadamente um milhão de Etiquetas de Sequências Expressas (ESTs - Expressed Sequence Tags) pela metodologia ORESTES, almejando principalmente obter dados sobre o câncer. Deste total, 341.680 não mostraram similaridade com sequências depositadas nos bancos de dados púbicos de transcritos, sendo por isto chamadas de "sem-pares" ou "no-match". É plausível que uma parte deste conjunto de dados represente transcritos humanos de baixa expressão ou de dificil detecção. Neste trabalho, criamos um protocolo de bioinformática para identificação de sequências no-match que com maior probabilidade representem transcritos humanos. Este foi aplicado a 14.908 agrupamentos de sequências, os quais continham ao menos uma sequência no-match de próstata ou mama. Os critérios de seleção do protocolo foram: 1) Usar apenas sequências com alta identidade com o genoma humano; 2) clusters que contenham sequências provindas de diferentes bibliotecas; 3) clusters que apresentem sequências com splicing e 4) destes, apenas os que apresentam sítios canônicos de splicing em suas sequências. O total de 2.226 clusters passou pelos dois primeiros critérios, sendo que 8.4% deste montante possuía sequências com splicing quando alinhadas contra o genoma. Deste percentual, 94 clusters possuíam sequências com sítios canônicos de splicing e foram então submetidos à inspeção manual para confirmação de seus alinhamentos, o que indicou que sessenta destes clusters possuíam sequências novas, dos quais oito mostraram alinhamento sobre genes preditos ab initio sendo selecionados para validação experimental. RT -PCR seguido de sequenciamento foi feito para validar a maior sequência com splicing de cada cluster selecionado, resultando no total de cinco sequências confirmadas como verdadeiros fragmentos de transcritos humanos. Dois destes mostraram-sediferencialmente expressos entre tecidos normal e tumoral em análise de expressão insilico.

The Human Cancer Genome Project generated about 1 million Expressed Sequence Tags (ESTs) by the ORESTES methodology, principally aiming to obtain data from cancer. Of this total, 341,680 show no similarity with sequences in the public transcript databases, therefore referred to 'no-match'. Some of these sequences may represent low abundance or hard to detect human transcripts. Hereby, we create a bioinformatics pipeline to identify no-match sequences, which may represent human transcripts, and apply it with 14,908 sequence clusters, which contains at least one no-match sequence derived from prostate or breast tissues. The selection criteria of such pipeline were: 1) sequences with high similarity to the human genome; 2) clusters originating :from different libraries; 3) spliced clusters; and 4) clusters whose spliced sequences show canonical splicing sites. A total o f 2,226 clusters passed by the two first criteria and 8.4% of these are spliced upon genomic alignment. Of this remainder, 94 show canonical splice sites, and were subjected to a manual inspection regarding genomic bits. Sixth o f these showed to be clusters with new sequences and eight which align in ab initio predicted genes, were selected for experimental validation. RT -PCR followed by sequencing was performed to vali date the largest spliced sequences of each selected cluster, resulting in the confirmation of five sequences as true human transcript fragments. Two of them showed differentially expressed between tumor and normal tissue by in silico analysis.

Assuntos

Humanos , Masculino , Feminino , Apresentação de Dados , Projeto Genoma Humano , Biologia Computacional , Neoplasias , Simulação por Computador , Base de Dados

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA