Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Nat Genet ; 38(10): 1151-8, 2006 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-16951679

RESUMO

Many animal and plant genomes are transcribed much more extensively than current annotations predict. However, the biological function of these unannotated transcribed regions is largely unknown. Approximately 7% and 23% of the detected transcribed nucleotides during D. melanogaster embryogenesis map to unannotated intergenic and intronic regions, respectively. Based on computational analysis of coordinated transcription, we conservatively estimate that 29% of all unannotated transcribed sequences function as missed or alternative exons of well-characterized protein-coding genes. We estimate that 15.6% of intergenic transcribed regions function as missed or alternative transcription start sites (TSS) used by 11.4% of the expressed protein-coding genes. Identification of P element mutations within or near newly identified 5' exons provides a strategy for mapping previously uncharacterized mutations to their respective genes. Collectively, these data indicate that at least 85% of the fly genome is transcribed and processed into mature transcripts representing at least 30% of the fly genome.


Assuntos
Drosophila melanogaster/embriologia , Drosophila melanogaster/genética , Regulação da Expressão Gênica no Desenvolvimento , Transcrição Gênica , Sequência de Aminoácidos , Animais , DNA Intergênico , Proteínas de Drosophila/genética , Embrião não Mamífero , Éxons , Genoma de Inseto , Dados de Sequência Molecular , Mutação , Análise de Sequência com Séries de Oligonucleotídeos , Sítio de Iniciação de Transcrição
2.
Genome Res ; 17(6): 746-59, 2007 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-17567994

RESUMO

This report presents systematic empirical annotation of transcript products from 399 annotated protein-coding loci across the 1% of the human genome targeted by the Encyclopedia of DNA elements (ENCODE) pilot project using a combination of 5' rapid amplification of cDNA ends (RACE) and high-density resolution tiling arrays. We identified previously unannotated and often tissue- or cell-line-specific transcribed fragments (RACEfrags), both 5' distal to the annotated 5' terminus and internal to the annotated gene bounds for the vast majority (81.5%) of the tested genes. Half of the distal RACEfrags span large segments of genomic sequences away from the main portion of the coding transcript and often overlap with the upstream-annotated gene(s). Notably, at least 20% of the resultant novel transcripts have changes in their open reading frames (ORFs), most of them fusing ORFs of adjacent transcripts. A significant fraction of distal RACEfrags show expression levels comparable to those of known exons of the same locus, suggesting that they are not part of very minority splice forms. These results have significant implications concerning (1) our current understanding of the architecture of protein-coding genes; (2) our views on locations of regulatory regions in the genome; and (3) the interpretation of sequence polymorphisms mapping to regions hitherto considered to be "noncoding," ultimately relating to the identification of disease-related sequence alterations.


Assuntos
Mapeamento Cromossômico , Éxons , Genoma Humano , Regiões Promotoras Genéticas , Locos de Características Quantitativas , Transcrição Gênica/fisiologia , DNA Complementar/genética , Projeto Genoma Humano , Humanos , Fases de Leitura Aberta
3.
Science ; 316(5830): 1484-8, 2007 Jun 08.
Artigo em Inglês | MEDLINE | ID: mdl-17510325

RESUMO

Significant fractions of eukaryotic genomes give rise to RNA, much of which is unannotated and has reduced protein-coding potential. The genomic origins and the associations of human nuclear and cytosolic polyadenylated RNAs longer than 200 nucleotides (nt) and whole-cell RNAs less than 200 nt were investigated in this genome-wide study. Subcellular addresses for nucleotides present in detected RNAs were assigned, and their potential processing into short RNAs was investigated. Taken together, these observations suggest a novel role for some unannotated RNAs as primary transcripts for the production of short RNAs. Three potentially functional classes of RNAs have been identified, two of which are syntenically conserved and correlate with the expression state of protein-coding genes. These data support a highly interleaved organization of the human transcriptome.


Assuntos
Genoma Humano , Precursores de RNA/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , RNA/genética , Transcrição Gênica , Animais , Linhagem Celular Tumoral , Núcleo Celular/metabolismo , Citosol/metabolismo , Éxons , Expressão Gênica , Genoma , Células HeLa , Humanos , Camundongos , Regiões Promotoras Genéticas , RNA/metabolismo , Precursores de RNA/metabolismo , Sintenia , Regiões Terminadoras Genéticas
4.
Genome Res ; 15(7): 987-97, 2005 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-15998911

RESUMO

Recently, we mapped the sites of transcription across approximately 30% of the human genome and elucidated the structures of several hundred novel transcripts. In this report, we describe a novel combination of techniques including the rapid amplification of cDNA ends (RACE) and tiling array technologies that was used to further characterize transcripts in the human transcriptome. This technical approach allows for several important pieces of information to be gathered about each array-detected transcribed region, including strand of origin, start and termination positions, and the exonic structures of spliced and unspliced coding and noncoding RNAs. In this report, the structures of transcripts from 14 transcribed loci, representing both known genes and unannotated transcripts taken from the several hundred randomly selected unannotated transcripts described in our previous work are represented as examples of the complex organization of the human transcriptome. As a consequence of this complexity, it is not unusual that a single base pair can be part of an intricate network of multiple isoforms of overlapping sense and antisense transcripts, the majority of which are unannotated. Some of these transcripts follow the canonical splicing rules, whereas others combine the exons of different genes or represent other types of noncanonical transcripts. These results have important implications concerning the correlation of genotypes to phenotypes, the regulation of complex interlaced transcriptional patterns, and the definition of a gene.


Assuntos
Técnicas de Amplificação de Ácido Nucleico , Análise de Sequência com Séries de Oligonucleotídeos , Transcrição Gênica , Linhagem Celular , Perfilação da Expressão Gênica , Humanos , Células Jurkat , Modelos Genéticos , Dados de Sequência Molecular , Técnicas de Amplificação de Ácido Nucleico/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Isoformas de Proteínas/genética , Células Tumorais Cultivadas
5.
Science ; 308(5725): 1149-54, 2005 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-15790807

RESUMO

Sites of transcription of polyadenylated and nonpolyadenylated RNAs for 10 human chromosomes were mapped at 5-base pair resolution in eight cell lines. Unannotated, nonpolyadenylated transcripts comprise the major proportion of the transcriptional output of the human genome. Of all transcribed sequences, 19.4, 43.7, and 36.9% were observed to be polyadenylated, nonpolyadenylated, and bimorphic, respectively. Half of all transcribed sequences are found only in the nucleus and for the most part are unannotated. Overall, the transcribed portions of the human genome are predominantly composed of interlaced networks of both poly A+ and poly A- annotated transcripts and unannotated transcripts of unknown function. This organization has important implications for interpreting genotype-phenotype associations, regulation of gene expression, and the definition of a gene.


Assuntos
Cromossomos Humanos/genética , Genoma Humano , RNA Mensageiro/análise , Transcrição Gênica , Linhagem Celular , Linhagem Celular Tumoral , Núcleo Celular/metabolismo , Cromossomos Humanos Par 13/genética , Cromossomos Humanos Par 14/genética , Cromossomos Humanos Par 19/genética , Cromossomos Humanos Par 20/genética , Cromossomos Humanos Par 21/genética , Cromossomos Humanos Par 22/genética , Cromossomos Humanos Par 6/genética , Cromossomos Humanos Par 7/genética , Cromossomos Humanos X/genética , Cromossomos Humanos Y/genética , Biologia Computacional , Citosol/metabolismo , DNA Complementar , DNA Intergênico , Éxons , Feminino , Humanos , Íntrons , Masculino , Dados de Sequência Molecular , Técnicas de Amplificação de Ácido Nucleico , Análise de Sequência com Séries de Oligonucleotídeos , Mapeamento Físico do Cromossomo , Splicing de RNA
6.
Genome Res ; 14(12): 2424-9, 2004 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-15574821

RESUMO

The completion of the mouse and other mammalian genome sequences will provide necessary, but not sufficient, knowledge for an understanding of much of mouse biology at the molecular level. As a requisite next step in this process, the genes in mouse and their structure must be elucidated. In particular, knowledge of the transcriptional start site of these genes will be necessary for further study of their regulatory regions. To assess the current state of mouse genome annotation to support this activity, we identified several hundred gene predictions in mouse with varying levels of supporting evidence and tested them using RACE-PCR. Modifications were made to the procedure allowing pooling of RNA samples, resulting in a scaleable procedure. The results illustrate potential errors or omissions in the current 5' end annotations in 58% of the genes detected. In testing experimentally unsupported gene predictions, we were able to identify 58 that are not usually annotated as genes but produced spliced transcripts (approximately 25% success rate). In addition, in many genes we were able to detect novel exons not predicted by any gene prediction algorithms. In 19.8% of the genes detected in this study, multiple transcript species were observed. These data show an urgent need to provide direct experimental validation of gene annotations. Moreover, these results show that direct validation using RACE-PCR can be an important component of genome-wide validation. This approach can be a useful tool in the ongoing efforts to increase the quality of gene annotations, especially transcriptional start sites, in complex genomes.


Assuntos
Genes/genética , Genoma , Camundongos/genética , Fases de Leitura Aberta/genética , Sítio de Iniciação de Transcrição , Animais , Sequência de Bases , Ilhas de CpG/genética , Primers do DNA , DNA Complementar/genética , Éxons/genética , Dados de Sequência Molecular , Reação em Cadeia da Polimerase/métodos , Análise de Sequência de DNA
7.
Science ; 302(5653): 2115-7, 2003 Dec 19.
Artigo em Inglês | MEDLINE | ID: mdl-14684820

RESUMO

Gene enrichment strategies offer an alternative to sequencing large and repetitive genomes such as that of maize. We report the generation and analysis of nearly 100,000 undermethylated (or methylation filtration) maize sequences. Comparison with the rice genome reveals that methylation filtration results in a more comprehensive representation of maize genes than those that result from expressed sequence tags or transposon insertion sites sequences. About 7% of the repetitive DNA is unmethylated and thus selected in our libraries, but potentially active transposons and unmethylated organelle genomes can be identified. Reverse transcription polymerase chain reaction can be used to finish the maize transcriptome.


Assuntos
Metilação de DNA , Genoma de Planta , Análise de Sequência de DNA/métodos , Zea mays/genética , Algoritmos , Cromossomos Artificiais Bacterianos , Clonagem Molecular , Biologia Computacional , Sequência Conservada , Mapeamento de Sequências Contíguas , Ilhas de CpG , Elementos de DNA Transponíveis , DNA de Cloroplastos/genética , DNA Complementar , DNA Mitocondrial/genética , DNA de Plantas/genética , Bases de Dados de Ácidos Nucleicos , Escherichia coli/genética , Éxons , Etiquetas de Sequências Expressas , Genes de Plantas , Biblioteca Genômica , Oryza/genética , Sequências Repetitivas de Ácido Nucleico , Retroelementos , Reação em Cadeia da Polimerase Via Transcriptase Reversa
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA