Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Nat Genet ; 38(10): 1151-8, 2006 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-16951679

RESUMEN

Many animal and plant genomes are transcribed much more extensively than current annotations predict. However, the biological function of these unannotated transcribed regions is largely unknown. Approximately 7% and 23% of the detected transcribed nucleotides during D. melanogaster embryogenesis map to unannotated intergenic and intronic regions, respectively. Based on computational analysis of coordinated transcription, we conservatively estimate that 29% of all unannotated transcribed sequences function as missed or alternative exons of well-characterized protein-coding genes. We estimate that 15.6% of intergenic transcribed regions function as missed or alternative transcription start sites (TSS) used by 11.4% of the expressed protein-coding genes. Identification of P element mutations within or near newly identified 5' exons provides a strategy for mapping previously uncharacterized mutations to their respective genes. Collectively, these data indicate that at least 85% of the fly genome is transcribed and processed into mature transcripts representing at least 30% of the fly genome.


Asunto(s)
Drosophila melanogaster/embriología , Drosophila melanogaster/genética , Regulación del Desarrollo de la Expresión Génica , Transcripción Genética , Secuencia de Aminoácidos , Animales , ADN Intergénico , Proteínas de Drosophila/genética , Embrión no Mamífero , Exones , Genoma de los Insectos , Datos de Secuencia Molecular , Mutación , Análisis de Secuencia por Matrices de Oligonucleótidos , Sitio de Iniciación de la Transcripción
2.
Genome Res ; 17(6): 746-59, 2007 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-17567994

RESUMEN

This report presents systematic empirical annotation of transcript products from 399 annotated protein-coding loci across the 1% of the human genome targeted by the Encyclopedia of DNA elements (ENCODE) pilot project using a combination of 5' rapid amplification of cDNA ends (RACE) and high-density resolution tiling arrays. We identified previously unannotated and often tissue- or cell-line-specific transcribed fragments (RACEfrags), both 5' distal to the annotated 5' terminus and internal to the annotated gene bounds for the vast majority (81.5%) of the tested genes. Half of the distal RACEfrags span large segments of genomic sequences away from the main portion of the coding transcript and often overlap with the upstream-annotated gene(s). Notably, at least 20% of the resultant novel transcripts have changes in their open reading frames (ORFs), most of them fusing ORFs of adjacent transcripts. A significant fraction of distal RACEfrags show expression levels comparable to those of known exons of the same locus, suggesting that they are not part of very minority splice forms. These results have significant implications concerning (1) our current understanding of the architecture of protein-coding genes; (2) our views on locations of regulatory regions in the genome; and (3) the interpretation of sequence polymorphisms mapping to regions hitherto considered to be "noncoding," ultimately relating to the identification of disease-related sequence alterations.


Asunto(s)
Mapeo Cromosómico , Exones , Genoma Humano , Regiones Promotoras Genéticas , Sitios de Carácter Cuantitativo , Transcripción Genética/fisiología , ADN Complementario/genética , Proyecto Genoma Humano , Humanos , Sistemas de Lectura Abierta
3.
Science ; 316(5830): 1484-8, 2007 Jun 08.
Artículo en Inglés | MEDLINE | ID: mdl-17510325

RESUMEN

Significant fractions of eukaryotic genomes give rise to RNA, much of which is unannotated and has reduced protein-coding potential. The genomic origins and the associations of human nuclear and cytosolic polyadenylated RNAs longer than 200 nucleotides (nt) and whole-cell RNAs less than 200 nt were investigated in this genome-wide study. Subcellular addresses for nucleotides present in detected RNAs were assigned, and their potential processing into short RNAs was investigated. Taken together, these observations suggest a novel role for some unannotated RNAs as primary transcripts for the production of short RNAs. Three potentially functional classes of RNAs have been identified, two of which are syntenically conserved and correlate with the expression state of protein-coding genes. These data support a highly interleaved organization of the human transcriptome.


Asunto(s)
Genoma Humano , Precursores del ARN/genética , ARN Mensajero/genética , ARN Mensajero/metabolismo , ARN/genética , Transcripción Genética , Animales , Línea Celular Tumoral , Núcleo Celular/metabolismo , Citosol/metabolismo , Exones , Expresión Génica , Genoma , Células HeLa , Humanos , Ratones , Regiones Promotoras Genéticas , ARN/metabolismo , Precursores del ARN/metabolismo , Sintenía , Regiones Terminadoras Genéticas
4.
Genome Res ; 15(7): 987-97, 2005 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-15998911

RESUMEN

Recently, we mapped the sites of transcription across approximately 30% of the human genome and elucidated the structures of several hundred novel transcripts. In this report, we describe a novel combination of techniques including the rapid amplification of cDNA ends (RACE) and tiling array technologies that was used to further characterize transcripts in the human transcriptome. This technical approach allows for several important pieces of information to be gathered about each array-detected transcribed region, including strand of origin, start and termination positions, and the exonic structures of spliced and unspliced coding and noncoding RNAs. In this report, the structures of transcripts from 14 transcribed loci, representing both known genes and unannotated transcripts taken from the several hundred randomly selected unannotated transcripts described in our previous work are represented as examples of the complex organization of the human transcriptome. As a consequence of this complexity, it is not unusual that a single base pair can be part of an intricate network of multiple isoforms of overlapping sense and antisense transcripts, the majority of which are unannotated. Some of these transcripts follow the canonical splicing rules, whereas others combine the exons of different genes or represent other types of noncanonical transcripts. These results have important implications concerning the correlation of genotypes to phenotypes, the regulation of complex interlaced transcriptional patterns, and the definition of a gene.


Asunto(s)
Técnicas de Amplificación de Ácido Nucleico , Análisis de Secuencia por Matrices de Oligonucleótidos , Transcripción Genética , Línea Celular , Perfilación de la Expresión Génica , Humanos , Células Jurkat , Modelos Genéticos , Datos de Secuencia Molecular , Técnicas de Amplificación de Ácido Nucleico/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Isoformas de Proteínas/genética , Células Tumorales Cultivadas
5.
Science ; 308(5725): 1149-54, 2005 May 20.
Artículo en Inglés | MEDLINE | ID: mdl-15790807

RESUMEN

Sites of transcription of polyadenylated and nonpolyadenylated RNAs for 10 human chromosomes were mapped at 5-base pair resolution in eight cell lines. Unannotated, nonpolyadenylated transcripts comprise the major proportion of the transcriptional output of the human genome. Of all transcribed sequences, 19.4, 43.7, and 36.9% were observed to be polyadenylated, nonpolyadenylated, and bimorphic, respectively. Half of all transcribed sequences are found only in the nucleus and for the most part are unannotated. Overall, the transcribed portions of the human genome are predominantly composed of interlaced networks of both poly A+ and poly A- annotated transcripts and unannotated transcripts of unknown function. This organization has important implications for interpreting genotype-phenotype associations, regulation of gene expression, and the definition of a gene.


Asunto(s)
Cromosomas Humanos/genética , Genoma Humano , ARN Mensajero/análisis , Transcripción Genética , Línea Celular , Línea Celular Tumoral , Núcleo Celular/metabolismo , Cromosomas Humanos Par 13/genética , Cromosomas Humanos Par 14/genética , Cromosomas Humanos Par 19/genética , Cromosomas Humanos Par 20/genética , Cromosomas Humanos Par 21/genética , Cromosomas Humanos Par 22/genética , Cromosomas Humanos Par 6/genética , Cromosomas Humanos Par 7/genética , Cromosomas Humanos X/genética , Cromosomas Humanos Y/genética , Biología Computacional , Citosol/metabolismo , ADN Complementario , ADN Intergénico , Exones , Femenino , Humanos , Intrones , Masculino , Datos de Secuencia Molecular , Técnicas de Amplificación de Ácido Nucleico , Análisis de Secuencia por Matrices de Oligonucleótidos , Mapeo Físico de Cromosoma , Empalme del ARN
6.
Genome Res ; 14(12): 2424-9, 2004 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-15574821

RESUMEN

The completion of the mouse and other mammalian genome sequences will provide necessary, but not sufficient, knowledge for an understanding of much of mouse biology at the molecular level. As a requisite next step in this process, the genes in mouse and their structure must be elucidated. In particular, knowledge of the transcriptional start site of these genes will be necessary for further study of their regulatory regions. To assess the current state of mouse genome annotation to support this activity, we identified several hundred gene predictions in mouse with varying levels of supporting evidence and tested them using RACE-PCR. Modifications were made to the procedure allowing pooling of RNA samples, resulting in a scaleable procedure. The results illustrate potential errors or omissions in the current 5' end annotations in 58% of the genes detected. In testing experimentally unsupported gene predictions, we were able to identify 58 that are not usually annotated as genes but produced spliced transcripts (approximately 25% success rate). In addition, in many genes we were able to detect novel exons not predicted by any gene prediction algorithms. In 19.8% of the genes detected in this study, multiple transcript species were observed. These data show an urgent need to provide direct experimental validation of gene annotations. Moreover, these results show that direct validation using RACE-PCR can be an important component of genome-wide validation. This approach can be a useful tool in the ongoing efforts to increase the quality of gene annotations, especially transcriptional start sites, in complex genomes.


Asunto(s)
Genes/genética , Genoma , Ratones/genética , Sistemas de Lectura Abierta/genética , Sitio de Iniciación de la Transcripción , Animales , Secuencia de Bases , Islas de CpG/genética , Cartilla de ADN , ADN Complementario/genética , Exones/genética , Datos de Secuencia Molecular , Reacción en Cadena de la Polimerasa/métodos , Análisis de Secuencia de ADN
7.
Science ; 302(5653): 2115-7, 2003 Dec 19.
Artículo en Inglés | MEDLINE | ID: mdl-14684820

RESUMEN

Gene enrichment strategies offer an alternative to sequencing large and repetitive genomes such as that of maize. We report the generation and analysis of nearly 100,000 undermethylated (or methylation filtration) maize sequences. Comparison with the rice genome reveals that methylation filtration results in a more comprehensive representation of maize genes than those that result from expressed sequence tags or transposon insertion sites sequences. About 7% of the repetitive DNA is unmethylated and thus selected in our libraries, but potentially active transposons and unmethylated organelle genomes can be identified. Reverse transcription polymerase chain reaction can be used to finish the maize transcriptome.


Asunto(s)
Metilación de ADN , Genoma de Planta , Análisis de Secuencia de ADN/métodos , Zea mays/genética , Algoritmos , Cromosomas Artificiales Bacterianos , Clonación Molecular , Biología Computacional , Secuencia Conservada , Mapeo Contig , Islas de CpG , Elementos Transponibles de ADN , ADN de Cloroplastos/genética , ADN Complementario , ADN Mitocondrial/genética , ADN de Plantas/genética , Bases de Datos de Ácidos Nucleicos , Escherichia coli/genética , Exones , Etiquetas de Secuencia Expresada , Genes de Plantas , Biblioteca Genómica , Oryza/genética , Secuencias Repetitivas de Ácidos Nucleicos , Retroelementos , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA