Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Nature ; 444(7118): 499-502, 2006 Nov 23.
Artigo em Inglês | MEDLINE | ID: mdl-17086198

RESUMO

Identifying the sequences that direct the spatial and temporal expression of genes and defining their function in vivo remains a significant challenge in the annotation of vertebrate genomes. One major obstacle is the lack of experimentally validated training sets. In this study, we made use of extreme evolutionary sequence conservation as a filter to identify putative gene regulatory elements, and characterized the in vivo enhancer activity of a large group of non-coding elements in the human genome that are conserved in human-pufferfish, Takifugu (Fugu) rubripes, or ultraconserved in human-mouse-rat. We tested 167 of these extremely conserved sequences in a transgenic mouse enhancer assay. Here we report that 45% of these sequences functioned reproducibly as tissue-specific enhancers of gene expression at embryonic day 11.5. While directing expression in a broad range of anatomical structures in the embryo, the majority of the 75 enhancers directed expression to various regions of the developing nervous system. We identified sequence signatures enriched in a subset of these elements that targeted forebrain expression, and used these features to rank all approximately 3,100 non-coding elements in the human genome that are conserved between human and Fugu. The testing of the top predictions in transgenic mice resulted in a threefold enrichment for sequences with forebrain enhancer activity. These data dramatically expand the catalogue of human gene enhancers that have been characterized in vivo, and illustrate the utility of such training sets for a variety of biological applications, including decoding the regulatory vocabulary of the human genome.


Assuntos
Elementos Facilitadores Genéticos , Genoma Humano , Animais , Sequência de Bases , Cromossomos Humanos Par 16 , Sequência Conservada , Embrião de Mamíferos/metabolismo , Embrião não Mamífero , Expressão Gênica , Genômica/métodos , Humanos , Camundongos , Camundongos Transgênicos , Sistema Nervoso/embriologia , Sistema Nervoso/metabolismo , Prosencéfalo/embriologia , Prosencéfalo/metabolismo , Takifugu/genética , Fatores de Transcrição/genética
2.
Nature ; 432(7020): 988-94, 2004 Dec 23.
Artigo em Inglês | MEDLINE | ID: mdl-15616553

RESUMO

Human chromosome 16 features one of the highest levels of segmentally duplicated sequence among the human autosomes. We report here the 78,884,754 base pairs of finished chromosome 16 sequence, representing over 99.9% of its euchromatin. Manual annotation revealed 880 protein-coding genes confirmed by 1,670 aligned transcripts, 19 transfer RNA genes, 341 pseudogenes and three RNA pseudogenes. These genes include metallothionein, cadherin and iroquois gene families, as well as the disease genes for polycystic kidney disease and acute myelomonocytic leukaemia. Several large-scale structural polymorphisms spanning hundreds of kilobase pairs were identified and result in gene content differences among humans. Whereas the segmental duplications of chromosome 16 are enriched in the relatively gene-poor pericentromere of the p arm, some are involved in recent gene duplication and conversion events that are likely to have had an impact on the evolution of primates and human disease susceptibility.


Assuntos
Cromossomos Humanos Par 16/genética , Duplicação Gênica , Mapeamento Físico do Cromossomo , Animais , Genes/genética , Genômica , Heterocromatina/genética , Humanos , Dados de Sequência Molecular , Polimorfismo Genético/genética , Análise de Sequência de DNA , Sintenia/genética
3.
Nature ; 431(7006): 268-74, 2004 Sep 16.
Artigo em Inglês | MEDLINE | ID: mdl-15372022

RESUMO

Chromosome 5 is one of the largest human chromosomes and contains numerous intrachromosomal duplications, yet it has one of the lowest gene densities. This is partially explained by numerous gene-poor regions that display a remarkable degree of noncoding conservation with non-mammalian vertebrates, suggesting that they are functionally constrained. In total, we compiled 177.7 million base pairs of highly accurate finished sequence containing 923 manually curated protein-coding genes including the protocadherin and interleukin gene families. We also completely sequenced versions of the large chromosome-5-specific internal duplications. These duplications are very recent evolutionary events and probably have a mechanistic role in human physiological variation, as deletions in these regions are the cause of debilitating disorders including spinal muscular atrophy.


Assuntos
Cromossomos Humanos Par 5/genética , Análise de Sequência de DNA , Animais , Composição de Bases , Caderinas/genética , Sequência Conservada/genética , Duplicação Gênica , Genes/genética , Doenças Genéticas Inatas/genética , Genômica , Humanos , Interleucinas/genética , Dados de Sequência Molecular , Atrofia Muscular Espinal/genética , Pan troglodytes/genética , Mapeamento Físico do Cromossomo , Pseudogenes/genética , Sintenia/genética , Vertebrados/genética
4.
Nature ; 428(6982): 529-35, 2004 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-15057824

RESUMO

Chromosome 19 has the highest gene density of all human chromosomes, more than double the genome-wide average. The large clustered gene families, corresponding high G + C content, CpG islands and density of repetitive DNA indicate a chromosome rich in biological and evolutionary significance. Here we describe 55.8 million base pairs of highly accurate finished sequence representing 99.9% of the euchromatin portion of the chromosome. Manual curation of gene loci reveals 1,461 protein-coding genes and 321 pseudogenes. Among these are genes directly implicated in mendelian disorders, including familial hypercholesterolaemia and insulin-resistant diabetes. Nearly one-quarter of these genes belong to tandemly arranged families, encompassing more than 25% of the chromosome. Comparative analyses show a fascinating picture of conservation and divergence, revealing large blocks of gene orthology with rodents, scattered regions with more recent gene family expansions and deletions, and segments of coding and non-coding conservation with the distant fish species Takifugu.


Assuntos
Cromossomos Humanos Par 19/genética , Genes/genética , Mapeamento Físico do Cromossomo , Processamento Alternativo/genética , Animais , Composição de Bases , Sequência Conservada/genética , Ilhas de CpG/genética , Evolução Molecular , Duplicação Gênica , Genética Médica , Humanos , Camundongos , Dados de Sequência Molecular , Família Multigênica/genética , Pseudogenes/genética , Análise de Sequência de DNA
5.
Genome Res ; 16(7): 855-63, 2006 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-16769978

RESUMO

Cross-species DNA sequence comparison is the primary method used to identify functional noncoding elements in human and other large genomes. However, little is known about the relative merits of evolutionarily close and distant sequence comparisons. To address this problem, we identified evolutionarily conserved noncoding regions in primate, mammalian, and more distant comparisons using a uniform approach (Gumby) that facilitates unbiased assessment of the impact of evolutionary distance on predictive power. We benchmarked computational predictions against previously identified cis-regulatory elements at diverse genomic loci and also tested numerous extremely conserved human-rodent sequences for transcriptional enhancer activity using an in vivo enhancer assay in transgenic mice. Human regulatory elements were identified with acceptable sensitivity (53%-80%) and true-positive rate (27%-67%) by comparison with one to five other eutherian mammals or six other simian primates. More distant comparisons (marsupial, avian, amphibian, and fish) failed to identify many of the empirically defined functional noncoding elements. Our results highlight the practical utility of close sequence comparisons, and the loss of sensitivity entailed by more distant comparisons. We derived an intuitive relationship between ancient and recent noncoding sequence conservation from whole-genome comparative analysis that explains most of the observations from empirical benchmarking. Lastly, we determined that, in addition to strength of conservation, genomic location and/or density of surrounding conserved elements must also be considered in selecting candidate enhancers for in vivo testing at embryonic time points.


Assuntos
Elementos Facilitadores Genéticos , Genoma Humano , Sequências Reguladoras de Ácido Nucleico , Animais , Sequência de Bases , Cromossomos Humanos Par 16 , Biologia Computacional , Sequência Conservada , DNA/genética , Evolução Molecular , Proteínas do Olho/química , Proteínas do Olho/genética , Humanos , Camundongos , Camundongos Transgênicos , Valor Preditivo dos Testes , Estrutura Terciária de Proteína , Ratos , Sensibilidade e Especificidade , Análise de Sequência de DNA , Fatores de Transcrição/química , Fatores de Transcrição/genética
6.
Hum Mol Genet ; 14(20): 3057-63, 2005 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-16155111

RESUMO

Our inability to associate distant regulatory elements with the genes they regulate has largely precluded their examination for sequence alterations contributing to human disease. One major obstacle is the large genomic space surrounding targeted genes in which such elements could potentially reside. In order to delineate gene regulatory boundaries, we used whole-genome human-mouse-chicken (HMC) and human-mouse-frog (HMF) multiple alignments to compile conserved blocks of synteny (CBSs), under the hypothesis that these blocks have been kept intact throughout evolution at least in part by the requirement of regulatory elements to stay linked to the genes they regulate. A total of 2116 and 1942 CBSs >200 kb were assembled for HMC and HMF, respectively, encompassing 1.53 and 0.86 Gb of human sequence. To support the existence of complex long-range regulatory domains within these CBSs, we analyzed the prevalence and distribution of chromosomal aberrations leading to position effects (disruption of a gene's regulatory environment), observing a clear bias not only for mapping onto CBS but also for longer CBS size. Our results provide an extensive data set characterizing the regulatory domains of genes and the conserved regulatory elements within them.


Assuntos
Sequência Conservada/genética , Genoma Humano , Mapeamento Físico do Cromossomo/métodos , Sequências Reguladoras de Ácido Nucleico/genética , Sintenia/genética , Animais , Galinhas/genética , Cromossomos Humanos/genética , Evolução Molecular , Humanos , Camundongos , Ranidae/genética , Deleção de Sequência
7.
Genome Res ; 15(1): 1-18, 2005 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-15632085

RESUMO

We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each arm gene order has been extensively reshuffled, leading to a minimum of 921 syntenic blocks shared between the species. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 25-55 million years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences between the species--but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila.


Assuntos
Cromossomos/genética , Drosophila/genética , Evolução Molecular , Genes de Insetos/genética , Genoma , Análise de Sequência de DNA/métodos , Animais , Quebra Cromossômica/genética , Inversão Cromossômica/genética , Mapeamento Cromossômico/métodos , Sequência Conservada/genética , Drosophila melanogaster/genética , Elementos Facilitadores Genéticos , Rearranjo Gênico/genética , Variação Genética/genética , Dados de Sequência Molecular , Valor Preditivo dos Testes , Sequências Repetitivas de Ácido Nucleico/genética
8.
Genome Res ; 13(1): 73-80, 2003 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-12529308

RESUMO

The availability of the assembled mouse genome makes possible, for the first time, an alignment and comparison of two large vertebrate genomes. We investigated different strategies of alignment for the subsequent analysis of conservation of genomes that are effective for assemblies of different quality. These strategies were applied to the comparison of the working draft of the human genome with the Mouse Genome Sequencing Consortium assembly, as well as other intermediate mouse assemblies. Our methods are fast and the resulting alignments exhibit a high degree of sensitivity, covering more than 90% of known coding exons in the human genome. We obtained such coverage while preserving specificity. With a view towards the end user, we developed a suite of tools and Web sites for automatically aligning and subsequently browsing and working with whole-genome comparisons. We describe the use of these tools to identify conserved non-coding regions between the human and mouse genomes, some of which have not been identified by other methods.


Assuntos
Genoma Humano , Genoma , Projetos de Pesquisa , Alinhamento de Sequência/instrumentação , Alinhamento de Sequência/métodos , Algoritmos , Animais , Cromossomos/genética , Cromossomos Humanos/genética , Redes de Comunicação de Computadores/instrumentação , Bases de Dados Genéticas , Humanos , Internet/instrumentação , Camundongos , Software
9.
Bioinformatics ; 19 Suppl 1: i54-62, 2003.
Artigo em Inglês | MEDLINE | ID: mdl-12855437

RESUMO

MOTIVATION: To compare entire genomes from different species, biologists increasingly need alignment methods that are efficient enough to handle long sequences, and accurate enough to correctly align the conserved biological features between distant species. The two main classes of pairwise alignments are global alignment, where one string is transformed into the other, and local alignment, where all locations of similarity between the two strings are returned. Global alignments are less prone to demonstrating false homology as each letter of one sequence is constrained to being aligned to only one letter of the other. Local alignments, on the other hand, can cope with rearrangements between non-syntenic, orthologous sequences by identifying similar regions in sequences; this, however, comes at the expense of a higher false positive rate due to the inability of local aligners to take into account overall conservation maps. RESULTS: In this paper we introduce the notion of glocal alignment, a combination of global and local methods, where one creates a map that transforms one sequence into the other while allowing for rearrangement events. We present Shuffle-LAGAN, a glocal alignment algorithm that is based on the CHAOS local alignment algorithm and the LAGAN global aligner, and is able to align long genomic sequences. To test Shuffle-LAGAN we split the mouse genome into BAC-sized pieces, and aligned these pieces to the human genome. We demonstrate that Shuffle-LAGAN compares favorably in terms of sensitivity and specificity with standard local and global aligners. From the alignments we conclude that about 9% of human/mouse homology may be attributed to small rearrangements, 63% of which are duplications.


Assuntos
Mapeamento Cromossômico/métodos , DNA/análise , DNA/química , Perfilação da Expressão Gênica/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Algoritmos , Animais , Sequência de Bases , DNA/genética , Genoma Humano , Humanos , Camundongos , Dados de Sequência Molecular , Homologia de Sequência do Ácido Nucleico
10.
Bioinformatics ; 20(5): 636-43, 2004 Mar 22.
Artigo em Inglês | MEDLINE | ID: mdl-15033870

RESUMO

MOTIVATION: The power of multi-sequence comparison for biological discovery is well established. The need for new capabilities to visualize and compare cross-species alignment data is intensified by the growing number of genomic sequence datasets being generated for an ever-increasing number of organisms. To be efficient these visualization algorithms must support the ability to accommodate consistently a wide range of evolutionary distances in a comparison framework based upon phylogenetic relationships. RESULTS: We have developed Phylo-VISTA, an interactive tool for analyzing multiple alignments by visualizing a similarity measure for multiple DNA sequences. The complexity of visual presentation is effectively organized using a framework based upon interspecies phylogenetic relationships. The phylogenetic organization supports rapid, user-guided interspecies comparison. To aid in navigation through large sequence datasets, Phylo-VISTA leverages concepts from VISTA that provide a user with the ability to select and view data at varying resolutions. The combination of multiresolution data visualization and analysis, combined with the phylogenetic framework for interspecies comparison, produces a highly flexible and powerful tool for visual data analysis of multiple sequence alignments. AVAILABILITY: Phylo-VISTA is available at http://www-gsd.lbl.gov/phylovista. It requires an Internet browser with Java Plug-in 1.4.2 and it is integrated into the global alignment program LAGAN at http://lagan.stanford.edu


Assuntos
Algoritmos , Gráficos por Computador , Perfilação da Expressão Gênica/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Software , Interface Usuário-Computador , Animais , Sequência de Bases , Humanos , Leucemia/genética , Dados de Sequência Molecular , Filogenia , Homologia de Sequência do Ácido Nucleico
11.
Nature ; 420(6915): 520-62, 2002 Dec 05.
Artigo em Inglês | MEDLINE | ID: mdl-12466850

RESUMO

The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.


Assuntos
Cromossomos de Mamíferos/genética , Evolução Molecular , Genoma , Camundongos/genética , Mapeamento Físico do Cromossomo , Animais , Composição de Bases , Sequência Conservada/genética , Ilhas de CpG/genética , Regulação da Expressão Gênica , Genes/genética , Variação Genética/genética , Genoma Humano , Genômica , Humanos , Camundongos/classificação , Camundongos Knockout , Camundongos Transgênicos , Modelos Animais , Família Multigênica/genética , Mutagênese , Neoplasias/genética , Proteoma/genética , Pseudogenes/genética , Locos de Características Quantitativas/genética , RNA não Traduzido/genética , Sequências Repetitivas de Ácido Nucleico/genética , Seleção Genética , Análise de Sequência de DNA , Cromossomos Sexuais/genética , Especificidade da Espécie , Sintenia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA