RESUMEN
Chromosome 5 is one of the largest human chromosomes and contains numerous intrachromosomal duplications, yet it has one of the lowest gene densities. This is partially explained by numerous gene-poor regions that display a remarkable degree of noncoding conservation with non-mammalian vertebrates, suggesting that they are functionally constrained. In total, we compiled 177.7 million base pairs of highly accurate finished sequence containing 923 manually curated protein-coding genes including the protocadherin and interleukin gene families. We also completely sequenced versions of the large chromosome-5-specific internal duplications. These duplications are very recent evolutionary events and probably have a mechanistic role in human physiological variation, as deletions in these regions are the cause of debilitating disorders including spinal muscular atrophy.
Asunto(s)
Cromosomas Humanos Par 5/genética , Análisis de Secuencia de ADN , Animales , Composición de Base , Cadherinas/genética , Secuencia Conservada/genética , Duplicación de Gen , Genes/genética , Enfermedades Genéticas Congénitas/genética , Genómica , Humanos , Interleucinas/genética , Datos de Secuencia Molecular , Atrofia Muscular Espinal/genética , Pan troglodytes/genética , Mapeo Físico de Cromosoma , Seudogenes/genética , Sintenía/genética , Vertebrados/genéticaRESUMEN
Chromosome 19 has the highest gene density of all human chromosomes, more than double the genome-wide average. The large clustered gene families, corresponding high G + C content, CpG islands and density of repetitive DNA indicate a chromosome rich in biological and evolutionary significance. Here we describe 55.8 million base pairs of highly accurate finished sequence representing 99.9% of the euchromatin portion of the chromosome. Manual curation of gene loci reveals 1,461 protein-coding genes and 321 pseudogenes. Among these are genes directly implicated in mendelian disorders, including familial hypercholesterolaemia and insulin-resistant diabetes. Nearly one-quarter of these genes belong to tandemly arranged families, encompassing more than 25% of the chromosome. Comparative analyses show a fascinating picture of conservation and divergence, revealing large blocks of gene orthology with rodents, scattered regions with more recent gene family expansions and deletions, and segments of coding and non-coding conservation with the distant fish species Takifugu.
Asunto(s)
Cromosomas Humanos Par 19/genética , Genes/genética , Mapeo Físico de Cromosoma , Empalme Alternativo/genética , Animales , Composición de Base , Secuencia Conservada/genética , Islas de CpG/genética , Evolución Molecular , Duplicación de Gen , Genética Médica , Humanos , Ratones , Datos de Secuencia Molecular , Familia de Multigenes/genética , Seudogenes/genética , Análisis de Secuencia de ADNRESUMEN
The species complexity of microbial communities and challenges in culturing representative isolates make it difficult to obtain assembled genomes. Here we characterize and compare the metabolic capabilities of terrestrial and marine microbial communities using largely unassembled sequence data obtained by shotgun sequencing DNA isolated from the various environments. Quantitative gene content analysis reveals habitat-specific fingerprints that reflect known characteristics of the sampled environments. The identification of environment-specific genes through a gene-centric comparative analysis presents new opportunities for interpreting and diagnosing environments.