RESUMO
The reference sequence for each human chromosome provides the framework for understanding genome function, variation and evolution. Here we report the finished sequence and biological annotation of human chromosome 1. Chromosome 1 is gene-dense, with 3,141 genes and 991 pseudogenes, and many coding sequences overlap. Rearrangements and mutations of chromosome 1 are prevalent in cancer and many other diseases. Patterns of sequence variation reveal signals of recent selection in specific genes that may contribute to human fitness, and also in regions where no function is evident. Fine-scale recombination occurs in hotspots of varying intensity along the sequence, and is enriched near genes. These and other studies of human biology and disease encoded within chromosome 1 are made possible with the highly accurate annotated sequence, as part of the completed set of chromosome sequences that comprise the reference human genome.
Assuntos
Cromossomos Humanos Par 1/genética , Sequência de Bases , Período de Replicação do DNA , Doença , Duplicação Gênica , Genes/genética , Variação Genética/genética , Genômica , Humanos , Dados de Sequência Molecular , Fases de Leitura Aberta/genética , Pseudogenes/genética , Recombinação Genética/genética , Seleção Genética , Análise de Sequência de DNARESUMO
Chromosome 9 is highly structurally polymorphic. It contains the largest autosomal block of heterochromatin, which is heteromorphic in 6-8% of humans, whereas pericentric inversions occur in more than 1% of the population. The finished euchromatic sequence of chromosome 9 comprises 109,044,351 base pairs and represents >99.6% of the region. Analysis of the sequence reveals many intra- and interchromosomal duplications, including segmental duplications adjacent to both the centromere and the large heterochromatic block. We have annotated 1,149 genes, including genes implicated in male-to-female sex reversal, cancer and neurodegenerative disease, and 426 pseudogenes. The chromosome contains the largest interferon gene cluster in the human genome. There is also a region of exceptionally high gene and G + C content including genes paralogous to those in the major histocompatibility complex. We have also detected recently duplicated genes that exhibit different rates of sequence divergence, presumably reflecting natural selection.
Assuntos
Cromossomos Humanos Par 9/genética , Genes , Mapeamento Físico do Cromossomo , Composição de Bases , Eucromatina/genética , Evolução Molecular , Feminino , Duplicação Gênica , Genes Duplicados/genética , Variação Genética/genética , Genética Médica , Genômica , Heterocromatina/genética , Humanos , Masculino , Neoplasias/genética , Doenças Neurodegenerativas/genética , Pseudogenes/genética , Análise de Sequência de DNA , Processos de Determinação SexualRESUMO
The finished sequence of human chromosome 10 comprises a total of 131,666,441 base pairs. It represents 99.4% of the euchromatic DNA and includes one megabase of heterochromatic sequence within the pericentromeric region of the short and long arm of the chromosome. Sequence annotation revealed 1,357 genes, of which 816 are protein coding, and 430 are pseudogenes. We observed widespread occurrence of overlapping coding genes (either strand) and identified 67 antisense transcripts. Our analysis suggests that both inter- and intrachromosomal segmental duplications have impacted on the gene count on chromosome 10. Multispecies comparative analysis indicated that we can readily annotate the protein-coding genes with current resources. We estimate that over 95% of all coding exons were identified in this study. Assessment of single base changes between the human chromosome 10 and chimpanzee sequence revealed nonsense mutations in only 21 coding genes with respect to the human sequence.
Assuntos
Cromossomos Humanos Par 10/genética , Genes , Mapeamento Físico do Cromossomo , Animais , Composição de Bases , Mapeamento de Sequências Contíguas , Ilhas de CpG/genética , Evolução Molecular , Éxons/genética , Duplicação Gênica , Variação Genética/genética , Genética Médica , Genômica , Humanos , Pan troglodytes/genética , Proteínas/genética , Pseudogenes/genética , Análise de Sequência de DNARESUMO
Chromosome 6 is a metacentric chromosome that constitutes about 6% of the human genome. The finished sequence comprises 166,880,988 base pairs, representing the largest chromosome sequenced so far. The entire sequence has been subjected to high-quality manual annotation, resulting in the evidence-supported identification of 1,557 genes and 633 pseudogenes. Here we report that at least 96% of the protein-coding genes have been identified, as assessed by multi-species comparative sequence analysis, and provide evidence for the presence of further, otherwise unsupported exons/genes. Among these are genes directly implicated in cancer, schizophrenia, autoimmunity and many other diseases. Chromosome 6 harbours the largest transfer RNA gene cluster in the genome; we show that this cluster co-localizes with a region of high transcriptional activity. Within the essential immune loci of the major histocompatibility complex, we find HLA-B to be the most polymorphic gene on chromosome 6 and in the human genome.
Assuntos
Cromossomos Humanos Par 6/genética , Genes/genética , Mapeamento Físico do Cromossomo , Animais , Éxons/genética , Doenças Genéticas Inatas/genética , Antígenos HLA-B/genética , Humanos , Pseudogenes/genética , RNA de Transferência/genética , Análise de Sequência de DNARESUMO
Chromosome 13 is the largest acrocentric human chromosome. It carries genes involved in cancer including the breast cancer type 2 (BRCA2) and retinoblastoma (RB1) genes, is frequently rearranged in B-cell chronic lymphocytic leukaemia, and contains the DAOA locus associated with bipolar disorder and schizophrenia. We describe completion and analysis of 95.5 megabases (Mb) of sequence from chromosome 13, which contains 633 genes and 296 pseudogenes. We estimate that more than 95.4% of the protein-coding genes of this chromosome have been identified, on the basis of comparison with other vertebrate genome sequences. Additionally, 105 putative non-coding RNA genes were found. Chromosome 13 has one of the lowest gene densities (6.5 genes per Mb) among human chromosomes, and contains a central region of 38 Mb where the gene density drops to only 3.1 genes per Mb.