RESUMO
COSMIC, the Catalogue Of Somatic Mutations In Cancer (https://cancer.sanger.ac.uk) is the most detailed and comprehensive resource for exploring the effect of somatic mutations in human cancer. The latest release, COSMIC v86 (August 2018), includes almost 6 million coding mutations across 1.4 million tumour samples, curated from over 26 000 publications. In addition to coding mutations, COSMIC covers all the genetic mechanisms by which somatic mutations promote cancer, including non-coding mutations, gene fusions, copy-number variants and drug-resistance mutations. COSMIC is primarily hand-curated, ensuring quality, accuracy and descriptive data capture. Building on our manual curation processes, we are introducing new initiatives that allow us to prioritize key genes and diseases, and to react more quickly and comprehensively to new findings in the literature. Alongside improvements to the public website and data-download systems, new functionality in COSMIC-3D allows exploration of mutations within three-dimensional protein structures, their protein structural and functional impacts, and implications for druggability. In parallel with COSMIC's deep and broad variant coverage, the Cancer Gene Census (CGC) describes a curated catalogue of genes driving every form of human cancer. Currently describing 719 genes, the CGC has recently introduced functional descriptions of how each gene drives disease, summarized into the 10 cancer Hallmarks.
Assuntos
Bases de Dados de Ácidos Nucleicos , Mutação , Neoplasias/genética , Genes , Humanos , Conformação ProteicaRESUMO
BACKGROUND: Although the human genome sequence was declared complete in 2004, the sequence was interrupted by 341 gaps of which 308 lay in an estimated approximately 28 Mb of euchromatin. While these gaps constitute only approximately 1% of the sequence, knowledge of the full complement of human genes and regulatory elements is incomplete without their sequences. RESULTS: We have used a combination of conventional chromosome walking (aided by the availability of end sequences) in fosmid and bacterial artificial chromosome (BAC) libraries, whole chromosome shotgun sequencing, comparative genome analysis and long PCR to finish 8 of the 11 gaps in the initial chromosome 22 sequence. In addition, we have patched four regions of the initial sequence where the original clones were found to be deleted, or contained a deletion allele of a known gene, with a further 126 kb of new sequence. Over 1.018 Mb of new sequence has been generated to extend into and close the gaps, and we have annotated 16 new or extended gene structures and one pseudogene. CONCLUSION: Thus, we have made significant progress to completing the sequence of the euchromatic regions of human chromosome 22 using a combination of detailed approaches. Our experience suggests that substantial work remains to close the outstanding gaps in the human genome sequence.
Assuntos
Cromossomos Humanos Par 22 , Genoma Humano , Análise de Sequência de DNA , Sequência de Bases , Mapeamento Cromossômico , Cromossomos Artificiais Bacterianos/genética , HumanosRESUMO
We generated high-resolution maps of histone H3 lysine 9/14 acetylation (H3ac), histone H4 lysine 5/8/12/16 acetylation (H4ac), and histone H3 at lysine 4 mono-, di-, and trimethylation (H3K4me1, H3K4me2, H3K4me3, respectively) across the ENCODE regions. Studying each modification in five human cell lines including the ENCODE Consortium common cell lines GM06990 (lymphoblastoid) and HeLa-S3, as well as K562, HFL-1, and MOLT4, we identified clear patterns of histone modification profiles with respect to genomic features. H3K4me3, H3K4me2, and H3ac modifications are tightly associated with the transcriptional start sites (TSSs) of genes, while H3K4me1 and H4ac have more widespread distributions. TSSs reveal characteristic patterns of both types of modification present and the position relative to TSSs. These patterns differ between active and inactive genes and in particular the state of H3K4me3 and H3ac modifications is highly predictive of gene activity. Away from TSSs, modification sites are enriched in H3K4me1 and relatively depleted in H3K4me3 and H3ac. Comparison between cell lines identified differences in the histone modification profiles associated with transcriptional differences between the cell lines. These results provide an overview of the functional relationship among histone modifications and gene expression in human cells.
Assuntos
Genoma Humano/fisiologia , Histonas/metabolismo , Processamento de Proteína Pós-Traducional/fisiologia , Transcrição Gênica/fisiologia , Células HeLa , Humanos , Células K562RESUMO
Genomic microarrays have been used to assess DNA replication timing in a variety of eukaryotic organisms. A replication timing map of the human genome has already been published at a 1Mb resolution. Here we describe how the same method can be used to assess the replication timing of chromosome 6 with a greater resolution using an array of overlapping tile path clones. We report the replication timing map of the whole of chromosome 6 in general, and the MHC region in particular. Positive correlations are observed between replication timing and a number of genomic features including GC content, repeat content and transcriptional activity.
Assuntos
Cromossomos Humanos Par 6/genética , Cromossomos Humanos Par 6/fisiologia , Período de Replicação do DNA , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Linhagem Celular , Mapeamento Cromossômico , Citosina/análise , DNA/química , DNA/genética , Expansão das Repetições de DNA , Epigênese Genética , Fase G1/genética , Fase G1/fisiologia , Regulação da Expressão Gênica , Guanina/análise , Humanos , Complexo Principal de Histocompatibilidade/genética , Fase S/genética , Fase S/fisiologia , Transcrição GênicaRESUMO
We report a second-generation gene annotation of human chromosome 22. Using expressed sequence databases, comparative sequence analysis, and experimental verification, we have extended genes, fused previously fragmented structures, and identified new genes. The total length in exons of annotation was increased by 74% over our previously published annotation and includes 546 protein-coding genes and 234 pseudogenes. Thirty-two potential protein-coding annotations are partial copies of other genes, and may represent duplications on an evolutionary path to change or loss of function. We also identified 31 non-protein-coding transcripts, including 16 possible antisense RNAs. By extrapolation, we estimate the human genome contains 29,000-36,000 protein-coding genes, 21,300 pseudogenes, and 1500 antisense RNAs. We suggest that our revised annotation criteria provide a paradigm for future annotation of the human genome.
Assuntos
Mapeamento Cromossômico/métodos , Cromossomos Humanos Par 22/genética , Genes/genética , Animais , Humanos , Camundongos , Dados de Sequência MolecularRESUMO
We have developed a directly quantitative method utilizing genomic clone DNA microarrays to assess the replication timing of sequences during the S phase of the cell cycle. The genomic resolution of the replication timing measurements is limited only by the genomic clone size and density. We demonstrate the power of this approach by constructing a genome-wide map of replication timing in human lymphoblastoid cells using an array with clones spaced at 1 Mb intervals and a high-resolution replication timing map of 22q with an array utilizing overlapping sequencing tile path clones. We show a positive correlation, both genome-wide and at a high resolution, between replication timing and a range of genome parameters including GC content, gene density and transcriptional activity.
Assuntos
Replicação do DNA , Genoma Humano , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Composição de Bases , Células Cultivadas , Cromossomos Humanos Par 22 , Expressão Gênica , Humanos , Fase S/genéticaRESUMO
We have developed a systematic approach to generating cDNA clones containing full-length open reading frames (ORFs), exploiting knowledge of gene structure from genomic sequence. Each ORF was amplified by PCR from a pool of primary cDNAs, cloned and confirmed by sequencing. We obtained clones representing 70% of genes on human chromosome 22, whereas searching available cDNA clone collections found at best 48% from a single collection and 60% for all collections combined.
Assuntos
Clonagem Molecular/métodos , Genoma Humano , Genômica/métodos , Fases de Leitura Aberta/genética , Proteoma/genética , Cromossomos Humanos Par 22/genética , Biologia Computacional , DNA Complementar/genética , Bases de Dados Genéticas , Humanos , Reação em Cadeia da Polimerase , Polimorfismo de Nucleotídeo Único/genética , Projetos de Pesquisa , Análise de Sequência de DNARESUMO
DNA sequence variants in specific genes or regions of the human genome are responsible for a variety of phenotypes such as disease risk or variable drug response. These variants can be investigated directly, or through their non-random associations with neighbouring markers (called linkage disequilibrium (LD)). Here we report measurement of LD along the complete sequence of human chromosome 22. Duplicate genotyping and analysis of 1,504 markers in Centre d'Etude du Polymorphisme Humain (CEPH) reference families at a median spacing of 15 kilobases (kb) reveals a highly variable pattern of LD along the chromosome, in which extensive regions of nearly complete LD up to 804 kb in length are interspersed with regions of little or no detectable LD. The LD patterns are replicated in a panel of unrelated UK Caucasians. There is a strong correlation between high LD and low recombination frequency in the extant genetic map, suggesting that historical and contemporary recombination rates are similar. This study demonstrates the feasibility of developing genome-wide maps of LD.
Assuntos
Mapeamento Cromossômico , Cromossomos Humanos Par 22/genética , Desequilíbrio de Ligação/genética , Efeito Fundador , Frequência do Gene , Haplótipos/genética , Humanos , Linhagem , Polimorfismo Genético/genética , Recombinação GenéticaRESUMO
We have constructed the first comprehensive microarray representing a human chromosome for analysis of DNA copy number variation. This chromosome 22 array covers 34.7 Mb, representing 1.1% of the genome, with an average resolution of 75 kb. To demonstrate the utility of the array, we have applied it to profile acral melanoma, dermatofibrosarcoma, DiGeorge syndrome and neurofibromatosis 2. We accurately diagnosed homozygous/heterozygous deletions, amplifications/gains, IGLV/IGLC locus instability, and breakpoints of an imbalanced translocation. We further identified the 14-3-3 eta isoform as a candidate tumor suppressor in glioblastoma. Two significant methodological advances in array construction were also developed and validated. These include a strictly sequence defined, repeat-free, and non-redundant strategy for array preparation. This approach allows an increase in array resolution and analysis of any locus; disregarding common repeats, genomic clone availability and sequence redundancy. In addition, we report that the application of phi29 DNA polymerase is advantageous in microarray preparation. A broad spectrum of issues in medical research and diagnostics can be approached using the array. This well annotated and gene-rich autosome contains numerous uncharacterized disease genes. It is therefore crucial to associate these genes to specific 22q-related conditions and this array will be instrumental towards this goal. Furthermore, comprehensive epigenetic profiling of 22q-located genes and high-resolution analysis of replication timing across the entire chromosome can be studied using our array.