RESUMO
Somatic genome rearrangements are thought to play important roles in cancer development. We optimized a long-span paired-end-tag (PET) sequencing approach using 10-Kb genomic DNA inserts to study human genome structural variations (SVs). The use of a 10-Kb insert size allows the identification of breakpoints within repetitive or homology-containing regions of a few kilobases in size and results in a higher physical coverage compared with small insert libraries with the same sequencing effort. We have applied this approach to comprehensively characterize the SVs of 15 cancer and two noncancer genomes and used a filtering approach to strongly enrich for somatic SVs in the cancer genomes. Our analyses revealed that most inversions, deletions, and insertions are germ-line SVs, whereas tandem duplications, unpaired inversions, interchromosomal translocations, and complex rearrangements are over-represented among somatic rearrangements in cancer genomes. We demonstrate that the quantitative and connective nature of DNA-PET data is precise in delineating the genealogy of complex rearrangement events, we observe signatures that are compatible with breakage-fusion-bridge cycles, and we discover that large duplications are among the initial rearrangements that trigger genome instability for extensive amplification in epithelial cancers.
Assuntos
Pareamento de Bases/genética , Neoplasias da Mama/genética , Mapeamento Cromossômico/métodos , Genoma Humano/genética , Variação Estrutural do Genoma/genética , Neoplasias Gástricas/genética , Linhagem Celular Tumoral , Biologia Computacional , DNA/genética , Feminino , Rearranjo Gênico , Humanos , Análise de Sequência de DNARESUMO
Genome rearrangements, a hallmark of cancer, can result in gene fusions with oncogenic properties. Using DNA paired-end-tag (DNA-PET) whole-genome sequencing, we analyzed 15 gastric cancers (GCs) from Southeast Asians. Rearrangements were enriched in open chromatin and shaped by chromatin structure. We identified seven rearrangement hot spots and 136 gene fusions. In three out of 100 GC cases, we found recurrent fusions between CLDN18, a tight junction gene, and ARHGAP26, a gene encoding a RHOA inhibitor. Epithelial cell lines expressing CLDN18-ARHGAP26 displayed a dramatic loss of epithelial phenotype and long protrusions indicative of epithelial-mesenchymal transition (EMT). Fusion-positive cell lines showed impaired barrier properties, reduced cell-cell and cell-extracellular matrix adhesion, retarded wound healing, and inhibition of RHOA. Gain of invasion was seen in cancer cell lines expressing the fusion. Thus, CLDN18-ARHGAP26 mediates epithelial disintegration, possibly leading to stomach H(+) leakage, and the fusion might contribute to invasiveness once a cell is transformed.
Assuntos
Claudinas/genética , Proteínas Ativadoras de GTPase/genética , Proteínas de Fusão Oncogênica/metabolismo , Neoplasias Gástricas/patologia , Sequência de Aminoácidos , Animais , Adesão Celular , Linhagem Celular Tumoral , Movimento Celular , Proliferação de Células , Clatrina/farmacologia , Claudinas/metabolismo , Cães , Endocitose/efeitos dos fármacos , Células Epiteliais/citologia , Células Epiteliais/metabolismo , Transição Epitelial-Mesenquimal , Proteínas Ativadoras de GTPase/metabolismo , Células HeLa , Humanos , Células MCF-7 , Células Madin Darby de Rim Canino , Dados de Sequência Molecular , Proteínas de Fusão Oncogênica/genética , Fenótipo , Neoplasias Gástricas/metabolismo , Proteína rhoA de Ligação ao GTP/antagonistas & inibidores , Proteína rhoA de Ligação ao GTP/metabolismoRESUMO
Structural variations (SVs) contribute significantly to the variability of the human genome and extensive genomic rearrangements are a hallmark of cancer. While genomic DNA paired-end-tag (DNA-PET) sequencing is an attractive approach to identify genomic SVs, the current application of PET sequencing with short insert size DNA can be insufficient for the comprehensive mapping of SVs in low complexity and repeat-rich genomic regions. We employed a recently developed procedure to generate PET sequencing data using large DNA inserts of 10-20 kb and compared their characteristics with short insert (1 kb) libraries for their ability to identify SVs. Our results suggest that although short insert libraries bear an advantage in identifying small deletions, they do not provide significantly better breakpoint resolution. In contrast, large inserts are superior to short inserts in providing higher physical genome coverage for the same sequencing cost and achieve greater sensitivity, in practice, for the identification of several classes of SVs, such as copy number neutral and complex events. Furthermore, our results confirm that large insert libraries allow for the identification of SVs within repetitive sequences, which cannot be spanned by short inserts. This provides a key advantage in studying rearrangements in cancer, and we show how it can be used in a fusion-point-guided-concatenation algorithm to study focally amplified regions in cancer.