ABSTRACT
Large-scale chromatin features, such as replication time and accessibility influence the rate of somatic and germline mutations at the megabase scale. This article reviews how local chromatin structures -e.g., DNA wrapped around nucleosomes, transcription factors bound to DNA- affect the mutation rate at a local scale. It dissects how the interaction of some mutagenic agents and/or DNA repair systems with these local structures influence the generation of mutations. We discuss how this local mutation rate variability affects our understanding of the evolution of the genomic sequence, and the study of the evolution of organisms and tumors.
Subject(s)
Chromatin/genetics , Genome, Human/genetics , Mutation/genetics , Chromosome Mapping/methods , DNA/chemistry , DNA Repair/genetics , Evolution, Molecular , Genomics , Germ-Line Mutation/genetics , Humans , Mutagenesis/genetics , Mutation Rate , Nucleosomes/genetics , Transcription Factors/geneticsABSTRACT
Mutation rates along the genome are highly variable and influenced by several chromatin features. Here, we addressed how nucleosomes, the most pervasive chromatin structure in eukaryotes, affect the generation of mutations. We discovered that within nucleosomes, the somatic mutation rate across several tumor cohorts exhibits a strong 10 base pair (bp) periodicity. This periodic pattern tracks the alternation of the DNA minor groove facing toward and away from the histones. The strength and phase of the mutation rate periodicity are determined by the mutational processes active in tumors. We uncovered similar periodic patterns in the genetic variation among human and Arabidopsis populations, also detectable in their divergence from close species, indicating that the same principles underlie germline and somatic mutation rates. We propose that differential DNA damage and repair processes dependent on the minor groove orientation in nucleosome-bound DNA contribute to the 10-bp periodicity in AT/CG content in eukaryotic genomes.
Subject(s)
DNA/genetics , Germ-Line Mutation , Mutation Rate , Nucleosomes/genetics , Arabidopsis/genetics , DNA/chemistry , GC Rich Sequence , Genetic Variation , Nucleic Acid Conformation , Nucleosomes/chemistryABSTRACT
Somatic mutations in cancer genomes are caused by multiple mutational processes, each of which generates a characteristic mutational signature1. Here, as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium2 of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), we characterized mutational signatures using 84,729,690 somatic mutations from 4,645 whole-genome and 19,184 exome sequences that encompass most types of cancer. We identified 49 single-base-substitution, 11 doublet-base-substitution, 4 clustered-base-substitution and 17 small insertion-and-deletion signatures. The substantial size of our dataset, compared with previous analyses3-15, enabled the discovery of new signatures, the separation of overlapping signatures and the decomposition of signatures into components that may represent associated-but distinct-DNA damage, repair and/or replication mechanisms. By estimating the contribution of each signature to the mutational catalogues of individual cancer genomes, we revealed associations of signatures to exogenous or endogenous exposures, as well as to defective DNA-maintenance processes. However, many signatures are of unknown cause. This analysis provides a systematic perspective on the repertoire of mutational processes that contribute to the development of human cancer.
Subject(s)
Mutation/genetics , Neoplasms/genetics , Age Factors , Base Sequence , Exome/genetics , Genome, Human/genetics , Humans , Sequence Analysis, DNAABSTRACT
The discovery of drivers of cancer has traditionally focused on protein-coding genes1-4. Here we present analyses of driver point mutations and structural variants in non-coding regions across 2,658 genomes from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium5 of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). For point mutations, we developed a statistically rigorous strategy for combining significance levels from multiple methods of driver discovery that overcomes the limitations of individual methods. For structural variants, we present two methods of driver discovery, and identify regions that are significantly affected by recurrent breakpoints and recurrent somatic juxtapositions. Our analyses confirm previously reported drivers6,7, raise doubts about others and identify novel candidates, including point mutations in the 5' region of TP53, in the 3' untranslated regions of NFKBIZ and TOB1, focal deletions in BRD4 and rearrangements in the loci of AKR1C genes. We show that although point mutations and structural variants that drive cancer are less frequent in non-coding genes and regulatory sequences than in protein-coding genes, additional examples of these drivers will be found as more cancer genomes become available.
Subject(s)
Genome, Human/genetics , Mutation/genetics , Neoplasms/genetics , DNA Breaks , Databases, Genetic , Gene Expression Regulation, Neoplastic , Genome-Wide Association Study , Humans , INDEL MutationABSTRACT
Melanoma of the skin is a common cancer only in Europeans, whereas it arises in internal body surfaces (mucosal sites) and on the hands and feet (acral sites) in people throughout the world. Here we report analysis of whole-genome sequences from cutaneous, acral and mucosal subtypes of melanoma. The heavily mutated landscape of coding and non-coding mutations in cutaneous melanoma resolved novel signatures of mutagenesis attributable to ultraviolet radiation. However, acral and mucosal melanomas were dominated by structural changes and mutation signatures of unknown aetiology, not previously identified in melanoma. The number of genes affected by recurrent mutations disrupting non-coding sequences was similar to that affected by recurrent mutations to coding sequences. Significantly mutated genes included BRAF, CDKN2A, NRAS and TP53 in cutaneous melanoma, BRAF, NRAS and NF1 in acral melanoma and SF3B1 in mucosal melanoma. Mutations affecting the TERT promoter were the most frequent of all; however, neither they nor ATRX mutations, which correlate with alternative telomere lengthening, were associated with greater telomere length. Most melanomas had potentially actionable mutations, most in components of the mitogen-activated protein kinase and phosphoinositol kinase pathways. The whole-genome mutation landscape of melanoma reveals diverse carcinogenic processes across its subtypes, some unrelated to sun exposure, and extends potential involvement of the non-coding genome in its pathogenesis.
Subject(s)
Genome, Human/genetics , Melanoma/genetics , Mutation/genetics , DNA Helicases/genetics , GTP Phosphohydrolases/genetics , Genes, p16 , Humans , Melanoma/classification , Membrane Proteins/genetics , Mitogen-Activated Protein Kinases/genetics , Neurofibromatosis 1/genetics , Nuclear Proteins/genetics , Phosphoproteins/genetics , Proto-Oncogene Proteins B-raf/genetics , RNA Splicing Factors/genetics , Signal Transduction/drug effects , Telomerase/genetics , Telomere/genetics , Tumor Suppressor Protein p53/genetics , Ultraviolet Rays/adverse effects , X-linked Nuclear ProteinABSTRACT
An abnormally high rate of UV-light related mutations appears at transcription factor binding sites (TFBS) across melanomas. The binding of transcription factors (TFs) to the DNA impairs the repair of UV-induced lesions and certain TFs have been shown to increase the rate of generation of these lesions at their binding sites. However, the precise contribution of these two elements to the increase in mutation rate at TFBS in these malignant cells is not understood. Here, exploiting nucleotide-resolution data, we computed the rate of formation and repair of UV-lesions within the binding sites of TFs of different families. We observed, at certain dipyrimidine positions within the binding site of TFs in the Tryptophan Cluster family, an increased rate of formation of UV-induced lesions, corroborating previous studies. Nevertheless, across most families of TFs, the observed increased mutation rate within the entire DNA region covered by the protein results from the decreased repair efficiency. While the rate of mutations across all TFBS does not agree with the amount of UV-induced lesions observed immediately after UV exposure, it strongly agrees with that observed after 48 h. This corroborates the determinant role of the impaired repair in the observed increase of mutation rate.
Subject(s)
DNA Damage , DNA Repair , DNA, Neoplasm/radiation effects , Melanoma/genetics , Mutagenesis , Skin Neoplasms/genetics , Transcription Factors/metabolism , Ultraviolet Rays/adverse effects , Binding Sites , Chromosome Mapping , DNA, Neoplasm/genetics , Humans , Mutation , Pyrimidine Dimers/genetics , Pyrimidine Dimers/metabolism , Whole Genome SequencingABSTRACT
Twist1 is a basic helix-loop-helix transcription factor, essential during early development in mammals. While Twist1 induces epithelial-to-mesenchymal transition (EMT), here we show that Twist1 overexpression enhances nuclear and mitotic aberrations. This is accompanied by an increase in whole chromosomal copy number gains and losses, underscoring the role of Twist1 in inducing chromosomal instability (CIN) in colorectal cancer cells. Array comparative genomic hybridization (array CGH) analysis further shows sub-chromosomal deletions, consistent with an increased frequency of DNA double strand breaks (DSBs). Remarkably, Twist1 overexpression downmodulates key cell cycle checkpoint factors-Bub1, BubR1, Mad1 and Mad2-that regulate CIN. Mathematical simulations using the RACIPE tool show a negative correlation of Twist1 with E-cadherin and BubR1. Data analyses of gene expression profiles of patient samples from The Cancer Genome Atlas (TCGA) reveal a positive correlation between Twist1 and mesenchymal genes across cancers, whereas the correlation of TWIST1 with CIN and DSB genes is cancer subtype-specific. Taken together, these studies highlight the mechanistic involvement of Twist1 in the deregulation of factors that maintain genome stability during EMT in colorectal cancer cells. Twist1 overexpression enhances genome instability in the context of EMT that further contributes to cellular heterogeneity. In addition, these studies imply that Twist1 downmodulates nuclear lamins that further alter spatiotemporal organization of the cancer genome and epigenome. Notwithstanding their genetic background, colorectal cancer cells nevertheless maintain their overall ploidy, while the downstream effects of Twist1 enhance CIN and DNA damage enriching for sub-populations of aggressive cancer cells.
Subject(s)
Cadherins/genetics , Chromosomal Instability/genetics , Colorectal Neoplasms/genetics , Nuclear Proteins/genetics , Protein Serine-Threonine Kinases/genetics , Twist-Related Protein 1/genetics , Cell Cycle Proteins/genetics , Cell Line, Tumor , Colorectal Neoplasms/pathology , Comparative Genomic Hybridization , Epithelial-Mesenchymal Transition/genetics , Gene Expression Regulation, Neoplastic/genetics , Humans , Mad2 Proteins/geneticsABSTRACT
Somatic mutations are the driving force of cancer genome evolution. The rate of somatic mutations appears to be greatly variable across the genome due to variations in chromatin organization, DNA accessibility and replication timing. However, other variables that may influence the mutation rate locally are unknown, such as a role for DNA-binding proteins, for example. Here we demonstrate that the rate of somatic mutations in melanomas is highly increased at active transcription factor binding sites and nucleosome embedded DNA, compared to their flanking regions. Using recently available excision-repair sequencing (XR-seq) data, we show that the higher mutation rate at these sites is caused by a decrease of the levels of nucleotide excision repair (NER) activity. Our work demonstrates that DNA-bound proteins interfere with the NER machinery, which results in an increased rate of DNA mutations at the protein binding sites. This finding has important implications for our understanding of mutational and DNA repair processes and in the identification of cancer driver mutations.
Subject(s)
DNA Repair , DNA-Binding Proteins/metabolism , DNA/genetics , DNA/metabolism , Melanoma/genetics , Mutagenesis/genetics , Mutation Rate , Transcription Factors/metabolism , Binding Sites , DNA, Neoplasm/genetics , DNA, Neoplasm/metabolism , Gene Expression Regulation, Neoplastic/genetics , Genome, Human/genetics , Humans , Lung Neoplasms/genetics , Nucleosomes/genetics , Nucleosomes/metabolism , Promoter Regions, Genetic/genetics , Protein BindingABSTRACT
The novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) causing COVID-19 has rapidly turned into a pandemic, infecting millions and causing 1â157â509 (as of 27 October 2020) deaths across the globe. In addition to studying the mode of transmission and evasion of host immune system, analysing the viral mutational landscape constitutes an area under active research. The latter is expected to impart knowledge on the emergence of different clades, subclades, viral protein functions and protein-protein and protein-RNA interactions during replication/transcription cycle of virus and response to host immune checkpoints. In this study, we have attempted to bring forth the viral genomic variants defining the major clade(s) as identified from samples collected from the state of Telangana, India. We further report a comprehensive draft of all genomic variations (including unique mutations) present in SARS-CoV-2 strain in the state of Telangana. Our results reveal the presence of two mutually exclusive subgroups defined by specific variants within the dominant clade present in the population. This work attempts to bridge the critical gap regarding the genomic landscape and associate mutations in SARS-CoV-2 from a highly infected southern region of India, which was lacking to date.
Subject(s)
COVID-19/virology , Genome, Viral , SARS-CoV-2/genetics , COVID-19/epidemiology , Genomics , Humans , India/epidemiology , Mutation , Phylogeny , SARS-CoV-2/isolation & purification , Sequence Analysis, RNA , Viral Nonstructural Proteins/genetics , Viral Proteins/geneticsABSTRACT
BACKGROUND: Mutations in TP53 not only affect its tumour suppressor activity but also exerts oncogenic gain-of-function activity. While the genome-wide mutant p53 binding sites have been identified in cancer cell lines, the chromatin accessibility landscape driven by mutant p53 in primary tumours is unknown. Here, we leveraged the chromatin accessibility data of primary tumours from The Cancer Genome Atlas (TCGA) to identify differentially accessible regions in mutant p53 tumours compared to wild-type p53 tumours, especially in breast and colon cancers. RESULTS: We identified 1587 lost and 984 gained accessible chromatin regions in breast, and 1143 lost and 640 gained regions in colon cancers. However, only less than half of those regions in both cancer types contain sequence motifs for wild-type or mutant p53 binding. Whereas, the remaining showed enrichment for master transcriptional regulators, such as FOX-Family TFs and NF-kB in lost and SMAD and KLF TFs in gained regions of breast. In colon, ATF3 and FOS/JUN TFs were enriched in lost, and CDX family TFs and HNF4A in gained regions. By integrating the gene expression data, we identified known and novel target genes regulated by the mutant p53. CONCLUSION: This study reveals the direct and indirect mechanisms by which gain-of-function mutant p53 targets the chromatin and subsequent gene expression patterns in a tumour-type specific manner. This furthers our understanding of the impact of mutant p53 in cancer development.
Subject(s)
Breast Neoplasms/genetics , Chromatin/metabolism , Colonic Neoplasms/genetics , Gene Expression Regulation, Neoplastic , Tumor Suppressor Protein p53/genetics , Carcinogenesis/genetics , Datasets as Topic , Female , Gain of Function Mutation , Humans , MaleABSTRACT
Purpose: To identify the mutation for Volkmann cataract (CTRCT8) at 1p36.33. Methods: The genes in the candidate region 1p36.33 were Sanger and parallel deep sequenced, and informative single nucleotide polymorphisms (SNPs) were identified for linkage analysis. Expression analysis with reverse transcription polymerase chain reaction (RT-PCR) of the candidate gene was performed using RNA from different human tissues. Quantitative transcription polymerase chain reaction (qRT-PCR) analysis of the GNB1 gene was performed in affected and healthy individuals. Bioinformatic analysis of the linkage regions including the candidate gene was performed. Results: Linkage analysis of the 1p36.33 CCV locus applying new marker systems obtained with Sanger and deep sequencing reduced the candidate locus from 2.1 Mb to 0.389 Mb flanked by the markers STS-22AC and rs549772338 and resulted in an logarithm of the odds (LOD) score of Z = 21.67. The identified mutation, rs763295804, affects the donor splice site in the long non-coding RNA gene RP1-140A9.1 (ENSG00000231050). The gene including splice-site junctions is conserved in primates but not in other mammalian genomes, and two alternative transcripts were shown with RT-PCR. One of these transcripts represented a lens cell-specific transcript. Meta-analysis of the Cross-Linking-Immuno-Precipitation sequencing (CLIP-Seq) data suggested the RNA binding protein (RBP) eIF4AIII is an active counterpart for RP1-140A9.1, and several miRNA and transcription factors binding sites were predicted in the proximity of the mutation. ENCODE DNase I hypersensitivity and histone methylation and acetylation data suggest the genomic region may have regulatory functions. Conclusions: The mutation in RP1-140A9.1 suggests the long non-coding RNA as the candidate cataract gene associated with the autosomal dominant inherited congenital cataract from CCV. The mutation has the potential to destroy exon/intron splicing of both transcripts of RP1-140A9.1. Sanger and massive deep resequencing of the linkage region failed to identify alternative candidates suggesting the mutation in RP1-140A9.1 is causative for the CCV phenotype.
Subject(s)
Cataract/congenital , Chromosomes, Human, Pair 1/chemistry , Mutation , RNA, Long Noncoding/genetics , RNA, Messenger/genetics , Acetylation , Adult , Base Sequence , Binding Sites , Cataract/diagnosis , Cataract/genetics , Cataract/pathology , Eukaryotic Initiation Factor-4A/genetics , Eukaryotic Initiation Factor-4A/metabolism , Exons , Family , Female , Genes, Dominant , Genetic Loci , Genetic Markers , High-Throughput Nucleotide Sequencing , Histones/genetics , Histones/metabolism , Humans , Introns , Male , Methylation , Middle Aged , Pedigree , RNA Splice Sites , RNA Splicing , RNA, Long Noncoding/metabolism , RNA, Messenger/metabolismABSTRACT
Somatic mutations in the nuclear genome are required for tumor formation, but the functional consequences of somatic mitochondrial DNA (mtDNA) mutations are less understood. Here we identify somatic mtDNA mutations across 527 tumors and 14 cancer types, using an approach that takes advantage of evidence from both genomic and transcriptomic sequencing. We find that there is selective pressure against deleterious coding mutations, supporting that functional mitochondria are required in tumor cells, and also observe a strong mutational strand bias, compatible with endogenous replication-coupled errors as the major source of mutations. Interestingly, while allelic ratios in general were consistent in RNA compared to DNA, some mutations in tRNAs displayed strong allelic imbalances caused by accumulation of unprocessed tRNA precursors. The effect was explained by altered secondary structure, demonstrating that correct tRNA folding is a major determinant for processing of polycistronic mitochondrial transcripts. Additionally, the data suggest that tRNA clusters are preferably processed in the 3' to 5' direction. Our study gives insights into mtDNA function in cancer and answers questions regarding mitochondrial tRNA biogenesis that are difficult to address in controlled experimental systems.
Subject(s)
Mitochondria/genetics , Mutation , Neoplasms/genetics , Alleles , DNA, Mitochondrial , DNA, Neoplasm/genetics , Genome, Mitochondrial , Humans , RNA, Neoplasm , RNA, Transfer/genetics , Sequence Analysis, RNAABSTRACT
The function of many non-coding RNA genes and cis-regulatory elements of messenger RNA largely depends on the structure, which is in turn determined by their sequence. Single nucleotide polymorphisms (SNPs) and other mutations may disrupt the RNA structure, interfere with the molecular function and hence cause a phenotypic effect. RNAsnp is an efficient method to predict the effect of SNPs on local RNA secondary structure based on the RNA folding algorithms implemented in the Vienna RNA package. The SNP effects are quantified in terms of empirical P-values, which, for computational efficiency, are derived from extensive pre-computed tables of distributions of substitution effects as a function of gene length and GC content. Here, we present a web service that not only provides an interface for RNAsnp but also features a graphical output representation. In addition, the web server is connected to a local mirror of the UCSC genome browser database that enables the users to select the genomic sequences for analysis and visualize the results directly in the UCSC genome browser. The RNAsnp web server is freely available at: http://rth.dk/resources/rnasnp/.
Subject(s)
Polymorphism, Single Nucleotide , RNA/chemistry , Software , Algorithms , Computer Graphics , Internet , Nucleic Acid Conformation , RNA/geneticsABSTRACT
Ribose methylations are the most abundant chemical modifications of ribosomal RNA and are critical for ribosome assembly and fidelity of translation. Many aspects of ribose methylations have been difficult to study due to lack of efficient mapping methods. Here, we present a sequencing-based method (RiboMeth-seq) and its application to yeast ribosomes, presently the best-studied eukaryotic model system. We demonstrate detection of the known as well as new modifications, reveal partial modifications and unexpected communication between modification events, and determine the order of modification at several sites during ribosome biogenesis. Surprisingly, the method also provides information on a subset of other modifications. Hence, RiboMeth-seq enables a detailed evaluation of the importance of RNA modifications in the cells most sophisticated molecular machine. RiboMeth-seq can be adapted to other RNA classes, for example, mRNA, to reveal new biology involving RNA modifications.
Subject(s)
High-Throughput Nucleotide Sequencing , RNA/metabolism , Ribose/metabolism , MethylationABSTRACT
BACKGROUND: Serovars of Salmonella enterica, namely Typhi and Typhimurium, reportedly, are the bacterial pathogens causing systemic infections like gastroenteritis and typhoid fever. To elucidate the role and importance in such infection, the proteins of the Type III secretion system of Salmonella pathogenicity islands and two component signal transduction systems, have been mainly focused. However, the most indispensable of these virulent ones and their hierarchical role has not yet been studied extensively. RESULTS: We have adopted a theoretical approach to build an interactome comprising the proteins from the Salmonella pathogeneicity islands (SPI) and two component signal transduction systems. This interactome was then analyzed by using network parameters like centrality and k-core measures. An initial step to capture the fingerprint of the core network resulted in a set of proteins which are involved in the process of invasion and colonization, thereby becoming more important in the process of infection. These proteins pertained to the Inv, Org, Prg, Sip, Spa, Ssa and Sse operons along with chaperone protein SicA. Amongst them, SicA was figured out to be the most indispensable protein from different network parametric analyses. Subsequently, the gene expression levels of all these theoretically identified important proteins were confirmed by microarray data analysis. Finally, we have proposed a hierarchy of the proteins involved in the total infection process. This theoretical approach is the first of its kind to figure out potential virulence determinants encoded by SPI for therapeutic targets for enteric infection. CONCLUSIONS: A set of responsible virulent proteins was identified and the expression level of their genes was validated by using independent, published microarray data. The result was a targeted set of proteins that could serve as sensitive predictors and form the foundation for a series of trials in the wet-lab setting. Understanding these regulatory and virulent proteins would provide insight into conditions which are encountered by this intracellular enteric pathogen during the course of infection. This would further contribute in identifying novel targets for antimicrobial agents.
Subject(s)
Bacterial Secretion Systems/genetics , Genomic Islands/physiology , Protein Interaction Mapping/methods , Salmonella/metabolism , Salmonella/pathogenicity , Signal Transduction/physiology , Bacterial Proteins/metabolism , Gene Regulatory Networks/genetics , Microarray Analysis , Molecular Chaperones/metabolism , Salmonella/geneticsABSTRACT
Structural characteristics are essential for the functioning of many noncoding RNAs and cis-regulatory elements of mRNAs. SNPs may disrupt these structures, interfere with their molecular function, and hence cause a phenotypic effect. RNA folding algorithms can provide detailed insights into structural effects of SNPs. The global measures employed so far suffer from limited accuracy of folding programs on large RNAs and are computationally too demanding for genome-wide applications. Here, we present a strategy that focuses on the local regions of maximal structural change between mutant and wild-type. These local regions are approximated in a "screening mode" that is intended for genome-wide applications. Furthermore, localized regions are identified as those with maximal discrepancy. The mutation effects are quantified in terms of empirical P values. To this end, the RNAsnp software uses extensive precomputed tables of the distribution of SNP effects as function of length and GC content. RNAsnp thus achieves both a noise reduction and speed-up of several orders of magnitude over shuffling-based approaches. On a data set comprising 501 SNPs associated with human-inherited diseases, we predict 54 to have significant local structural effect in the untranslated region of mRNAs.
Subject(s)
Polymorphism, Single Nucleotide , RNA/chemistry , RNA/genetics , Software , Algorithms , Genetic Association Studies , Humans , Nucleic Acid Conformation , RNA FoldingABSTRACT
Induction of immunoproteasome (IP) expression in tumour cells can enhance antigen presentation and immunogenicity. Recently, the overexpression of IP genes has been associated with better prognosis and response to immune checkpoint blockade (ICB) therapies in melanoma. However, the extent of this association in other solid tumours and how that is influenced by tumour cell-intrinsic and cell-extrinsic factors remain unclear. Here, we address this by exploring the gene expression patterns from available bulk and single-cell transcriptomic data of primary tumours. We find that tumours with high-IP expression exhibit cytotoxic immune cell infiltration and upregulation of IFN-γ and TNF-α pathways in tumour cells. However, the association of IP expression with overall survival (TCGA cohort) and response to ICB therapy (non-TCGA cohorts) is tumour-type specific (better in non-small-cell lung, breast, bladder and thymus; and worse in glioma and renal) and is greatly influenced by pro- or antitumourigenic immune cell infiltration patterns. This emphasises the need for considering immune cell infiltration patterns, along with IP expression, as a prognostic biomarker to predict overall survival or response to ICB therapies in solid tumours, besides melanoma.
Subject(s)
Carcinoma, Non-Small-Cell Lung , Lung Neoplasms , Melanoma , Humans , Prognosis , Melanoma/pathology , Gene Expression ProfilingABSTRACT
Human papillomavirus (HPV) infections are the primary drivers of cervical cancers, and often HPV DNA gets integrated into the host genome. Although the oncogenic impact of HPV encoded genes is relatively well known, the cis-regulatory effect of integrated HPV DNA on host chromatin structure and gene regulation remains less understood. We investigated genome-wide patterns of HPV integrations and associated host gene expression changes in the context of host chromatin states and topologically associating domains (TADs). HPV integrations were significantly enriched in active chromatin regions and depleted in inactive ones. Interestingly, regardless of chromatin state, genomic regions flanking HPV integrations showed transcriptional upregulation. Nevertheless, upregulation (both local and long-range) was mostly confined to TADs with integration, but not affecting adjacent TADs. Few TADs showed recurrent integrations associated with overexpression of oncogenes within them (e.g. MYC, PVT1, TP63 and ERBB2) regardless of proximity. Hi-C and 4C-seq analyses in cervical cancer cell line (HeLa) demonstrated chromatin looping interactions between integrated HPV and MYC/PVT1 regions (~ 500 kb apart), leading to allele-specific overexpression. Based on these, we propose HPV integrations can trigger multimodal oncogenic activation to promote cancer progression.