RESUMO
Approximately 13% of the human genome at certain motifs have the potential to form noncanonical (non-B) DNA structures (e.g., G-quadruplexes, cruciforms, and Z-DNA), which regulate many cellular processes but also affect the activity of polymerases and helicases. Because sequencing technologies use these enzymes, they might possess increased errors at non-B structures. To evaluate this, we analyzed error rates, read depth, and base quality of Illumina, Pacific Biosciences (PacBio) HiFi, and Oxford Nanopore Technologies (ONT) sequencing at non-B motifs. All technologies showed altered sequencing success for most non-B motif types, although this could be owing to several factors, including structure formation, biased GC content, and the presence of homopolymers. Single-nucleotide mismatch errors had low biases in HiFi and ONT for all non-B motif types but were increased for G-quadruplexes and Z-DNA in all three technologies. Deletion errors were increased for all non-B types but Z-DNA in Illumina and HiFi, as well as only for G-quadruplexes in ONT. Insertion errors for non-B motifs were highly, moderately, and slightly elevated in Illumina, HiFi, and ONT, respectively. Additionally, we developed a probabilistic approach to determine the number of false positives at non-B motifs depending on sample size and variant frequency, and applied it to publicly available data sets (1000 Genomes, Simons Genome Diversity Project, and gnomAD). We conclude that elevated sequencing errors at non-B DNA motifs should be considered in low-read-depth studies (single-cell, ancient DNA, and pooled-sample population sequencing) and in scoring rare variants. Combining technologies should maximize sequencing accuracy in future studies of non-B DNA.
Assuntos
DNA Forma Z , Nanoporos , Humanos , Motivos de Nucleotídeos , Análise de Sequência de DNA , DNA/genética , Composição de Bases , Sequenciamento de Nucleotídeos em Larga EscalaRESUMO
DNA polymerase stalling activates the ATR checkpoint kinase, which in turn suppresses fork collapse and breakage. Herein, we describe use of ATR inhibition (ATRi) as a means to identify genomic sites of problematic DNA replication in murine and human cells. Over 500 high-resolution ATR-dependent sites were ascertained using two distinct methods: replication protein A (RPA)-chromatin immunoprecipitation (ChIP) and breaks identified by TdT labeling (BrITL). The genomic feature most strongly associated with ATR dependence was repetitive DNA that exhibited high structure-forming potential. Repeats most reliant on ATR for stability included structure-forming microsatellites, inverted retroelement repeats, and quasi-palindromic AT-rich repeats. Notably, these distinct categories of repeats differed in the structures they formed and their ability to stimulate RPA accumulation and breakage, implying that the causes and character of replication fork collapse under ATR inhibition can vary in a DNA-structure-specific manner. Collectively, these studies identify key sources of endogenous replication stress that rely on ATR for stability.
Assuntos
Proteínas Mutadas de Ataxia Telangiectasia/antagonistas & inibidores , Proteínas Mutadas de Ataxia Telangiectasia/genética , Replicação do DNA/genética , Repetições de Microssatélites/genética , Animais , Proteínas de Ciclo Celular/genética , Cromatina/genética , Imunoprecipitação da Cromatina/métodos , Quebras de DNA de Cadeia Dupla , Dano ao DNA/genética , Feminino , Instabilidade Genômica/genética , Humanos , Camundongos , Proteína de Replicação A/genéticaRESUMO
Fragile sites are unstable genomic regions that are prone to breakage during stressed DNA replication. Several common fragile sites (CFS) contain A+T-rich regions including perfect [AT/TA] microsatellite repeats that may collapse into hairpins when in single-stranded DNA (ssDNA) form and coincide with chromosomal hotspots for breakage and rearrangements. While many factors contribute to CFS instability, evidence exists for replication stalling within [AT/TA] microsatellite repeats. Currently, it is unknown how stress causes replication stalling within [AT/TA] microsatellite repeats. To investigate this, we utilized FRET to characterize the structures of [AT/TA]25 sequences and also reconstituted lagging strand replication to characterize the progression of pol δ holoenzymes through A+T-rich sequences. The results indicate that [AT/TA]25 sequences adopt hairpins that are unwound by the major ssDNA-binding complex, RPA, and the progression of pol δ holoenzymes through A+T-rich sequences saturated with RPA is dependent on the template sequence and dNTP concentration. Importantly, the effects of RPA on the replication of [AT/TA]25 sequences are dependent on dNTP concentration, whereas the effects of RPA on the replication of A+T-rich, nonstructure-forming sequences are independent of dNTP concentration. Collectively, these results reveal complexities in lagging strand replication and provide novel insights into how [AT/TA] microsatellite repeats contribute to genome instability.
Assuntos
DNA Polimerase III , Replicação do DNA , Humanos , DNA Polimerase III/genética , DNA Polimerase III/metabolismo , DNA de Cadeia Simples/genética , Holoenzimas/genética , Repetições de Microssatélites , NucleotídeosRESUMO
Approximately 1% of the human genome has the ability to fold into G-quadruplexes (G4s)-noncanonical strand-specific DNA structures forming at G-rich motifs. G4s regulate several key cellular processes (e.g., transcription) and have been hypothesized to participate in others (e.g., firing of replication origins). Moreover, G4s differ in their thermostability, and this may affect their function. Yet, G4s may also hinder replication, transcription, and translation and may increase genome instability and mutation rates. Therefore, depending on their genomic location, thermostability, and functionality, G4 loci might evolve under different selective pressures, which has never been investigated. Here we conducted the first genome-wide analysis of G4 distribution, thermostability, and selection. We found an overrepresentation, high thermostability, and purifying selection for G4s within genic components in which they are expected to be functional-promoters, CpG islands, and 5' and 3' UTRs. A similar pattern was observed for G4s within replication origins, enhancers, eQTLs, and TAD boundary regions, strongly suggesting their functionality. In contrast, G4s on the nontranscribed strand of exons were underrepresented, were unstable, and evolved neutrally. In general, G4s on the nontranscribed strand of genic components had lower density and were less stable than those on the transcribed strand, suggesting that the former are avoided at the RNA level. Across the genome, purifying selection was stronger at stable G4s. Our results suggest that purifying selection preserves the sequences of functional G4s, whereas nonfunctional G4s are too costly to be tolerated in the genome. Thus, G4s are emerging as fundamental, functional genomic elements.
RESUMO
Approximately 13% of the human genome can fold into non-canonical (non-B) DNA structures (e.g. G-quadruplexes, Z-DNA, etc.), which have been implicated in vital cellular processes. Non-B DNA also hinders replication, increasing errors and facilitating mutagenesis, yet its contribution to genome-wide variation in mutation rates remains unexplored. Here, we conducted a comprehensive analysis of nucleotide substitution frequencies at non-B DNA loci within noncoding, non-repetitive genome regions, their ±2 kb flanking regions, and 1-Megabase windows, using human-orangutan divergence and human single-nucleotide polymorphisms. Functional data analysis at single-base resolution demonstrated that substitution frequencies are usually elevated at non-B DNA, with patterns specific to each non-B DNA type. Mirror, direct and inverted repeats have higher substitution frequencies in spacers than in repeat arms, whereas G-quadruplexes, particularly stable ones, have higher substitution frequencies in loops than in stems. Several non-B DNA types also affect substitution frequencies in their flanking regions. Finally, non-B DNA explains more variation than any other predictor in multiple regression models for diversity or divergence at 1-Megabase scale. Thus, non-B DNA substantially contributes to variation in substitution frequencies at small and large scales. Our results highlight the role of non-B DNA in germline mutagenesis with implications to evolution and genetic diseases.
Assuntos
DNA/química , Variação Genética , Genoma Humano , Animais , Loci Gênicos , Humanos , Taxa de Mutação , Polimorfismo de Nucleotídeo Único , Pongo pygmaeusRESUMO
Incomplete and low-fidelity genome duplication contribute to genomic instability and cancer development. Difficult-to-Replicate Sequences, or DiToRS, are natural impediments in the genome that require specialized DNA polymerases and repair pathways to complete and maintain faithful DNA synthesis. DiToRS include non B-DNA secondary structures formed by repetitive sequences, for example within chromosomal fragile sites and telomeres, which inhibit DNA replication under endogenous stress conditions. Oncogene activation alters DNA replication dynamics and creates oncogenic replication stress, resulting in persistent activation of the DNA damage and replication stress responses, cell cycle arrest, and cell death. The response to oncogenic replication stress is highly complex and must be tightly regulated to prevent mutations and tumorigenesis. In this review, we summarize types of known DiToRS and the experimental evidence supporting replication inhibition, with a focus on the specialized DNA polymerases utilized to cope with these obstacles. In addition, we discuss different causes of oncogenic replication stress and its impact on DiToRS stability. We highlight recent findings regarding the regulation of DNA polymerases during oncogenic replication stress and the implications for cancer development.
Assuntos
Dano ao DNA , Replicação do DNA , DNA Polimerase Dirigida por DNA/metabolismo , Neoplasias/genética , Animais , DNA Polimerase Dirigida por DNA/genética , Humanos , Proteínas Oncogênicas/genética , Proteínas Oncogênicas/metabolismoRESUMO
Transcript variation has important implications for organismal function in health and disease. Most transcriptome studies focus on assessing variation in gene expression levels and isoform representation. Variation at the level of transcript sequence is caused by RNA editing and transcription errors, and leads to nongenetically encoded transcript variants, or RNA-DNA differences (RDDs). Such variation has been understudied, in part because its detection is obscured by reverse transcription (RT) and sequencing errors. It has only been evaluated for intertranscript base substitution differences. Here, we investigated transcript sequence variation for short tandem repeats (STRs). We developed the first maximum-likelihood estimator (MLE) to infer RT error and RDD rates, taking next generation sequencing error rates into account. Using the MLE, we empirically evaluated RT error and RDD rates for STRs in a large-scale DNA and RNA replicated sequencing experiment conducted in a primate species. The RT error rates increased exponentially with STR length and were biased toward expansions. The RDD rates were approximately 1 order of magnitude lower than the RT error rates. The RT error rates estimated with the MLE from a primate data set were concordant with those estimated with an independent method, barcoded RNA sequencing, from a Caenorhabditis elegans data set. Our results have important implications for medical genomics, as STR allelic variation is associated with >40 diseases. STR nonallelic transcript variation can also contribute to disease phenotype. The MLE and empirical rates presented here can be used to evaluate the probability of disease-associated transcripts arising due to RDD.
Assuntos
DNA/genética , Repetições de Microssatélites , RNA/genética , Transcrição Reversa , Alelos , Reparo do DNA , Variação Genética , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Análise de Sequência de RNA , TranscriptomaRESUMO
PrimPol is a recently identified polymerase involved in eukaryotic DNA damage tolerance, employed in both re-priming and translesion synthesis mechanisms to bypass nuclear and mitochondrial DNA lesions. In this report, we investigate how the enzymatic activities of human PrimPol are regulated. We show that, unlike other TLS polymerases, PrimPol is not stimulated by PCNA and does not interact with it in vivo. We identify that PrimPol interacts with both of the major single-strand binding proteins, RPA and mtSSB in vivo. Using NMR spectroscopy, we characterize the domains responsible for the PrimPol-RPA interaction, revealing that PrimPol binds directly to the N-terminal domain of RPA70. In contrast to the established role of SSBs in stimulating replicative polymerases, we find that SSBs significantly limit the primase and polymerase activities of PrimPol. To identify the requirement for this regulation, we employed two forward mutation assays to characterize PrimPol's replication fidelity. We find that PrimPol is a mutagenic polymerase, with a unique error specificity that is highly biased towards insertion-deletion errors. Given the error-prone disposition of PrimPol, we propose a mechanism whereby SSBs greatly restrict the contribution of this enzyme to DNA replication at stalled forks, thus reducing the mutagenic potential of PrimPol during genome replication.
Assuntos
DNA Primase/metabolismo , Proteínas de Ligação a DNA/metabolismo , DNA Polimerase Dirigida por DNA/metabolismo , Proteínas Mitocondriais/metabolismo , Enzimas Multifuncionais/metabolismo , Proteína de Replicação A/metabolismo , Primers do DNA/biossíntese , Replicação do DNA , Humanos , Mutagênese , Antígeno Nuclear de Célula em Proliferação/metabolismo , Domínios e Motivos de Interação entre Proteínas , Proteína de Replicação A/químicaRESUMO
Interruptions of microsatellite sequences impact genome evolution and can alter disease manifestation. However, human polymorphism levels at interrupted microsatellites (iMSs) are not known at a genome-wide scale, and the pathways for gaining interruptions are poorly understood. Using the 1000 Genomes Phase-1 variant call set, we interrogated mono-, di-, tri-, and tetranucleotide repeats up to 10 units in length. We detected â¼26,000-40,000 iMSs within each of four human population groups (African, European, East Asian, and American). We identified population-specific iMSs within exonic regions, and discovered that known disease-associated iMSs contain alleles present at differing frequencies among the populations. By analyzing longer microsatellites in primate genomes, we demonstrate that single interruptions result in a genome-wide average two- to six-fold reduction in microsatellite mutability, as compared with perfect microsatellites. Centrally located interruptions lowered mutability dramatically, by two to three orders of magnitude. Using a biochemical approach, we tested directly whether the mutability of a specific iMS is lower because of decreased DNA polymerase strand slippage errors. Modeling the adenomatous polyposis coli tumor suppressor gene sequence, we observed that a single base substitution interruption reduced strand slippage error rates five- to 50-fold, relative to a perfect repeat, during synthesis by DNA polymerases α, ß, or η. Computationally, we demonstrate that iMSs arise primarily by base substitution mutations within individual human genomes. Our biochemical survey of human DNA polymerase α, ß, δ, κ, and η error rates within certain microsatellites suggests that interruptions are created most frequently by low fidelity polymerases. Our combined computational and biochemical results demonstrate that iMSs are abundant in human genomes and are sources of population-specific genetic variation that may affect genome stability. The genome-wide identification of iMSs in human populations presented here has important implications for current models describing the impact of microsatellite polymorphisms on gene expression.
Assuntos
Instabilidade Genômica , Repetições de Microssatélites/genética , Polimorfismo de Nucleotídeo Único/genética , Primatas/genética , Alelos , Animais , Sequência de Bases , Regulação da Expressão Gênica , Genoma Humano , Humanos , População/genéticaRESUMO
DNA damage and repair are linked to cancer. DNA damage that is induced endogenously or from exogenous sources has the potential to result in mutations and genomic instability if not properly repaired, eventually leading to cancer. Inflammation is also linked to cancer. Reactive oxygen and nitrogen species (RONs) produced by inflammatory cells at sites of infection can induce DNA damage. RONs can also amplify inflammatory responses, leading to increased DNA damage. Here, we focus on the links between DNA damage, repair, and inflammation, as they relate to cancer. We examine the interplay between chronic inflammation, DNA damage and repair and review recent findings in this rapidly emerging field, including the links between DNA damage and the innate immune system, and the roles of inflammation in altering the microbiome, which subsequently leads to the induction of DNA damage in the colon. Mouse models of defective DNA repair and inflammatory control are extensively reviewed, including treatment of mouse models with pathogens, which leads to DNA damage. The roles of microRNAs in regulating inflammation and DNA repair are discussed. Importantly, DNA repair and inflammation are linked in many important ways, and in some cases balance each other to maintain homeostasis. The failure to repair DNA damage or to control inflammatory responses has the potential to lead to cancer.
Assuntos
Dano ao DNA , Reparo do DNA , Inflamação/genética , Inflamação/imunologia , Neoplasias/genética , Neoplasias/imunologia , Animais , DNA/genética , DNA/imunologia , Regulação Neoplásica da Expressão Gênica , Humanos , Imunidade Inata , Inflamação/complicações , Inflamação/microbiologia , MicroRNAs/genética , MicroRNAs/imunologia , Neoplasias/complicações , Neoplasias/microbiologia , Espécies Reativas de Oxigênio/imunologiaRESUMO
Chromosomal common fragile sites (CFSs) are unstable genomic regions that break under replication stress and are involved in structural variation. They frequently are sites of chromosomal rearrangements in cancer and of viral integration. However, CFSs are undercharacterized at the molecular level and thus difficult to predict computationally. Newly available genome-wide profiling studies provide us with an unprecedented opportunity to associate CFSs with features of their local genomic contexts. Here, we contrasted the genomic landscape of cytogenetically defined aphidicolin-induced CFSs (aCFSs) to that of nonfragile sites, using multiple logistic regression. We also analyzed aCFS breakage frequencies as a function of their genomic landscape, using standard multiple regression. We show that local genomic features are effective predictors both of regions harboring aCFSs (explaining â¼77% of the deviance in logistic regression models) and of aCFS breakage frequencies (explaining â¼45% of the variance in standard regression models). In our optimal models (having highest explanatory power), aCFSs are predominantly located in G-negative chromosomal bands and away from centromeres, are enriched in Alu repeats, and have high DNA flexibility. In alternative models, CpG island density, transcription start site density, H3K4me1 coverage, and mononucleotide microsatellite coverage are significant predictors. Also, aCFSs have high fragility when colocated with evolutionarily conserved chromosomal breakpoints. Our models are predictive of the fragility of aCFSs mapped at a higher resolution. Importantly, the genomic features we identified here as significant predictors of fragility allow us to draw valuable inferences on the molecular mechanisms underlying aCFSs.
Assuntos
Instabilidade Cromossômica , Sítios Frágeis do Cromossomo , Genoma Humano , Modelos Genéticos , Elementos Alu , Animais , Afidicolina/farmacologia , Centrômero , Quebra Cromossômica , Cromossomos Humanos/efeitos dos fármacos , Ilhas de CpG , Análise Citogenética , Humanos , Modelos Logísticos , Camundongos , Repetições de Microssatélites , Reprodutibilidade dos Testes , Sítio de Iniciação de TranscriçãoRESUMO
Microsatellites--tandem repeats of short DNA motifs--are abundant in the human genome and have high mutation rates. While microsatellite instability is implicated in numerous genetic diseases, the molecular processes involved in their emergence and disappearance are still not well understood. Microsatellites are hypothesized to follow a life cycle, wherein they are born and expand into adulthood, until their degradation and death. Here we identified microsatellite births/deaths in human, chimpanzee, and orangutan genomes, using macaque and marmoset as outgroups. We inferred mutations causing births/deaths based on parsimony, and investigated local genomic environments affecting them. We also studied birth/death patterns within transposable elements (Alus and L1s), coding regions, and disease-associated loci. We observed that substitutions were the predominant cause for births of short microsatellites, while insertions and deletions were important for births of longer microsatellites. Substitutions were the cause for deaths of microsatellites of virtually all lengths. AT-rich L1 sequences exhibited elevated frequency of births/deaths over their entire length, while GC-rich Alus only in their 3' poly(A) tails and middle A-stretches, with differences depending on transposable element integration timing. Births/deaths were strongly selected against in coding regions. Births/deaths occurred in genomic regions with high substitution rates, protomicrosatellite content, and L1 density, but low GC content and Alu density. The majority of the 17 disease-associated microsatellites examined are evolutionarily ancient (were acquired by the common ancestor of simians). Our genome-wide investigation of microsatellite life cycle has fundamental applications for predicting the susceptibility of birth/death of microsatellites, including many disease-causing loci.
Assuntos
Elementos Alu/genética , Evolução Molecular , Genoma Humano/fisiologia , Repetições de Microssatélites/genética , Animais , Humanos , Primatas/genéticaRESUMO
Microsatellite DNA synthesis represents a significant component of human genome replication that must occur faithfully. However, yeast replicative DNA polymerases do not possess high fidelity for microsatellite synthesis. We hypothesized that the structural features of Y-family polymerases that facilitate accurate translesion synthesis may promote accurate microsatellite synthesis. We compared human polymerases κ (Pol κ) and η (Pol η) fidelities to that of replicative human polymerase δ holoenzyme (Pol δ4), using the in vitro HSV-tk assay. Relative polymerase accuracy for insertion/deletion (indel) errors within 2-3 unit repeats internal to the HSV-tk gene concurred with the literature: Pol δ4 >> Pol κ or Pol η. In contrast, relative polymerase accuracy for unit-based indel errors within [GT](10) and [TC](11) microsatellites was: Pol κ ≥ Pol δ4 > Pol η. The magnitude of difference was greatest between Pols κ and δ4 with the [GT] template. Biochemically, Pol κ displayed less synthesis termination within the [GT] allele than did Pol δ4. In dual polymerase reactions, Pol κ competed with either a stalled or moving Pol δ4, thereby reducing termination. Our results challenge the ideology that pol κ is error prone, and suggest that DNA polymerases with complementary biochemical properties can function cooperatively at repetitive sequences.
Assuntos
DNA Polimerase Dirigida por DNA/metabolismo , Instabilidade de Microssatélites , Repetições de Microssatélites , Alelos , DNA/biossíntese , Dano ao DNA , DNA Polimerase III/metabolismo , Humanos , Mutação INDELRESUMO
Microsatellite DNA sequences display allele length alterations or microsatellite instability (MSI) in tumor tissues, and MSI is used diagnostically for tumor detection and classification. We discuss the known types of tumor-specific MSI patterns and the relevant mechanisms underlying each pattern. Mutation rates of individual microsatellites vary greatly, and the intrinsic DNA features of motif size, sequence, and length contribute to this variation. MSI is used for detecting mismatch repair (MMR)-deficient tumors, which display an MSI-high phenotype due to genome-wide microsatellite destabilization. Because several pathways maintain microsatellite stability, tumors that have undergone other events associated with moderate genome instability may display diagnostic MSI only at specific di- or tetranucleotide markers. We summarize evidence for such alternative MSI forms (A-MSI) in sporadic cancers, also referred to as MSI-low and EMAST. While the existence of A-MSI is not disputed, there is disagreement about the origin and pathologic significance of this phenomenon. Although ambiguities due to PCR methods may be a source, evidence exists for other mechanisms to explain tumor-specific A-MSI. Some portion of A-MSI tumors may result from random mutational events arising during neoplastic cell evolution. However, this mechanism fails to explain the specificity of A-MSI for di- and tetranucleotide instability. We present evidence supporting the alternative argument that some A-MSI tumors arise by a distinct genetic pathway, and give examples of DNA metabolic pathways that, when altered, may be responsible for instability at specific microsatellite motifs. Finally, we suggest that A-MSI in tumors could be molecular signatures of environmental influences and DNA damage. Importantly, A-MSI occurs in several pre-neoplastic inflammatory states, including inflammatory bowel diseases, consistent with a role of oxidative stress in A-MSI. Understanding the biochemical basis of A-MSI tumor phenotypes will advance the development of new diagnostic tools and positively impact the clinical management of individual cancers.
Assuntos
Instabilidade de Microssatélites , Neoplasias/genética , Reparo de Erro de Pareamento de DNA , Humanos , Mutação , FenótipoRESUMO
DNA polymerase eta (Pol η) is a Y-family polymerase and the product of the POLH gene. Autosomal recessive inheritance of POLH mutations is the cause of the xeroderma pigmentosum variant, a cancer predisposition syndrome. This review summarizes mounting evidence for expanded Pol η cellular functions in addition to DNA lesion bypass that are critical for maintaining genome stability. In vitro, Pol η displays efficient DNA synthesis through difficult-to-replicate sequences, catalyzes D-loop extensions, and utilizes RNA-DNA hybrid templates. Human Pol η is constitutively present at the replication fork. In response to replication stress, Pol η is upregulated at the transcriptional and protein levels, and post-translational modifications regulate its localization to chromatin. Numerous studies show that Pol η is required for efficient common fragile site replication and stability. Additionally, Pol η can be recruited to stalled replication forks through protein-protein interactions, suggesting a broader role in replication fork recovery. During somatic hypermutations, Pol η is recruited by mismatch repair proteins and is essential for VH gene A:T basepair mutagenesis. Within the global context of repeat-dense genomes, the recruitment of Pol η to perform specialized functions during replication could promote genome stability by interrupting pure repeat arrays with base substitutions. Alternatively, not engaging Pol η in genome duplication is costly, as the absence of Pol η leads to incomplete replication and increased chromosomal instability.
Assuntos
DNA Polimerase Dirigida por DNA , Duplicação Gênica , Humanos , DNA Polimerase Dirigida por DNA/genética , DNA Polimerase Dirigida por DNA/metabolismo , DNA/metabolismo , Instabilidade GenômicaRESUMO
Mutations of numerous genes involved in DNA replication, DNA repair, and DNA damage response (DDR) pathways lead to a variety of human diseases, including aging and cancer [...].
Assuntos
Dano ao DNA , Neoplasias , Humanos , Dano ao DNA/genética , Reparo do DNA/genética , Mutação , Neoplasias/genética , Replicação do DNA/genéticaRESUMO
Difficult-to-Replicate Sequences (DiToRS) are natural impediments in the human genome that inhibit DNA replication under endogenous replication. Some of the most widely-studied DiToRS are A+T-rich, high "flexibility regions," including long stretches of perfect [AT/TA] microsatellite repeats that have the potential to collapse into hairpin structures when in single-stranded DNA (ssDNA) form and are sites of recurrent structural variation and double-stranded DNA (dsDNA) breaks. Currently, it is unclear how these flexibility regions impact DNA replication, greatly limiting our fundamental understanding of human genome stability. To investigate replication through flexibility regions, we utilized FRET to characterize the effects of the major ssDNA-binding complex, RPA, on the structure of perfect [AT/TA]25 microsatellite repeats and also re-constituted human lagging strand replication to quantitatively characterize initial encounters of pol δ holoenzymes with A+T-rich DNA template sequences. The results indicate that [AT/TA]25 sequences adopt hairpin structures that are unwound by RPA and pol δ holoenzymes support dNTP incorporation through the [AT/TA]25 sequences as well as an A+T-rich, non-structure forming sequence. Furthermore, the extent of dNTP incorporation is dependent on the sequence of the DNA template and the concentration of dNTPs. Importantly, the effects of RPA on the replication of [AT/TA]25 sequences are dependent on the concentration of dNTPs, whereas the effects of RPA on the replication of an A+T-rich, non-structure forming sequence are independent of dNTP concentration. Collectively, these results reveal complexities in lagging strand replication and provide novel insights into how flexibility regions contribute to genome instability.
RESUMO
Common fragile sites (CFS) are chromosomal regions that exhibit instability during DNA replication stress. Although the mechanism of CFS expression has not been fully elucidated, one known feature is a severely delayed S-phase. We used an in vitro primer extension assay to examine the progression of DNA synthesis through various sequences within FRA16D by the replicative human DNA polymerases delta and alpha, and with human cell-free extracts. We found that specific cis-acting sequence elements perturb DNA elongation, causing inconsistent DNA synthesis rates between regions on the same strand and complementary strands. Pol delta was significantly inhibited in regions containing hairpins and microsatellites, [AT/TA](24) and [A/T](19-28), compared with a control region with minimal secondary structure. Pol delta processivity was enhanced by full length Werner Syndrome protein (WRN) and by WRN fragments containing either the helicase domain or DNA-binding C-terminal domain. In cell-free extracts, stalling was eliminated at smaller hairpins, but persisted in larger hairpins and microsatellites. Our data support a model whereby CFS expression during cellular stress is due to a combination of factors--density of specific DNA secondary-structures within a genomic region and asymmetric rates of strand synthesis.
Assuntos
Sítios Frágeis do Cromossomo , DNA Polimerase III/metabolismo , Replicação do DNA , RecQ Helicases/metabolismo , Sequência de Bases , DNA/biossíntese , DNA/química , Exodesoxirribonucleases/metabolismo , Células HeLa , HumanosRESUMO
G-quadruplexes (G4s), a type of non-B DNA, play important roles in a wide range of molecular processes, including replication, transcription, and translation. Genome integrity relies on efficient and accurate DNA synthesis, and is compromised by various stressors, to which non-B DNA structures such as G4s can be particularly vulnerable. However, the impact of G4 structures on DNA polymerase fidelity is largely unknown. Using an in vitro forward mutation assay, we investigated the fidelity of human DNA polymerases delta (δ4, four-subunit), eta (η), and kappa (κ) during synthesis of G4 motifs representing those in the human genome. The motifs differ in sequence, topology, and stability, features that may affect DNA polymerase errors. Polymerase error rate hierarchy (δ4 < κ < η) is largely maintained during G4 synthesis. Importantly, we observed unique polymerase error signatures during synthesis of VEGF G4 motifs, stable G4s which form parallel topologies. These statistically significant errors occurred within, immediately flanking, and encompassing the G4 motif. For pol δ4, the errors were deletions, insertions and complex errors within the G4 or encompassing the G4 motif and surrounding sequence. For pol η, the errors occurred in 3' sequences flanking the G4 motif. For pol κ, the errors were frameshift mutations within G-tracts of the G4. Because these error signatures were not observed during synthesis of an antiparallel G4 and, to a lesser extent, a hybrid G4, we suggest that G4 topology and/or stability could influence polymerase fidelity. Using in silico analyses, we show that most polymerase errors are predicted to have minimal effects on predicted G4 stability. Our results provide a unique view of G4s not previously elucidated, showing that G4 motif heterogeneity differentially influences polymerase fidelity within the motif and flanking sequences. Thus, our study advances the understanding of how DNA polymerase errors contribute to G4 mutagenesis.