Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 20 de 27
1.
Sci Total Environ ; 940: 173315, 2024 Aug 25.
Article En | MEDLINE | ID: mdl-38761955

The rapidly expanding use of wastewater for public health surveillance requires new strategies to protect privacy rights, while data are collected at increasingly discrete geospatial scales, i.e., city, neighborhood, campus, and building-level. Data collected at high geospatial resolution can inform on labile, short-lived biomarkers, thereby making wastewater-derived data both more actionable and more likely to cause privacy concerns and stigmatization of subpopulations. Additionally, data sharing restrictions among neighboring cities and communities can complicate efforts to balance public health protections with citizens' privacy. Here, we have created an encrypted framework that facilitates the sharing of sensitive population health data among entities that lack trust for one another (e.g., between adjacent municipalities with different governance of health monitoring and data sharing). We demonstrate the utility of this approach with two real-world cases. Our results show the feasibility of sharing encrypted data between two municipalities and a laboratory, while performing secure private computations for wastewater-based epidemiology (WBE) with high precision, fast speeds, and low data costs. This framework is amenable to other computations used by WBE researchers including population normalized mass loads, fecal indicator normalizations, and quality control measures. The Centers for Disease Control and Prevention's National Wastewater Surveillance System shows ∼8 % of the records attributed to collection before the wastewater treatment plant, illustrating an opportunity to further expand currently limited community-level sampling and public health surveillance through security and responsible data-sharing as outlined here.


Information Dissemination , Wastewater , Privacy , Humans , Computer Security , Environmental Monitoring/methods , Wastewater-Based Epidemiological Monitoring
2.
Pac Symp Biocomput ; 28: 347-358, 2023.
Article En | MEDLINE | ID: mdl-36540990

Accurate prediction of TCR binding affinity to a target antigen is important for development of immunotherapy strategies. Recent computational methods were built on various deep neural networks and used the evolutionary-based distance matrix BLOSUM to embed amino acids of TCR and epitope sequences to numeric values. A pre-trained language model of amino acids is an alternative embedding method where each amino acid in a peptide is embedded as a continuous numeric vector. Little attention has yet been given to summarize the amino-acid-wise embedding vectors to sequence-wise representations. In this paper, we propose PiTE, a two-step pipeline for the TCR-epitope binding affinity prediction. First, we use an amino acids embedding model pre-trained on a large number of unlabeled TCR sequences and obtain a real-valued representation from a string representation of amino acid sequences. Second, we train a binding affinity prediction model that consists of two sequence encoders and a stack of linear layers predicting the affinity score of a given TCR and epitope pair. In particular, we explore various types of neural network architectures for the sequence encoders in the two-step binding affinity prediction pipeline. We show that our Transformer-like sequence encoder achieves a state-of-the-art performance and significantly outperforms the others, perhaps due to the model's ability to capture contextual information between amino acids in each sequence. Our work highlights that an advanced sequence encoder on top of pre-trained representation significantly improves performance of the TCR-epitope binding affinity prediction.


Computational Biology , Neural Networks, Computer , Humans , Epitopes , Computational Biology/methods , Amino Acids , Receptors, Antigen, T-Cell/genetics
3.
Front Immunol ; 13: 893247, 2022.
Article En | MEDLINE | ID: mdl-35874725

TCR-epitope pair binding is the key component for T cell regulation. The ability to predict whether a given pair binds is fundamental to understanding the underlying biology of the binding mechanism as well as developing T-cell mediated immunotherapy approaches. The advent of large-scale public databases containing TCR-epitope binding pairs enabled the recent development of computational prediction methods for TCR-epitope binding. However, the number of epitopes reported along with binding TCRs is far too small, resulting in poor out-of-sample performance for unseen epitopes. In order to address this issue, we present our model ATM-TCR which uses a multi-head self-attention mechanism to capture biological contextual information and improve generalization performance. Additionally, we present a novel application of the attention map from our model to improve out-of-sample performance by demonstrating on recent SARS-CoV-2 data.


Epitopes, T-Lymphocyte , Receptors, Antigen, T-Cell , Computational Biology , Epitopes, T-Lymphocyte/metabolism , Humans , Protein Binding , Receptors, Antigen, T-Cell/metabolism , SARS-CoV-2
4.
Mol Biol Evol ; 39(4)2022 04 10.
Article En | MEDLINE | ID: mdl-35446958

Because errors at the DNA level power pathogen evolution, a systematic understanding of the rate and molecular spectra of mutations could guide the avoidance and treatment of infectious diseases. We thus accumulated tens of thousands of spontaneous mutations in 768 repeatedly bottlenecked lineages of 18 strains from various geographical sites, temporal spread, and genetic backgrounds. Entailing over ∼1.36 million generations, the resultant data yield an average mutation rate of ∼0.0005 per genome per generation, with a significant within-species variation. This is one of the lowest bacterial mutation rates reported, giving direct support for a high genome stability in this pathogen resulting from high DNA-mismatch-repair efficiency and replication-machinery fidelity. Pathogenicity genes do not exhibit an accelerated mutation rate, and thus, elevated mutation rates may not be the major determinant for the diversification of toxin and secretion systems. Intriguingly, a low error rate at the transcript level is not observed, suggesting distinct fidelity of the replication and transcription machinery. This study urges more attention on the most basic evolutionary processes of even the best-known human pathogens and deepens the understanding of their genome evolution.


Salmonella enterica , Salmonella , Genome, Bacterial , Mutation , Mutation Rate , Salmonella/genetics , Salmonella enterica/genetics
5.
mBio ; 12(5): e0250321, 2021 10 26.
Article En | MEDLINE | ID: mdl-34634932

Encounters between DNA replication and transcription can cause genomic disruption, particularly when the two meet head-on. Whether these conflicts produce point mutations is debated. This paper presents detailed analyses of a large collection of mutations generated during mutation accumulation experiments with mismatch repair (MMR)-defective Escherichia coli. With MMR absent, mutations are primarily due to DNA replication errors. Overall, there were no differences in the frequencies of base pair substitutions or small indels (i.e., insertion and deletions of ≤4 bp) in the coding sequences or promoters of genes oriented codirectionally versus head-on to replication. Among a subset of highly expressed genes, there was a 2- to 3-fold bias for indels in genes oriented head-on to replication, but this difference was almost entirely due to the asymmetrical genomic locations of tRNA genes containing mononucleotide runs, which are hot spots for indels. No additional orientation bias in mutation frequencies occurred when MMR- strains were also defective for transcription-coupled repair (TCR). However, in contrast to other reports, loss of TCR slightly increased the overall mutation rate, meaning that TCR is antimutagenic. There was no orientation bias in mutation frequencies among the stress response genes that are regulated by RpoS or induced by DNA damage. Thus, biases in the locations of mutational targets can account for most, if not all, apparent biases in mutation frequencies between genes oriented head-on versus codirectional to replication. In addition, the data revealed a strong correlation of the frequency of base pair substitutions with gene length but no correlation with gene expression levels. IMPORTANCE Because DNA replication and transcription occur on the same DNA template, encounters between the two machines occur frequently. When these encounters are head-to-head, genomic disruption can occur. However, whether replication-transcription conflicts contribute to spontaneous mutations is debated. Analyzing in detail a large collection of mutations generated with mismatch repair-defective Escherichia coli strains, we found that across the genome there are no significant differences in mutation frequencies between genes oriented codirectionally and those oriented head-on to replication. Among a subset of highly expressed genes, there was a 2- to 3-fold bias for small insertions and deletions in head-on-oriented genes, but this difference was almost entirely due to the asymmetrical locations of tRNA genes containing mononucleotide runs, which are hot spots for these mutations. Thus, biases in the positions of mutational target sequences can account for most, if not all, apparent biases in mutation frequencies between genes oriented head-on and codirectionally to replication.


DNA Replication , Escherichia coli/genetics , Genome, Bacterial/genetics , Mutation , Transcription, Genetic , DNA Mismatch Repair , Frameshift Mutation , Mutation Rate , Point Mutation
6.
DNA Repair (Amst) ; 90: 102852, 2020 06.
Article En | MEDLINE | ID: mdl-32388005

When its DNA is damaged, Escherichia coli induces the SOS response, which consists of about 40 genes that encode activities to repair or tolerate the damage. Certain alleles of the major SOS-control genes, recA and lexA, cause constitutive expression of the response, resulting in an increase in spontaneous mutations. These mutations, historically called "untargeted", have been the subject of many previous studies. Here we re-examine SOS-induced mutagenesis using mutation accumulation followed by whole-genome sequencing (MA/WGS), which allows a detailed picture of the types of mutations induced as well as their sequence-specificity. Our results confirm previous findings that SOS expression specifically induces transversion base-pair substitutions, with rates averaging about 60-fold above wild-type levels. Surprisingly, the rates of G:C to C:G transversions, normally an extremely rare mutation, were induced an average of 160-fold above wild-type levels. The SOS-induced transversion showed strong sequence specificity, the most extreme of which was the G:C to C:G transversions, 60% of which occurred at the middle base of 5'GGC3'+5'GCC3' sites, although these sites represent only 8% of the G:C base pairs in the genome. SOS-induced transversions were also DNA strand-biased, occurring, on average, 2- to 4- times more often when the purine was on the leading-strand template and the pyrimidine on the lagging-strand template than in the opposite orientation. However, the strand bias was also sequence specific, and even of reverse orientation at some sites. By eliminating constraints on the mutations that can be recovered, the MA/WGS protocol revealed new complexities of SOS "untargeted" mutations.


Escherichia coli/genetics , Mutagenesis , Mutation , SOS Response, Genetics , DNA, Bacterial/metabolism , DNA-Directed DNA Polymerase/metabolism , Mutation Rate , Whole Genome Sequencing
7.
mBio ; 10(4)2019 07 02.
Article En | MEDLINE | ID: mdl-31266871

Mutation accumulation experiments followed by whole-genome sequencing have revealed that, for several bacterial species, the rate of base-pair substitutions (BPSs) is not constant across the chromosome but varies in a wave-like pattern that is symmetrical about the origin of replication. The experiments reported here demonstrated that, in Escherichia coli, several interacting factors determine the wave. The origin is a major driver of BPS rates. When it is relocated, the BPS rates in a 1,000-kb region surrounding the new origin reproduce the pattern that surrounds the normal origin. However, the pattern across distant regions of the chromosome is unaltered and thus must be determined by other factors. Increasing the deoxynucleoside triphosphate (dNTP) concentration shifts the wave pattern away from the origin, supporting the hypothesis that fluctuations in dNTP pools coincident with replication firing contribute to the variations in the mutation rate. The nucleoid binding proteins (HU and Fis) and the terminus organizing protein (MatP) are also major factors. These proteins alter the three-dimensional structure of the DNA, and results suggest that mutation rates increase when highly structured DNA is replicated. Biases in error correction by proofreading and mismatch repair, both of which may be responsive to dNTP concentrations and DNA structure, also are major determinants of the wave pattern. These factors should apply to most bacterial and, possibly, eukaryotic genomes and suggest that different areas of the genome evolve at different rates.IMPORTANCE It has been found in several species of bacteria that the rate at which single base pairs are mutated is not constant across the genome but varies in a wave-like pattern that is symmetrical about the origin of replication. Using Escherichia coli as our model system, we show that this pattern is the result of several interconnected factors. First, the timing and progression of replication are important in determining the wave pattern. Second, the three-dimensional structure of the DNA is also a factor, and the results suggest that mutation rates increase when highly structured DNA is replicated. Finally, biases in error correction, which may be responsive both to the progression of DNA synthesis and to DNA structure, are major determinants of the wave pattern. These factors should apply to most bacterial and, possibly, eukaryotic genomes and suggest that different areas of the genome evolve at different rates.


Base Pairing , Chromosomes, Bacterial , Escherichia coli/genetics , Mutation Rate , Point Mutation , Replication Origin , Escherichia coli Proteins/metabolism , Nucleosides/metabolism , Spatial Analysis
8.
Methods Mol Biol ; 1802: 235-247, 2018.
Article En | MEDLINE | ID: mdl-29858814

Accurate typing of human leukocyte antigen (HLA) is essential for successful organ transplantation and HLA genes are heavily associated with various diseases. Widely used typing assays often involve a set of specially designed primers or probes requiring additional experiments. With the maturing of high-throughput sequencing (HTS) technologies, whole genome sequencing (WGS) as well as other HTS assays are becoming more accessible even in the clinical settings. We describe various computational methods capable of directly typing HLA genes using HTS data including Kourami, our HLA assembler. Kourami is the first HLA assembler capable of discovering novel alleles. Kourami assembles full-length sequences across the peptide-binding regions of HLA genes. Here, we focus on how a user would use Kourami on a new sample. We demonstrate the application by typing HLA alleles from a recently published WGS data with validated HLA types using Kourami.


Algorithms , Histocompatibility Testing/methods , Alleles , Base Sequence , Genome, Human , Humans , Sequence Alignment , Software , Whole Genome Sequencing
9.
Genetics ; 209(4): 1029-1042, 2018 08.
Article En | MEDLINE | ID: mdl-29907647

Mismatch repair (MMR) is a major contributor to replication fidelity, but its impact varies with sequence context and the nature of the mismatch. Mutation accumulation experiments followed by whole-genome sequencing of MMR-defective Escherichia coli strains yielded ≈30,000 base-pair substitutions (BPSs), revealing mutational patterns across the entire chromosome. The BPS spectrum was dominated by A:T to G:C transitions, which occurred predominantly at the center base of 5'NAC3'+5'GTN3' triplets. Surprisingly, growth on minimal medium or at low temperature attenuated these mutations. Mononucleotide runs were also hotspots for BPSs, and the rate at which these occurred increased with run length. Comparison with ≈2000 BPSs accumulated in MMR-proficient strains revealed that both kinds of hotspots appeared in the wild-type spectrum and so are likely to be sites of frequent replication errors. In MMR-defective strains transitions were strand biased, occurring twice as often when A and C rather than T and G were on the lagging-strand template. Loss of nucleotide diphosphate kinase increases the cellular concentration of dCTP, which resulted in increased rates of mutations due to misinsertion of C opposite A and T. In an mmr ndk double mutant strain, these mutations were more frequent when the template A and T were on the leading strand, suggesting that lagging-strand synthesis was more error-prone, or less well corrected by proofreading, than was leading strand synthesis.


Amino Acid Substitution , DNA Mismatch Repair , Escherichia coli/genetics , Whole Genome Sequencing/methods , DNA Replication , Genome, Bacterial , Point Mutation
10.
Genetics ; 209(4): 1043-1054, 2018 08.
Article En | MEDLINE | ID: mdl-29907648

When the DNA polymerase that replicates the Escherichia coli chromosome, DNA polymerase III, makes an error, there are two primary defenses against mutation: proofreading by the ϵ subunit of the holoenzyme and mismatch repair. In proofreading-deficient strains, mismatch repair is partially saturated and the cell's response to DNA damage, the SOS response, may be partially induced. To investigate the nature of replication errors, we used mutation accumulation experiments and whole-genome sequencing to determine mutation rates and mutational spectra across the entire chromosome of strains deficient in proofreading, mismatch repair, and the SOS response. We report that a proofreading-deficient strain has a mutation rate 4000-fold greater than wild-type strains. While the SOS response may be induced in these cells, it does not contribute to the mutational load. Inactivating mismatch repair in a proofreading-deficient strain increases the mutation rate another 1.5-fold. DNA polymerase has a bias for converting G:C to A:T base pairs, but proofreading reduces the impact of these mutations, helping to maintain the genomic G:C content. These findings give an unprecedented view of how polymerase and error-correction pathways work together to maintain E. coli's low mutation rate of 1 per 1000 generations.


DNA Replication , DNA, Bacterial/genetics , Escherichia coli/genetics , Whole Genome Sequencing/methods , DNA Damage , DNA Mismatch Repair , DNA Polymerase III/metabolism , Escherichia coli Proteins/metabolism , Mutation Rate , SOS Response, Genetics
11.
Res Microbiol ; 169(3): 145-156, 2018 Apr.
Article En | MEDLINE | ID: mdl-29454026

Experimental evolution studies have characterized the genetic strategies microbes utilize to adapt to their environments, mainly focusing on how microbes adapt to constant and/or defined environments. Using a system that incubates Escherichia coli in different complex media in long-term batch culture, we have focused on how heterogeneity and environment affects adaptive landscapes. In this system, there is no passaging of cells, and therefore genetic diversity is lost only through negative selection, without the experimentally-imposed bottlenecking common in other platforms. In contrast with other experimental evolution systems, because of cycling of nutrients and waste products, this is a heterogeneous environment, where selective pressures change over time, similar to natural environments. We determined that incubation in each environment leads to different adaptations by observing the growth advantage in stationary phase (GASP) phenotype. Re-sequencing whole genomes of populations identified both mutant alleles in a conserved set of genes and differences in evolutionary trajectories between environments. Reconstructing identified mutations in the parental strain background confirmed the adaptive advantage of some alleles, but also identified a surprising number of neutral or even deleterious mutations. This result indicates that complex epistatic interactions may be under positive selection within these heterogeneous environments.


Adaptation, Biological , Batch Cell Culture Techniques , Culture Media , Escherichia coli/physiology , Nutritional Physiological Phenomena , Alleles , Epistasis, Genetic , Gene Frequency , Gene-Environment Interaction , Genetic Fitness , Genetic Variation , Mutation
12.
Genome Biol ; 19(1): 16, 2018 02 07.
Article En | MEDLINE | ID: mdl-29415772

Accurate typing of human leukocyte antigen (HLA) is important because HLA genes play important roles in immune responses and disease genesis. Previously available computational methods are database-matching approaches and their outputs are inherently limited by the completeness of already known types, making them unsuitable for discovery of novel alleles. We have developed a graph-guided assembly technique for classical HLA genes, which can construct allele sequences given high-coverage whole-genome sequencing data. Our method delivers highly accurate HLA typing, comparable to the current state-of-the-art methods. Using various data, we also demonstrate that our method can type novel alleles.


Alleles , HLA Antigens/genetics , Histocompatibility Testing/methods , Genome , Haplotypes , Humans , Terminology as Topic , Whole Genome Sequencing
13.
mSystems ; 2(2)2017.
Article En | MEDLINE | ID: mdl-28289732

Experimental evolution of bacterial populations in the laboratory has led to identification of several themes, including parallel evolution of populations adapting to carbon starvation, heat stress, and pH stress. However, most of these experiments study growth in defined and/or constant environments. We hypothesized that while there would likely continue to be parallelism in more complex and changing environments, there would also be more variation in what types of mutations would benefit the cells. In order to test our hypothesis, we serially passaged Escherichia coli in a complex medium (Luria-Bertani broth) throughout the five phases of bacterial growth. This passaging scheme allowed cells to experience a wide variety of stresses, including nutrient limitation, oxidative stress, and pH variation, and therefore allowed them to adapt to several conditions. After every ~30 generations of growth, for a total of ~300 generations, we compared both the growth phenotypes and genotypes of aged populations to the parent population. After as few as 30 generations, populations exhibit changes in growth phenotype and accumulate potentially adaptive mutations. There were many genes with mutant alleles in different populations, indicating potential parallel evolution. We examined 8 of these alleles by constructing the point mutations in the parental genetic background and competed those cells with the parent population; five of these alleles were found to be adaptive. The variety and swiftness of adaptive mutations arising in the populations indicate that the cells are adapting to a complex set of stresses, while the parallel nature of several of the mutations indicates that this behavior may be generalized to bacterial evolution. IMPORTANCE With a growing body of work directed toward understanding the mechanisms of evolution using experimental systems, it is crucial to decipher what effects the experimental setup has on the outcome. If the goal of experimental laboratory evolution is to elucidate underlying evolutionary mechanisms and trends, these must be demonstrated in a variety of systems and environments. Here, we perform experimental evolution in a complex medium allowing the cells to transition through all five phases of growth, including death phase and long-term stationary phase. We show that the swiftness of selection and the specific targets of adaptive evolution are different in this system compared to others. We also observe parallel evolution where different mutations in the same genes are under positive natural selection. Together, these data show that while some outcomes of microbial evolution experiments may be generalizable, many outcomes will be environment or system specific.

14.
Front Microbiol ; 7: 852, 2016.
Article En | MEDLINE | ID: mdl-27375574

Diversity-generating retroelements (DGRs) are genetic cassettes that can produce massive protein sequence variation in prokaryotes. Presumably DGRs confer selective advantages to their hosts (bacteria or viruses) by generating variants of target genes-typically resulting in target proteins with altered ligand-binding specificity-through a specialized error-prone reverse transcription process. The only extensively studied DGR system is from the Bordetella phage BPP-1, although DGRs are predicted to exist in other species. Using bioinformatics analysis, we discovered that the DGR system associated with the Treponema denticola species (a human oral-associated periopathogen) is dynamic (with gains/losses of the system found in the isolates) and diverse (with multiple types found in isolated genomes and the human microbiota). The T. denticola DGR is found in only nine of the 17 sequenced T. denticola strains. Analysis of the DGR-associated template regions and reverse transcriptase gene sequences revealed two types of DGR systems in T. denticola: the ATCC35405-type shared by seven isolates including ATCC35405; and the SP32-type shared by two isolates (SP32 and SP33), suggesting multiple DGR acquisitions. We detected additional variants of the T. denticola DGR systems in the human microbiomes, and found that the SP32-type DGR is more abundant than the ATCC35405-type in the healthy human oral microbiome, although the latter is found in more sequenced isolates. This is the first comprehensive study to characterize the DGRs associated with T. denticola in individual genomes as well as human microbiomes, demonstrating the importance of utilizing both individual genomes and metagenomes for characterizing the elements, and for analyzing their diversity and distribution in human populations.

15.
Nucleic Acids Res ; 44(15): 7109-19, 2016 09 06.
Article En | MEDLINE | ID: mdl-27431326

A majority of large-scale bacterial genome rearrangements involve mobile genetic elements such as insertion sequence (IS) elements. Here we report novel insertions and excisions of IS elements and recombination between homologous IS elements identified in a large collection of Escherichia coli mutation accumulation lines by analysis of whole genome shotgun sequencing data. Based on 857 identified events (758 IS insertions, 98 recombinations and 1 excision), we estimate that the rate of IS insertion is 3.5 × 10(-4) insertions per genome per generation and the rate of IS homologous recombination is 4.5 × 10(-5) recombinations per genome per generation. These events are mostly contributed by the IS elements IS1, IS2, IS5 and IS186 Spatial analysis of new insertions suggest that transposition is biased to proximal insertions, and the length spectrum of IS-caused deletions is largely explained by local hopping. For any of the ISs studied there is no region of the circular genome that is favored or disfavored for new insertions but there are notable hotspots for deletions. Some elements have preferences for non-coding sequence or for the beginning and end of coding regions, largely explained by target site motifs. Interestingly, transposition and deletion rates remain constant across the wild-type and 12 mutant E. coli lines, each deficient in a distinct DNA repair pathway. Finally, we characterized the target sites of four IS families, confirming previous results and characterizing a highly specific pattern at IS186 target-sites, 5'-GGGG(N6/N7)CCCC-3'. We also detected 48 long deletions not involving IS elements.


DNA Transposable Elements/genetics , Escherichia coli/genetics , Genome, Bacterial/genetics , Mutagenesis, Insertional/genetics , Base Sequence , Evolution, Molecular
16.
Proc Natl Acad Sci U S A ; 113(18): E2498-505, 2016 May 03.
Article En | MEDLINE | ID: mdl-27091991

Although it is well known that microbial populations can respond adaptively to challenges from antibiotics, empirical difficulties in distinguishing the roles of de novo mutation and natural selection have left several issues unresolved. Here, we explore the mutational properties of Escherichia coli exposed to long-term sublethal levels of the antibiotic norfloxacin, using a mutation accumulation design combined with whole-genome sequencing of replicate lines. The genome-wide mutation rate significantly increases with norfloxacin concentration. This response is associated with enhanced expression of error-prone DNA polymerases and may also involve indirect effects of norfloxacin on DNA mismatch and oxidative-damage repair. Moreover, we find that acquisition of antibiotic resistance can be enhanced solely by accelerated mutagenesis, i.e., without direct involvement of selection. Our results suggest that antibiotics may generally enhance the mutation rates of target cells, thereby accelerating the rate of adaptation not only to the antibiotic itself but to additional challenges faced by invasive pathogens.


Escherichia coli/genetics , Genome, Bacterial/genetics , Genomic Instability/genetics , Mutagenesis/genetics , Mutation/genetics , Norfloxacin/administration & dosage , Anti-Bacterial Agents/administration & dosage , DNA Damage/genetics , DNA Repair/drug effects , DNA Repair/genetics , Dose-Response Relationship, Drug , Escherichia coli/drug effects , Evolution, Molecular , Genome, Bacterial/drug effects , Genomic Instability/drug effects , Mutagenesis/drug effects , Mutation/drug effects
17.
Proc Natl Acad Sci U S A ; 113(8): 2176-81, 2016 Feb 23.
Article En | MEDLINE | ID: mdl-26839411

The rate of cytosine deamination is much higher in single-stranded DNA (ssDNA) than in double-stranded DNA, and copying the resulting uracils causes C to T mutations. To study this phenomenon, the catalytic domain of APOBEC3G (A3G-CTD), an ssDNA-specific cytosine deaminase, was expressed in an Escherichia coli strain defective in uracil repair (ung mutant), and the mutations that accumulated over thousands of generations were determined by whole-genome sequencing. C:G to T:A transitions dominated, with significantly more cytosines mutated to thymine in the lagging-strand template (LGST) than in the leading-strand template (LDST). This strand bias was present in both repair-defective and repair-proficient cells and was strongest and highly significant in cells expressing A3G-CTD. These results show that the LGST is accessible to cellular cytosine deaminating agents, explains the well-known GC skew in microbial genomes, and suggests the APOBEC3 family of mutators may target the LGST in the human genome.


Escherichia coli/genetics , Escherichia coli/metabolism , APOBEC-3G Deaminase , Base Sequence , Cytidine Deaminase/genetics , Cytidine Deaminase/metabolism , Cytosine/metabolism , DNA/genetics , DNA/metabolism , DNA Repair/genetics , DNA Replication , DNA, Bacterial/genetics , DNA, Bacterial/metabolism , DNA, Single-Stranded/genetics , DNA, Single-Stranded/metabolism , Deamination , Escherichia coli Proteins/genetics , Escherichia coli Proteins/metabolism , Genes, Bacterial , Humans , Mutation , Recombinant Proteins/genetics , Recombinant Proteins/metabolism , Thymine/metabolism , Uracil/metabolism , Uracil-DNA Glycosidase/genetics , Uracil-DNA Glycosidase/metabolism
18.
Genome Announc ; 3(6)2015 Nov 05.
Article En | MEDLINE | ID: mdl-26543129

Caedibacter varicaedens is a kappa killer endosymbiont bacterium of the ciliate Paramecium biaurelia. Here, we present the draft genome sequence of C. varicaedens.

19.
Proc Natl Acad Sci U S A ; 112(44): E5990-9, 2015 Nov 03.
Article En | MEDLINE | ID: mdl-26460006

A complete understanding of evolutionary processes requires that factors determining spontaneous mutation rates and spectra be identified and characterized. Using mutation accumulation followed by whole-genome sequencing, we found that the mutation rates of three widely diverged commensal Escherichia coli strains differ only by about 50%, suggesting that a rate of 1-2 × 10(-3) mutations per generation per genome is common for this bacterium. Four major forces are postulated to contribute to spontaneous mutations: intrinsic DNA polymerase errors, endogenously induced DNA damage, DNA damage caused by exogenous agents, and the activities of error-prone polymerases. To determine the relative importance of these factors, we studied 11 strains, each defective for a major DNA repair pathway. The striking result was that only loss of the ability to prevent or repair oxidative DNA damage significantly impacted mutation rates or spectra. These results suggest that, with the exception of oxidative damage, endogenously induced DNA damage does not perturb the overall accuracy of DNA replication in normally growing cells and that repair pathways may exist primarily to defend against exogenously induced DNA damage. The thousands of mutations caused by oxidative damage recovered across the entire genome revealed strong local-sequence biases of these mutations. Specifically, we found that the identity of the 3' base can affect the mutability of a purine by oxidative damage by as much as eightfold.


Escherichia coli/genetics , Genes, Bacterial , Mutation , Alkylation , DNA Repair
20.
Mol Biol Evol ; 32(9): 2383-92, 2015 Sep.
Article En | MEDLINE | ID: mdl-25976352

Deinococcus bacteria are extremely resistant to radiation, oxidation, and desiccation. Resilience to these factors has been suggested to be due to enhanced damage prevention and repair mechanisms, as well as highly efficient antioxidant protection systems. Here, using mutation-accumulation experiments, we find that the GC-rich Deinococcus radiodurans has an overall background genomic mutation rate similar to that of E. coli, but differs in mutation spectrum, with the A/T to G/C mutation rate (based on a total count of 88 A:T → G:C transitions and 82 A:T → C:G transversions) per site per generation higher than that in the other direction (based on a total count of 157 G:C → A:T transitions and 33 G:C → T:A transversions). We propose that this unique spectrum is shaped mainly by the abundant uracil DNA glycosylases reducing G:C → A:T transitions, adenine methylation elevating A:T → C:G transversions, and absence of cytosine methylation decreasing G:C → A:T transitions. As opposed to the greater than 100× elevation of the mutation rate in MMR(-) (DNA Mismatch Repair deficient) strains of most other organisms, MMR(-) D. radiodurans only exhibits a 4-fold elevation, raising the possibility that other DNA repair mechanisms compensate for a relatively low-efficiency DNA MMR pathway. As D. radiodurans has plentiful insertion sequence (IS) elements in the genome and the activities of IS elements are rarely directly explored, we also estimated the insertion (transposition) rate of the IS elements to be 2.50 × 10(-3) per genome per generation in the wild-type strain; knocking out MMR did not elevate the IS element insertion rate in this organism.


DNA, Bacterial/genetics , Deinococcus/genetics , Bacterial Proteins/genetics , DNA Damage , DNA Methylation , DNA Repair , Deinococcus/enzymology , Genes, Bacterial , Genetic Drift , Mutagenesis, Insertional , Mutation Rate , Plasmids/genetics , Point Mutation , Radiation Tolerance , Uracil-DNA Glycosidase/genetics
...