Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 23
Filter
Add more filters










Publication year range
1.
PLoS Comput Biol ; 20(5): e1012067, 2024 May.
Article in English | MEDLINE | ID: mdl-38709825

ABSTRACT

Chromosome conformation capture (3C) technologies reveal the incredible complexity of genome organization. Maps of increasing size, depth, and resolution are now used to probe genome architecture across cell states, types, and organisms. Larger datasets add challenges at each step of computational analysis, from storage and memory constraints to researchers' time; however, analysis tools that meet these increased resource demands have not kept pace. Furthermore, existing tools offer limited support for customizing analysis for specific use cases or new biology. Here we introduce cooltools (https://github.com/open2c/cooltools), a suite of computational tools that enables flexible, scalable, and reproducible analysis of high-resolution contact frequency data. Cooltools leverages the widely-adopted cooler format which handles storage and access for high-resolution datasets. Cooltools provides a paired command line interface (CLI) and Python application programming interface (API), which respectively facilitate workflows on high-performance computing clusters and in interactive analysis environments. In short, cooltools enables the effective use of the latest and largest genome folding datasets.


Subject(s)
Computational Biology , Software , Computational Biology/methods , Programming Languages , Genomics/methods , Genome/genetics , Chromosome Mapping/methods , Humans
2.
Mol Cell ; 84(8): 1422-1441.e14, 2024 Apr 18.
Article in English | MEDLINE | ID: mdl-38521067

ABSTRACT

The topological state of chromosomes determines their mechanical properties, dynamics, and function. Recent work indicated that interphase chromosomes are largely free of entanglements. Here, we use Hi-C, polymer simulations, and multi-contact 3C and find that, by contrast, mitotic chromosomes are self-entangled. We explore how a mitotic self-entangled state is converted into an unentangled interphase state during mitotic exit. Most mitotic entanglements are removed during anaphase/telophase, with remaining ones removed during early G1, in a topoisomerase-II-dependent process. Polymer models suggest a two-stage disentanglement pathway: first, decondensation of mitotic chromosomes with remaining condensin loops produces entropic forces that bias topoisomerase II activity toward decatenation. At the second stage, the loops are released, and the formation of new entanglements is prevented by lower topoisomerase II activity, allowing the establishment of unentangled and territorial G1 chromosomes. When mitotic entanglements are not removed in experiments and models, a normal interphase state cannot be acquired.


Subject(s)
Chromosomes , DNA Topoisomerases, Type II , DNA Topoisomerases, Type II/genetics , Chromosomes/genetics , Mitosis/genetics , Interphase/genetics , Polymers
3.
Nat Genet ; 56(5): 900-912, 2024 May.
Article in English | MEDLINE | ID: mdl-38388848

ABSTRACT

Whole chromosome and arm-level copy number alterations occur at high frequencies in tumors, but their selective advantages, if any, are poorly understood. Here, utilizing unbiased whole chromosome genetic screens combined with in vitro evolution to generate arm- and subarm-level events, we iteratively selected the fittest karyotypes from aneuploidized human renal and mammary epithelial cells. Proliferation-based karyotype selection in these epithelial lines modeled tissue-specific tumor aneuploidy patterns in patient cohorts in the absence of driver mutations. Hi-C-based translocation mapping revealed that arm-level events usually emerged in multiples of two via centromeric translocations and occurred more frequently in tetraploids than diploids, contributing to the increased diversity in evolving tetraploid populations. Isogenic clonal lineages enabled elucidation of pro-tumorigenic mechanisms associated with common copy number alterations, revealing Notch signaling potentiation as a driver of 1q gain in breast cancer. We propose that intrinsic, tissue-specific proliferative effects underlie tumor copy number patterns in cancer.


Subject(s)
Aneuploidy , Humans , Female , Breast Neoplasms/genetics , Breast Neoplasms/pathology , DNA Copy Number Variations , Neoplasms/genetics , Neoplasms/pathology , Translocation, Genetic , Evolution, Molecular , Cell Proliferation/genetics , Receptors, Notch/genetics , Receptors, Notch/metabolism , Organ Specificity/genetics , Epithelial Cells/metabolism , Epithelial Cells/pathology
4.
bioRxiv ; 2024 Jan 07.
Article in English | MEDLINE | ID: mdl-38260419

ABSTRACT

The expression of a precise mRNA transcriptome is crucial for establishing cell identity and function, with dozens of alternative isoforms produced for a single gene sequence. The regulation of mRNA isoform usage occurs by the coordination of co-transcriptional mRNA processing mechanisms across a gene. Decisions involved in mRNA initiation and termination underlie the largest extent of mRNA isoform diversity, but little is known about any relationships between decisions at both ends of mRNA molecules. Here, we systematically profile the joint usage of mRNA transcription start sites (TSSs) and polyadenylation sites (PASs) across tissues and species. Using both short and long read RNA-seq data, we observe that mRNAs preferentially using upstream TSSs also tend to use upstream PASs, and congruently, the usage of downstream sites is similarly paired. This observation suggests that mRNA 5' end choice may directly influence mRNA 3' ends. Our results suggest a novel "Positional Initiation-Termination Axis" (PITA), in which the usage of alternative terminal sites are coupled based on the order in which they appear in the genome. PITA isoforms are more likely to encode alternative protein domains and use conserved sites. PITA is strongly associated with the length of genomic features, such that PITA is enriched in longer genes with more area devoted to regions that regulate alternative 5' or 3' ends. Strikingly, we found that PITA genes are more likely than non-PITA genes to have multiple, overlapping chromatin structural domains related to pairing of ordinally coupled start and end sites. In turn, PITA coupling is also associated with fast RNA Polymerase II (RNAPII) trafficking across these long gene regions. Our findings indicate that a combination of spatial and kinetic mechanisms couple transcription initiation and mRNA 3' end decisions based on ordinal position to define the expression mRNA isoforms.

5.
bioRxiv ; 2023 Feb 15.
Article in English | MEDLINE | ID: mdl-36824968

ABSTRACT

The field of 3D genome organization produces large amounts of sequencing data from Hi-C and a rapidly-expanding set of other chromosome conformation protocols (3C+). Massive and heterogeneous 3C+ data require high-performance and flexible processing of sequenced reads into contact pairs. To meet these challenges, we present pairtools - a flexible suite of tools for contact extraction from sequencing data. Pairtools provides modular command-line interface (CLI) tools that can be flexibly chained into data processing pipelines. Pairtools provides both crucial core tools as well as auxiliary tools for building feature-rich 3C+ pipelines, including contact pair manipulation, filtration, and quality control. Benchmarking pairtools against popular 3C+ data pipelines shows advantages of pairtools for high-performance and flexible 3C+ analysis. Finally, pairtools provides protocol-specific tools for multi-way contacts, haplotype-resolved contacts, and single-cell Hi-C. The combination of CLI tools and tight integration with Python data analysis libraries makes pairtools a versatile foundation for a broad range of 3C+ pipelines.

6.
Nat Struct Mol Biol ; 29(12): 1239-1251, 2022 12.
Article in English | MEDLINE | ID: mdl-36482254

ABSTRACT

Cohesin-mediated loop extrusion has been shown to be blocked at specific cis-elements, including CTCF sites, producing patterns of loops and domain boundaries along chromosomes. Here we explore such cis-elements, and their role in gene regulation. We find that transcription termination sites of active genes form cohesin- and RNA polymerase II-dependent domain boundaries that do not accumulate cohesin. At these sites, cohesin is first stalled and then rapidly unloaded. Start sites of transcriptionally active genes form cohesin-bound boundaries, as shown before, but are cohesin-independent. Together with cohesin loading, possibly at enhancers, these sites create a pattern of cohesin traffic that guides enhancer-promoter interactions. Disrupting this traffic pattern, by removing CTCF, renders cells sensitive to knockout of genes involved in transcription initiation, such as the SAGA complexes, and RNA processing such DEAD/H-Box RNA helicases. Without CTCF, these factors are less efficiently recruited to active promoters.


Subject(s)
Chromatin , Chromosomal Proteins, Non-Histone , CCCTC-Binding Factor/genetics , Chromosomal Proteins, Non-Histone/metabolism , Cell Cycle Proteins/metabolism , Cohesins
7.
Nature ; 606(7915): 812-819, 2022 06.
Article in English | MEDLINE | ID: mdl-35676475

ABSTRACT

DNA replication occurs through an intricately regulated series of molecular events and is fundamental for genome stability1,2. At present, it is unknown how the locations of replication origins are determined in the human genome. Here we dissect the role of topologically associating domains (TADs)3-6, subTADs7 and loops8 in the positioning of replication initiation zones (IZs). We stratify TADs and subTADs by the presence of corner-dots indicative of loops and the orientation of CTCF motifs. We find that high-efficiency, early replicating IZs localize to boundaries between adjacent corner-dot TADs anchored by high-density arrays of divergently and convergently oriented CTCF motifs. By contrast, low-efficiency IZs localize to weaker dotless boundaries. Following ablation of cohesin-mediated loop extrusion during G1, high-efficiency IZs become diffuse and delocalized at boundaries with complex CTCF motif orientations. Moreover, G1 knockdown of the cohesin unloading factor WAPL results in gained long-range loops and narrowed localization of IZs at the same boundaries. Finally, targeted deletion or insertion of specific boundaries causes local replication timing shifts consistent with IZ loss or gain, respectively. Our data support a model in which cohesin-mediated loop extrusion and stalling at a subset of genetically encoded TAD and subTAD boundaries is an essential determinant of the locations of replication origins in human S phase.


Subject(s)
Cell Cycle Proteins , Chromatin , Chromosomal Proteins, Non-Histone , Replication Origin , Cell Cycle Proteins/metabolism , Chromatin/genetics , Chromosomal Proteins, Non-Histone/metabolism , DNA Replication , Humans , Replication Origin/genetics , S Phase , Cohesins
8.
Nat Methods ; 18(9): 1046-1055, 2021 09.
Article in English | MEDLINE | ID: mdl-34480151

ABSTRACT

Chromosome conformation capture (3C) assays are used to map chromatin interactions genome-wide. Chromatin interaction maps provide insights into the spatial organization of chromosomes and the mechanisms by which they fold. Hi-C and Micro-C are widely used 3C protocols that differ in key experimental parameters including cross-linking chemistry and chromatin fragmentation strategy. To understand how the choice of experimental protocol determines the ability to detect and quantify aspects of chromosome folding we have performed a systematic evaluation of 3C experimental parameters. We identified optimal protocol variants for either loop or compartment detection, optimizing fragment size and cross-linking chemistry. We used this knowledge to develop a greatly improved Hi-C protocol (Hi-C 3.0) that can detect both loops and compartments relatively effectively. In addition to providing benchmarked protocols, this work produced ultra-deep chromatin interaction maps using Micro-C, conventional Hi-C and Hi-C 3.0 for key cell lines used by the 4D Nucleome project.


Subject(s)
Chromatin/chemistry , Chromosomes, Human/chemistry , Cross-Linking Reagents/chemistry , Genetic Techniques , Cell Line , Chromatin/metabolism , Databases, Factual , Human Embryonic Stem Cells/cytology , Human Embryonic Stem Cells/physiology , Humans
9.
Nat Genet ; 53(3): 367-378, 2021 03.
Article in English | MEDLINE | ID: mdl-33574602

ABSTRACT

Nuclear compartmentalization of active and inactive chromatin is thought to occur through microphase separation mediated by interactions between loci of similar type. The nature and dynamics of these interactions are not known. We developed liquid chromatin Hi-C to map the stability of associations between loci. Before fixation and Hi-C, chromosomes are fragmented, which removes strong polymeric constraint, enabling detection of intrinsic locus-locus interaction stabilities. Compartmentalization is stable when fragments are larger than 10-25 kb. Fragmentation of chromatin into pieces smaller than 6 kb leads to gradual loss of genome organization. Lamin-associated domains are most stable, whereas interactions for speckle- and polycomb-associated loci are more dynamic. Cohesin-mediated loops dissolve after fragmentation. Liquid chromatin Hi-C provides a genome-wide view of chromosome interaction dynamics.


Subject(s)
Chromatin/chemistry , Chromatin/metabolism , Chromosomes, Human/chemistry , Cell Compartmentation , Cell Cycle Proteins/metabolism , Cell Nucleus/chemistry , Cell Nucleus/genetics , Chromatin/genetics , Chromatin Assembly and Disassembly , Chromosomal Proteins, Non-Histone/metabolism , Chromosomes, Human/metabolism , Half-Life , Humans , K562 Cells , Kinetics , Cohesins
10.
JCI Insight ; 6(3)2021 02 08.
Article in English | MEDLINE | ID: mdl-33351783

ABSTRACT

The cohesin complex plays an essential role in chromosome maintenance and transcriptional regulation. Recurrent somatic mutations in the cohesin complex are frequent genetic drivers in cancer, including myelodysplastic syndromes (MDS) and acute myeloid leukemia (AML). Here, using genetic dependency screens of stromal antigen 2-mutant (STAG2-mutant) AML, we identified DNA damage repair and replication as genetic dependencies in cohesin-mutant cells. We demonstrated increased levels of DNA damage and sensitivity of cohesin-mutant cells to poly(ADP-ribose) polymerase (PARP) inhibition. We developed a mouse model of MDS in which Stag2 mutations arose as clonal secondary lesions in the background of clonal hematopoiesis driven by tet methylcytosine dioxygenase 2 (Tet2) mutations and demonstrated selective depletion of cohesin-mutant cells with PARP inhibition in vivo. Finally, we demonstrated a shift from STAG2- to STAG1-containing cohesin complexes in cohesin-mutant cells, which was associated with longer DNA loop extrusion, more intermixing of chromatin compartments, and increased interaction with PARP and replication protein A complex. Our findings inform the biology and therapeutic opportunities for cohesin-mutant malignancies.


Subject(s)
Cell Cycle Proteins/genetics , Cell Cycle Proteins/metabolism , Chromosomal Proteins, Non-Histone/genetics , Chromosomal Proteins, Non-Histone/metabolism , DNA Repair/genetics , Leukemia, Myeloid, Acute/genetics , Leukemia, Myeloid, Acute/metabolism , Mutation , Myelodysplastic Syndromes/genetics , Myelodysplastic Syndromes/metabolism , Animals , Cell Line, Tumor , Chromatin/genetics , Chromatin/metabolism , DNA Damage , Disease Models, Animal , Female , Humans , K562 Cells , Leukemia, Myeloid, Acute/drug therapy , Male , Mice , Mice, Inbred C57BL , Mice, Inbred NOD , Mice, Mutant Strains , Mice, SCID , Mice, Transgenic , Myelodysplastic Syndromes/drug therapy , Nuclear Proteins/genetics , Phthalazines/pharmacology , Poly(ADP-ribose) Polymerase Inhibitors/pharmacology , U937 Cells , Xenograft Model Antitumor Assays , Cohesins
11.
EMBO J ; 39(21): e99520, 2020 11 02.
Article in English | MEDLINE | ID: mdl-32935369

ABSTRACT

Vertebrate genomes replicate according to a precise temporal program strongly correlated with their organization into A/B compartments. Until now, the molecular mechanisms underlying the establishment of early-replicating domains remain largely unknown. We defined two minimal cis-element modules containing a strong replication origin and chromatin modifier binding sites capable of shifting a targeted mid-late-replicating region for earlier replication. The two origins overlap with a constitutive or a silent tissue-specific promoter. When inserted side-by-side, these modules advance replication timing over a 250 kb region through the cooperation with one endogenous origin located 30 kb away. Moreover, when inserted at two chromosomal sites separated by 30 kb, these two modules come into close physical proximity and form an early-replicating domain establishing more contacts with active A compartments. The synergy depends on the presence of the active promoter/origin. Our results show that clustering of strong origins located at active promoters can establish early-replicating domains.


Subject(s)
DNA Replication Timing , DNA Replication , Promoter Regions, Genetic , Actins/genetics , Binding Sites , Chromatin , Chromosomes , Cluster Analysis , Epigenomics , Humans , Replication Origin , beta-Globins/genetics
12.
Mol Cell ; 78(3): 554-565.e7, 2020 05 07.
Article in English | MEDLINE | ID: mdl-32213324

ABSTRACT

Over the past decade, 3C-related methods have provided remarkable insights into chromosome folding in vivo. To overcome the limited resolution of prior studies, we extend a recently developed Hi-C variant, Micro-C, to map chromosome architecture at nucleosome resolution in human ESCs and fibroblasts. Micro-C robustly captures known features of chromosome folding including compartment organization, topologically associating domains, and interactions between CTCF binding sites. In addition, Micro-C provides a detailed map of nucleosome positions and localizes contact domain boundaries with nucleosomal precision. Compared to Hi-C, Micro-C exhibits an order of magnitude greater dynamic range, allowing the identification of ∼20,000 additional loops in each cell type. Many newly identified peaks are localized along extrusion stripes and form transitive grids, consistent with their anchors being pause sites impeding cohesin-dependent loop extrusion. Our analyses comprise the highest-resolution maps of chromosome folding in human cells to date, providing a valuable resource for studies of chromosome organization.


Subject(s)
Chromosomes, Human/ultrastructure , Animals , CCCTC-Binding Factor/metabolism , Cells, Cultured , Chromatin/chemistry , Chromosomes, Mammalian/ultrastructure , Embryonic Stem Cells/cytology , Fibroblasts/cytology , Humans , Male , Mammals/genetics , Nucleosomes/metabolism , Nucleosomes/ultrastructure , Signal-To-Noise Ratio
13.
Nat Cell Biol ; 21(11): 1393-1402, 2019 11.
Article in English | MEDLINE | ID: mdl-31685986

ABSTRACT

Chromosome folding is modulated as cells progress through the cell cycle. During mitosis, condensins fold chromosomes into helical loop arrays. In interphase, the cohesin complex generates loops and topologically associating domains (TADs), while a separate process of compartmentalization drives segregation of active and inactive chromatin. We used synchronized cell cultures to determine how the mitotic chromosome conformation transforms into the interphase state. Using high-throughput chromosome conformation capture (Hi-C) analysis, chromatin binding assays and immunofluorescence, we show that, by telophase, condensin-mediated loops are lost and a transient folding intermediate is formed that is devoid of most loops. By cytokinesis, cohesin-mediated CTCF-CTCF loops and the positions of TADs emerge. Compartment boundaries are also established early, but long-range compartmentalization is a slow process and proceeds for hours after cells enter G1. Our results reveal the kinetics and order of events by which the interphase chromosome state is formed and identify telophase as a critical transition between condensin- and cohesin-driven chromosome folding.


Subject(s)
Adenosine Triphosphatases/genetics , Cell Cycle Proteins/genetics , Chromatin/metabolism , Chromosomal Proteins, Non-Histone/genetics , DNA-Binding Proteins/genetics , Multiprotein Complexes/genetics , Telophase , Adenosine Triphosphatases/metabolism , Cell Compartmentation/genetics , Cell Cycle Proteins/metabolism , Cell Line, Transformed , Chromatin/ultrastructure , Chromosomal Proteins, Non-Histone/metabolism , Chromosome Mapping , Cytokinesis/genetics , DNA-Binding Proteins/metabolism , Gene Expression , HeLa Cells , Humans , Interphase , Multiprotein Complexes/metabolism , S Phase , Cohesins
14.
Mol Biol Cell ; 30(21): 2626-2638, 2019 10 01.
Article in English | MEDLINE | ID: mdl-31433728

ABSTRACT

Mammalian cells express two oligosaccharyltransferase complexes, STT3A and STT3B, that have distinct roles in N-linked glycosylation. The STT3A complex interacts directly with the protein translocation channel to mediate glycosylation of proteins using an N-terminal-to-C-terminal scanning mechanism. N-linked glycosylation of proteins in budding yeast has been assumed to be a cotranslational reaction. We have compared glycosylation of several glycoproteins in yeast and mammalian cells. Prosaposin, a cysteine-rich protein that contains STT3A-dependent glycosylation sites, is poorly glycosylated in yeast cells and STT3A-deficient human cells. In contrast, a protein with extreme C-terminal glycosylation sites was efficiently glycosylated in yeast by a posttranslocational mechanism. Posttranslocational glycosylation was also observed for carboxypeptidase Y-derived reporter proteins that contain closely spaced acceptor sites. A comparison of two recent protein structures indicates that the yeast OST is unable to interact with the yeast heptameric Sec complex via an evolutionarily conserved interface due to occupation of the OST binding site by the Sec63 protein. The efficiency of glycosylation in yeast is not enhanced for proteins that are translocated by the Sec61 or Ssh1 translocation channels instead of the Sec complex. We conclude that N-linked glycosylation and protein translocation are not directly coupled in yeast cells.


Subject(s)
Asparagine/metabolism , Endoplasmic Reticulum/metabolism , Glycoproteins/metabolism , Hexosyltransferases/metabolism , Membrane Proteins/metabolism , Saccharomyces cerevisiae/metabolism , Glycoproteins/genetics , Glycosylation , HEK293 Cells , Heat-Shock Proteins/genetics , Heat-Shock Proteins/metabolism , Hexosyltransferases/genetics , Humans , Membrane Proteins/genetics , Membrane Transport Proteins/genetics , Membrane Transport Proteins/metabolism , Protein Binding , Protein Transport , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae Proteins/genetics , Saccharomyces cerevisiae Proteins/metabolism
15.
J Cell Biol ; 218(8): 2782-2796, 2019 08 05.
Article in English | MEDLINE | ID: mdl-31296534

ABSTRACT

Human cells express two oligosaccharyltransferase complexes (STT3A and STT3B) with partially overlapping functions. The STT3A complex interacts directly with the protein translocation channel to mediate cotranslational glycosylation, while the STT3B complex can catalyze posttranslocational glycosylation. We used a quantitative glycoproteomics procedure to compare glycosylation of roughly 1,000 acceptor sites in wild type and mutant cells. Analysis of site occupancy data disclosed several new classes of STT3A-dependent acceptor sites including those with suboptimal flanking sequences and sites located within cysteine-rich protein domains. Acceptor sites located in short loops of multi-spanning membrane proteins represent a new class of STT3B-dependent site. Remarkably, the lumenal ER chaperone GRP94 was hyperglycosylated in STT3A-deficient cells, bearing glycans on five silent sites in addition to the normal glycosylation site. GRP94 was also hyperglycosylated in wild-type cells treated with ER stress inducers including thapsigargin, dithiothreitol, and NGI-1.


Subject(s)
Glycoproteins/metabolism , Hexosyltransferases/metabolism , Membrane Proteins/metabolism , Proteomics , Glycosylation , HEK293 Cells , HSP70 Heat-Shock Proteins/metabolism , HeLa Cells , Humans
16.
J Mol Biol ; 430(8): 1098-1115, 2018 04 13.
Article in English | MEDLINE | ID: mdl-29466705

ABSTRACT

The fitness effects of synonymous mutations can provide insights into biological and evolutionary mechanisms. We analyzed the experimental fitness effects of all single-nucleotide mutations, including synonymous substitutions, at the beginning of the influenza A virus hemagglutinin (HA) gene. Many synonymous substitutions were deleterious both in bulk competition and for individually isolated clones. Investigating protein and RNA levels of a subset of individually expressed HA variants revealed that multiple biochemical properties contribute to the observed experimental fitness effects. Our results indicate that a structural element in the HA segment viral RNA may influence fitness. Examination of naturally evolved sequences in human hosts indicates a preference for the unfolded state of this structural element compared to that found in swine hosts. Our overall results reveal that synonymous mutations may have greater fitness consequences than indicated by simple models of sequence conservation, and we discuss the implications of this finding for commonly used evolutionary tests and analyses.


Subject(s)
Genetic Fitness , Hemagglutinin Glycoproteins, Influenza Virus/chemistry , Hemagglutinin Glycoproteins, Influenza Virus/genetics , Influenza A Virus, H1N1 Subtype/growth & development , Silent Mutation , Amino Acid Substitution , Animals , Dogs , Evolution, Molecular , HEK293 Cells , Humans , Influenza A Virus, H1N1 Subtype/genetics , Madin Darby Canine Kidney Cells , Models, Molecular , Phylogeny , RNA Folding , Swine , Virus Replication
17.
Mol Biol Evol ; 35(1): 211-224, 2018 01 01.
Article in English | MEDLINE | ID: mdl-29106597

ABSTRACT

Prokaryotes evolved to thrive in an extremely diverse set of habitats, and their proteomes bear signatures of environmental conditions. Although correlations between amino acid usage and environmental temperature are well-documented, understanding of the mechanisms of thermal adaptation remains incomplete. Here, we couple the energetic costs of protein folding and protein homeostasis to build a microscopic model explaining both the overall amino acid composition and its temperature trends. Low biosynthesis costs lead to low diversity of physical interactions between amino acid residues, which in turn makes proteins less stable and drives up chaperone activity to maintain appropriate levels of folded, functional proteins. Assuming that the cost of chaperone activity is proportional to the fraction of unfolded client proteins, we simulated thermal adaptation of model proteins subject to minimization of the total cost of amino acid synthesis and chaperone activity. For the first time, we predicted both the proteome-average amino acid abundances and their temperature trends simultaneously, and found strong correlations between model predictions and 402 genomes of bacteria and archaea. The energetic constraint on protein evolution is more apparent in highly expressed proteins, selected by codon adaptation index. We found that in bacteria, highly expressed proteins are similar in composition to thermophilic ones, whereas in archaea no correlation between predicted expression level and thermostability was observed. At the same time, thermal adaptations of highly expressed proteins in bacteria and archaea are nearly identical, suggesting that universal energetic constraints prevail over the phylogenetic differences between these domains of life.


Subject(s)
Adaptation, Physiological/physiology , Prokaryotic Cells/metabolism , Proteostasis/physiology , Acclimatization/genetics , Acclimatization/physiology , Adaptation, Physiological/genetics , Amino Acids/genetics , Archaea/genetics , Archaea/metabolism , Bacteria/genetics , Bacteria/metabolism , Biological Evolution , Codon/metabolism , Computer Simulation , Evolution, Molecular , Hot Temperature , Phylogeny , Prokaryotic Cells/physiology , Proteome/genetics , Temperature
18.
Nat Commun ; 8: 14614, 2017 03 06.
Article in English | MEDLINE | ID: mdl-28262665

ABSTRACT

Sequence divergence of orthologous proteins enables adaptation to environmental stresses and promotes evolution of novel functions. Limits on evolution imposed by constraints on sequence and structure were explored using a model TIM barrel protein, indole-3-glycerol phosphate synthase (IGPS). Fitness effects of point mutations in three phylogenetically divergent IGPS proteins during adaptation to temperature stress were probed by auxotrophic complementation of yeast with prokaryotic, thermophilic IGPS. Analysis of beneficial mutations pointed to an unexpected, long-range allosteric pathway towards the active site of the protein. Significant correlations between the fitness landscapes of distant orthologues implicate both sequence and structure as primary forces in defining the TIM barrel fitness landscape and suggest that fitness landscapes can be translocated in sequence space. Exploration of fitness landscapes in the context of a protein fold provides a strategy for elucidating the sequence-structure-fitness relationships in other common motifs.


Subject(s)
Indole-3-Glycerol-Phosphate Synthase/chemistry , Mutation , Sulfolobus solfataricus/chemistry , Thermotoga maritima/chemistry , Thermus thermophilus/chemistry , Amino Acid Sequence , Binding Sites , Cloning, Molecular , Evolution, Molecular , Gene Expression , Genetic Vectors/chemistry , Genetic Vectors/metabolism , Indole-3-Glycerol-Phosphate Synthase/genetics , Indole-3-Glycerol-Phosphate Synthase/metabolism , Kinetics , Models, Molecular , Protein Binding , Protein Conformation, alpha-Helical , Protein Conformation, beta-Strand , Protein Interaction Domains and Motifs , Recombinant Proteins/chemistry , Recombinant Proteins/genetics , Recombinant Proteins/metabolism , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae/metabolism , Structural Homology, Protein , Substrate Specificity , Sulfolobus solfataricus/enzymology , Thermodynamics , Thermotoga maritima/enzymology , Thermus thermophilus/enzymology
19.
J Chem Phys ; 143(5): 055101, 2015 Aug 07.
Article in English | MEDLINE | ID: mdl-26254668

ABSTRACT

Evolution of proteins in bacteria and archaea living in different conditions leads to significant correlations between amino acid usage and environmental temperature. The origins of these correlations are poorly understood, and an important question of protein theory, physics-based prediction of types of amino acids overrepresented in highly thermostable proteins, remains largely unsolved. Here, we extend the random energy model of protein folding by weighting the interaction energies of amino acids by their frequencies in protein sequences and predict the energy gap of proteins designed to fold well at elevated temperatures. To test the model, we present a novel scalable algorithm for simultaneous energy calculation for many sequences in many structures, targeting massively parallel computing architectures such as graphics processing unit. The energy calculation is performed by multiplying two matrices, one representing the complete set of sequences, and the other describing the contact maps of all structural templates. An implementation of the algorithm for the CUDA platform is available at http://www.github.com/kzeldovich/galeprot and calculates protein folding energies over 250 times faster than a single central processing unit. Analysis of amino acid usage in 64-mer cubic lattice proteins designed to fold well at different temperatures demonstrates an excellent agreement between theoretical and simulated values of energy gap. The theoretical predictions of temperature trends of amino acid frequencies are significantly correlated with bioinformatics data on 191 bacteria and archaea, and highlight protein folding constraints as a fundamental selection pressure during thermal adaptation in biological evolution.


Subject(s)
Adaptation, Physiological , Models, Molecular , Proteins/metabolism , Temperature , Algorithms , Evolution, Molecular , Protein Folding , Proteins/chemistry , Time Factors
20.
Mol Biol Evol ; 32(6): 1519-32, 2015 Jun.
Article in English | MEDLINE | ID: mdl-25713211

ABSTRACT

Influenza A virus (IAV) has a segmented genome that allows for the exchange of genome segments between different strains. This reassortment accelerates evolution by breaking linkage, helping IAV cross species barriers to potentially create highly virulent strains. Challenges associated with monitoring the process of reassortment in molecular detail have limited our understanding of its evolutionary implications. We applied a novel deep sequencing approach with quantitative analysis to assess the in vitro temporal evolution of genomic reassortment in IAV. The combination of H1N1 and H3N2 strains reproducibly generated a new H1N2 strain with the hemagglutinin and nucleoprotein segments originating from H1N1 and the remaining six segments from H3N2. By deep sequencing the entire viral genome, we monitored the evolution of reassortment, quantifying the relative abundance of all IAV genome segments from the two parent strains over time and measuring the selection coefficients of the reassorting segments. Additionally, we observed several mutations coemerging with reassortment that were not found during passaging of pure parental IAV strains. Our results demonstrate how reassortment of the segmented genome can accelerate viral evolution in IAV, potentially enabled by the emergence of a small number of individual mutations.


Subject(s)
Alphainfluenzavirus/genetics , Genome, Viral , Reassortant Viruses/genetics , Selection, Genetic , Animals , Computational Biology , Dogs , Evolution, Molecular , Gene Frequency , Genotype , Hemagglutinin Glycoproteins, Influenza Virus/genetics , High-Throughput Nucleotide Sequencing , Influenza A Virus, H1N1 Subtype/genetics , Influenza A Virus, H1N2 Subtype/genetics , Influenza A Virus, H3N2 Subtype/genetics , Limit of Detection , Madin Darby Canine Kidney Cells , Nucleoproteins/genetics , Sequence Analysis, RNA
SELECTION OF CITATIONS
SEARCH DETAIL
...