Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
PLoS Comput Biol ; 20(5): e1012164, 2024 May 29.
Artículo en Inglés | MEDLINE | ID: mdl-38809952

RESUMEN

The field of 3D genome organization produces large amounts of sequencing data from Hi-C and a rapidly-expanding set of other chromosome conformation protocols (3C+). Massive and heterogeneous 3C+ data require high-performance and flexible processing of sequenced reads into contact pairs. To meet these challenges, we present pairtools-a flexible suite of tools for contact extraction from sequencing data. Pairtools provides modular command-line interface (CLI) tools that can be flexibly chained into data processing pipelines. The core operations provided by pairtools are parsing of.sam alignments into Hi-C pairs, sorting and removal of PCR duplicates. In addition, pairtools provides auxiliary tools for building feature-rich 3C+ pipelines, including contact pair manipulation, filtration, and quality control. Benchmarking pairtools against popular 3C+ data pipelines shows advantages of pairtools for high-performance and flexible 3C+ analysis. Finally, pairtools provides protocol-specific tools for restriction-based protocols, haplotype-resolved contacts, and single-cell Hi-C. The combination of CLI tools and tight integration with Python data analysis libraries makes pairtools a versatile foundation for a broad range of 3C+ pipelines.

2.
PLoS Comput Biol ; 20(5): e1012067, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38709825

RESUMEN

Chromosome conformation capture (3C) technologies reveal the incredible complexity of genome organization. Maps of increasing size, depth, and resolution are now used to probe genome architecture across cell states, types, and organisms. Larger datasets add challenges at each step of computational analysis, from storage and memory constraints to researchers' time; however, analysis tools that meet these increased resource demands have not kept pace. Furthermore, existing tools offer limited support for customizing analysis for specific use cases or new biology. Here we introduce cooltools (https://github.com/open2c/cooltools), a suite of computational tools that enables flexible, scalable, and reproducible analysis of high-resolution contact frequency data. Cooltools leverages the widely-adopted cooler format which handles storage and access for high-resolution datasets. Cooltools provides a paired command line interface (CLI) and Python application programming interface (API), which respectively facilitate workflows on high-performance computing clusters and in interactive analysis environments. In short, cooltools enables the effective use of the latest and largest genome folding datasets.


Asunto(s)
Biología Computacional , Programas Informáticos , Biología Computacional/métodos , Lenguajes de Programación , Genómica/métodos , Genoma/genética , Mapeo Cromosómico/métodos , Humanos
3.
Mol Cell ; 84(8): 1422-1441.e14, 2024 Apr 18.
Artículo en Inglés | MEDLINE | ID: mdl-38521067

RESUMEN

The topological state of chromosomes determines their mechanical properties, dynamics, and function. Recent work indicated that interphase chromosomes are largely free of entanglements. Here, we use Hi-C, polymer simulations, and multi-contact 3C and find that, by contrast, mitotic chromosomes are self-entangled. We explore how a mitotic self-entangled state is converted into an unentangled interphase state during mitotic exit. Most mitotic entanglements are removed during anaphase/telophase, with remaining ones removed during early G1, in a topoisomerase-II-dependent process. Polymer models suggest a two-stage disentanglement pathway: first, decondensation of mitotic chromosomes with remaining condensin loops produces entropic forces that bias topoisomerase II activity toward decatenation. At the second stage, the loops are released, and the formation of new entanglements is prevented by lower topoisomerase II activity, allowing the establishment of unentangled and territorial G1 chromosomes. When mitotic entanglements are not removed in experiments and models, a normal interphase state cannot be acquired.


Asunto(s)
Cromosomas , ADN-Topoisomerasas de Tipo II , ADN-Topoisomerasas de Tipo II/genética , Cromosomas/genética , Mitosis/genética , Interfase/genética , Polímeros
4.
Nat Genet ; 56(5): 900-912, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38388848

RESUMEN

Whole chromosome and arm-level copy number alterations occur at high frequencies in tumors, but their selective advantages, if any, are poorly understood. Here, utilizing unbiased whole chromosome genetic screens combined with in vitro evolution to generate arm- and subarm-level events, we iteratively selected the fittest karyotypes from aneuploidized human renal and mammary epithelial cells. Proliferation-based karyotype selection in these epithelial lines modeled tissue-specific tumor aneuploidy patterns in patient cohorts in the absence of driver mutations. Hi-C-based translocation mapping revealed that arm-level events usually emerged in multiples of two via centromeric translocations and occurred more frequently in tetraploids than diploids, contributing to the increased diversity in evolving tetraploid populations. Isogenic clonal lineages enabled elucidation of pro-tumorigenic mechanisms associated with common copy number alterations, revealing Notch signaling potentiation as a driver of 1q gain in breast cancer. We propose that intrinsic, tissue-specific proliferative effects underlie tumor copy number patterns in cancer.


Asunto(s)
Aneuploidia , Humanos , Femenino , Neoplasias de la Mama/genética , Neoplasias de la Mama/patología , Variaciones en el Número de Copia de ADN , Neoplasias/genética , Neoplasias/patología , Translocación Genética , Evolución Molecular , Proliferación Celular/genética , Receptores Notch/genética , Receptores Notch/metabolismo , Especificidad de Órganos/genética , Células Epiteliales/metabolismo , Células Epiteliales/patología
5.
bioRxiv ; 2024 Jan 07.
Artículo en Inglés | MEDLINE | ID: mdl-38260419

RESUMEN

The expression of a precise mRNA transcriptome is crucial for establishing cell identity and function, with dozens of alternative isoforms produced for a single gene sequence. The regulation of mRNA isoform usage occurs by the coordination of co-transcriptional mRNA processing mechanisms across a gene. Decisions involved in mRNA initiation and termination underlie the largest extent of mRNA isoform diversity, but little is known about any relationships between decisions at both ends of mRNA molecules. Here, we systematically profile the joint usage of mRNA transcription start sites (TSSs) and polyadenylation sites (PASs) across tissues and species. Using both short and long read RNA-seq data, we observe that mRNAs preferentially using upstream TSSs also tend to use upstream PASs, and congruently, the usage of downstream sites is similarly paired. This observation suggests that mRNA 5' end choice may directly influence mRNA 3' ends. Our results suggest a novel "Positional Initiation-Termination Axis" (PITA), in which the usage of alternative terminal sites are coupled based on the order in which they appear in the genome. PITA isoforms are more likely to encode alternative protein domains and use conserved sites. PITA is strongly associated with the length of genomic features, such that PITA is enriched in longer genes with more area devoted to regions that regulate alternative 5' or 3' ends. Strikingly, we found that PITA genes are more likely than non-PITA genes to have multiple, overlapping chromatin structural domains related to pairing of ordinally coupled start and end sites. In turn, PITA coupling is also associated with fast RNA Polymerase II (RNAPII) trafficking across these long gene regions. Our findings indicate that a combination of spatial and kinetic mechanisms couple transcription initiation and mRNA 3' end decisions based on ordinal position to define the expression mRNA isoforms.

6.
bioRxiv ; 2023 Feb 15.
Artículo en Inglés | MEDLINE | ID: mdl-36824968

RESUMEN

The field of 3D genome organization produces large amounts of sequencing data from Hi-C and a rapidly-expanding set of other chromosome conformation protocols (3C+). Massive and heterogeneous 3C+ data require high-performance and flexible processing of sequenced reads into contact pairs. To meet these challenges, we present pairtools - a flexible suite of tools for contact extraction from sequencing data. Pairtools provides modular command-line interface (CLI) tools that can be flexibly chained into data processing pipelines. Pairtools provides both crucial core tools as well as auxiliary tools for building feature-rich 3C+ pipelines, including contact pair manipulation, filtration, and quality control. Benchmarking pairtools against popular 3C+ data pipelines shows advantages of pairtools for high-performance and flexible 3C+ analysis. Finally, pairtools provides protocol-specific tools for multi-way contacts, haplotype-resolved contacts, and single-cell Hi-C. The combination of CLI tools and tight integration with Python data analysis libraries makes pairtools a versatile foundation for a broad range of 3C+ pipelines.

7.
Nat Struct Mol Biol ; 29(12): 1239-1251, 2022 12.
Artículo en Inglés | MEDLINE | ID: mdl-36482254

RESUMEN

Cohesin-mediated loop extrusion has been shown to be blocked at specific cis-elements, including CTCF sites, producing patterns of loops and domain boundaries along chromosomes. Here we explore such cis-elements, and their role in gene regulation. We find that transcription termination sites of active genes form cohesin- and RNA polymerase II-dependent domain boundaries that do not accumulate cohesin. At these sites, cohesin is first stalled and then rapidly unloaded. Start sites of transcriptionally active genes form cohesin-bound boundaries, as shown before, but are cohesin-independent. Together with cohesin loading, possibly at enhancers, these sites create a pattern of cohesin traffic that guides enhancer-promoter interactions. Disrupting this traffic pattern, by removing CTCF, renders cells sensitive to knockout of genes involved in transcription initiation, such as the SAGA complexes, and RNA processing such DEAD/H-Box RNA helicases. Without CTCF, these factors are less efficiently recruited to active promoters.


Asunto(s)
Cromatina , Proteínas Cromosómicas no Histona , Factor de Unión a CCCTC/genética , Proteínas Cromosómicas no Histona/metabolismo , Proteínas de Ciclo Celular/metabolismo , Cohesinas
8.
Nature ; 606(7915): 812-819, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35676475

RESUMEN

DNA replication occurs through an intricately regulated series of molecular events and is fundamental for genome stability1,2. At present, it is unknown how the locations of replication origins are determined in the human genome. Here we dissect the role of topologically associating domains (TADs)3-6, subTADs7 and loops8 in the positioning of replication initiation zones (IZs). We stratify TADs and subTADs by the presence of corner-dots indicative of loops and the orientation of CTCF motifs. We find that high-efficiency, early replicating IZs localize to boundaries between adjacent corner-dot TADs anchored by high-density arrays of divergently and convergently oriented CTCF motifs. By contrast, low-efficiency IZs localize to weaker dotless boundaries. Following ablation of cohesin-mediated loop extrusion during G1, high-efficiency IZs become diffuse and delocalized at boundaries with complex CTCF motif orientations. Moreover, G1 knockdown of the cohesin unloading factor WAPL results in gained long-range loops and narrowed localization of IZs at the same boundaries. Finally, targeted deletion or insertion of specific boundaries causes local replication timing shifts consistent with IZ loss or gain, respectively. Our data support a model in which cohesin-mediated loop extrusion and stalling at a subset of genetically encoded TAD and subTAD boundaries is an essential determinant of the locations of replication origins in human S phase.


Asunto(s)
Proteínas de Ciclo Celular , Cromatina , Proteínas Cromosómicas no Histona , Origen de Réplica , Proteínas de Ciclo Celular/metabolismo , Cromatina/genética , Proteínas Cromosómicas no Histona/metabolismo , Replicación del ADN , Humanos , Origen de Réplica/genética , Fase S , Cohesinas
9.
Nat Methods ; 18(9): 1046-1055, 2021 09.
Artículo en Inglés | MEDLINE | ID: mdl-34480151

RESUMEN

Chromosome conformation capture (3C) assays are used to map chromatin interactions genome-wide. Chromatin interaction maps provide insights into the spatial organization of chromosomes and the mechanisms by which they fold. Hi-C and Micro-C are widely used 3C protocols that differ in key experimental parameters including cross-linking chemistry and chromatin fragmentation strategy. To understand how the choice of experimental protocol determines the ability to detect and quantify aspects of chromosome folding we have performed a systematic evaluation of 3C experimental parameters. We identified optimal protocol variants for either loop or compartment detection, optimizing fragment size and cross-linking chemistry. We used this knowledge to develop a greatly improved Hi-C protocol (Hi-C 3.0) that can detect both loops and compartments relatively effectively. In addition to providing benchmarked protocols, this work produced ultra-deep chromatin interaction maps using Micro-C, conventional Hi-C and Hi-C 3.0 for key cell lines used by the 4D Nucleome project.


Asunto(s)
Cromatina/química , Cromosomas Humanos/química , Reactivos de Enlaces Cruzados/química , Técnicas Genéticas , Línea Celular , Cromatina/metabolismo , Bases de Datos Factuales , Células Madre Embrionarias Humanas/citología , Células Madre Embrionarias Humanas/fisiología , Humanos
10.
Nat Genet ; 53(3): 367-378, 2021 03.
Artículo en Inglés | MEDLINE | ID: mdl-33574602

RESUMEN

Nuclear compartmentalization of active and inactive chromatin is thought to occur through microphase separation mediated by interactions between loci of similar type. The nature and dynamics of these interactions are not known. We developed liquid chromatin Hi-C to map the stability of associations between loci. Before fixation and Hi-C, chromosomes are fragmented, which removes strong polymeric constraint, enabling detection of intrinsic locus-locus interaction stabilities. Compartmentalization is stable when fragments are larger than 10-25 kb. Fragmentation of chromatin into pieces smaller than 6 kb leads to gradual loss of genome organization. Lamin-associated domains are most stable, whereas interactions for speckle- and polycomb-associated loci are more dynamic. Cohesin-mediated loops dissolve after fragmentation. Liquid chromatin Hi-C provides a genome-wide view of chromosome interaction dynamics.


Asunto(s)
Cromatina/química , Cromatina/metabolismo , Cromosomas Humanos/química , Compartimento Celular , Proteínas de Ciclo Celular/metabolismo , Núcleo Celular/química , Núcleo Celular/genética , Cromatina/genética , Ensamble y Desensamble de Cromatina , Proteínas Cromosómicas no Histona/metabolismo , Cromosomas Humanos/metabolismo , Semivida , Humanos , Células K562 , Cinética , Cohesinas
11.
JCI Insight ; 6(3)2021 02 08.
Artículo en Inglés | MEDLINE | ID: mdl-33351783

RESUMEN

The cohesin complex plays an essential role in chromosome maintenance and transcriptional regulation. Recurrent somatic mutations in the cohesin complex are frequent genetic drivers in cancer, including myelodysplastic syndromes (MDS) and acute myeloid leukemia (AML). Here, using genetic dependency screens of stromal antigen 2-mutant (STAG2-mutant) AML, we identified DNA damage repair and replication as genetic dependencies in cohesin-mutant cells. We demonstrated increased levels of DNA damage and sensitivity of cohesin-mutant cells to poly(ADP-ribose) polymerase (PARP) inhibition. We developed a mouse model of MDS in which Stag2 mutations arose as clonal secondary lesions in the background of clonal hematopoiesis driven by tet methylcytosine dioxygenase 2 (Tet2) mutations and demonstrated selective depletion of cohesin-mutant cells with PARP inhibition in vivo. Finally, we demonstrated a shift from STAG2- to STAG1-containing cohesin complexes in cohesin-mutant cells, which was associated with longer DNA loop extrusion, more intermixing of chromatin compartments, and increased interaction with PARP and replication protein A complex. Our findings inform the biology and therapeutic opportunities for cohesin-mutant malignancies.


Asunto(s)
Proteínas de Ciclo Celular/genética , Proteínas de Ciclo Celular/metabolismo , Proteínas Cromosómicas no Histona/genética , Proteínas Cromosómicas no Histona/metabolismo , Reparación del ADN/genética , Leucemia Mieloide Aguda/genética , Leucemia Mieloide Aguda/metabolismo , Mutación , Síndromes Mielodisplásicos/genética , Síndromes Mielodisplásicos/metabolismo , Animales , Línea Celular Tumoral , Cromatina/genética , Cromatina/metabolismo , Daño del ADN , Modelos Animales de Enfermedad , Femenino , Humanos , Células K562 , Leucemia Mieloide Aguda/tratamiento farmacológico , Masculino , Ratones , Ratones Endogámicos C57BL , Ratones Endogámicos NOD , Ratones Mutantes , Ratones SCID , Ratones Transgénicos , Síndromes Mielodisplásicos/tratamiento farmacológico , Proteínas Nucleares/genética , Ftalazinas/farmacología , Inhibidores de Poli(ADP-Ribosa) Polimerasas/farmacología , Células U937 , Ensayos Antitumor por Modelo de Xenoinjerto , Cohesinas
12.
EMBO J ; 39(21): e99520, 2020 11 02.
Artículo en Inglés | MEDLINE | ID: mdl-32935369

RESUMEN

Vertebrate genomes replicate according to a precise temporal program strongly correlated with their organization into A/B compartments. Until now, the molecular mechanisms underlying the establishment of early-replicating domains remain largely unknown. We defined two minimal cis-element modules containing a strong replication origin and chromatin modifier binding sites capable of shifting a targeted mid-late-replicating region for earlier replication. The two origins overlap with a constitutive or a silent tissue-specific promoter. When inserted side-by-side, these modules advance replication timing over a 250 kb region through the cooperation with one endogenous origin located 30 kb away. Moreover, when inserted at two chromosomal sites separated by 30 kb, these two modules come into close physical proximity and form an early-replicating domain establishing more contacts with active A compartments. The synergy depends on the presence of the active promoter/origin. Our results show that clustering of strong origins located at active promoters can establish early-replicating domains.


Asunto(s)
Momento de Replicación del ADN , Replicación del ADN , Regiones Promotoras Genéticas , Actinas/genética , Sitios de Unión , Cromatina , Cromosomas , Análisis por Conglomerados , Epigenómica , Humanos , Origen de Réplica , Globinas beta/genética
13.
Mol Cell ; 78(3): 554-565.e7, 2020 05 07.
Artículo en Inglés | MEDLINE | ID: mdl-32213324

RESUMEN

Over the past decade, 3C-related methods have provided remarkable insights into chromosome folding in vivo. To overcome the limited resolution of prior studies, we extend a recently developed Hi-C variant, Micro-C, to map chromosome architecture at nucleosome resolution in human ESCs and fibroblasts. Micro-C robustly captures known features of chromosome folding including compartment organization, topologically associating domains, and interactions between CTCF binding sites. In addition, Micro-C provides a detailed map of nucleosome positions and localizes contact domain boundaries with nucleosomal precision. Compared to Hi-C, Micro-C exhibits an order of magnitude greater dynamic range, allowing the identification of ∼20,000 additional loops in each cell type. Many newly identified peaks are localized along extrusion stripes and form transitive grids, consistent with their anchors being pause sites impeding cohesin-dependent loop extrusion. Our analyses comprise the highest-resolution maps of chromosome folding in human cells to date, providing a valuable resource for studies of chromosome organization.


Asunto(s)
Cromosomas Humanos/ultraestructura , Animales , Factor de Unión a CCCTC/metabolismo , Células Cultivadas , Cromatina/química , Cromosomas de los Mamíferos/ultraestructura , Células Madre Embrionarias/citología , Fibroblastos/citología , Humanos , Masculino , Mamíferos/genética , Nucleosomas/metabolismo , Nucleosomas/ultraestructura , Relación Señal-Ruido
14.
Nat Cell Biol ; 21(11): 1393-1402, 2019 11.
Artículo en Inglés | MEDLINE | ID: mdl-31685986

RESUMEN

Chromosome folding is modulated as cells progress through the cell cycle. During mitosis, condensins fold chromosomes into helical loop arrays. In interphase, the cohesin complex generates loops and topologically associating domains (TADs), while a separate process of compartmentalization drives segregation of active and inactive chromatin. We used synchronized cell cultures to determine how the mitotic chromosome conformation transforms into the interphase state. Using high-throughput chromosome conformation capture (Hi-C) analysis, chromatin binding assays and immunofluorescence, we show that, by telophase, condensin-mediated loops are lost and a transient folding intermediate is formed that is devoid of most loops. By cytokinesis, cohesin-mediated CTCF-CTCF loops and the positions of TADs emerge. Compartment boundaries are also established early, but long-range compartmentalization is a slow process and proceeds for hours after cells enter G1. Our results reveal the kinetics and order of events by which the interphase chromosome state is formed and identify telophase as a critical transition between condensin- and cohesin-driven chromosome folding.


Asunto(s)
Adenosina Trifosfatasas/genética , Proteínas de Ciclo Celular/genética , Cromatina/metabolismo , Proteínas Cromosómicas no Histona/genética , Proteínas de Unión al ADN/genética , Complejos Multiproteicos/genética , Telofase , Adenosina Trifosfatasas/metabolismo , Compartimento Celular/genética , Proteínas de Ciclo Celular/metabolismo , Línea Celular Transformada , Cromatina/ultraestructura , Proteínas Cromosómicas no Histona/metabolismo , Mapeo Cromosómico , Citocinesis/genética , Proteínas de Unión al ADN/metabolismo , Expresión Génica , Células HeLa , Humanos , Interfase , Complejos Multiproteicos/metabolismo , Fase S , Cohesinas
15.
Mol Biol Cell ; 30(21): 2626-2638, 2019 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-31433728

RESUMEN

Mammalian cells express two oligosaccharyltransferase complexes, STT3A and STT3B, that have distinct roles in N-linked glycosylation. The STT3A complex interacts directly with the protein translocation channel to mediate glycosylation of proteins using an N-terminal-to-C-terminal scanning mechanism. N-linked glycosylation of proteins in budding yeast has been assumed to be a cotranslational reaction. We have compared glycosylation of several glycoproteins in yeast and mammalian cells. Prosaposin, a cysteine-rich protein that contains STT3A-dependent glycosylation sites, is poorly glycosylated in yeast cells and STT3A-deficient human cells. In contrast, a protein with extreme C-terminal glycosylation sites was efficiently glycosylated in yeast by a posttranslocational mechanism. Posttranslocational glycosylation was also observed for carboxypeptidase Y-derived reporter proteins that contain closely spaced acceptor sites. A comparison of two recent protein structures indicates that the yeast OST is unable to interact with the yeast heptameric Sec complex via an evolutionarily conserved interface due to occupation of the OST binding site by the Sec63 protein. The efficiency of glycosylation in yeast is not enhanced for proteins that are translocated by the Sec61 or Ssh1 translocation channels instead of the Sec complex. We conclude that N-linked glycosylation and protein translocation are not directly coupled in yeast cells.


Asunto(s)
Asparagina/metabolismo , Retículo Endoplásmico/metabolismo , Glicoproteínas/metabolismo , Hexosiltransferasas/metabolismo , Proteínas de la Membrana/metabolismo , Saccharomyces cerevisiae/metabolismo , Glicoproteínas/genética , Glicosilación , Células HEK293 , Proteínas de Choque Térmico/genética , Proteínas de Choque Térmico/metabolismo , Hexosiltransferasas/genética , Humanos , Proteínas de la Membrana/genética , Proteínas de Transporte de Membrana/genética , Proteínas de Transporte de Membrana/metabolismo , Unión Proteica , Transporte de Proteínas , Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo
16.
J Cell Biol ; 218(8): 2782-2796, 2019 08 05.
Artículo en Inglés | MEDLINE | ID: mdl-31296534

RESUMEN

Human cells express two oligosaccharyltransferase complexes (STT3A and STT3B) with partially overlapping functions. The STT3A complex interacts directly with the protein translocation channel to mediate cotranslational glycosylation, while the STT3B complex can catalyze posttranslocational glycosylation. We used a quantitative glycoproteomics procedure to compare glycosylation of roughly 1,000 acceptor sites in wild type and mutant cells. Analysis of site occupancy data disclosed several new classes of STT3A-dependent acceptor sites including those with suboptimal flanking sequences and sites located within cysteine-rich protein domains. Acceptor sites located in short loops of multi-spanning membrane proteins represent a new class of STT3B-dependent site. Remarkably, the lumenal ER chaperone GRP94 was hyperglycosylated in STT3A-deficient cells, bearing glycans on five silent sites in addition to the normal glycosylation site. GRP94 was also hyperglycosylated in wild-type cells treated with ER stress inducers including thapsigargin, dithiothreitol, and NGI-1.


Asunto(s)
Glicoproteínas/metabolismo , Hexosiltransferasas/metabolismo , Proteínas de la Membrana/metabolismo , Proteómica , Glicosilación , Células HEK293 , Proteínas HSP70 de Choque Térmico/metabolismo , Células HeLa , Humanos
17.
J Mol Biol ; 430(8): 1098-1115, 2018 04 13.
Artículo en Inglés | MEDLINE | ID: mdl-29466705

RESUMEN

The fitness effects of synonymous mutations can provide insights into biological and evolutionary mechanisms. We analyzed the experimental fitness effects of all single-nucleotide mutations, including synonymous substitutions, at the beginning of the influenza A virus hemagglutinin (HA) gene. Many synonymous substitutions were deleterious both in bulk competition and for individually isolated clones. Investigating protein and RNA levels of a subset of individually expressed HA variants revealed that multiple biochemical properties contribute to the observed experimental fitness effects. Our results indicate that a structural element in the HA segment viral RNA may influence fitness. Examination of naturally evolved sequences in human hosts indicates a preference for the unfolded state of this structural element compared to that found in swine hosts. Our overall results reveal that synonymous mutations may have greater fitness consequences than indicated by simple models of sequence conservation, and we discuss the implications of this finding for commonly used evolutionary tests and analyses.


Asunto(s)
Aptitud Genética , Glicoproteínas Hemaglutininas del Virus de la Influenza/química , Glicoproteínas Hemaglutininas del Virus de la Influenza/genética , Subtipo H1N1 del Virus de la Influenza A/crecimiento & desarrollo , Mutación Silenciosa , Sustitución de Aminoácidos , Animales , Perros , Evolución Molecular , Células HEK293 , Humanos , Subtipo H1N1 del Virus de la Influenza A/genética , Células de Riñón Canino Madin Darby , Modelos Moleculares , Filogenia , Pliegue del ARN , Porcinos , Replicación Viral
18.
Mol Biol Evol ; 35(1): 211-224, 2018 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-29106597

RESUMEN

Prokaryotes evolved to thrive in an extremely diverse set of habitats, and their proteomes bear signatures of environmental conditions. Although correlations between amino acid usage and environmental temperature are well-documented, understanding of the mechanisms of thermal adaptation remains incomplete. Here, we couple the energetic costs of protein folding and protein homeostasis to build a microscopic model explaining both the overall amino acid composition and its temperature trends. Low biosynthesis costs lead to low diversity of physical interactions between amino acid residues, which in turn makes proteins less stable and drives up chaperone activity to maintain appropriate levels of folded, functional proteins. Assuming that the cost of chaperone activity is proportional to the fraction of unfolded client proteins, we simulated thermal adaptation of model proteins subject to minimization of the total cost of amino acid synthesis and chaperone activity. For the first time, we predicted both the proteome-average amino acid abundances and their temperature trends simultaneously, and found strong correlations between model predictions and 402 genomes of bacteria and archaea. The energetic constraint on protein evolution is more apparent in highly expressed proteins, selected by codon adaptation index. We found that in bacteria, highly expressed proteins are similar in composition to thermophilic ones, whereas in archaea no correlation between predicted expression level and thermostability was observed. At the same time, thermal adaptations of highly expressed proteins in bacteria and archaea are nearly identical, suggesting that universal energetic constraints prevail over the phylogenetic differences between these domains of life.


Asunto(s)
Adaptación Fisiológica/fisiología , Células Procariotas/metabolismo , Proteostasis/fisiología , Aclimatación/genética , Aclimatación/fisiología , Adaptación Fisiológica/genética , Aminoácidos/genética , Archaea/genética , Archaea/metabolismo , Bacterias/genética , Bacterias/metabolismo , Evolución Biológica , Codón/metabolismo , Simulación por Computador , Evolución Molecular , Calor , Filogenia , Células Procariotas/fisiología , Proteoma/genética , Temperatura
19.
Nat Commun ; 8: 14614, 2017 03 06.
Artículo en Inglés | MEDLINE | ID: mdl-28262665

RESUMEN

Sequence divergence of orthologous proteins enables adaptation to environmental stresses and promotes evolution of novel functions. Limits on evolution imposed by constraints on sequence and structure were explored using a model TIM barrel protein, indole-3-glycerol phosphate synthase (IGPS). Fitness effects of point mutations in three phylogenetically divergent IGPS proteins during adaptation to temperature stress were probed by auxotrophic complementation of yeast with prokaryotic, thermophilic IGPS. Analysis of beneficial mutations pointed to an unexpected, long-range allosteric pathway towards the active site of the protein. Significant correlations between the fitness landscapes of distant orthologues implicate both sequence and structure as primary forces in defining the TIM barrel fitness landscape and suggest that fitness landscapes can be translocated in sequence space. Exploration of fitness landscapes in the context of a protein fold provides a strategy for elucidating the sequence-structure-fitness relationships in other common motifs.


Asunto(s)
Indol-3-Glicerolfosfato Sintasa/química , Mutación , Sulfolobus solfataricus/química , Thermotoga maritima/química , Thermus thermophilus/química , Secuencia de Aminoácidos , Sitios de Unión , Clonación Molecular , Evolución Molecular , Expresión Génica , Vectores Genéticos/química , Vectores Genéticos/metabolismo , Indol-3-Glicerolfosfato Sintasa/genética , Indol-3-Glicerolfosfato Sintasa/metabolismo , Cinética , Modelos Moleculares , Unión Proteica , Conformación Proteica en Hélice alfa , Conformación Proteica en Lámina beta , Dominios y Motivos de Interacción de Proteínas , Proteínas Recombinantes/química , Proteínas Recombinantes/genética , Proteínas Recombinantes/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Homología Estructural de Proteína , Especificidad por Sustrato , Sulfolobus solfataricus/enzimología , Termodinámica , Thermotoga maritima/enzimología , Thermus thermophilus/enzimología
20.
J Chem Phys ; 143(5): 055101, 2015 Aug 07.
Artículo en Inglés | MEDLINE | ID: mdl-26254668

RESUMEN

Evolution of proteins in bacteria and archaea living in different conditions leads to significant correlations between amino acid usage and environmental temperature. The origins of these correlations are poorly understood, and an important question of protein theory, physics-based prediction of types of amino acids overrepresented in highly thermostable proteins, remains largely unsolved. Here, we extend the random energy model of protein folding by weighting the interaction energies of amino acids by their frequencies in protein sequences and predict the energy gap of proteins designed to fold well at elevated temperatures. To test the model, we present a novel scalable algorithm for simultaneous energy calculation for many sequences in many structures, targeting massively parallel computing architectures such as graphics processing unit. The energy calculation is performed by multiplying two matrices, one representing the complete set of sequences, and the other describing the contact maps of all structural templates. An implementation of the algorithm for the CUDA platform is available at http://www.github.com/kzeldovich/galeprot and calculates protein folding energies over 250 times faster than a single central processing unit. Analysis of amino acid usage in 64-mer cubic lattice proteins designed to fold well at different temperatures demonstrates an excellent agreement between theoretical and simulated values of energy gap. The theoretical predictions of temperature trends of amino acid frequencies are significantly correlated with bioinformatics data on 191 bacteria and archaea, and highlight protein folding constraints as a fundamental selection pressure during thermal adaptation in biological evolution.


Asunto(s)
Adaptación Fisiológica , Modelos Moleculares , Proteínas/metabolismo , Temperatura , Algoritmos , Evolución Molecular , Pliegue de Proteína , Proteínas/química , Factores de Tiempo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...