Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 34
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Trends Immunol ; 43(6): 449-458, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35490134

RESUMEN

Several viruses hide in the genome of their host. To complete their replication cycle, they need to integrate in the form of a provirus and express their genes. In vertebrates, integrated viruses can be silenced by chromatin, implying that some specific mechanisms exist to detect non-self genes. The known mechanisms depend on sequence features of retroelements, but the fluctuations of virus expression suggest that other determinants also exist. Here we review the mechanisms allowing chromatin to silence integrated viruses and propose that DNA repair may help flag them as 'non-self' shortly after their genomic insertion.


Asunto(s)
Cromatina , Integración Viral , Animales , Cromatina/genética , Silenciador del Gen , Humanos , Provirus/genética , Integración Viral/genética
2.
Cell ; 143(2): 212-24, 2010 Oct 15.
Artículo en Inglés | MEDLINE | ID: mdl-20888037

RESUMEN

Chromatin is important for the regulation of transcription and other functions, yet the diversity of chromatin composition and the distribution along chromosomes are still poorly characterized. By integrative analysis of genome-wide binding maps of 53 broadly selected chromatin components in Drosophila cells, we show that the genome is segmented into five principal chromatin types that are defined by unique yet overlapping combinations of proteins and form domains that can extend over > 100 kb. We identify a repressive chromatin type that covers about half of the genome and lacks classic heterochromatin markers. Furthermore, transcriptionally active euchromatin consists of two types that differ in molecular organization and H3K36 methylation and regulate distinct classes of genes. Finally, we provide evidence that the different chromatin types help to target DNA-binding factors to specific genomic regions. These results provide a global view of chromatin diversity and domain organization in a metazoan cell.


Asunto(s)
Cromatina/clasificación , Proteínas de Unión al ADN/análisis , Proteínas de Drosophila/análisis , Drosophila melanogaster/genética , Animales , Línea Celular , Cromatina/metabolismo , Proteínas de Unión al ADN/metabolismo , Proteínas de Drosophila/metabolismo , Drosophila melanogaster/metabolismo , Eucromatina/metabolismo , Heterocromatina/metabolismo , Histonas/metabolismo , Análisis de Componente Principal
3.
Nature ; 569(7756): 345-354, 2019 05.
Artículo en Inglés | MEDLINE | ID: mdl-31092938

RESUMEN

How cells adopt different identities has long fascinated biologists. Signal transduction in response to environmental cues results in the activation of transcription factors that determine the gene-expression program characteristic of each cell type. Technological advances in the study of 3D chromatin folding are bringing the role of genome conformation in transcriptional regulation to the fore. Characterizing this role of genome architecture has profound implications, not only for differentiation and development but also for diseases including developmental malformations and cancer. Here we review recent studies indicating that the interplay between transcription and genome conformation is a driving force for cell-fate decisions.


Asunto(s)
Diferenciación Celular/genética , Células/citología , Células/metabolismo , Genoma , Factores de Transcripción/metabolismo , Animales , Ensamble y Desensamble de Cromatina/genética , Posicionamiento de Cromosoma , Regulación de la Expresión Génica , Genoma/genética , Humanos , Especificidad de Órganos/genética
4.
Mol Cell ; 67(4): 550-565.e5, 2017 Aug 17.
Artículo en Inglés | MEDLINE | ID: mdl-28803780

RESUMEN

DNA methylation is an essential epigenetic mark in mammals that has to be re-established after each round of DNA replication. The protein UHRF1 is essential for this process; it has been proposed that the protein targets newly replicated DNA by cooperatively binding hemi-methylated DNA and H3K9me2/3, but this model leaves a number of questions unanswered. Here, we present evidence for a direct recruitment of UHRF1 by the replication machinery via DNA ligase 1 (LIG1). A histone H3K9-like mimic within LIG1 is methylated by G9a and GLP and, compared with H3K9me2/3, more avidly binds UHRF1. Interaction with methylated LIG1 promotes the recruitment of UHRF1 to DNA replication sites and is required for DNA methylation maintenance. These results further elucidate the function of UHRF1, identify a non-histone target of G9a and GLP, and provide an example of a histone mimic that coordinates DNA replication and DNA methylation maintenance.


Asunto(s)
Proteínas Potenciadoras de Unión a CCAAT/metabolismo , ADN Ligasa (ATP)/metabolismo , Metilación de ADN , Replicación del ADN , ADN/biosíntesis , Epigénesis Genética , Antígenos de Histocompatibilidad/metabolismo , N-Metiltransferasa de Histona-Lisina/metabolismo , Procesamiento Proteico-Postraduccional , Animales , Proteínas Potenciadoras de Unión a CCAAT/química , Proteínas Potenciadoras de Unión a CCAAT/genética , ADN/genética , ADN Ligasa (ATP)/química , ADN Ligasa (ATP)/genética , Células Madre Embrionarias/enzimología , Células HEK293 , Células HeLa , Antígenos de Histocompatibilidad/química , Antígenos de Histocompatibilidad/genética , N-Metiltransferasa de Histona-Lisina/química , N-Metiltransferasa de Histona-Lisina/genética , Histonas/metabolismo , Humanos , Lisina , Metilación , Ratones , Modelos Moleculares , Imitación Molecular , Mutación , Unión Proteica , Conformación Proteica , Relación Estructura-Actividad , Transfección , Dominio Tudor , Ubiquitina-Proteína Ligasas
5.
Bioessays ; 44(10): e2200105, 2022 10.
Artículo en Inglés | MEDLINE | ID: mdl-36028473

RESUMEN

The spatial organization of genomes is becoming increasingly understood. In mammals, where it is most investigated, this organization ties in with transcription, so an important research objective is to understand whether gene activity is a cause or a consequence of genome folding in space. In this regard, the phenomena of X-chromosome inactivation and reactivation open a unique window of investigation because of the singularities of the inactive X chromosome. Here we focus on the cause-consequence nexus between genome conformation and transcription and explain how recent results about the structural changes associated with inactivation and reactivation of the X chromosome shed light on this problem.


Asunto(s)
Inactivación del Cromosoma X , Cromosoma X , Animales , Genoma/genética , Mamíferos/genética , Inactivación del Cromosoma X/genética
6.
PLoS Genet ; 15(4): e1008079, 2019 04.
Artículo en Inglés | MEDLINE | ID: mdl-30969963

RESUMEN

Characterizing the fitness landscape, a representation of fitness for a large set of genotypes, is key to understanding how genetic information is interpreted to create functional organisms. Here we determined the evolutionarily-relevant segment of the fitness landscape of His3, a gene coding for an enzyme in the histidine synthesis pathway, focusing on combinations of amino acid states found at orthologous sites of extant species. Just 15% of amino acids found in yeast His3 orthologues were always neutral while the impact on fitness of the remaining 85% depended on the genetic background. Furthermore, at 67% of sites, amino acid replacements were under sign epistasis, having both strongly positive and negative effect in different genetic backgrounds. 46% of sites were under reciprocal sign epistasis. The fitness impact of amino acid replacements was influenced by only a few genetic backgrounds but involved interaction of multiple sites, shaping a rugged fitness landscape in which many of the shortest paths between highly fit genotypes are inaccessible.


Asunto(s)
Evolución Molecular , Proteínas Fúngicas/genética , Proteínas Fúngicas/metabolismo , Aptitud Genética , Levaduras/genética , Levaduras/metabolismo , Secuencia de Aminoácidos , Sustitución de Aminoácidos , Aminoácidos/genética , Aminoácidos/metabolismo , Epistasis Genética , Proteínas Fúngicas/química , Genes Fúngicos , Genotipo , Hidroliasas/química , Hidroliasas/genética , Hidroliasas/metabolismo , Modelos Genéticos , Modelos Moleculares , Filogenia , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/química , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo
7.
Mol Cell ; 49(4): 759-71, 2013 Feb 21.
Artículo en Inglés | MEDLINE | ID: mdl-23438860

RESUMEN

Chromatin governs gene regulation and genome maintenance, yet a substantial fraction of the chromatin proteome is still unexplored. Moreover, a global model of the chromatin protein network is lacking. By screening >100 candidates we identify 42 Drosophila proteins that were not previously associated with chromatin, which all display specific genomic binding patterns. Bayesian network modeling of the binding profiles of these and 70 known chromatin components yields a detailed blueprint of the in vivo chromatin protein network. We demonstrate functional compartmentalization of this network, and predict functions for most of the previously unknown chromatin proteins, including roles in DNA replication and repair, and gene activation and repression.


Asunto(s)
Cromatina/metabolismo , Proteínas de Drosophila/metabolismo , Drosophila melanogaster/metabolismo , Proteínas Nucleares/metabolismo , Animales , Teorema de Bayes , Sitios de Unión , Línea Celular , Cromosomas de Insectos/metabolismo , Reparación del ADN , Replicación del ADN , Proteínas de Drosophila/genética , Proteínas de Drosophila/fisiología , Drosophila melanogaster/genética , Regulación de la Expresión Génica , Modelos Biológicos , Anotación de Secuencia Molecular , Proteínas Nucleares/genética , Proteínas Nucleares/fisiología , Análisis de Componente Principal , Unión Proteica , Mapeo de Interacción de Proteínas , Mapas de Interacción de Proteínas , Procesamiento Proteico-Postraduccional
8.
Genome Res ; 27(7): 1153-1161, 2017 07.
Artículo en Inglés | MEDLINE | ID: mdl-28420691

RESUMEN

Housekeeping genes of animal genomes cluster in the same chromosomal regions. It has long been suggested that this organization contributes to their steady expression across all the tissues of the organism. Here, we show that the activity of Drosophila housekeeping gene promoters depends on the expression of their neighbors. By measuring the expression of ∼85,000 reporters integrated in Kc167 cells, we identified the best predictors of expression as chromosomal contacts with the promoters and terminators of active genes. Surprisingly, the chromatin composition at the insertion site and the contacts with enhancers were less informative. These results are substantiated by the existence of genomic "paradoxical" domains, rich in euchromatic features and enhancers, but where the reporters are expressed at low level, concomitant with a deficit of interactions with promoters and terminators. This indicates that the proper function of housekeeping genes relies not on contacts with long distance enhancers but on spatial clustering. Overall, our results suggest that spatial proximity between genes increases their expression and that the linear architecture of the Drosophila genome contributes to this effect.


Asunto(s)
Regulación de la Expresión Génica/fisiología , Genes Esenciales/fisiología , Familia de Multigenes/fisiología , Animales , Línea Celular , Drosophila melanogaster
9.
Nucleic Acids Res ; 46(8): e49, 2018 05 04.
Artículo en Inglés | MEDLINE | ID: mdl-29394371

RESUMEN

The three-dimensional conformation of genomes is an essential component of their biological activity. The advent of the Hi-C technology enabled an unprecedented progress in our understanding of genome structures. However, Hi-C is subject to systematic biases that can compromise downstream analyses. Several strategies have been proposed to remove those biases, but the issue of abnormal karyotypes received little attention. Many experiments are performed in cancer cell lines, which typically harbor large-scale copy number variations that create visible defects on the raw Hi-C maps. The consequences of these widespread artifacts on the normalized maps are mostly unexplored. We observed that current normalization methods are not robust to the presence of large-scale copy number variations, potentially obscuring biological differences and enhancing batch effects. To address this issue, we developed an alternative approach designed to take into account chromosomal abnormalities. The method, called OneD, increases reproducibility among replicates of Hi-C samples with abnormal karyotype, outperforming previous methods significantly. On normal karyotypes, OneD fared equally well as state-of-the-art methods, making it a safe choice for Hi-C normalization. OneD is fast and scales well in terms of computing resources for resolutions up to 5 kb.


Asunto(s)
Cariotipo Anormal , Animales , Composición de Base , Sesgo , Línea Celular , Aberraciones Cromosómicas , Biología Computacional/métodos , Biología Computacional/estadística & datos numéricos , Simulación por Computador , Variaciones en el Número de Copia de ADN , Técnicas Genéticas , Humanos , Cadenas de Markov , Ratones , Modelos Estadísticos , Reproducibilidad de los Resultados
10.
J Virol ; 92(10)2018 05 15.
Artículo en Inglés | MEDLINE | ID: mdl-29343578

RESUMEN

Upon HIV-1 infection, a reservoir of latently infected resting T cells prevents the eradication of the virus from patients. To achieve complete depletion, the existing virus-suppressing antiretroviral therapy must be combined with drugs that reactivate the dormant viruses. We previously described a novel chemical scaffold compound, MMQO (8-methoxy-6-methylquinolin-4-ol), that is able to reactivate viral transcription in several models of HIV latency, including J-Lat cells, through an unknown mechanism. MMQO potentiates the activity of known latency-reversing agents (LRAs) or "shock" drugs, such as protein kinase C (PKC) agonists or histone deacetylase (HDAC) inhibitors. Here, we demonstrate that MMQO activates HIV-1 independently of the Tat transactivator. Gene expression microarrays in Jurkat cells indicated that MMQO treatment results in robust immunosuppression, diminishes expression of c-Myc, and causes the dysregulation of acetylation-sensitive genes. These hallmarks indicated that MMQO mimics acetylated lysines of core histones and might function as a bromodomain and extraterminal domain protein family inhibitor (BETi). MMQO functionally mimics the effects of JQ1, a well-known BETi. We confirmed that MMQO interacts with the BET family protein BRD4. Utilizing MMQO and JQ1, we demonstrate how the inhibition of BRD4 targets a subset of latently integrated barcoded proviruses distinct from those targeted by HDAC inhibitors or PKC pathway agonists. Thus, the quinoline-based compound MMQO represents a new class of BET bromodomain inhibitors that, due to its minimalistic structure, holds promise for further optimization for increased affinity and specificity for distinct bromodomain family members and could potentially be of use against a variety of diseases, including HIV infection.IMPORTANCE The suggested "shock and kill" therapy aims to eradicate the latent functional proportion of HIV-1 proviruses in a patient. However, to this day, clinical studies investigating the "shocking" element of this strategy have proven it to be considerably more difficult than anticipated. While the proportion of intracellular viral RNA production and general plasma viral load have been shown to increase upon a shock regimen, the global viral reservoir remains unaffected, highlighting both the inefficiency of the treatments used and the gap in our understanding of viral reactivation in vivo Utilizing a new BRD4 inhibitor and barcoded HIV-1 minigenomes, we demonstrate that PKC pathway activators and HDAC and bromodomain inhibitors all target different subsets of proviral integration. Considering the fundamental differences of these compounds and the synergies displayed between them, we propose that the field should concentrate on investigating the development of combinatory shock cocktail therapies for improved reservoir reactivation.


Asunto(s)
Infecciones por VIH/tratamiento farmacológico , Proteínas Nucleares/antagonistas & inhibidores , Quinolinas/farmacología , Factores de Transcripción/antagonistas & inhibidores , Activación Viral/efectos de los fármacos , Latencia del Virus/efectos de los fármacos , Azepinas/farmacología , Linfocitos T CD4-Positivos/virología , Proteínas de Ciclo Celular , Regulación Viral de la Expresión Génica/efectos de los fármacos , Células HEK293 , VIH-1/metabolismo , Células HeLa , Inhibidores de Histona Desacetilasas/farmacología , Humanos , Células Jurkat , Dominios Proteicos/efectos de los fármacos , Proteínas Proto-Oncogénicas c-myc/biosíntesis , Provirus/genética , Triazoles/farmacología , Carga Viral/efectos de los fármacos , Integración Viral/efectos de los fármacos
11.
PLoS Comput Biol ; 13(7): e1005665, 2017 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-28723903

RESUMEN

The sequence of a genome is insufficient to understand all genomic processes carried out in the cell nucleus. To achieve this, the knowledge of its three-dimensional architecture is necessary. Advances in genomic technologies and the development of new analytical methods, such as Chromosome Conformation Capture (3C) and its derivatives, provide unprecedented insights in the spatial organization of genomes. Here we present TADbit, a computational framework to analyze and model the chromatin fiber in three dimensions. Our package takes as input the sequencing reads of 3C-based experiments and performs the following main tasks: (i) pre-process the reads, (ii) map the reads to a reference genome, (iii) filter and normalize the interaction data, (iv) analyze the resulting interaction matrices, (v) build 3D models of selected genomic domains, and (vi) analyze the resulting models to characterize their structural properties. To illustrate the use of TADbit, we automatically modeled 50 genomic domains from the fly genome revealing differential structural features of the previously defined chromatin colors, establishing a link between the conformation of the genome and the local chromatin composition. TADbit provides three-dimensional models built from 3C-based experiments, which are ready for visualization and for characterizing their relation to gene expression and epigenetic states. TADbit is an open-source Python library available for download from https://github.com/3DGenomes/tadbit.


Asunto(s)
Cromatina/genética , Cromatina/ultraestructura , Biología Computacional/métodos , Drosophila melanogaster/genética , Genoma de los Insectos/genética , Imagenología Tridimensional/métodos , Programas Informáticos , Algoritmos , Animales
12.
Bioinformatics ; 32(19): 2896-902, 2016 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-27288492

RESUMEN

MOTIVATION: Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is the standard method to investigate chromatin protein composition. As the number of community-available ChIP-seq profiles increases, it becomes more common to use data from different sources, which makes joint analysis challenging. Issues such as lack of reproducibility, heterogeneous quality and conflicts between replicates become evident when comparing datasets, especially when they are produced by different laboratories. RESULTS: Here, we present Zerone, a ChIP-seq discretizer with built-in quality control. Zerone is powered by a Hidden Markov Model with zero-inflated negative multinomial emissions, which allows it to merge several replicates into a single discretized profile. To identify low quality or irreproducible data, we trained a Support Vector Machine and integrated it as part of the discretization process. The result is a classifier reaching 95% accuracy in detecting low quality profiles. We also introduce a graphical representation to compare discretization quality and we show that Zerone achieves outstanding accuracy. Finally, on current hardware, Zerone discretizes a ChIP-seq experiment on mammalian genomes in about 5 min using less than 700 MB of memory. AVAILABILITY AND IMPLEMENTATION: Zerone is available as a command line tool and as an R package. The C source code and R scripts can be downloaded from https://github.com/nanakiksc/zerone The information to reproduce the benchmark and the figures is stored in a public Docker image that can be downloaded from https://hub.docker.com/r/nanakiksc/zerone/ CONTACT: : guillaume.filion@gmail.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Inmunoprecipitación de Cromatina , Animales , Replicación del ADN , Genoma , Control de Calidad , Reproducibilidad de los Resultados , Análisis de Secuencia de ADN
13.
Bioinformatics ; 31(12): 1913-9, 2015 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-25638815

RESUMEN

MOTIVATION: The increasing throughput of sequencing technologies offers new applications and challenges for computational biology. In many of those applications, sequencing errors need to be corrected. This is particularly important when sequencing reads from an unknown reference such as random DNA barcodes. In this case, error correction can be done by performing a pairwise comparison of all the barcodes, which is a computationally complex problem. RESULTS: Here, we address this challenge and describe an exact algorithm to determine which pairs of sequences lie within a given Levenshtein distance. For error correction or redundancy reduction purposes, matched pairs are then merged into clusters of similar sequences. The efficiency of starcode is attributable to the poucet search, a novel implementation of the Needleman-Wunsch algorithm performed on the nodes of a trie. On the task of matching random barcodes, starcode outperforms sequence clustering algorithms in both speed and precision. AVAILABILITY AND IMPLEMENTATION: The C source code is available at http://github.com/gui11aume/starcode.


Asunto(s)
Algoritmos , Análisis por Conglomerados , Biología Computacional/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Humanos
14.
Genome Res ; 20(2): 190-200, 2010 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-20007327

RESUMEN

In eukaryotes, many chromatin proteins together regulate gene expression. Chromatin proteins often direct the genomic binding pattern of other chromatin proteins, for example, by recruitment or competition mechanisms. The network of such targeting interactions in chromatin is complex and still poorly understood. Based on genome-wide binding maps, we constructed a Bayesian network model of the targeting interactions among a broad set of 43 chromatin components in Drosophila cells. This model predicts many novel functional relationships. For example, we found that the homologous proteins HP1 and HP1C each target the heterochromatin protein HP3 to distinct sets of genes in a competitive manner. We also discovered a central role for the remodeling factor Brahma in the targeting of several DNA-binding factors, including GAGA factor, JRA, and SU(VAR)3-7. Our network model provides a global view of the targeting interplay among dozens of chromatin components.


Asunto(s)
Cromatina/metabolismo , Proteínas de Unión al ADN/metabolismo , Drosophila melanogaster/genética , Redes Reguladoras de Genes , Redes y Vías Metabólicas , Animales , Teorema de Bayes , Modelos Biológicos , Mapeo de Interacción de Proteínas
15.
Genome Biol ; 23(1): 93, 2022 04 12.
Artículo en Inglés | MEDLINE | ID: mdl-35414014

RESUMEN

BACKGROUND: Biases of DNA repair can shape the nucleotide landscape of genomes at evolutionary timescales. The molecular mechanisms of those biases are still poorly understood because it is difficult to isolate the contributions of DNA repair from those of DNA damage. RESULTS: Here, we develop a genome-wide assay whereby the same DNA lesion is repaired in different genomic contexts. We insert thousands of barcoded transposons carrying a reporter of DNA mismatch repair in the genome of mouse embryonic stem cells. Upon inducing a double-strand break between tandem repeats, a mismatch is generated if the break is repaired through single-strand annealing. The resolution of the mismatch showed a 60-80% bias in favor of the strand with the longest 3' flap. The location of the lesion in the genome and the type of mismatch had little influence on the bias. Instead, we observe a complete reversal of the bias when the longest 3' flap is moved to the opposite strand by changing the position of the double-strand break in the reporter. CONCLUSIONS: These results suggest that the processing of the double-strand break has a major influence on the repair of mismatches during a single-strand annealing.


Asunto(s)
Roturas del ADN de Doble Cadena , Reparación del ADN , Animales , ADN , Daño del ADN , Ratones
16.
Nat Commun ; 12(1): 3499, 2021 06 09.
Artículo en Inglés | MEDLINE | ID: mdl-34108480

RESUMEN

A hallmark of chromosome organization is the partition into transcriptionally active A and repressed B compartments, and into topologically associating domains (TADs). Both structures were regarded to be absent from the inactive mouse X chromosome, but to be re-established with transcriptional reactivation and chromatin opening during X-reactivation. Here, we combine a tailor-made mouse iPSC reprogramming system and high-resolution Hi-C to produce a time course combining gene reactivation, chromatin opening and chromosome topology during X-reactivation. Contrary to previous observations, we observe A/B-like compartments on the inactive X harbouring multiple subcompartments. While partial X-reactivation initiates within a compartment rich in X-inactivation escapees, it then occurs rapidly along the chromosome, concomitant with downregulation of Xist. Importantly, we find that TAD formation precedes transcription and initiates from Xist-poor compartments. Here, we show that TAD formation and transcriptional reactivation are causally independent during X-reactivation while establishing Xist as a common denominator.


Asunto(s)
Transcripción Genética , Inactivación del Cromosoma X/genética , Cromosoma X/metabolismo , Animales , Reprogramación Celular/genética , Ensamble y Desensamble de Cromatina , Células Madre Pluripotentes Inducidas/citología , Células Madre Pluripotentes Inducidas/metabolismo , Ratones , ARN Largo no Codificante/genética , ARN Largo no Codificante/metabolismo , Cromatina Sexual/genética , Cromatina Sexual/metabolismo , Cromosoma X/genética
17.
Front Genet ; 11: 572, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32670351

RESUMEN

The increasing throughput of DNA sequencing technologies creates a need for faster algorithms. The fate of most reads is to be mapped to a reference sequence, typically a genome. Modern mappers rely on heuristics to gain speed at a reasonable cost for accuracy. In the seeding heuristic, short matches between the reads and the genome are used to narrow the search to a set of candidate locations. Several seeding variants used in modern mappers show good empirical performance but they are difficult to calibrate or to optimize for lack of theoretical results. Here we develop a theory to estimate the probability that the correct location of a read is filtered out during seeding, resulting in mapping errors. We describe the properties of simple exact seeds, skip seeds and MEM seeds (Maximal Exact Match seeds). The main innovation of this work is to use concepts from analytic combinatorics to represent reads as abstract sequences, and to specify their generative function to estimate the probabilities of interest. We provide several algorithms, which together give a workable solution for the problem of calibrating seeding heuristics for short reads. We also provide a C implementation of these algorithms in a library called Sesame. These results can improve current mapping algorithms and lay the foundation of a general strategy to tackle sequence alignment problems. The Sesame library is open source and available for download at https://github.com/gui11aume/sesame.

18.
Mol Cell Biol ; 26(1): 169-81, 2006 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-16354688

RESUMEN

In vertebrates, densely methylated DNA is associated with inactive transcription. Actors in this process include proteins of the MBD family that can recognize methylated CpGs and repress transcription. Kaiso, a structurally unrelated protein, has also been shown to bind methylated CGCGs through its three Krüppel-like C2H2 zinc fingers. The human genome contains two uncharacterized proteins, ZBTB4 and ZBTB38, that contain Kaiso-like zinc fingers. We report that ZBTB4 and ZBTB38 bind methylated DNA in vitro and in vivo. Unlike Kaiso, they can bind single methylated CpGs. When transfected in mouse cells, the proteins colocalize with foci of heavily methylated satellite DNA and become delocalized upon loss of DNA methylation. Chromatin immunoprecipitation suggests that both of these proteins specifically bind to the methylated allele of the H19/Igf2 differentially methylated region. ZBTB4 and ZBTB38 repress the transcription of methylated templates in transfection assays. The two genes have distinct tissue-specific expression patterns, but both are highly expressed in the brain. Our results reveal the existence of a family of Kaiso-like proteins that bind methylated CpGs. Like proteins of the MBD family, they are able to repress transcription in a methyl-dependent manner, yet their tissue-specific expression pattern suggests nonoverlapping functions.


Asunto(s)
Metilación de ADN , Proteínas de Unión al ADN/metabolismo , Proteínas Represoras/metabolismo , Dedos de Zinc , Secuencia de Aminoácidos , Animales , Encéfalo/metabolismo , Química Encefálica , Islas de CpG , ADN/metabolismo , Proteínas de Unión al ADN/análisis , Proteínas de Unión al ADN/genética , Regulación de la Expresión Génica , Humanos , Ratones , Datos de Secuencia Molecular , Filogenia , Proteínas Represoras/análisis , Proteínas Represoras/genética , Factores de Transcripción/análisis , Factores de Transcripción/metabolismo , Transcripción Genética
19.
Nat Commun ; 10(1): 4059, 2019 09 06.
Artículo en Inglés | MEDLINE | ID: mdl-31492853

RESUMEN

HIV-1 recurrently targets active genes and integrates in the proximity of the nuclear pore compartment in CD4+ T cells. However, the genomic features of these genes and the relevance of their transcriptional activity for HIV-1 integration have so far remained unclear. Here we show that recurrently targeted genes are proximal to super-enhancer genomic elements and that they cluster in specific spatial compartments of the T cell nucleus. We further show that these gene clusters acquire their location during the activation of T cells. The clustering of these genes along with their transcriptional activity are the major determinants of HIV-1 integration in T cells. Our results provide evidence of the relevance of the spatial compartmentalization of the genome for HIV-1 integration, thus further strengthening the role of nuclear architecture in viral infection.


Asunto(s)
Linfocitos T CD4-Positivos/metabolismo , Núcleo Celular/genética , Elementos de Facilitación Genéticos , VIH-1/genética , Integración Viral/genética , Secuencia de Bases , Linfocitos T CD4-Positivos/virología , Núcleo Celular/metabolismo , Núcleo Celular/virología , Cromatina/genética , Cromatina/virología , Infecciones por VIH/genética , Infecciones por VIH/inmunología , Infecciones por VIH/virología , VIH-1/fisiología , Humanos , Poro Nuclear/genética , Poro Nuclear/virología , Regiones Promotoras Genéticas/genética , Transcripción Genética
20.
Nat Commun ; 9(1): 1740, 2018 04 30.
Artículo en Inglés | MEDLINE | ID: mdl-29712907

RESUMEN

All organisms regulate transcription of their genes. To understand this process, a complete understanding of how transcription factors find their targets in cellular nuclei is essential. The DNA sequence and other variables are known to influence this binding, but the distribution of transcription factor binding patterns remains mostly unexplained in metazoan genomes. Here, we investigate the role of chromosome conformation in the trajectories of transcription factors. Using molecular dynamics simulations, we uncover the principles of their diffusion on chromatin. Chromosome contacts play a conflicting role: at low density they enhance transcription factor traffic, but at high density they lower it by volume exclusion. Consistently, we observe that in human cells, highly occupied targets, where protein binding is promiscuous, are found at sites engaged in chromosome loops within uncompacted chromatin. In summary, we provide a framework for understanding the search trajectories of transcription factors, highlighting the key contribution of genome conformation.


Asunto(s)
Cromatina/química , Genoma Humano , Factores de Transcripción/metabolismo , Transcripción Genética , Línea Celular Transformada , Cromatina/ultraestructura , Humanos , Linfocitos/citología , Linfocitos/metabolismo , Modelos Genéticos , Simulación de Dinámica Molecular , Factores de Transcripción/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA