Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 34
Filtrar
1.
Trends Immunol ; 43(6): 449-458, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35490134

RESUMO

Several viruses hide in the genome of their host. To complete their replication cycle, they need to integrate in the form of a provirus and express their genes. In vertebrates, integrated viruses can be silenced by chromatin, implying that some specific mechanisms exist to detect non-self genes. The known mechanisms depend on sequence features of retroelements, but the fluctuations of virus expression suggest that other determinants also exist. Here we review the mechanisms allowing chromatin to silence integrated viruses and propose that DNA repair may help flag them as 'non-self' shortly after their genomic insertion.


Assuntos
Cromatina , Integração Viral , Animais , Cromatina/genética , Inativação Gênica , Humanos , Provírus/genética , Integração Viral/genética
2.
Cell ; 143(2): 212-24, 2010 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-20888037

RESUMO

Chromatin is important for the regulation of transcription and other functions, yet the diversity of chromatin composition and the distribution along chromosomes are still poorly characterized. By integrative analysis of genome-wide binding maps of 53 broadly selected chromatin components in Drosophila cells, we show that the genome is segmented into five principal chromatin types that are defined by unique yet overlapping combinations of proteins and form domains that can extend over > 100 kb. We identify a repressive chromatin type that covers about half of the genome and lacks classic heterochromatin markers. Furthermore, transcriptionally active euchromatin consists of two types that differ in molecular organization and H3K36 methylation and regulate distinct classes of genes. Finally, we provide evidence that the different chromatin types help to target DNA-binding factors to specific genomic regions. These results provide a global view of chromatin diversity and domain organization in a metazoan cell.


Assuntos
Cromatina/classificação , Proteínas de Ligação a DNA/análise , Proteínas de Drosophila/análise , Drosophila melanogaster/genética , Animais , Linhagem Celular , Cromatina/metabolismo , Proteínas de Ligação a DNA/metabolismo , Proteínas de Drosophila/metabolismo , Drosophila melanogaster/metabolismo , Eucromatina/metabolismo , Heterocromatina/metabolismo , Histonas/metabolismo , Análise de Componente Principal
3.
Nature ; 569(7756): 345-354, 2019 05.
Artigo em Inglês | MEDLINE | ID: mdl-31092938

RESUMO

How cells adopt different identities has long fascinated biologists. Signal transduction in response to environmental cues results in the activation of transcription factors that determine the gene-expression program characteristic of each cell type. Technological advances in the study of 3D chromatin folding are bringing the role of genome conformation in transcriptional regulation to the fore. Characterizing this role of genome architecture has profound implications, not only for differentiation and development but also for diseases including developmental malformations and cancer. Here we review recent studies indicating that the interplay between transcription and genome conformation is a driving force for cell-fate decisions.


Assuntos
Diferenciação Celular/genética , Células/citologia , Células/metabolismo , Genoma , Fatores de Transcrição/metabolismo , Animais , Montagem e Desmontagem da Cromatina/genética , Posicionamento Cromossômico , Regulação da Expressão Gênica , Genoma/genética , Humanos , Especificidade de Órgãos/genética
4.
Mol Cell ; 67(4): 550-565.e5, 2017 Aug 17.
Artigo em Inglês | MEDLINE | ID: mdl-28803780

RESUMO

DNA methylation is an essential epigenetic mark in mammals that has to be re-established after each round of DNA replication. The protein UHRF1 is essential for this process; it has been proposed that the protein targets newly replicated DNA by cooperatively binding hemi-methylated DNA and H3K9me2/3, but this model leaves a number of questions unanswered. Here, we present evidence for a direct recruitment of UHRF1 by the replication machinery via DNA ligase 1 (LIG1). A histone H3K9-like mimic within LIG1 is methylated by G9a and GLP and, compared with H3K9me2/3, more avidly binds UHRF1. Interaction with methylated LIG1 promotes the recruitment of UHRF1 to DNA replication sites and is required for DNA methylation maintenance. These results further elucidate the function of UHRF1, identify a non-histone target of G9a and GLP, and provide an example of a histone mimic that coordinates DNA replication and DNA methylation maintenance.


Assuntos
Proteínas Estimuladoras de Ligação a CCAAT/metabolismo , DNA Ligase Dependente de ATP/metabolismo , Metilação de DNA , Replicação do DNA , DNA/biossíntese , Epigênese Genética , Antígenos de Histocompatibilidade/metabolismo , Histona-Lisina N-Metiltransferase/metabolismo , Processamento de Proteína Pós-Traducional , Animais , Proteínas Estimuladoras de Ligação a CCAAT/química , Proteínas Estimuladoras de Ligação a CCAAT/genética , DNA/genética , DNA Ligase Dependente de ATP/química , DNA Ligase Dependente de ATP/genética , Células-Tronco Embrionárias/enzimologia , Células HEK293 , Células HeLa , Antígenos de Histocompatibilidade/química , Antígenos de Histocompatibilidade/genética , Histona-Lisina N-Metiltransferase/química , Histona-Lisina N-Metiltransferase/genética , Histonas/metabolismo , Humanos , Lisina , Metilação , Camundongos , Modelos Moleculares , Mimetismo Molecular , Mutação , Ligação Proteica , Conformação Proteica , Relação Estrutura-Atividade , Transfecção , Domínio Tudor , Ubiquitina-Proteína Ligases
5.
Bioessays ; 44(10): e2200105, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-36028473

RESUMO

The spatial organization of genomes is becoming increasingly understood. In mammals, where it is most investigated, this organization ties in with transcription, so an important research objective is to understand whether gene activity is a cause or a consequence of genome folding in space. In this regard, the phenomena of X-chromosome inactivation and reactivation open a unique window of investigation because of the singularities of the inactive X chromosome. Here we focus on the cause-consequence nexus between genome conformation and transcription and explain how recent results about the structural changes associated with inactivation and reactivation of the X chromosome shed light on this problem.


Assuntos
Inativação do Cromossomo X , Cromossomo X , Animais , Genoma/genética , Mamíferos/genética , Inativação do Cromossomo X/genética
6.
PLoS Genet ; 15(4): e1008079, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30969963

RESUMO

Characterizing the fitness landscape, a representation of fitness for a large set of genotypes, is key to understanding how genetic information is interpreted to create functional organisms. Here we determined the evolutionarily-relevant segment of the fitness landscape of His3, a gene coding for an enzyme in the histidine synthesis pathway, focusing on combinations of amino acid states found at orthologous sites of extant species. Just 15% of amino acids found in yeast His3 orthologues were always neutral while the impact on fitness of the remaining 85% depended on the genetic background. Furthermore, at 67% of sites, amino acid replacements were under sign epistasis, having both strongly positive and negative effect in different genetic backgrounds. 46% of sites were under reciprocal sign epistasis. The fitness impact of amino acid replacements was influenced by only a few genetic backgrounds but involved interaction of multiple sites, shaping a rugged fitness landscape in which many of the shortest paths between highly fit genotypes are inaccessible.


Assuntos
Evolução Molecular , Proteínas Fúngicas/genética , Proteínas Fúngicas/metabolismo , Aptidão Genética , Leveduras/genética , Leveduras/metabolismo , Sequência de Aminoácidos , Substituição de Aminoácidos , Aminoácidos/genética , Aminoácidos/metabolismo , Epistasia Genética , Proteínas Fúngicas/química , Genes Fúngicos , Genótipo , Hidroliases/química , Hidroliases/genética , Hidroliases/metabolismo , Modelos Genéticos , Modelos Moleculares , Filogenia , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/química , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo
7.
Mol Cell ; 49(4): 759-71, 2013 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-23438860

RESUMO

Chromatin governs gene regulation and genome maintenance, yet a substantial fraction of the chromatin proteome is still unexplored. Moreover, a global model of the chromatin protein network is lacking. By screening >100 candidates we identify 42 Drosophila proteins that were not previously associated with chromatin, which all display specific genomic binding patterns. Bayesian network modeling of the binding profiles of these and 70 known chromatin components yields a detailed blueprint of the in vivo chromatin protein network. We demonstrate functional compartmentalization of this network, and predict functions for most of the previously unknown chromatin proteins, including roles in DNA replication and repair, and gene activation and repression.


Assuntos
Cromatina/metabolismo , Proteínas de Drosophila/metabolismo , Drosophila melanogaster/metabolismo , Proteínas Nucleares/metabolismo , Animais , Teorema de Bayes , Sítios de Ligação , Linhagem Celular , Cromossomos de Insetos/metabolismo , Reparo do DNA , Replicação do DNA , Proteínas de Drosophila/genética , Proteínas de Drosophila/fisiologia , Drosophila melanogaster/genética , Regulação da Expressão Gênica , Modelos Biológicos , Anotação de Sequência Molecular , Proteínas Nucleares/genética , Proteínas Nucleares/fisiologia , Análise de Componente Principal , Ligação Proteica , Mapeamento de Interação de Proteínas , Mapas de Interação de Proteínas , Processamento de Proteína Pós-Traducional
8.
Genome Res ; 27(7): 1153-1161, 2017 07.
Artigo em Inglês | MEDLINE | ID: mdl-28420691

RESUMO

Housekeeping genes of animal genomes cluster in the same chromosomal regions. It has long been suggested that this organization contributes to their steady expression across all the tissues of the organism. Here, we show that the activity of Drosophila housekeeping gene promoters depends on the expression of their neighbors. By measuring the expression of ∼85,000 reporters integrated in Kc167 cells, we identified the best predictors of expression as chromosomal contacts with the promoters and terminators of active genes. Surprisingly, the chromatin composition at the insertion site and the contacts with enhancers were less informative. These results are substantiated by the existence of genomic "paradoxical" domains, rich in euchromatic features and enhancers, but where the reporters are expressed at low level, concomitant with a deficit of interactions with promoters and terminators. This indicates that the proper function of housekeeping genes relies not on contacts with long distance enhancers but on spatial clustering. Overall, our results suggest that spatial proximity between genes increases their expression and that the linear architecture of the Drosophila genome contributes to this effect.


Assuntos
Regulação da Expressão Gênica/fisiologia , Genes Essenciais/fisiologia , Família Multigênica/fisiologia , Animais , Linhagem Celular , Drosophila melanogaster
9.
Nucleic Acids Res ; 46(8): e49, 2018 05 04.
Artigo em Inglês | MEDLINE | ID: mdl-29394371

RESUMO

The three-dimensional conformation of genomes is an essential component of their biological activity. The advent of the Hi-C technology enabled an unprecedented progress in our understanding of genome structures. However, Hi-C is subject to systematic biases that can compromise downstream analyses. Several strategies have been proposed to remove those biases, but the issue of abnormal karyotypes received little attention. Many experiments are performed in cancer cell lines, which typically harbor large-scale copy number variations that create visible defects on the raw Hi-C maps. The consequences of these widespread artifacts on the normalized maps are mostly unexplored. We observed that current normalization methods are not robust to the presence of large-scale copy number variations, potentially obscuring biological differences and enhancing batch effects. To address this issue, we developed an alternative approach designed to take into account chromosomal abnormalities. The method, called OneD, increases reproducibility among replicates of Hi-C samples with abnormal karyotype, outperforming previous methods significantly. On normal karyotypes, OneD fared equally well as state-of-the-art methods, making it a safe choice for Hi-C normalization. OneD is fast and scales well in terms of computing resources for resolutions up to 5 kb.


Assuntos
Cariótipo Anormal , Animais , Composição de Bases , Viés , Linhagem Celular , Aberrações Cromossômicas , Biologia Computacional/métodos , Biologia Computacional/estatística & dados numéricos , Simulação por Computador , Variações do Número de Cópias de DNA , Técnicas Genéticas , Humanos , Cadeias de Markov , Camundongos , Modelos Estatísticos , Reprodutibilidade dos Testes
10.
J Virol ; 92(10)2018 05 15.
Artigo em Inglês | MEDLINE | ID: mdl-29343578

RESUMO

Upon HIV-1 infection, a reservoir of latently infected resting T cells prevents the eradication of the virus from patients. To achieve complete depletion, the existing virus-suppressing antiretroviral therapy must be combined with drugs that reactivate the dormant viruses. We previously described a novel chemical scaffold compound, MMQO (8-methoxy-6-methylquinolin-4-ol), that is able to reactivate viral transcription in several models of HIV latency, including J-Lat cells, through an unknown mechanism. MMQO potentiates the activity of known latency-reversing agents (LRAs) or "shock" drugs, such as protein kinase C (PKC) agonists or histone deacetylase (HDAC) inhibitors. Here, we demonstrate that MMQO activates HIV-1 independently of the Tat transactivator. Gene expression microarrays in Jurkat cells indicated that MMQO treatment results in robust immunosuppression, diminishes expression of c-Myc, and causes the dysregulation of acetylation-sensitive genes. These hallmarks indicated that MMQO mimics acetylated lysines of core histones and might function as a bromodomain and extraterminal domain protein family inhibitor (BETi). MMQO functionally mimics the effects of JQ1, a well-known BETi. We confirmed that MMQO interacts with the BET family protein BRD4. Utilizing MMQO and JQ1, we demonstrate how the inhibition of BRD4 targets a subset of latently integrated barcoded proviruses distinct from those targeted by HDAC inhibitors or PKC pathway agonists. Thus, the quinoline-based compound MMQO represents a new class of BET bromodomain inhibitors that, due to its minimalistic structure, holds promise for further optimization for increased affinity and specificity for distinct bromodomain family members and could potentially be of use against a variety of diseases, including HIV infection.IMPORTANCE The suggested "shock and kill" therapy aims to eradicate the latent functional proportion of HIV-1 proviruses in a patient. However, to this day, clinical studies investigating the "shocking" element of this strategy have proven it to be considerably more difficult than anticipated. While the proportion of intracellular viral RNA production and general plasma viral load have been shown to increase upon a shock regimen, the global viral reservoir remains unaffected, highlighting both the inefficiency of the treatments used and the gap in our understanding of viral reactivation in vivo Utilizing a new BRD4 inhibitor and barcoded HIV-1 minigenomes, we demonstrate that PKC pathway activators and HDAC and bromodomain inhibitors all target different subsets of proviral integration. Considering the fundamental differences of these compounds and the synergies displayed between them, we propose that the field should concentrate on investigating the development of combinatory shock cocktail therapies for improved reservoir reactivation.


Assuntos
Infecções por HIV/tratamento farmacológico , Proteínas Nucleares/antagonistas & inibidores , Quinolinas/farmacologia , Fatores de Transcrição/antagonistas & inibidores , Ativação Viral/efeitos dos fármacos , Latência Viral/efeitos dos fármacos , Azepinas/farmacologia , Linfócitos T CD4-Positivos/virologia , Proteínas de Ciclo Celular , Regulação Viral da Expressão Gênica/efeitos dos fármacos , Células HEK293 , HIV-1/metabolismo , Células HeLa , Inibidores de Histona Desacetilases/farmacologia , Humanos , Células Jurkat , Domínios Proteicos/efeitos dos fármacos , Proteínas Proto-Oncogênicas c-myc/biossíntese , Provírus/genética , Triazóis/farmacologia , Carga Viral/efeitos dos fármacos , Integração Viral/efeitos dos fármacos
11.
PLoS Comput Biol ; 13(7): e1005665, 2017 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-28723903

RESUMO

The sequence of a genome is insufficient to understand all genomic processes carried out in the cell nucleus. To achieve this, the knowledge of its three-dimensional architecture is necessary. Advances in genomic technologies and the development of new analytical methods, such as Chromosome Conformation Capture (3C) and its derivatives, provide unprecedented insights in the spatial organization of genomes. Here we present TADbit, a computational framework to analyze and model the chromatin fiber in three dimensions. Our package takes as input the sequencing reads of 3C-based experiments and performs the following main tasks: (i) pre-process the reads, (ii) map the reads to a reference genome, (iii) filter and normalize the interaction data, (iv) analyze the resulting interaction matrices, (v) build 3D models of selected genomic domains, and (vi) analyze the resulting models to characterize their structural properties. To illustrate the use of TADbit, we automatically modeled 50 genomic domains from the fly genome revealing differential structural features of the previously defined chromatin colors, establishing a link between the conformation of the genome and the local chromatin composition. TADbit provides three-dimensional models built from 3C-based experiments, which are ready for visualization and for characterizing their relation to gene expression and epigenetic states. TADbit is an open-source Python library available for download from https://github.com/3DGenomes/tadbit.


Assuntos
Cromatina/genética , Cromatina/ultraestrutura , Biologia Computacional/métodos , Drosophila melanogaster/genética , Genoma de Inseto/genética , Imageamento Tridimensional/métodos , Software , Algoritmos , Animais
12.
Bioinformatics ; 32(19): 2896-902, 2016 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-27288492

RESUMO

MOTIVATION: Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is the standard method to investigate chromatin protein composition. As the number of community-available ChIP-seq profiles increases, it becomes more common to use data from different sources, which makes joint analysis challenging. Issues such as lack of reproducibility, heterogeneous quality and conflicts between replicates become evident when comparing datasets, especially when they are produced by different laboratories. RESULTS: Here, we present Zerone, a ChIP-seq discretizer with built-in quality control. Zerone is powered by a Hidden Markov Model with zero-inflated negative multinomial emissions, which allows it to merge several replicates into a single discretized profile. To identify low quality or irreproducible data, we trained a Support Vector Machine and integrated it as part of the discretization process. The result is a classifier reaching 95% accuracy in detecting low quality profiles. We also introduce a graphical representation to compare discretization quality and we show that Zerone achieves outstanding accuracy. Finally, on current hardware, Zerone discretizes a ChIP-seq experiment on mammalian genomes in about 5 min using less than 700 MB of memory. AVAILABILITY AND IMPLEMENTATION: Zerone is available as a command line tool and as an R package. The C source code and R scripts can be downloaded from https://github.com/nanakiksc/zerone The information to reproduce the benchmark and the figures is stored in a public Docker image that can be downloaded from https://hub.docker.com/r/nanakiksc/zerone/ CONTACT: : guillaume.filion@gmail.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Imunoprecipitação da Cromatina , Animais , Replicação do DNA , Genoma , Controle de Qualidade , Reprodutibilidade dos Testes , Análise de Sequência de DNA
13.
Bioinformatics ; 31(12): 1913-9, 2015 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-25638815

RESUMO

MOTIVATION: The increasing throughput of sequencing technologies offers new applications and challenges for computational biology. In many of those applications, sequencing errors need to be corrected. This is particularly important when sequencing reads from an unknown reference such as random DNA barcodes. In this case, error correction can be done by performing a pairwise comparison of all the barcodes, which is a computationally complex problem. RESULTS: Here, we address this challenge and describe an exact algorithm to determine which pairs of sequences lie within a given Levenshtein distance. For error correction or redundancy reduction purposes, matched pairs are then merged into clusters of similar sequences. The efficiency of starcode is attributable to the poucet search, a novel implementation of the Needleman-Wunsch algorithm performed on the nodes of a trie. On the task of matching random barcodes, starcode outperforms sequence clustering algorithms in both speed and precision. AVAILABILITY AND IMPLEMENTATION: The C source code is available at http://github.com/gui11aume/starcode.


Assuntos
Algoritmos , Análise por Conglomerados , Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Software , Humanos
14.
Genome Res ; 20(2): 190-200, 2010 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-20007327

RESUMO

In eukaryotes, many chromatin proteins together regulate gene expression. Chromatin proteins often direct the genomic binding pattern of other chromatin proteins, for example, by recruitment or competition mechanisms. The network of such targeting interactions in chromatin is complex and still poorly understood. Based on genome-wide binding maps, we constructed a Bayesian network model of the targeting interactions among a broad set of 43 chromatin components in Drosophila cells. This model predicts many novel functional relationships. For example, we found that the homologous proteins HP1 and HP1C each target the heterochromatin protein HP3 to distinct sets of genes in a competitive manner. We also discovered a central role for the remodeling factor Brahma in the targeting of several DNA-binding factors, including GAGA factor, JRA, and SU(VAR)3-7. Our network model provides a global view of the targeting interplay among dozens of chromatin components.


Assuntos
Cromatina/metabolismo , Proteínas de Ligação a DNA/metabolismo , Drosophila melanogaster/genética , Redes Reguladoras de Genes , Redes e Vias Metabólicas , Animais , Teorema de Bayes , Modelos Biológicos , Mapeamento de Interação de Proteínas
15.
Genome Biol ; 23(1): 93, 2022 04 12.
Artigo em Inglês | MEDLINE | ID: mdl-35414014

RESUMO

BACKGROUND: Biases of DNA repair can shape the nucleotide landscape of genomes at evolutionary timescales. The molecular mechanisms of those biases are still poorly understood because it is difficult to isolate the contributions of DNA repair from those of DNA damage. RESULTS: Here, we develop a genome-wide assay whereby the same DNA lesion is repaired in different genomic contexts. We insert thousands of barcoded transposons carrying a reporter of DNA mismatch repair in the genome of mouse embryonic stem cells. Upon inducing a double-strand break between tandem repeats, a mismatch is generated if the break is repaired through single-strand annealing. The resolution of the mismatch showed a 60-80% bias in favor of the strand with the longest 3' flap. The location of the lesion in the genome and the type of mismatch had little influence on the bias. Instead, we observe a complete reversal of the bias when the longest 3' flap is moved to the opposite strand by changing the position of the double-strand break in the reporter. CONCLUSIONS: These results suggest that the processing of the double-strand break has a major influence on the repair of mismatches during a single-strand annealing.


Assuntos
Quebras de DNA de Cadeia Dupla , Reparo do DNA , Animais , DNA , Dano ao DNA , Camundongos
16.
Nat Commun ; 12(1): 3499, 2021 06 09.
Artigo em Inglês | MEDLINE | ID: mdl-34108480

RESUMO

A hallmark of chromosome organization is the partition into transcriptionally active A and repressed B compartments, and into topologically associating domains (TADs). Both structures were regarded to be absent from the inactive mouse X chromosome, but to be re-established with transcriptional reactivation and chromatin opening during X-reactivation. Here, we combine a tailor-made mouse iPSC reprogramming system and high-resolution Hi-C to produce a time course combining gene reactivation, chromatin opening and chromosome topology during X-reactivation. Contrary to previous observations, we observe A/B-like compartments on the inactive X harbouring multiple subcompartments. While partial X-reactivation initiates within a compartment rich in X-inactivation escapees, it then occurs rapidly along the chromosome, concomitant with downregulation of Xist. Importantly, we find that TAD formation precedes transcription and initiates from Xist-poor compartments. Here, we show that TAD formation and transcriptional reactivation are causally independent during X-reactivation while establishing Xist as a common denominator.


Assuntos
Transcrição Gênica , Inativação do Cromossomo X/genética , Cromossomo X/metabolismo , Animais , Reprogramação Celular/genética , Montagem e Desmontagem da Cromatina , Células-Tronco Pluripotentes Induzidas/citologia , Células-Tronco Pluripotentes Induzidas/metabolismo , Camundongos , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo , Cromatina Sexual/genética , Cromatina Sexual/metabolismo , Cromossomo X/genética
17.
Front Genet ; 11: 572, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32670351

RESUMO

The increasing throughput of DNA sequencing technologies creates a need for faster algorithms. The fate of most reads is to be mapped to a reference sequence, typically a genome. Modern mappers rely on heuristics to gain speed at a reasonable cost for accuracy. In the seeding heuristic, short matches between the reads and the genome are used to narrow the search to a set of candidate locations. Several seeding variants used in modern mappers show good empirical performance but they are difficult to calibrate or to optimize for lack of theoretical results. Here we develop a theory to estimate the probability that the correct location of a read is filtered out during seeding, resulting in mapping errors. We describe the properties of simple exact seeds, skip seeds and MEM seeds (Maximal Exact Match seeds). The main innovation of this work is to use concepts from analytic combinatorics to represent reads as abstract sequences, and to specify their generative function to estimate the probabilities of interest. We provide several algorithms, which together give a workable solution for the problem of calibrating seeding heuristics for short reads. We also provide a C implementation of these algorithms in a library called Sesame. These results can improve current mapping algorithms and lay the foundation of a general strategy to tackle sequence alignment problems. The Sesame library is open source and available for download at https://github.com/gui11aume/sesame.

18.
Mol Cell Biol ; 26(1): 169-81, 2006 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-16354688

RESUMO

In vertebrates, densely methylated DNA is associated with inactive transcription. Actors in this process include proteins of the MBD family that can recognize methylated CpGs and repress transcription. Kaiso, a structurally unrelated protein, has also been shown to bind methylated CGCGs through its three Krüppel-like C2H2 zinc fingers. The human genome contains two uncharacterized proteins, ZBTB4 and ZBTB38, that contain Kaiso-like zinc fingers. We report that ZBTB4 and ZBTB38 bind methylated DNA in vitro and in vivo. Unlike Kaiso, they can bind single methylated CpGs. When transfected in mouse cells, the proteins colocalize with foci of heavily methylated satellite DNA and become delocalized upon loss of DNA methylation. Chromatin immunoprecipitation suggests that both of these proteins specifically bind to the methylated allele of the H19/Igf2 differentially methylated region. ZBTB4 and ZBTB38 repress the transcription of methylated templates in transfection assays. The two genes have distinct tissue-specific expression patterns, but both are highly expressed in the brain. Our results reveal the existence of a family of Kaiso-like proteins that bind methylated CpGs. Like proteins of the MBD family, they are able to repress transcription in a methyl-dependent manner, yet their tissue-specific expression pattern suggests nonoverlapping functions.


Assuntos
Metilação de DNA , Proteínas de Ligação a DNA/metabolismo , Proteínas Repressoras/metabolismo , Dedos de Zinco , Sequência de Aminoácidos , Animais , Encéfalo/metabolismo , Química Encefálica , Ilhas de CpG , DNA/metabolismo , Proteínas de Ligação a DNA/análise , Proteínas de Ligação a DNA/genética , Regulação da Expressão Gênica , Humanos , Camundongos , Dados de Sequência Molecular , Filogenia , Proteínas Repressoras/análise , Proteínas Repressoras/genética , Fatores de Transcrição/análise , Fatores de Transcrição/metabolismo , Transcrição Gênica
19.
Nat Commun ; 10(1): 4059, 2019 09 06.
Artigo em Inglês | MEDLINE | ID: mdl-31492853

RESUMO

HIV-1 recurrently targets active genes and integrates in the proximity of the nuclear pore compartment in CD4+ T cells. However, the genomic features of these genes and the relevance of their transcriptional activity for HIV-1 integration have so far remained unclear. Here we show that recurrently targeted genes are proximal to super-enhancer genomic elements and that they cluster in specific spatial compartments of the T cell nucleus. We further show that these gene clusters acquire their location during the activation of T cells. The clustering of these genes along with their transcriptional activity are the major determinants of HIV-1 integration in T cells. Our results provide evidence of the relevance of the spatial compartmentalization of the genome for HIV-1 integration, thus further strengthening the role of nuclear architecture in viral infection.


Assuntos
Linfócitos T CD4-Positivos/metabolismo , Núcleo Celular/genética , Elementos Facilitadores Genéticos , HIV-1/genética , Integração Viral/genética , Sequência de Bases , Linfócitos T CD4-Positivos/virologia , Núcleo Celular/metabolismo , Núcleo Celular/virologia , Cromatina/genética , Cromatina/virologia , Infecções por HIV/genética , Infecções por HIV/imunologia , Infecções por HIV/virologia , HIV-1/fisiologia , Humanos , Poro Nuclear/genética , Poro Nuclear/virologia , Regiões Promotoras Genéticas/genética , Transcrição Gênica
20.
Nat Commun ; 9(1): 1740, 2018 04 30.
Artigo em Inglês | MEDLINE | ID: mdl-29712907

RESUMO

All organisms regulate transcription of their genes. To understand this process, a complete understanding of how transcription factors find their targets in cellular nuclei is essential. The DNA sequence and other variables are known to influence this binding, but the distribution of transcription factor binding patterns remains mostly unexplained in metazoan genomes. Here, we investigate the role of chromosome conformation in the trajectories of transcription factors. Using molecular dynamics simulations, we uncover the principles of their diffusion on chromatin. Chromosome contacts play a conflicting role: at low density they enhance transcription factor traffic, but at high density they lower it by volume exclusion. Consistently, we observe that in human cells, highly occupied targets, where protein binding is promiscuous, are found at sites engaged in chromosome loops within uncompacted chromatin. In summary, we provide a framework for understanding the search trajectories of transcription factors, highlighting the key contribution of genome conformation.


Assuntos
Cromatina/química , Genoma Humano , Fatores de Transcrição/metabolismo , Transcrição Gênica , Linhagem Celular Transformada , Cromatina/ultraestrutura , Humanos , Linfócitos/citologia , Linfócitos/metabolismo , Modelos Genéticos , Simulação de Dinâmica Molecular , Fatores de Transcrição/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA