RESUMO
The ubiquitous family of dimeric transcription factors AP-1 is made up of Fos and Jun family proteins. It has long been thought to operate principally at gene promoters and how it controls transcription is still ill-understood. The Fos family protein Fra-1 is overexpressed in triple negative breast cancers (TNBCs) where it contributes to tumor aggressiveness. To address its transcriptional actions in TNBCs, we combined transcriptomics, ChIP-seqs, machine learning and NG Capture-C. Additionally, we studied its Fos family kin Fra-2 also expressed in TNBCs, albeit much less. Consistently with their pleiotropic effects, Fra-1 and Fra-2 up- and downregulate individually, together or redundantly many genes associated with a wide range of biological processes. Target gene regulation is principally due to binding of Fra-1 and Fra-2 at regulatory elements located distantly from cognate promoters where Fra-1 modulates the recruitment of the transcriptional co-regulator p300/CBP and where differences in AP-1 variant motif recognition can underlie preferential Fra-1- or Fra-2 bindings. Our work also shows no major role for Fra-1 in chromatin architecture control at target gene loci, but suggests collaboration between Fra-1-bound and -unbound enhancers within chromatin hubs sometimes including promoters for other Fra-1-regulated genes. Our work impacts our view of AP-1.
Assuntos
Elementos Facilitadores Genéticos , Regulação Neoplásica da Expressão Gênica , Proteínas Proto-Oncogênicas c-fos/metabolismo , Neoplasias de Mama Triplo Negativas/genética , Sítios de Ligação , Linhagem Celular Tumoral , Cromatina/química , Cromatina/metabolismo , Epigênese Genética , Antígeno 2 Relacionado a Fos/metabolismo , Humanos , Motivos de Nucleotídeos , Regiões Promotoras Genéticas , Proteínas Proto-Oncogênicas c-fos/fisiologia , Fator de Transcrição AP-1/metabolismo , Neoplasias de Mama Triplo Negativas/metabolismo , Fatores de Transcrição de p300-CBP/metabolismoRESUMO
Gene expression is orchestrated by distinct regulatory regions to ensure a wide variety of cell types and functions. A challenge is to identify which regulatory regions are active, what are their associated features and how they work together in each cell type. Several approaches have tackled this problem by modeling gene expression based on epigenetic marks, with the ultimate goal of identifying driving regions and associated genomic variations that are clinically relevant in particular in precision medicine. However, these models rely on experimental data, which are limited to specific samples (even often to cell lines) and cannot be generated for all regulators and all patients. In addition, we show here that, although these approaches are accurate in predicting gene expression, inference of TF combinations from this type of models is not straightforward. Furthermore these methods are not designed to capture regulation instructions present at the sequence level, before the binding of regulators or the opening of the chromatin. Here, we probe sequence-level instructions for gene expression and develop a method to explain mRNA levels based solely on nucleotide features. Our method positions nucleotide composition as a critical component of gene expression. Moreover, our approach, able to rank regulatory regions according to their contribution, unveils a strong influence of the gene body sequence, in particular introns. We further provide evidence that the contribution of nucleotide content can be linked to co-regulations associated with genome 3D architecture and to associations of genes within topologically associated domains.
Assuntos
Composição de Bases , Regulação da Expressão Gênica , Sequências Reguladoras de Ácido Nucleico , Biologia Computacional , Variações do Número de Cópias de DNA , Elementos Facilitadores Genéticos , Genoma Humano , Humanos , Modelos Genéticos , Neoplasias/genética , Neoplasias/metabolismo , Polimorfismo de Nucleotídeo Único , Regiões Promotoras Genéticas , Locos de Características Quantitativas , RNA Mensageiro/química , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismoRESUMO
Overlapping genes exist in all domains of life and are much more abundant than expected upon their first discovery in the late 1970s. Assuming that the reference gene is read in frame +0, an overlapping gene can be encoded in two reading frames in the sense strand, denoted by +1 and +2, and in three reading frames in the opposite strand, denoted by -0, -1, and -2. This motivated numerous researchers to study the constraints induced by the genetic code on the various overlapping frames, mostly based on information theory. Our focus in this paper is on the constraints induced on two overlapping genes in terms of amino acids, as well as polypeptides. We show that simple linear constraints bind the amino-acid composition of two proteins encoded by overlapping genes. Novel constraints are revealed when polypeptides are considered, and not just single amino acids. For example, in double-coding sequences with an overlapping reading frame -2, each Tyrosine (denoted as Tyr or Y) in the overlapping frame overlaps a Tyrosine in the reference frame +0 (and reciprocally), whereas specific words (e.g. YY) never occur. We thus distinguish between null constraints (YY = 0 in frame -2) and non-null constraints (Y in frame +0 â Y in frame -2). Our equivalence-based constraints are symmetrical and thus enable the characterization of the joint composition of overlapping proteins. We describe several formal frameworks and a graph algorithm to characterize and compute these constraints. As expected, the degrees of freedom left by these constraints vary drastically among the different overlapping frames. Interestingly, the biological meaning of constraints induced on two overlapping proteins (hydropathy, forbidden di-peptides, expected overlap length ) is also specific to the reading frame. We study the combinatorics of these constraints for overlapping polypeptides of length n, pointing out that, (i) except for frame -2, non-null constraints are deduced from the amino-acid (length = 1) constraints and (ii) null constraints are deduced from the di-peptide (length = 2) constraints. These results yield support for understanding the mechanisms and evolution of overlapping genes, and for developing novel overlapping gene detection methods.
Assuntos
Sequência de Aminoácidos/genética , Homologia de Genes , Fases de Leitura Aberta , Proteínas/genética , Algoritmos , Animais , Evolução Biológica , HumanosRESUMO
BACKGROUND: Nuclear workers from French contracting companies have received higher doses than workers from Electricité de France (EDF) or Commissariat à l'Energie Atomique (CEA). METHODS: A cohort study of 9,815 workers in 11 contracting companies, monitored for exposure to ionizing radiation between 1967 and 2000 were followed up for a median duration of 12.5 years. Standardized mortality ratios (SMRs) were computed. RESULTS: Between 1968 and 2002, 250 deaths occurred. Our study demonstrated a clear healthy worker effect (HWE) with mortality attaining half that expected from national mortality statistics (SMR = 0.54, 95% CI = [0.47-0.61]). The HWE was lower for all cancers (SMR = 0.65) than for non-cancer deaths (SMR = 0.46). The analysis by cancer site showed no excess compared with the general population. Significant trends were observed according to the level of exposure to ionizing radiation for deaths from cancer, deaths from digestive cancer and deaths from respiratory cancer. CONCLUSIONS: The mortality of nuclear workers from contracting companies is very low compared to French national mortality.