RESUMO
Posterior fossa group A (PFA) ependymoma is a lethal brain cancer diagnosed in infants and young children. The lack of driver events in the PFA linear genome led us to search its 3D genome for characteristic features. Here, we reconstructed 3D genomes from diverse childhood tumor types and uncovered a global topology in PFA that is highly reminiscent of stem and progenitor cells in a variety of human tissues. A remarkable feature exclusively present in PFA are type B ultra long-range interactions in PFAs (TULIPs), regions separated by great distances along the linear genome that interact with each other in the 3D nuclear space with surprising strength. TULIPs occur in all PFA samples and recur at predictable genomic coordinates, and their formation is induced by expression of EZHIP. The universality of TULIPs across PFA samples suggests a conservation of molecular principles that could be exploited therapeutically.
Assuntos
Ependimoma , Ependimoma/genética , Humanos , Neoplasias Infratentoriais/genética , Neoplasias Infratentoriais/patologia , Genoma Humano , Lactente , Neoplasias Encefálicas/genética , Neoplasias Encefálicas/patologia , Criança , Masculino , FemininoRESUMO
Polycomb Repressive Complex 2 (PRC2)-mediated histone H3K27 tri-methylation (H3K27me3) recruits canonical PRC1 (cPRC1) to maintain heterochromatin. In early development, polycomb-regulated genes are connected through long-range 3D interactions which resolve upon differentiation. Here, we report that polycomb looping is controlled by H3K27me3 spreading and regulates target gene silencing and cell fate specification. Using glioma-derived H3 Lys-27-Met (H3K27M) mutations as tools to restrict H3K27me3 deposition, we show that H3K27me3 confinement concentrates the chromatin pool of cPRC1, resulting in heightened 3D interactions mirroring chromatin architecture of pluripotency, and stringent gene repression that maintains cells in progenitor states to facilitate tumor development. Conversely, H3K27me3 spread in pluripotent stem cells, following neural differentiation or loss of the H3K36 methyltransferase NSD1, dilutes cPRC1 concentration and dissolves polycomb loops. These results identify the regulatory principles and disease implications of polycomb looping and nominate histone modification-guided distribution of reader complexes as an important mechanism for nuclear compartment organization. Highlights: The confinement of H3K27me3 at PRC2 nucleation sites without its spreading correlates with increased 3D chromatin interactions.The H3K27M oncohistone concentrates canonical PRC1 that anchors chromatin loop interactions in gliomas, silencing developmental programs.Stem and progenitor cells require factors promoting H3K27me3 confinement, including H3K36me2, to maintain cPRC1 loop architecture.The cPRC1-H3K27me3 interaction is a targetable driver of aberrant self-renewal in tumor cells.
RESUMO
Canonical (H3.1/H3.2) and noncanonical (H3.3) histone 3 K27M-mutant gliomas have unique spatiotemporal distributions, partner alterations and molecular profiles. The contribution of the cell of origin to these differences has been challenging to uncouple from the oncogenic reprogramming induced by the mutation. Here, we perform an integrated analysis of 116 tumors, including single-cell transcriptome and chromatin accessibility, 3D chromatin architecture and epigenomic profiles, and show that K27M-mutant gliomas faithfully maintain chromatin configuration at developmental genes consistent with anatomically distinct oligodendrocyte precursor cells (OPCs). H3.3K27M thalamic gliomas map to prosomere 2-derived lineages. In turn, H3.1K27M ACVR1-mutant pontine gliomas uniformly mirror early ventral NKX6-1+/SHH-dependent brainstem OPCs, whereas H3.3K27M gliomas frequently resemble dorsal PAX3+/BMP-dependent progenitors. Our data suggest a context-specific vulnerability in H3.1K27M-mutant SHH-dependent ventral OPCs, which rely on acquisition of ACVR1 mutations to drive aberrant BMP signaling required for oncogenesis. The unifying action of K27M mutations is to restrict H3K27me3 at PRC2 landing sites, whereas other epigenetic changes are mainly contingent on the cell of origin chromatin state and cycling rate.
Assuntos
Cromatina , Epigenômica , Linhagem da Célula/genética , EncéfaloRESUMO
BACKGROUND: Juvenile Pilocytic Astrocytomas (JPAs) are one of the most common pediatric brain tumors, and they are driven by aberrant activation of the mitogen-activated protein kinase (MAPK) signaling pathway. RAF-fusions are the most common genetic alterations identified in JPAs, with the prototypical KIAA1549-BRAF fusion leading to loss of BRAF's auto-inhibitory domain and subsequent constitutive kinase activation. JPAs are highly vascular and show pervasive immune infiltration, which can lead to low tumor cell purity in clinical samples. This can result in gene fusions that are difficult to detect with conventional omics approaches including RNA-Seq. METHODS: To this effect, we applied RNA-Seq as well as linked-read whole-genome sequencing and in situ Hi-C as new approaches to detect and characterize low-frequency gene fusions at the genomic, transcriptomic and spatial level. RESULTS: Integration of these datasets allowed the identification and detailed characterization of two novel BRAF fusion partners, PTPRZ1 and TOP2B, in addition to the canonical fusion with partner KIAA1549. Additionally, our Hi-C datasets enabled investigations of 3D genome architecture in JPAs which showed a high level of correlation in 3D compartment annotations between JPAs compared to other pediatric tumors, and high similarity to normal adult astrocytes. We detected interactions between BRAF and its fusion partners exclusively in tumor samples containing BRAF fusions. CONCLUSIONS: We demonstrate the power of integrating multi-omic datasets to identify low frequency fusions and characterize the JPA genome at high resolution. We suggest that linked-reads and Hi-C could be used in clinic for the detection and characterization of JPAs.
Assuntos
Astrocitoma , Neoplasias Encefálicas , Criança , Adulto , Humanos , Multiômica , Proteínas Proto-Oncogênicas B-raf/genética , Proteínas de Fusão Oncogênica/genética , Astrocitoma/patologia , Neoplasias Encefálicas/patologia , Proteínas Tirosina Fosfatases Classe 5 Semelhantes a ReceptoresRESUMO
Histone H3.3 glycine 34 to arginine/valine (G34R/V) mutations drive deadly gliomas and show exquisite regional and temporal specificity, suggesting a developmental context permissive to their effects. Here we show that 50% of G34R/V tumors (n = 95) bear activating PDGFRA mutations that display strong selection pressure at recurrence. Although considered gliomas, G34R/V tumors actually arise in GSX2/DLX-expressing interneuron progenitors, where G34R/V mutations impair neuronal differentiation. The lineage of origin may facilitate PDGFRA co-option through a chromatin loop connecting PDGFRA to GSX2 regulatory elements, promoting PDGFRA overexpression and mutation. At the single-cell level, G34R/V tumors harbor dual neuronal/astroglial identity and lack oligodendroglial programs, actively repressed by GSX2/DLX-mediated cell fate specification. G34R/V may become dispensable for tumor maintenance, whereas mutant-PDGFRA is potently oncogenic. Collectively, our results open novel research avenues in deadly tumors. G34R/V gliomas are neuronal malignancies where interneuron progenitors are stalled in differentiation by G34R/V mutations and malignant gliogenesis is promoted by co-option of a potentially targetable pathway, PDGFRA signaling.
Assuntos
Neoplasias Encefálicas/genética , Carcinogênese/genética , Glioma/genética , Histonas/genética , Interneurônios/metabolismo , Mutação/genética , Células-Tronco Neurais/metabolismo , Receptor alfa de Fator de Crescimento Derivado de Plaquetas/genética , Animais , Astrócitos/metabolismo , Astrócitos/patologia , Neoplasias Encefálicas/patologia , Carcinogênese/patologia , Linhagem da Célula , Reprogramação Celular/genética , Cromatina/metabolismo , Embrião de Mamíferos/metabolismo , Epigênese Genética , Regulação Neoplásica da Expressão Gênica , Inativação Gênica , Glioma/patologia , Histonas/metabolismo , Lisina/metabolismo , Camundongos Endogâmicos C57BL , Modelos Biológicos , Gradação de Tumores , Oligodendroglia/metabolismo , Regiões Promotoras Genéticas/genética , Prosencéfalo/embriologia , Receptor alfa de Fator de Crescimento Derivado de Plaquetas/metabolismo , Transcrição Gênica , Transcriptoma/genéticaRESUMO
The PAQosome is an 11-subunit chaperone involved in the biogenesis of several human protein complexes. We show that ASDURF, a recently discovered upstream open reading frame (uORF) in the 5' UTR of ASNSD1 mRNA, encodes the 12th subunit of the PAQosome. ASDURF displays significant structural homology to ß-prefoldins and assembles with the five known subunits of the prefoldin-like module of the PAQosome to form a heterohexameric prefoldin-like complex. A model of the PAQosome prefoldin-like module is presented. The data presented here provide an example of a eukaryotic uORF-encoded polypeptide whose function is not limited to cis-acting translational regulation of downstream coding sequence and highlights the importance of including alternative ORF products in proteomic studies.
Assuntos
Chaperonas Moleculares , Proteômica , Humanos , Chaperonas Moleculares/genética , Fases de Leitura AbertaRESUMO
Cells are highly asymmetrical, a feature that relies on the sorting of molecular constituents, including proteins, lipids, and nucleic acids, to distinct subcellular locales. The localization of RNA molecules is an important layer of gene regulation required to modulate localized cellular activities, although its global prevalence remains unclear. We combine biochemical cell fractionation with RNA-sequencing (CeFra-seq) analysis to assess the prevalence and conservation of RNA asymmetric distribution on a transcriptome-wide scale in Drosophila and human cells. This approach reveals that the majority (â¼80%) of cellular RNA species are asymmetrically distributed, whether considering coding or noncoding transcript populations, in patterns that are broadly conserved evolutionarily. Notably, a large number of Drosophila and human long noncoding RNAs and circular RNAs display enriched levels within specific cytoplasmic compartments, suggesting that these RNAs fulfill extra-nuclear functions. Moreover, fraction-specific mRNA populations exhibit distinctive sequence characteristics. Comparative analysis of mRNA fractionation profiles with that of their encoded proteins reveals a general lack of correlation in subcellular distribution, marked by strong cases of asymmetry. However, coincident distribution profiles are observed for mRNA/protein pairs related to a variety of functional protein modules, suggesting complex regulatory inputs of RNA localization to cellular organization.
Assuntos
RNA Mensageiro/genética , RNA não Traduzido/genética , Animais , Proteínas de Drosophila/genética , Proteínas de Drosophila/metabolismo , Drosophila melanogaster , Células Hep G2 , Humanos , Transporte Proteico , Transporte de RNA , RNA de Cadeia Dupla/genética , RNA de Cadeia Dupla/metabolismo , RNA Mensageiro/metabolismo , RNA não Traduzido/metabolismo , Especificidade da EspécieRESUMO
In CRISPR-Cas9 genome editing, the underlying principles for selecting guide RNA (gRNA) sequences that would ensure for efficient target site modification remain poorly understood. Here we show that target sites harbouring multiple protospacer adjacent motifs (PAMs) are refractory to Cas9-mediated repair in situ. Thus we refine which substrates should be avoided in gRNA design, implicating PAM density as a novel sequence-specific feature that inhibits in vivo Cas9-driven DNA modification.
Assuntos
Sistemas CRISPR-Cas , Clivagem do DNA , Motivos de Nucleotídeos , Edição de RNA , RNA Guia de Cinetoplastídeos/química , Northern Blotting , Western Blotting , Proteínas Associadas a CRISPR , Ensaio de Desvio de Mobilidade Eletroforética , Genoma , Células HEK293 , Células HeLa , Humanos , Células MCF-7 , RNA/metabolismoRESUMO
BACKGROUND: CpG methylation variation is involved in human trait formation and disease susceptibility. Analyses within populations have been biased towards CpG-dense regions through the application of targeted arrays. We generate whole-genome bisulfite sequencing data for approximately 30 adipose and blood samples from monozygotic and dizygotic twins for the characterization of non-genetic and genetic effects at single-site resolution. RESULTS: Purely invariable CpGs display a bimodal distribution with enrichment of unmethylated CpGs and depletion of fully methylated CpGs in promoter and enhancer regions. Population-variable CpGs account for approximately 15-20 % of total CpGs per tissue, are enriched in enhancer-associated regions and depleted in promoters, and single nucleotide polymorphisms at CpGs are a frequent confounder of extreme methylation variation. Differential methylation is primarily non-genetic in origin, with non-shared environment accounting for most of the variance. These non-genetic effects are mainly tissue-specific. Tobacco smoking is associated with differential methylation in blood with no evidence of this exposure impacting cell counts. Opposite to non-genetic effects, genetic effects of CpG methylation are shared across tissues and thus limit inter-tissue epigenetic drift. CpH methylation is rare, and shows similar characteristics of variation patterns as CpGs. CONCLUSIONS: Our study highlights the utility of low pass whole-genome bisulfite sequencing in identifying methylome variation beyond promoter regions, and suggests that targeting the population dynamic methylome of tissues requires assessment of understudied intergenic CpGs distal to gene promoters to reveal the full extent of inter-individual variation.
Assuntos
Metilação de DNA , Interação Gene-Ambiente , Variação Genética , Genoma Humano , Tecido Adiposo/metabolismo , Sangue/metabolismo , Ilhas de CpG , Feminino , Humanos , Fumar/genética , Gêmeos Dizigóticos , Gêmeos MonozigóticosRESUMO
Recent releases of genome three-dimensional (3D) structures have the potential to transform our understanding of genomes. Nonetheless, the storage technology and visualization tools need to evolve to offer to the scientific community fast and convenient access to these data. We introduce simultaneously a database system to store and query 3D genomic data (3DBG), and a 3D genome browser to visualize and explore 3D genome structures (3DGB). We benchmark 3DBG against state-of-the-art systems and demonstrate that it is faster than previous solutions, and importantly gracefully scales with the size of data. We also illustrate the usefulness of our 3D genome Web browser to explore human genome structures. The 3D genome browser is available at http://3dgb.cs.mcgill.ca/.
Assuntos
Bases de Dados Genéticas , Genômica , Gráficos por Computador , Genes , Genes do Retinoblastoma , Genoma Humano , Humanos , Internet , Modelos Moleculares , Polimorfismo de Nucleotídeo ÚnicoRESUMO
BACKGROUND: Although genetic or epigenetic alterations have been shown to affect the three-dimensional organization of genomes, the utility of chromatin conformation in the classification of human disease has never been addressed. RESULTS: Here, we explore whether chromatin conformation can be used to classify human leukemia. We map the conformation of the HOXA gene cluster in a panel of cell lines with 5C chromosome conformation capture technology, and use the data to train and test a support vector machine classifier named 3D-SP. We show that 3D-SP is able to accurately distinguish leukemias expressing MLL-fusion proteins from those expressing only wild-type MLL, and that it can also classify leukemia subtypes according to MLL fusion partner, based solely on 5C data. CONCLUSIONS: Our study provides the first proof-of-principle demonstration that chromatin conformation contains the information value necessary for classification of leukemia subtypes.
Assuntos
Cromatina/genética , Proteínas de Homeodomínio/genética , Leucemia/genética , Linhagem Celular Tumoral , Cromatina/química , Montagem e Desmontagem da Cromatina , Proteínas de Homeodomínio/química , Humanos , Leucemia/diagnósticoRESUMO
Three-dimensional genome organization is an important higher order transcription regulation mechanism that can be studied with the chromosome conformation capture techniques. Here, we combined chromatin organization analysis by chromosome conformation capture-carbon copy, computational modeling and epigenomics to achieve the first integrated view, through time, of a connection between chromatin state and its architecture. We used this approach to examine the chromatin dynamics of the HoxA cluster in a human myeloid leukemia cell line at various stages of differentiation. We found that cellular differentiation involves a transient activation of the 5'-end HoxA genes coinciding with a loss of contacts throughout the cluster, and by specific silencing at the 3'-end with H3K27 methylation. The 3D modeling of the data revealed an extensive reorganization of the cluster between the two previously reported topologically associated domains in differentiated cells. Our results support a model whereby silencing by polycomb group proteins and reconfiguration of CTCF interactions at a topologically associated domain boundary participate in changing the HoxA cluster topology, which compartmentalizes the genes following differentiation.
Assuntos
Diferenciação Celular/genética , Cromatina/química , Proteínas de Homeodomínio/genética , Família Multigênica , Sítios de Ligação , Fator de Ligação a CCCTC , Linhagem Celular Tumoral , Cromatina/metabolismo , Regulação da Expressão Gênica , Histonas/metabolismo , Humanos , Lactente , Elementos Isolantes , Macrófagos/citologia , Macrófagos/metabolismo , Masculino , Proteínas Repressoras/metabolismo , Ativação TranscricionalRESUMO
The Ccs3 locus on mouse chromosome 3 regulates differential susceptibility of A/J (A, susceptible) and C57BL/6J (B6, resistant) mouse strains to chemically-induced colorectal cancer (CRC). Here, we report the high-resolution positional mapping of the gene underlying the Ccs3 effect. Using phenotype/genotype correlation in a series of 33 AcB/BcA recombinant congenic mouse strains, as well as in groups of backcross populations bearing unique recombinant chromosomes for the interval, and in subcongenic strains, we have delineated the maximum size of the Ccs3 physical interval to a â¼2.15 Mb segment. This interval contains 12 annotated transcripts. Sequencing of positional candidates in A and B6 identified many either low-priority coding changes or non-protein coding variants. We found a unique copy number variant (CNV) in intron 15 of the Nfkb1 gene. The CNV consists of two copies of a 54 bp sequence immediately adjacent to the exon 15 splice site, while only one copy is found in CRC-susceptible A. The Nfkb1 protein (p105/p50) expression is much reduced in A tumors compared to normal A colonic epithelium as analyzed by immunohistochemistry. Studies in primary macrophages from A and B6 mice demonstrate a marked differential activation of the NfκB pathway by lipopolysaccharide (kinetics of stimulation and maximum levels of phosphorylated IκBα), with a more robust activation being associated with resistance to CRC. NfκB has been previously implicated in regulating homeostasis and inflammatory response in the intestinal mucosa. The interval contains another positional candidate Slc39a8 that is differentially expressed in A vs B6 colons, and that has recently been associated in CRC tumor aggressiveness in humans.
Assuntos
Carcinógenos/toxicidade , Mapeamento Cromossômico , Cromossomos de Mamíferos/genética , Neoplasias Colorretais/induzido quimicamente , Neoplasias Colorretais/genética , Loci Gênicos/genética , Predisposição Genética para Doença/genética , Animais , Sequência de Bases , Neoplasias Colorretais/patologia , Regulação Neoplásica da Expressão Gênica/efeitos dos fármacos , Regulação Neoplásica da Expressão Gênica/genética , Humanos , Hibridização Genética , Endogamia , Mucosa Intestinal/efeitos dos fármacos , Mucosa Intestinal/metabolismo , Mucosa Intestinal/patologia , Camundongos , Dados de Sequência Molecular , Subunidade p50 de NF-kappa B/metabolismo , Análise de Sequência de DNA , Transdução de Sinais/efeitos dos fármacos , Transdução de Sinais/genética , Especificidade da EspécieRESUMO
BACKGROUND: Long-range interactions between regulatory DNA elements such as enhancers, insulators and promoters play an important role in regulating transcription. As chromatin contacts have been found throughout the human genome and in different cell types, spatial transcriptional control is now viewed as a general mechanism of gene expression regulation. Chromosome Conformation Capture Carbon Copy (5C) and its variant Hi-C are techniques used to measure the interaction frequency (IF) between specific regions of the genome. Our goal is to use the IF data generated by these experiments to computationally model and analyze three-dimensional chromatin organization. RESULTS: We formulate a probabilistic model linking 5C/Hi-C data to physical distances and describe a Markov chain Monte Carlo (MCMC) approach called MCMC5C to generate a representative sample from the posterior distribution over structures from IF data. Structures produced from parallel MCMC runs on the same dataset demonstrate that our MCMC method mixes quickly and is able to sample from the posterior distribution of structures and find subclasses of structures. Structural properties (base looping, condensation, and local density) were defined and their distribution measured across the ensembles of structures generated. We applied these methods to a biological model of human myelomonocyte cellular differentiation and identified distinct chromatin conformation signatures (CCSs) corresponding to each of the cellular states. We also demonstrate the ability of our method to run on Hi-C data and produce a model of human chromosome 14 at 1Mb resolution that is consistent with previously observed structural properties as measured by 3D-FISH. CONCLUSIONS: We believe that tools like MCMC5C are essential for the reliable analysis of data from the 3C-derived techniques such as 5C and Hi-C. By integrating complex, high-dimensional and noisy datasets into an easy to interpret ensemble of three-dimensional conformations, MCMC5C allows researchers to reliably interpret the result of their assay and contrast conformations under different conditions. AVAILABILITY: http://Dostielab.biochem.mcgill.ca.
Assuntos
Cromatina/química , Genoma Humano , Modelos Biológicos , Sequências Reguladoras de Ácido Nucleico , Linhagem Celular Tumoral , Cromossomos Humanos Par 14 , Simulação por Computador , Proteínas de Homeodomínio/genética , Proteínas de Homeodomínio/metabolismo , Humanos , Cadeias de Markov , Método de Monte CarloRESUMO
Spatial chromatin organization is emerging as an important mechanism to regulate the expression of genes. However, very little is known about genome architecture at high-resolution in vivo. Here, we mapped the three-dimensional organization of the human Hox clusters with chromosome conformation capture (3C) technology. We show that computational modeling of 3C data sets can identify candidate regulatory proteins of chromatin architecture and gene expression. Hox genes encode evolutionarily conserved master regulators of development which strict control has fascinated biologists for over 25 years. Proper transcriptional silencing is key to Hox function since premature expression can lead to developmental defects or human disease. We now show that the HoxA cluster is organized into multiple chromatin loops that are dependent on transcription activity. Long-range contacts were found in all four silent clusters but looping patterns were specific to each cluster. In contrast to the Drosophila homeotic bithorax complex (BX-C), we found that Polycomb proteins are only modestly required for human cluster looping and silencing. However, computational three-dimensional Hox cluster modeling identified the insulator-binding protein CTCF as a likely candidate mediating DNA loops in all clusters. Our data suggest that Hox cluster looping may represent an evolutionarily conserved structural mechanism of transcription regulation.
Assuntos
Cromatina/química , Inativação Gênica , Proteínas de Homeodomínio/genética , Família Multigênica , Fator de Ligação a CCCTC , Linhagem Celular Tumoral , Humanos , Masculino , Modelos Moleculares , Proteínas Repressoras/química , Proteínas Repressoras/fisiologia , Transcrição Gênica , Adulto JovemRESUMO
In this article, we undertake a study of the evolution of human papillomaviruses (HPV), whose potential to cause cervical cancer is well known. First, we found that the existing HPV groups are monophyletic and that the high risk of carcinogenicity taxa are usually clustered together. Then, we present a new algorithm for analyzing the information content of multiple sequence alignments in relation to epidemiologic carcinogenicity data to identify regions that would warrant additional experimental analyses. The new algorithm is based on a sliding window procedure and a p-value computation to identify genomic regions that are specific to HPVs causing disease. Examination of the genomes of 83 HPVs allowed us to identify specific regions that might be influenced by insertions, by deletions, or simply by mutations, and that may be of interest for further analyses. Supplementary Material is provided (see online Supplementary Material at www.libertonline.com ).
Assuntos
Transformação Celular Viral/genética , Genoma Viral , Papillomaviridae/genética , Papillomaviridae/patogenicidade , Infecções por Papillomavirus/genética , Algoritmos , Sequência de Bases , Feminino , Genoma Humano , Humanos , Modelos Genéticos , Dados de Sequência Molecular , Papillomaviridae/classificação , Infecções por Papillomavirus/virologia , Filogenia , Alinhamento de Sequência , Neoplasias do Colo do Útero/genética , Neoplasias do Colo do Útero/virologiaRESUMO
BACKGROUND: Leishmania and other members of the Trypanosomatidae family diverged early on in eukaryotic evolution and consequently display unique cellular properties. Their apparent lack of transcriptional regulation is compensated by complex post-transcriptional control mechanisms, including the processing of polycistronic transcripts by means of coupled trans-splicing and polyadenylation. Trans-splicing signals are often U-rich polypyrimidine (poly(Y)) tracts, which precede AG splice acceptor sites. However, as opposed to higher eukaryotes there is no consensus polyadenylation signal in trypanosomatid mRNAs. RESULTS: We refined a previously reported method to target 5' splice junctions by incorporating the pyrimidine content of query sequences into a scoring function. We also investigated a novel approach for predicting polyadenylation (poly(A)) sites in-silico, by comparing query sequences to polyadenylated expressed sequence tags (ESTs) using position-specific scanning matrices (PSSMs). An additional analysis of the distribution of putative splice junction to poly(A) distances helped to increase prediction rates by limiting the scanning range. These methods were able to simplify splice junction prediction without loss of precision and to increase polyadenylation site prediction from 22% to 47% within 100 nucleotides. CONCLUSION: We propose a simplified trans-splicing prediction tool and a novel poly(A) prediction tool based on comparative sequence analysis. We discuss the impact of certain regions surrounding the poly(A) sites on prediction rates and contemplate correlating biological mechanisms. This work aims to sharpen the identification of potentially functional untranslated regions (UTRs) in a large-scale, comparative genomics framework.
Assuntos
Etiquetas de Sequências Expressas , Leishmania/genética , Modelos Genéticos , RNA Mensageiro/genética , Alinhamento de Sequência/métodos , Análise de Sequência de RNA/métodos , Animais , Sequência de Bases , Simulação por Computador , Dados de Sequência Molecular , Reprodutibilidade dos Testes , Sensibilidade e EspecificidadeRESUMO
We have performed a survey of soluble human protein complexes containing components of the transcription and RNA processing machineries using protein affinity purification coupled to mass spectrometry. Thirty-two tagged polypeptides yielded a network of 805 high-confidence interactions. Remarkably, the network is significantly enriched in proteins that regulate the formation of protein complexes, including a number of previously uncharacterized proteins for which we have inferred functions. The RNA polymerase II (RNAP II)-associated proteins (RPAPs) are physically and functionally associated with RNAP II, forming an interface between the enzyme and chaperone/scaffolding proteins. BCDIN3 is the 7SK snRNA methylphosphate capping enzyme (MePCE) present in an snRNP complex containing both RNA processing and transcription factors, including the elongation factor P-TEFb. Our results define a high-density protein interaction network for the mammalian transcription machinery and uncover multiple regulatory factors that target the transcription machinery.
Assuntos
Nucleotidiltransferases/metabolismo , Sequência de Aminoácidos , Proteínas de Transporte/química , Proteínas de Transporte/metabolismo , Linhagem Celular , Humanos , Técnicas In Vitro , Substâncias Macromoleculares , Dados de Sequência Molecular , Nucleotidiltransferases/química , Nucleotidiltransferases/genética , Mapeamento de Interação de Proteínas , Interferência de RNA , RNA Polimerase II/química , RNA Polimerase II/metabolismo , Processamento Pós-Transcricional do RNA , Ribonucleoproteínas Nucleares Pequenas/química , Ribonucleoproteínas Nucleares Pequenas/metabolismo , Transcrição GênicaRESUMO
Given a multiple alignment of orthologous DNA sequences and a phylogenetic tree for these sequences, we investigate the problem of reconstructing the most likely scenario of insertions and deletions capable of explaining the gaps observed in the alignment. This problem, that we called the Indel Maximum Likelihood Problem (IMLP), is an important step toward the reconstruction of ancestral genomics sequences, and is important for studying evolutionary processes, genome function, adaptation and convergence. We solve the IMLP using a new type of tree hidden Markov model whose states correspond to single-base evolutionary scenarios and where transitions model dependencies between neighboring columns. The standard Viterbi and Forward-backward algorithms are optimized to produce the most likely ancestral reconstruction and to compute the level of confidence associated to specific regions of the reconstruction. A heuristic is presented to make the method practical for large data sets, while retaining an extremely high degree of accuracy. The methods are illustrated on a 1-Mb alignment of the CFTR regions from 12 mammals.
Assuntos
Evolução Molecular , Genoma , Mamíferos/genética , Modelos Genéticos , Algoritmos , Animais , Regulador de Condutância Transmembrana em Fibrose Cística/genética , Cadeias de MarkovRESUMO
Orphan nuclear receptor ERRalpha (NR3B1) is recognized as a key regulator of mitochondrial biogenesis, but it is not known whether ERRalpha and other ERR isoforms play a broader role in cardiac energetics and function. We used genome-wide location analysis and expression profiling to appraise the role of ERRalpha and gamma (NR3B3) in the adult heart. Our data indicate that the two receptors, acting as nonobligatory heterodimers, target a common set of promoters involved in the uptake of energy substrates, production and transport of ATP across the mitochondrial membranes, and intracellular fuel sensing, as well as Ca(2+) handling and contractile work. Motif-finding algorithms assisted by functional studies indicated that ERR target promoters are enriched for NRF-1, CREB, and STAT3 binding sites. Our study thus reveals that the ERRs orchestrate a comprehensive cardiac transcriptional program and further suggests that modulation of ERR activities could be used to manage cardiomyopathies.