RESUMO
Pancreatic ductal adenocarcinoma (PDAC) is a lethal disease with limited effective treatment options, potentiating the importance of uncovering novel drug targets. Here, we target cleavage and polyadenylation specificity factor 3 (CPSF3), the 3' endonuclease that catalyzes mRNA cleavage during polyadenylation and histone mRNA processing. We find that CPSF3 is highly expressed in PDAC and is associated with poor prognosis. CPSF3 knockdown blocks PDAC cell proliferation and colony formation in vitro and tumor growth in vivo. Chemical inhibition of CPSF3 by the small molecule JTE-607 also attenuates PDAC cell proliferation and colony formation, while it has no effect on cell proliferation of nontransformed immortalized control pancreatic cells. Mechanistically, JTE-607 induces transcriptional readthrough in replication-dependent histones, reduces core histone expression, destabilizes chromatin structure, and arrests cells in the S-phase of the cell cycle. Therefore, CPSF3 represents a potential therapeutic target for the treatment of PDAC.
Assuntos
Histonas , Neoplasias Pancreáticas , Humanos , Linhagem Celular Tumoral , Proliferação de Células , Regulação Neoplásica da Expressão Gênica , Histonas/genética , Neoplasias Pancreáticas/tratamento farmacológico , Neoplasias Pancreáticas/genética , Neoplasias Pancreáticas/metabolismo , Poliadenilação , RNA Mensageiro/genética , RNA Mensageiro/metabolismoRESUMO
MeCP2 is associated with Rett syndrome (RTT), MECP2 duplication syndrome, and a number of conditions with isolated features of these diseases, including autism, intellectual disability, and motor dysfunction. MeCP2 is known to broadly bind methylated DNA, but the precise molecular mechanism driving disease pathogenesis remains to be determined. Using proximity-dependent biotinylation (BioID), we identified a transcription factor 20 (TCF20) complex that interacts with MeCP2 at the chromatin interface. Importantly, RTT-causing mutations in MECP2 disrupt this interaction. TCF20 and MeCP2 are highly coexpressed in neurons and coregulate the expression of key neuronal genes. Reducing Tcf20 partially rescued the behavioral deficits caused by MECP2 overexpression, demonstrating a functional relationship between MeCP2 and TCF20 in MECP2 duplication syndrome pathogenesis. We identified a patient exhibiting RTT-like neurological features with a missense mutation in the PHF14 subunit of the TCF20 complex that abolishes the MeCP2-PHF14-TCF20 interaction. Our data demonstrate the critical role of the MeCP2-TCF20 complex for brain function.
Assuntos
Proteína 2 de Ligação a Metil-CpG/metabolismo , Complexos Multiproteicos/metabolismo , Transtornos do Neurodesenvolvimento/etiologia , Transtornos do Neurodesenvolvimento/metabolismo , Fatores de Transcrição/metabolismo , Alelos , Animais , Biomarcadores , Encéfalo/metabolismo , Modelos Animais de Doenças , Suscetibilidade a Doenças , Proteína 2 de Ligação a Metil-CpG/genética , Camundongos , Camundongos Knockout , Camundongos Transgênicos , Modelos Biológicos , Mutação , Neurônios/metabolismo , Proteínas Nucleares/genética , Proteínas Nucleares/metabolismo , Ligação Proteica , Sinapses/metabolismo , Fatores de Transcrição/genéticaRESUMO
PURPOSE: Gliomas and their surrounding microenvironment constantly interact to promote tumorigenicity, yet the underlying posttranscriptional regulatory mechanisms that govern this interplay are poorly understood. METHODS: Utilizing our established PAC-seq approach and PolyAMiner bioinformatic analysis pipeline, we deciphered the NUDT21-mediated differential APA dynamics in glioma cells. RESULTS: We identified LAMC1 as a critical NUDT21 alternative polyadenylation (APA) target, common in several core glioma-driving signaling pathways. qRT-PCR analysis confirmed that NUDT21-knockdown in glioma cells results in the preferred usage of the proximal polyA signal (PAS) of LAMC1. Functional studies revealed that NUDT21-knockdown-induced 3'UTR shortening of LAMC1 is sufficient to cause translational gain, as LAMC1 protein is upregulated in these cells compared to their respective controls. We demonstrate that 3'UTR shortening of LAMC1 after NUDT21 knockdown removes binding sites for miR-124/506, thereby relieving potent miRNA-based repression of LAMC1 expression. Remarkably, we report that the knockdown of NUDT21 significantly promoted glioma cell migration and that co-depletion of LAMC1 with NUDT21 abolished this effect. Lastly, we observed that LAMC1 3'UTR shortening predicts poor prognosis of low-grade glioma patients from The Cancer Genome Atlas. CONCLUSION: This study identifies NUDT21 as a core alternative polyadenylation factor that regulates the tumor microenvironment through differential APA and loss of miR-124/506 inhibition of LAMC1. Knockdown of NUDT21 in GBM cells mediates 3'UTR shortening of LAMC1, contributing to an increase in LAMC1, increased glioma cell migration/invasion, and a poor prognosis.
Assuntos
Fator de Especificidade de Clivagem e Poliadenilação , Glioma , MicroRNAs , Humanos , Regiões 3' não Traduzidas , Glioma/genética , MicroRNAs/metabolismo , Poliadenilação , Transdução de Sinais , Microambiente Tumoral , Fator de Especificidade de Clivagem e Poliadenilação/metabolismoRESUMO
Mutations in MeCP2 result in a crippling neurological disease, but we lack a lucid picture of MeCP2's molecular role. Individual transcriptomic studies yield inconsistent differentially expressed genes. To overcome these issues, we demonstrate a methodology to analyze all modern public data. We obtained relevant raw public transcriptomic data from GEO and ENA, then homogeneously processed it (QC, alignment to reference, differential expression analysis). We present a web portal to interactively access the mouse data, and we discovered a commonly perturbed core set of genes that transcends the limitations of any individual study. We then found functionally distinct, consistently up- and downregulated subsets within these genes and some bias to their location. We present this common core of genes as well as focused cores for up, down, cell fraction models, and some tissues. We observed enrichment for this mouse core in other species MeCP2 models and observed overlap with ASD models. By integrating and examining transcriptomic data at scale, we have uncovered the true picture of this dysregulation. The vast scale of these data enables us to analyze signal-to-noise, evaluate a molecular signature in an unbiased manner, and demonstrate a framework for future disease focused informatics work.
Assuntos
Síndrome de Rett , Camundongos , Animais , Síndrome de Rett/genética , Transcriptoma , Proteína 2 de Ligação a Metil-CpG/genética , Proteína 2 de Ligação a Metil-CpG/metabolismo , Perfilação da Expressão Gênica , Mutação , Modelos Animais de DoençasRESUMO
In the messenger RNA (mRNA) maturation process, the 3'-end of pre-mRNA is cleaved and a poly(A) sequence is added, this is an important determinant of mRNA stability and its cellular functions. More than 60%-70% of human genes have three or more polyadenylation (APA) sites and can be cleaved at different sites, generating mRNA transcripts of varying lengths. This phenomenon is termed as alternative cleavage and polyadenylation (APA) and it plays role in key biological processes like gene regulation, cell proliferation, senescence, and also in various human diseases. Loss of regulatory microRNA binding sites and interactions with RNA-binding proteins leading to APA are largely investigated in human diseases. However, the functions of the core APA machinery and related factors during disease conditions remain largely unknown. In this review, we discuss the roles of polyadenylation machinery in relation to brain disease, cardiac failure, pulmonary fibrosis, cancer, infectious conditions, and other human diseases. Collectively, we believe this review will be a useful avenue for understanding the emerging role of APA in the pathobiology of various human diseases.
Assuntos
Poliadenilação , Estabilidade de RNA , Regiões 3' não Traduzidas , Humanos , Estabilidade de RNA/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Proteínas de Ligação a RNA/genética , Proteínas de Ligação a RNA/metabolismoRESUMO
Alternative polyadenylation (APA) regulates gene expression by cleavage and addition of poly(A) sequence at different polyadenylation sites (PAS) in 3'UTR, thus, generating transcript isoforms with different lengths. Cleavage stimulating factor 64 (CstF64) is an APA regulator which plays a role in PAS selection and determines the length of 3'UTR. CstF64 favors the use of proximal PAS, resulting in 3'UTR shortening, which enhances the protein expression by increasing the stability of the target genes. The aim of this study is to investigate the role of CstF64 in cardiac fibrosis, a key event leading to heart failure (HF). We determined the expression of CstF64, key profibrotic genes, and their 3'UTR changes by calculating distal PAS (dPAS) usage in left ventricular (LV) tissues and cardiac fibroblasts from HF patients. CstF64 was upregulated in HF LV tissues and cardiac fibroblasts along with increased deposition of fibrosis genes such as COL1A and FN1 and significant shortening in their 3'UTR. In addition, HF cardiac fibroblasts showed increased transforming growth factor receptor ß1 (TGFßR1) expression consistent with significant shortening in 3'UTR of TGFßR1. Upon knockdown of CstF64 from HF fibroblasts, downregulation in pro-fibrotic genes corresponding to lengthening in their 3'UTR was observed. Our finding suggests an important role of CstF64 in myofibroblast activation and promotion of cardiac fibrosis during HF through APA. Therefore, targeting CstF64 mediated RNA processing approach in human HF could provide a new therapeutic treatment strategy for limiting fibrotic remodeling.
RESUMO
Almost 70% of human genes undergo alternative polyadenylation (APA) and generate mRNA transcripts with varying lengths, typically of the 3' untranslated regions (UTR). APA plays an important role in development and cellular differentiation, and its dysregulation can cause neuropsychiatric diseases and increase cancer severity. Increasing awareness of APA's role in human health and disease has propelled the development of several 3' sequencing (3'Seq) techniques that allow for precise identification of APA sites. However, despite the recent data explosion, there are no robust computational tools that are precisely designed to analyze 3'Seq data. Analytical approaches that have been used to analyze these data predominantly use proximal to distal usage. With about 50% of human genes having more than two APA isoforms, current methods fail to capture the entirety of APA changes and do not account for non-proximal to non-distal changes. Addressing these key challenges, this study demonstrates PolyA-miner, an algorithm to accurately detect and assess differential alternative polyadenylation specifically from 3'Seq data. Genes are abstracted as APA matrices, and differential APA usage is inferred using iterative consensus non-negative matrix factorization (NMF) based clustering. PolyA-miner accounts for all non-proximal to non-distal APA switches using vector projections and reflects precise gene-level 3'UTR changes. It can also effectively identify novel APA sites that are otherwise undetected when using reference-based approaches. Evaluation on multiple datasets-first-generation MicroArray Quality Control (MAQC) brain and Universal Human Reference (UHR) PolyA-seq data, recent glioblastoma cell line NUDT21 knockdown Poly(A)-ClickSeq (PAC-seq) data, and our own mouse hippocampal and human stem cell-derived neuron PAC-seq data-strongly supports the value and protocol-independent applicability of PolyA-miner. Strikingly, in the glioblastoma cell line data, PolyA-miner identified more than twice the number of genes with APA changes than initially reported. With the emerging importance of APA in human development and disease, PolyA-miner can significantly improve data analysis and help decode the underlying APA dynamics.
Assuntos
Algoritmos , Poliadenilação , RNA-Seq/métodos , Regiões 3' não Traduzidas , Animais , Humanos , Camundongos , RNA-Seq/normas , Padrões de Referência , SoftwareRESUMO
Splicing regulation is an important step of post-transcriptional gene regulation. It is a highly dynamic process orchestrated by RNA-binding proteins (RBPs). RBP dysfunction and global splicing dysregulation have been implicated in many human diseases, but the in vivo functions of most RBPs and the splicing outcome upon their loss remain largely unexplored. Here we report that constitutive deletion of Rbm17, which encodes an RBP with a putative role in splicing, causes early embryonic lethality in mice and that its loss in Purkinje neurons leads to rapid degeneration. Transcriptome profiling of Rbm17-deficient and control neurons and subsequent splicing analyses using CrypSplice, a new computational method that we developed, revealed that more than half of RBM17-dependent splicing changes are cryptic. Importantly, RBM17 represses cryptic splicing of genes that likely contribute to motor coordination and cell survival. This finding prompted us to re-analyze published datasets from a recent report on TDP-43, an RBP implicated in amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD), as it was demonstrated that TDP-43 represses cryptic exon splicing to promote cell survival. We uncovered a large number of TDP-43-dependent splicing defects that were not previously discovered, revealing that TDP-43 extensively regulates cryptic splicing. Moreover, we found a significant overlap in genes that undergo both RBM17- and TDP-43-dependent cryptic splicing repression, many of which are associated with survival. We propose that repression of cryptic splicing by RBPs is critical for neuronal health and survival. CrypSplice is available at www.liuzlab.org/CrypSplice.
Assuntos
Esclerose Lateral Amiotrófica/genética , Proteínas de Ligação a DNA/genética , Demência Frontotemporal/genética , Degeneração Neural/genética , Proteínas do Tecido Nervoso/genética , Fatores de Processamento de RNA/genética , Esclerose Lateral Amiotrófica/fisiopatologia , Animais , Biologia Computacional/métodos , Modelos Animais de Doenças , Éxons/genética , Demência Frontotemporal/fisiopatologia , Regulação da Expressão Gênica no Desenvolvimento , Humanos , Camundongos , Degeneração Neural/patologia , Proteínas do Tecido Nervoso/biossíntese , Células de Purkinje/metabolismo , Células de Purkinje/patologia , Splicing de RNA/genética , Fatores de Processamento de RNA/biossíntese , Proteínas de Ligação a RNA/biossíntese , Proteínas de Ligação a RNA/genéticaRESUMO
Conventionally, overall gene expressions from microarrays are used to infer gene networks, but it is challenging to account splicing isoforms. High-throughput RNA Sequencing has made splice variant profiling practical. However, its true merit in quantifying splicing isoforms and isoform-specific exon expressions is not well explored in inferring gene networks. This study demonstrates SpliceNet, a method to infer isoform-specific co-expression networks from exon-level RNA-Seq data, using large dimensional trace. It goes beyond differentially expressed genes and infers splicing isoform network changes between normal and diseased samples. It eases the sample size bottleneck; evaluations on simulated data and lung cancer-specific ERBB2 and MAPK signaling pathways, with varying number of samples, evince the merit in handling high exon to sample size ratio datasets. Inferred network rewiring of well established Bcl-x and EGFR centered networks from lung adenocarcinoma expression data is in good agreement with literature. Gene level evaluations demonstrate a substantial performance of SpliceNet over canonical correlation analysis, a method that is currently applied to exon level RNA-Seq data. SpliceNet can also be applied to exon array data. SpliceNet is distributed as an R package available at http://www.jjwanglab.org/SpliceNet.
Assuntos
Processamento Alternativo , Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Neoplasias/genética , Isoformas de Proteínas/genética , Análise de Sequência de RNA/métodos , Carcinoma Pulmonar de Células não Pequenas/genética , Humanos , Neoplasias Pulmonares/genética , Isoformas de Proteínas/metabolismo , Transdução de Sinais , SoftwareRESUMO
Decreasing pH due to anthropogenic CO2 inputs, called ocean acidification (OA), can make coastal environments unfavorable for oysters. This is a serious socioeconomical issue for China which supplies >70% of the world's edible oysters. Here, we present an iTRAQ-based protein profiling approach for the detection and quantification of proteome changes under OA in the early life stage of a commercially important oyster, Crassostrea hongkongensis. Availability of complete genome sequence for the pacific oyster (Crassostrea gigas) enabled us to confidently quantify over 1500 proteins in larval oysters. Over 7% of the proteome was altered in response to OA at pHNBS 7.6. Analysis of differentially expressed proteins and their associated functional pathways showed an upregulation of proteins involved in calcification, metabolic processes, and oxidative stress, each of which may be important in physiological adaptation of this species to OA. The downregulation of cytoskeletal and signal transduction proteins, on the other hand, might have impaired cellular dynamics and organelle development under OA. However, there were no significant detrimental effects in developmental processes such as metamorphic success. Implications of the differentially expressed proteins and metabolic pathways in the development of OA resistance in oyster larvae are discussed. The MS proteomics data have been deposited to the ProteomeXchange with identifiers PXD002138 (http://proteomecentral.proteomexchange.org/dataset/PXD002138).
Assuntos
Adaptação Fisiológica/genética , Crassostrea/fisiologia , Proteômica , Animais , Crassostrea/genética , Crassostrea/metabolismo , Larva/metabolismo , ProteomaRESUMO
MOTIVATION: Inferring gene-regulatory networks is very crucial in decoding various complex mechanisms in biological systems. Synthesis of a fully functional transcriptional factor/protein from DNA involves series of reactions, leading to a delay in gene regulation. The complexity increases with the dynamic delay induced by other small molecules involved in gene regulation, and noisy cellular environment. The dynamic delay in gene regulation is quite evident in high-temporal live cell lineage-imaging data. Although a number of gene-network-inference methods are proposed, most of them ignore the associated dynamic time delay. RESULTS: Here, we propose DDGni (dynamic delay gene-network inference), a novel gene-network-inference algorithm based on the gapped local alignment of gene-expression profiles. The local alignment can detect short-term gene regulations, that are usually overlooked by traditional correlation and mutual Information based methods. DDGni uses 'gaps' to handle the dynamic delay and non-uniform sampling frequency in high-temporal data, like live cell imaging data. Our algorithm is evaluated on synthetic and yeast cell cycle data, and Caenorhabditis elegans live cell imaging data against other prominent methods. The area under the curve of our method is significantly higher when compared to other methods on all three datasets. AVAILABILITY: The program, datasets and supplementary files are available at http://www.jjwanglab.org/DDGni/.
Assuntos
Algoritmos , Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes , Animais , Caenorhabditis elegans/genética , Caenorhabditis elegans/metabolismo , Regulação da Expressão Gênica , Fatores de Transcrição/metabolismo , Leveduras/genética , Leveduras/metabolismoRESUMO
Inferring gene regulatory networks from gene expression data at whole genome level is still an arduous challenge, especially in higher organisms where the number of genes is large but the number of experimental samples is small. It is reported that the accuracy of current methods at genome scale significantly drops from Escherichia coli to Saccharomyces cerevisiae due to the increase in number of genes. This limits the applicability of current methods to more complex genomes, like human and mouse. Least absolute shrinkage and selection operator (LASSO) is widely used for gene regulatory network inference from gene expression profiles. However, the accuracy of LASSO on large genomes is not satisfactory. In this study, we apply two extended models of LASSO, L0 and L1/2 regularization models to infer gene regulatory network from both high-throughput gene expression data and transcription factor binding data in mouse embryonic stem cells (mESCs). We find that both the L0 and L1/2 regularization models significantly outperform LASSO in network inference. Incorporating interactions between transcription factors and their targets remarkably improved the prediction accuracy. Current study demonstrates the efficiency and applicability of these two models for gene regulatory network inference from integrative omics data in large genomes. The applications of the two models will facilitate biologists to study the gene regulation of higher model organisms in a genome-wide scale.
Assuntos
Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes , Algoritmos , Imunoprecipitação da Cromatina/métodos , Genoma , Modelos GenéticosRESUMO
Alternative polyadenylation (APA) is a key post-transcriptional regulatory mechanism; yet, its regulation and impact on human diseases remain understudied. Existing bulk RNA sequencing (RNA-seq)-based APA methods predominantly rely on predefined annotations, severely impacting their ability to decode novel tissue- and disease-specific APA changes. Furthermore, they only account for the most proximal and distal cleavage and polyadenylation sites (C/PASs). Deconvoluting overlapping C/PASs and the inherent noisy 3' UTR coverage in bulk RNA-seq data pose additional challenges. To overcome these limitations, we introduce PolyAMiner-Bulk, an attention-based deep learning algorithm that accurately recapitulates C/PAS sequence grammar, resolves overlapping C/PASs, captures non-proximal-to-distal APA changes, and generates visualizations to illustrate APA dynamics. Evaluation on multiple datasets strongly evinces the performance merit of PolyAMiner-Bulk, accurately identifying more APA changes compared with other methods. With the growing importance of APA and the abundance of bulk RNA-seq data, PolyAMiner-Bulk establishes a robust paradigm of APA analysis.
Assuntos
Aprendizado Profundo , Poliadenilação , Humanos , Poliadenilação/genética , RNA-Seq , RNA , Análise de Sequência de RNA/métodos , AlgoritmosRESUMO
BACKGROUND: Regulation of the thermogenic response by brown adipose tissue (BAT) is an important component of energy homeostasis with implications for the treatment of obesity and diabetes. Our preliminary analyses of RNA-Seq data uncovered many nodes representing epigenetic modifiers that are altered in BAT in response to chronic thermogenic activation. Thus, we hypothesized that chronic thermogenic activation broadly alters epigenetic modifications of DNA and histones in BAT. RESULTS: Motivated to understand how BAT function is regulated epigenetically, we developed a novel method for the first-ever unbiased top-down proteomic quantitation of histone modifications in BAT and validated our results with a multi-omic approach. To test our hypothesis, wildtype male C57BL/6J mice were housed under chronic conditions of thermoneutral temperature (TN, 28°C), mild cold/room temperature (RT, 22°C), or severe cold (SC, 8°C) and BAT was analyzed for DNA methylation and histone modifications. Methylation of promoters and intragenic regions in genomic DNA decrease in response to chronic cold exposure. Integration of DNA methylation and RNA expression datasets suggest a role for epigenetic modification of DNA in regulation of gene expression in response to cold. In response to cold housing, we observe increased bulk acetylation of histones H3.2 and H4, increased histone H3.2 proteoforms with di- and trimethylation of lysine 9 (K9me2 and K9me3), and increased histone H4 proteoforms with acetylation of lysine 16 (K16ac) in BAT. CONCLUSIONS: Our results reveal global epigenetically-regulated transcriptional "on" and "off" signals in murine BAT in response to varying degrees of chronic cold stimuli and establish a novel methodology to quantitatively study histones in BAT, allowing for direct comparisons to decipher mechanistic changes during the thermogenic response. Additionally, we make histone PTM and proteoform quantitation, RNA splicing, RRBS, and transcriptional footprint datasets available as a resource for future research.
Assuntos
Tecido Adiposo Marrom , Resposta ao Choque Frio , Metilação de DNA , Epigênese Genética , Histonas , Camundongos Endogâmicos C57BL , Animais , Tecido Adiposo Marrom/metabolismo , Camundongos , Masculino , Histonas/metabolismo , Código das Histonas , Termogênese , Temperatura BaixaRESUMO
Matrin-3 (MATR3) is an RNA-binding protein implicated in neurodegenerative and neurodevelopmental diseases. However, little is known regarding the role of MATR3 in cryptic splicing within the context of functional genes and how disease-associated variants impact this function. We show that loss of MATR3 leads to cryptic exon inclusion in many transcripts. We reveal that ALS-linked S85C pathogenic variant reduces MATR3 solubility but does not impair RNA binding. In parallel, we report a novel neurodevelopmental disease-associated M548T variant, located in the RRM2 domain, which reduces protein solubility and impairs RNA binding and cryptic splicing repression functions of MATR3. Altogether, our research identifies cryptic events within functional genes and demonstrates how disease-associated variants impact MATR3 cryptic splicing repression function.
Assuntos
Esclerose Lateral Amiotrófica , Humanos , Esclerose Lateral Amiotrófica/genética , Éxons/genética , Proteínas de Ligação a RNA/genética , Proteínas de Ligação a RNA/metabolismo , RNA , Proteínas Associadas à Matriz Nuclear/genéticaRESUMO
Cellular responses to the steroid hormones, estrogen (E2), and progesterone (P4) are governed by their cognate receptor's transcriptional output. However, the feed-forward mechanisms that shape cell-type-specific transcriptional fulcrums for steroid receptors are unidentified. Herein, we found that a common feed-forward mechanism between GREB1 and steroid receptors regulates the differential effect of GREB1 on steroid hormones in a physiological or pathological context. In physiological (receptive) endometrium, GREB1 controls P4-responses in uterine stroma, affecting endometrial receptivity and decidualization, while not affecting E2-mediated epithelial proliferation. Of mechanism, progesterone-induced GREB1 physically interacts with the progesterone receptor, acting as a cofactor in a positive feedback mechanism to regulate P4-responsive genes. Conversely, in endometrial pathology (endometriosis), E2-induced GREB1 modulates E2-dependent gene expression to promote the growth of endometriotic lesions in mice. This differential action of GREB1 exerted by a common feed-forward mechanism with steroid receptors advances our understanding of mechanisms that underlie cell- and tissue-specific steroid hormone actions.
Assuntos
Endometriose , Proteínas de Neoplasias , Receptores de Esteroides , Animais , Feminino , Humanos , Camundongos , Endometriose/genética , Endometriose/metabolismo , Endométrio/metabolismo , Estrogênios/metabolismo , Proteínas de Neoplasias/metabolismo , Progesterona/metabolismo , Receptores de Progesterona/genética , Receptores de Progesterona/metabolismo , Receptores de Esteroides/genética , Receptores de Esteroides/metabolismo , Esteroides/metabolismoRESUMO
More than half of human genes exercise alternative polyadenylation (APA) and generate mRNA transcripts with varying 3' untranslated regions (UTR). However, current computational approaches for identifying cleavage and polyadenylation sites (C/PASs) and quantifying 3'UTR length changes from bulk RNA-seq data fail to unravel tissue- and disease-specific APA dynamics. Here, we developed a next-generation bioinformatics algorithm and application, PolyAMiner-Bulk, that utilizes an attention-based machine learning architecture and an improved vector projection-based engine to infer differential APA dynamics accurately. When applied to earlier studies, PolyAMiner-Bulk accurately identified more than twice the number of APA changes in an RBM17 knockdown bulk RNA-seq dataset compared to current generation tools. Moreover, on a separate dataset, PolyAMiner-Bulk revealed novel APA dynamics and pathways in scleroderma pathology and identified differential APA in a gene that was identified as being involved in scleroderma pathogenesis in an independent study. Lastly, we used PolyAMiner-Bulk to analyze the RNA-seq data of post-mortem prefrontal cortexes from the ROSMAP data consortium and unraveled novel APA dynamics in Alzheimer's Disease. Our method, PolyAMiner-Bulk, creates a paradigm for future alternative polyadenylation analysis from bulk RNA-seq data.
RESUMO
Alternative polyadenylation (APA) creates distinct transcripts from the same gene by cleaving the pre-mRNA at poly(A) sites that can lie within the 3' untranslated region (3'UTR), introns, or exons. Most studies focus on APA within the 3'UTR; however, here, we show that CPSF6 insufficiency alters protein levels and causes a developmental syndrome by deregulating APA throughout the transcript. In neonatal humans and zebrafish larvae, CPSF6 insufficiency shifts poly(A) site usage between the 3'UTR and internal sites in a pathway-specific manner. Genes associated with neuronal function undergo mostly intronic APA, reducing their expression, while genes associated with heart and skeletal function mostly undergo 3'UTR APA and are up-regulated. This suggests that, under healthy conditions, cells toggle between internal and 3'UTR APA to modulate protein expression.
Assuntos
Poliadenilação , Peixe-Zebra , Animais , Humanos , Recém-Nascido , Regiões 3' não Traduzidas , Éxons , Íntrons/genética , Peixe-Zebra/genética , Embrião não MamíferoRESUMO
MOTIVATION: The interaction between transcription factor (TF) and transcription factor binding site (TFBS) is essential for gene regulation. Mutation in either the TF or the TFBS may weaken their interaction and thus result in abnormalities. To maintain such vital interaction, a mutation in one of the interacting partners might be compensated by a corresponding mutation in its binding partner during the course of evolution. Confirming this co-evolutionary relationship will guide us in designing protein sequences to target a specific DNA sequence or in predicting TFBS for poorly studied proteins, or even correcting and rescuing disease mutations in clinical applications. RESULTS: Based on six, publicly available, experimentally validated TF-TFBS binding datasets for the basic Helix-Loop-Helix (bHLH) family, Homeo family, High-Mobility Group (HMG) family and Transient Receptor Potential channels (TRP) family, we showed that the evolutions of the TFs and their TFBSs are significantly correlated across eukaryotes. We further developed a mutual information-based method to identify co-evolved protein residues and DNA bases. This research sheds light on the dynamic relationship between TF and TFBS during their evolution. The same principle and strategy can be applied to co-evolutionary studies on protein-DNA interactions in other protein families. AVAILABILITY: All the datasets, scripts and other related files have been made freely available at: http://jjwanglab.org/co-evo. CONTACT: junwen@uw.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Evolução Molecular , Elementos Reguladores de Transcrição , Fatores de Transcrição/metabolismo , Sequência de Bases , Sítios de Ligação , DNA/química , DNA/metabolismo , Humanos , Fatores de Transcrição/química , Fatores de Transcrição/genéticaRESUMO
Recent genome-wide association studies corroborate classical research on developmental programming indicating that obesity is primarily a neurodevelopmental disease strongly influenced by nutrition during critical ontogenic windows. Epigenetic mechanisms regulate neurodevelopment; however, little is known about their role in establishing and maintaining the brain's energy balance circuitry. We generated neuron and glia methylomes and transcriptomes from male and female mouse hypothalamic arcuate nucleus, a key site for energy balance regulation, at time points spanning the closure of an established critical window for developmental programming of obesity risk. We find that postnatal epigenetic maturation is markedly cell type and sex specific and occurs in genomic regions enriched for heritability of body mass index in humans. Our results offer a potential explanation for both the limited ontogenic windows for and sex differences in sensitivity to developmental programming of obesity and provide a rich resource for epigenetic analyses of developmental programming of energy balance.