RESUMO
PURPOSE: More than half of the familial cutaneous melanomas have unknown genetic predisposition. This study aims at characterizing a novel melanoma susceptibility gene. METHODS: We performed exome and targeted sequencing in melanoma-prone families without any known melanoma susceptibility genes. We analyzed the expression of candidate gene DENND5A in melanoma samples in relation to pigmentation and UV signature. Functional studies were carried out using microscopic approaches and zebrafish model. RESULTS: We identified a novel DENND5A truncating variant that segregated with melanoma in a Swedish family and 2 additional rare DENND5A variants, 1 of which segregated with the disease in an American family. We found that DENND5A is significantly enriched in pigmented melanoma tissue. Our functional studies show that loss of DENND5A function leads to decrease in melanin content in vitro and pigmentation defects in vivo. Mechanistically, harboring the truncating variant or being suppressed leads to DENND5A losing its interaction with SNX1 and its ability to transport the SNX1-associated vesicles from melanosomes. Consequently, untethered SNX1-premelanosome protein and redundant tyrosinase are redirected to lysosomal degradation by default, causing decrease in melanin content. CONCLUSION: Our findings provide evidence of a physiological role of DENND5A in the skin context and link its variants to melanoma susceptibility.
Assuntos
Fatores de Troca do Nucleotídeo Guanina/genética , Melanoma , Neoplasias Cutâneas , Animais , Predisposição Genética para Doença , Humanos , Melanoma/genética , Melanossomas , Monofenol Mono-Oxigenase/metabolismo , Neoplasias Cutâneas/genética , Nexinas de Classificação , Sequenciamento do Exoma , Peixe-Zebra/genéticaRESUMO
Epigenetic changes are frequently observed in cancer. However, their role in establishing or sustaining the malignant state has been difficult to determine due to the lack of experimental tools that enable resetting of epigenetic abnormalities. To address this, we applied induced pluripotent stem cell (iPSC) reprogramming techniques to invoke widespread epigenetic resetting of glioblastoma (GBM)-derived neural stem (GNS) cells. GBM iPSCs (GiPSCs) were subsequently redifferentiated to the neural lineage to assess the impact of cancer-specific epigenetic abnormalities on tumorigenicity. GiPSCs and their differentiating derivatives display widespread resetting of common GBM-associated changes, such as DNA hypermethylation of promoter regions of the cell motility regulator TES (testis-derived transcript), the tumor suppressor cyclin-dependent kinase inhibitor 1C (CDKN1C; p57KIP2), and many polycomb-repressive complex 2 (PRC2) target genes (e.g., SFRP2). Surprisingly, despite such global epigenetic reconfiguration, GiPSC-derived neural progenitors remained highly malignant upon xenotransplantation. Only when GiPSCs were directed to nonneural cell types did we observe sustained expression of reactivated tumor suppressors and reduced infiltrative behavior. These data suggest that imposing an epigenome associated with an alternative developmental lineage can suppress malignant behavior. However, in the context of the neural lineage, widespread resetting of GBM-associated epigenetic abnormalities is not sufficient to override the cancer genome.
Assuntos
Reprogramação Celular/genética , Metilação de DNA , Epigênese Genética , Glioblastoma/patologia , Células-Tronco Neurais/citologia , Animais , Diferenciação Celular , Linhagem Celular Tumoral , Linhagem da Célula , Transformação Celular Neoplásica/genética , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Glioblastoma/genética , Humanos , Camundongos , Camundongos Endogâmicos NOD , Células-Tronco Pluripotentes/citologia , Transplante HeterólogoRESUMO
Sox2 is a master transcriptional regulator of embryonic development. In this study, we determined the protein interactome of Sox2 in the chromatin and nucleoplasm of mouse embryonic stem (mES) cells. Apart from canonical interactions with pluripotency-regulating transcription factors, we identified interactions with several chromatin modulators, including members of the heterochromatin protein 1 (HP1) family, suggesting a role for Sox2 in chromatin-mediated transcriptional repression. Sox2 was also found to interact with RNA binding proteins (RBPs), including proteins involved in RNA processing. RNA immunoprecipitation followed by sequencing revealed that Sox2 associates with different messenger RNAs, as well as small nucleolar RNA Snord34 and the non-coding RNA 7SK. 7SK has been shown to regulate transcription at gene regulatory regions, which could suggest a functional interaction with Sox2 for chromatin recruitment. Nevertheless, we found no evidence of Sox2 modulating recruitment of 7SK to chromatin when examining 7SK chromatin occupancy by Chromatin Isolation by RNA Purification (ChIRP) in Sox2 depleted mES cells. In addition, knockdown of 7SK in mES cells did not lead to any change in Sox2 occupancy at 7SK-regulated genes. Thus, our results show that Sox2 extensively interacts with RBPs, and suggest that Sox2 and 7SK co-exist in a ribonucleoprotein complex whose function is not to regulate chromatin recruitment, but could rather regulate other processes in the nucleoplasm.
Assuntos
Células-Tronco Embrionárias Murinas/metabolismo , Fatores de Transcrição SOXB1/metabolismo , Animais , Linhagem Celular , Cromatina/metabolismo , Técnicas de Silenciamento de Genes , Camundongos , Proteínas de Ligação a RNA/metabolismo , Fatores de Transcrição SOXB1/genéticaRESUMO
Complete and accurate genome assembly and annotation is a crucial foundation for comparative and functional genomics. Despite this, few complete eukaryotic genomes are available, and genome annotation remains a major challenge. Here, we present a complete genome assembly of the skin commensal yeast Malassezia sympodialis and demonstrate how proteogenomics can substantially improve gene annotation. Through long-read DNA sequencing, we obtained a gap-free genome assembly for M. sympodialis (ATCC 42132), comprising eight nuclear and one mitochondrial chromosome. We also sequenced and assembled four M. sympodialis clinical isolates, and showed their value for understanding Malassezia reproduction by confirming four alternative allele combinations at the two mating-type loci. Importantly, we demonstrated how proteomics data could be readily integrated with transcriptomics data in standard annotation tools. This increased the number of annotated protein-coding genes by 14% (from 3612 to 4113), compared to using transcriptomics evidence alone. Manual curation further increased the number of protein-coding genes by 9% (to 4493). All of these genes have RNA-seq evidence and 87% were confirmed by proteomics. The M. sympodialis genome assembly and annotation presented here is at a quality yet achieved only for a few eukaryotic organisms, and constitutes an important reference for future host-microbe interaction studies.
Assuntos
Proteínas Fúngicas/genética , Genoma Fúngico , Malassezia/genética , Anotação de Sequência Molecular/métodos , Proteogenômica/métodos , Genes Fúngicos , Genoma Mitocondrial , Peptídeos/genética , Domínios Proteicos , Análise de Sequência de RNARESUMO
We applied a targeted sequencing approach to identify germline mutations conferring a moderately to highly increased risk of cutaneous and uveal melanoma. Ninety-two high-risk melanoma patients were screened for inherited variation in 120 melanoma candidate genes. Observed gene variants were filtered based on frequency in reference populations, cosegregation with melanoma in families and predicted functional effect. Several novel or rare genetic variants in genes involved in DNA damage response, cell-cycle regulation and transcriptional control were identified in melanoma patients. Among identified genetic alterations was an extremely rare variant (minor allele frequency of 0.00008) in the BRIP1 gene that was found to cosegregate with the melanoma phenotype. We also found a rare nonsense variant in the BRCA2 gene (rs11571833), previously associated with cancer susceptibility but not with melanoma, which showed weak association with melanoma susceptibility in the Swedish population. Our results add to the growing knowledge about genetic factors associated with melanoma susceptibility and also emphasize the role of DNA damage response as an important factor in melanoma etiology. © 2016 Wiley Periodicals, Inc.
Assuntos
Proteína BRCA2/genética , Dano ao DNA/genética , Proteínas de Ligação a DNA/genética , Predisposição Genética para Doença , Mutação em Linhagem Germinativa/genética , Melanoma/genética , RNA Helicases/genética , Adulto , Idoso , Idoso de 80 Anos ou mais , Estudos de Casos e Controles , DNA de Neoplasias/análise , DNA de Neoplasias/genética , Proteínas de Grupos de Complementação da Anemia de Fanconi , Feminino , Seguimentos , Humanos , Masculino , Melanoma/patologia , Pessoa de Meia-Idade , Linhagem , PrognósticoRESUMO
We evaluated 25 protocol variants of 14 independent computational methods for exon identification, transcript reconstruction and expression-level quantification from RNA-seq data. Our results show that most algorithms are able to identify discrete transcript components with high success rates but that assembly of complete isoform structures poses a major challenge even when all constituent elements are identified. Expression-level estimates also varied widely across methods, even when based on similar transcript models. Consequently, the complexity of higher eukaryotic genomes imposes severe limitations on transcript recall and splice product discrimination that are likely to remain limiting factors for the analysis of current-generation RNA-seq data.
Assuntos
Biologia Computacional/métodos , Splicing de RNA , Análise de Sequência de RNA/métodos , Algoritmos , Animais , Caenorhabditis elegans , Drosophila melanogaster , Éxons , Perfilação da Expressão Gênica , Genoma , Humanos , Íntrons , Sítios de Splice de RNA , RNA Mensageiro/metabolismo , SoftwareRESUMO
High-throughput RNA sequencing is an increasingly accessible method for studying gene structure and activity on a genome-wide scale. A critical step in RNA-seq data analysis is the alignment of partial transcript reads to a reference genome sequence. To assess the performance of current mapping software, we invited developers of RNA-seq aligners to process four large human and mouse RNA-seq data sets. In total, we compared 26 mapping protocols based on 11 programs and pipelines and found major performance differences between methods on numerous benchmarks, including alignment yield, basewise accuracy, mismatch and gap placement, exon junction discovery and suitability of alignments for transcript reconstruction. We observed concordant results on real and simulated RNA-seq data, confirming the relevance of the metrics employed. Future developments in RNA-seq alignment methods would benefit from improved placement of multimapped reads, balanced utilization of existing gene annotation and a reduced false discovery rate for splice junctions.
Assuntos
Splicing de RNA , Alinhamento de Sequência/métodos , Análise de Sequência de RNA/métodos , Animais , Mapeamento Cromossômico/métodos , Biologia Computacional/métodos , Éxons , Reações Falso-Positivas , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Células K562 , Camundongos , RNA Mensageiro/metabolismo , Reprodutibilidade dos Testes , SoftwareRESUMO
Mammalian promoters can be separated into two classes, conserved TATA box-enriched promoters, which initiate at a well-defined site, and more plastic, broad and evolvable CpG-rich promoters. We have sequenced tags corresponding to several hundred thousand transcription start sites (TSSs) in the mouse and human genomes, allowing precise analysis of the sequence architecture and evolution of distinct promoter classes. Different tissues and families of genes differentially use distinct types of promoters. Our tagging methods allow quantitative analysis of promoter usage in different tissues and show that differentially regulated alternative TSSs are a common feature in protein-coding genes and commonly generate alternative N termini. Among the TSSs, we identified new start sites associated with the majority of exons and with 3' UTRs. These data permit genome-scale identification of tissue-specific promoters and analysis of the cis-acting elements associated with them.
Assuntos
Evolução Molecular , Regiões Promotoras Genéticas , Regiões 3' não Traduzidas , Animais , Sequência de Bases , DNA , Genoma , Proteoma , TATA BoxRESUMO
Genome-wide association studies identified noncoding SNPs associated with type 2 diabetes and obesity in linkage disequilibrium (LD) blocks encompassing HHEX-IDE and introns of CDKAL1 and FTO [Sladek R, et al. (2007) Nature 445:881-885; Steinthorsdottir V, et al. (2007) Nat. Genet 39:770-775; Frayling TM, et al. (2007) Science 316:889-894]. We show that these LD blocks contain highly conserved noncoding elements and overlap with the genomic regulatory blocks of the transcription factor genes HHEX, SOX4, and IRX3. We report that human highly conserved noncoding elements in LD with the risk SNPs drive expression in endoderm or pancreas in transgenic mice and zebrafish. Both HHEX and SOX4 have recently been implicated in pancreas development and the regulation of insulin secretion, but IRX3 had no prior association with pancreatic function or development. Knockdown of its orthologue in zebrafish, irx3a, increased the number of pancreatic ghrelin-producing epsilon cells and decreased the number of insulin-producing beta-cells and glucagon-producing alpha-cells, thereby suggesting a direct link of pancreatic IRX3 function to both obesity and type 2 diabetes.
Assuntos
Diabetes Mellitus Tipo 2/genética , Regulação da Expressão Gênica , Proteínas de Homeodomínio/genética , Obesidade/genética , Polimorfismo de Nucleotídeo Único , Fatores de Transcrição SOXC/genética , Fatores de Transcrição/genética , Animais , Sequência Conservada , Diabetes Mellitus Tipo 2/complicações , Diabetes Mellitus Tipo 2/epidemiologia , Genes Reporter , Estudo de Associação Genômica Ampla , Homeostase , Humanos , Insulina/metabolismo , Secreção de Insulina , Camundongos , Camundongos Transgênicos/genética , Pâncreas/fisiologia , Fatores de Risco , Peixe-Zebra/genéticaRESUMO
Regulatory elements can affect specific genes from megabase distances, often from within or beyond unrelated neighbouring genes. The task of computational charting of regulatory inputs in the genome can be approached from several directions. Typically, computational identification of putative regulatory elements for a gene of interest requires tools that will aid in estimating the extent of the (potentially vast) genomic region around the gene that is likely to contain regulatory elements, as well as tools for the identification and characterization of individual elements. Conversely, starting from a putative regulatory element or a regulatory variation in a non-coding region, one often wants to associate the regulatory element with the correct target gene(s). The design of tools for these purposes relies on the remarkably high level of sequence conservation of thousands of regulatory enhancers, their strong tendency to cluster around their target genes, as well as a constrained range of functional categories of the corresponding target genes, many of which are developmental regulators. Additional evolutionary information, such as conservation of synteny, and a growing body of functional genomic and epigenomic data are being rapidly added to established and emerging tools for studying developmental regulation and cross-species conservation to provide new functional insights into the roles of these regions. In this article, we give an overview of the functionality available in general purpose and new/specialized web tools for the above tasks, and discuss current and future developments in the field.
Assuntos
Biologia Computacional/métodos , Regulação da Expressão Gênica , Internet , Animais , Sequência Conservada , DNA Intergênico , Bases de Dados Genéticas , HumanosRESUMO
Identification of functional genetic variation associated with increased susceptibility to complex diseases can elucidate genes and underlying biochemical mechanisms linked to disease onset and progression. For genes linked to genetic diseases, most identified causal mutations alter an encoded protein sequence. Technological advances for measuring RNA abundance suggest that a significant number of undiscovered causal mutations may alter the regulation of gene transcription. However, it remains a challenge to separate causal genetic variations from linked neutral variations. Here we present an in silico driven approach to identify possible genetic variation in regulatory sequences. The approach combines phylogenetic footprinting and transcription factor binding site prediction to identify variation in candidate cis-regulatory elements. The bioinformatics approach has been tested on a set of SNPs that are reported to have a regulatory function, as well as background SNPs. In the absence of additional information about an analyzed gene, the poor specificity of binding site prediction is prohibitive to its application. However, when additional data is available that can give guidance on which transcription factor is involved in the regulation of the gene, the in silico binding site prediction improves the selection of candidate regulatory polymorphisms for further analyses. The bioinformatics software generated for the analysis has been implemented as a Web-based application system entitled RAVEN (regulatory analysis of variation in enhancers). The RAVEN system is available at http://www.cisreg.ca for all researchers interested in the detection and characterization of regulatory sequence variation.
Assuntos
Variação Genética/genética , Polimorfismo de Nucleotídeo Único/genética , Elementos Reguladores de Transcrição/genética , Análise de Sequência de DNA/métodos , Software , Fatores de Transcrição/genética , Algoritmos , Sítios de Ligação , Internet , Ligação ProteicaRESUMO
RNAdb is a comprehensive database of mammalian non-protein-coding RNAs (ncRNAs). There is increasing recognition that ncRNAs play important regulatory roles in multicellular organisms, and there is an expanding rate of discovery of novel ncRNAs as well as an increasing allocation of function. In this update to RNAdb, we provide nucleotide sequences and annotations for tens of thousands of non-housekeeping ncRNAs, including a wide range of mammalian microRNAs, small nucleolar RNAs and larger mRNA-like ncRNAs. Some of these have documented functions and/or expression patterns, but the majority remain of unclear significance, and include PIWI-interacting RNAs, ncRNAs identified from the latest rounds of large-scale cDNA sequencing projects, putative antisense transcripts, as well as ncRNAs predicted on the basis of structural features and alignments. Improvements to the database comprise not only new and updated ncRNA datasets, but also provision of microarray-based expression data and closer interface with more specialized ncRNA resources such as miRBase and snoRNA-LBME-db. To access RNAdb, visit http://research.imb.uq.edu.au/RNAdb.
Assuntos
Bases de Dados de Ácidos Nucleicos , Mamíferos/genética , RNA não Traduzido/química , Animais , Sequência de Bases , Expressão Gênica , Internet , RNA não Traduzido/metabolismo , Interface Usuário-ComputadorRESUMO
Mammalian genomes harbor a larger than expected number of complex loci, in which multiple genes are coupled by shared transcribed regions in antisense orientation and/or by bidirectional core promoters. To determine the incidence, functional significance, and evolutionary context of mammalian complex loci, we identified and characterized 5,248 cis-antisense pairs, 1,638 bidirectional promoters, and 1,153 chains of multiple cis-antisense and/or bidirectionally promoted pairs from 36,606 mouse transcriptional units (TUs), along with 6,141 cis-antisense pairs, 2,113 bidirectional promoters, and 1,480 chains from 42,887 human TUs. In both human and mouse, 25% of TUs resided in cis-antisense pairs, only 17% of which were conserved between the two organisms, indicating frequent species specificity of antisense gene arrangements. A sampling approach indicated that over 40% of all TUs might actually be in cis-antisense pairs, and that only a minority of these arrangements are likely to be conserved between human and mouse. Bidirectional promoters were characterized by variable transcriptional start sites and an identifiable midpoint at which overall sequence composition changed strand and the direction of transcriptional initiation switched. In microarray data covering a wide range of mouse tissues, genes in cis-antisense and bidirectionally promoted arrangement showed a higher probability of being coordinately expressed than random pairs of genes. In a case study on homeotic loci, we observed extensive transcription of nonconserved sequences on the noncoding strand, implying that the presence rather than the sequence of these transcripts is of functional importance. Complex loci are ubiquitous, host numerous nonconserved gene structures and lineage-specific exonification events, and may have a cis-regulatory impact on the member genes.
Assuntos
Mapeamento Cromossômico , Genoma , Camundongos , Animais , Camundongos/genética , Pareamento de Bases , Primers do DNA , Genoma Humano , Regiões Promotoras Genéticas , Reação em Cadeia da Polimerase Via Transcriptase Reversa , HumanosRESUMO
The international FANTOM consortium aims to produce a comprehensive picture of the mammalian transcriptome, based upon an extensive cDNA collection and functional annotation of full-length enriched cDNAs. The previous dataset, FANTOM2, comprised 60,770 full-length enriched cDNAs. Functional annotation revealed that this cDNA dataset contained only about half of the estimated number of mouse protein-coding genes, indicating that a number of cDNAs still remained to be collected and identified. To pursue the complete gene catalog that covers all predicted mouse genes, cloning and sequencing of full-length enriched cDNAs has been continued since FANTOM2. In FANTOM3, 42,031 newly isolated cDNAs were subjected to functional annotation, and the annotation of 4,347 FANTOM2 cDNAs was updated. To accomplish accurate functional annotation, we improved our automated annotation pipeline by introducing new coding sequence prediction programs and developed a Web-based annotation interface for simplifying the annotation procedures to reduce manual annotation errors. Automated coding sequence and function prediction was followed with manual curation and review by expert curators. A total of 102,801 full-length enriched mouse cDNAs were annotated. Out of 102,801 transcripts, 56,722 were functionally annotated as protein coding (including partial or truncated transcripts), providing to our knowledge the greatest current coverage of the mouse proteome by full-length cDNAs. The total number of distinct non-protein-coding transcripts increased to 34,030. The FANTOM3 annotation system, consisting of automated computational prediction, manual curation, and final expert curation, facilitated the comprehensive characterization of the mouse transcriptome, and could be applied to the transcriptomes of other species.
Assuntos
DNA Complementar/genética , Bases de Dados Genéticas , Camundongos/genética , Transcrição Gênica , Animais , Automação , DNA Complementar/química , GenomaRESUMO
In recent years, there have been increasing numbers of transcripts identified that do not encode proteins, many of which are developmentally regulated and appear to have regulatory functions. Here, we describe the construction of a comprehensive mammalian noncoding RNA database (RNAdb) which contains over 800 unique experimentally studied non-coding RNAs (ncRNAs), including many associated with diseases and/or developmental processes. The database is available at http://research.imb.uq.edu.au/RNAdb and is searchable by many criteria. It includes microRNAs and snoRNAs, but not infrastructural RNAs, such as rRNAs and tRNAs, which are catalogued elsewhere. The database also includes over 1100 putative antisense ncRNAs and almost 20,000 putative ncRNAs identified in high-quality murine and human cDNA libraries, with more to be added in the near future. Many of these RNAs are large, and many are spliced, some alternatively. The database will be useful as a foundation for the emerging field of RNomics and the characterization of the roles of ncRNAs in mammalian gene expression and regulation.
Assuntos
Bases de Dados de Ácidos Nucleicos , Mamíferos/genética , RNA não Traduzido/química , Animais , Humanos , Camundongos , Interface Usuário-ComputadorRESUMO
BACKGROUND: Germline genetic variants are an important cause of dilated cardiomyopathy (DCM). However, recent sequencing studies have revealed rare variants in DCM-associated genes also in individuals without known heart disease. In this study, we investigate variant prevalence and genotype-phenotype correlations in Swedish DCM patients, and compare their genetic variants to those detected in reference cohorts. METHODS AND RESULTS: We sequenced the coding regions of 41 DCM-associated genes in 176 unrelated patients with idiopathic DCM and found 102 protein-altering variants with an allele frequency of <0.04% in reference cohorts; the majority were missense variants not previously described in DCM. Fifty-five (31%) patients had one variant, and 24 (14%) patients had two or more variants in the analysed genes. Detection of genetic variants in any gene, and in LMNA, MYH7 or TTN alone, was associated with early onset disease and reduced transplant-free survival. As expected, nonsense and frameshift variants were more common in DCM patients than in healthy individuals of the reference cohort 1000 Genomes Europeans. Surprisingly however, the prevalence, conservation and pathogenicity scores, and localization of missense variants were similar in DCM patients and healthy reference individuals. CONCLUSION: To our knowledge, this is the first study to identify correlations between genotype and prognosis when sequencing a large number of genes in unselected DCM patients. The similar distribution of missense variants in DCM patients and healthy reference individuals questions the pathogenic role of many variants, and suggests that results from genetic testing of DCM patients should be interpreted with caution.
Assuntos
Cardiomiopatia Dilatada/genética , Adolescente , Adulto , Idoso , Cardiomiopatia Dilatada/mortalidade , Cardiomiopatia Dilatada/terapia , Estudos de Casos e Controles , Feminino , Frequência do Gene , Estudo de Associação Genômica Ampla , Humanos , Masculino , Pessoa de Meia-Idade , Mutação de Sentido Incorreto , Análise de Sobrevida , Suécia , Adulto JovemRESUMO
BACKGROUND: Over 4 million single nucleotide polymorphisms (SNPs) are currently reported to exist within the human genome. Only a small fraction of these SNPs alter gene function or expression, and therefore might be associated with a cell phenotype. These functional SNPs are consequently important in understanding human health. Information related to functional SNPs in candidate disease genes is critical for cost effective genetic association studies, which attempt to understand the genetics of complex diseases like diabetes, Alzheimer's, etc. Robust methods for the identification of functional SNPs are therefore crucial. We report one such experimental approach. RESULTS: Sequence conserved between mouse and human genomes, within 5 kilobases of the 5-prime end of 176 GPCR genes, were screened for SNPs. Sequences flanking these SNPs were scored for transcription factor binding sites. Allelic pairs resulting in a significant score difference were predicted to influence the binding of transcription factors (TFs). Ten such SNPs were selected for mobility shift assays (EMSA), resulting in 7 of them exhibiting a reproducible shift. The full-length promoter regions with 4 of the 7 SNPs were cloned in a Luciferase based plasmid reporter system. Two out of the 4 SNPs exhibited differential promoter activity in several human cell lines. CONCLUSIONS: We propose a method for effective selection of functional, regulatory SNPs that are located in evolutionary conserved 5-prime flanking regions (5'-FR) regions of human genes and influence the activity of the transcriptional regulatory region. Some SNPs behave differently in different cell types.
Assuntos
Regulação da Expressão Gênica , Genoma Humano , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA/métodos , Algoritmos , Alelos , Animais , Sítios de Ligação , Linhagem Celular , Mapeamento Cromossômico , Sequência Conservada , Evolução Molecular , Humanos , Luciferases/metabolismo , Camundongos , Modelos Genéticos , Dados de Sequência Molecular , Oligonucleotídeos/química , Fenótipo , Plasmídeos/metabolismo , Polimorfismo de Fragmento de Restrição , Regiões Promotoras Genéticas , Ligação Proteica , Fatores de Transcrição/metabolismo , Transcrição GênicaRESUMO
BACKGROUND: Evolutionarily conserved sequences within or adjoining orthologous genes often serve as critical cis-regulatory regions. Recent studies have identified long, non-coding genomic regions that are perfectly conserved between human and mouse, termed ultra-conserved regions (UCRs). Here, we focus on UCRs that cluster around genes involved in early vertebrate development; genes conserved over 450 million years of vertebrate evolution. RESULTS: Based on a high resolution detection procedure, our UCR set enables novel insights into vertebrate genome organization and regulation of developmentally important genes. We find that the genomic positions of deeply conserved UCRs are strongly associated with the locations of genes encoding key regulators of development, with particularly strong positional correlation to transcription factor-encoding genes. Of particular importance is the observation that most UCRs are clustered into arrays that span hundreds of kilobases around their presumptive target genes. Such a hallmark signature is present around several uncharacterized human genes predicted to encode developmentally important DNA-binding proteins. CONCLUSION: The genomic organization of UCRs, combined with previous findings, suggests that UCRs act as essential long-range modulators of gene expression. The exceptional sequence conservation and clustered structure suggests that UCR-mediated molecular events involve greater complexity than traditional DNA binding by transcription factors. The high-resolution UCR collection presented here provides a wealth of target sequences for future experimental studies to determine the nature of the biochemical mechanisms involved in the preservation of arrays of nearly identical non-coding sequences over the course of vertebrate evolution.
Assuntos
Regulação da Expressão Gênica no Desenvolvimento , Genes Controladores do Desenvolvimento , Genoma , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Animais , Análise por Conglomerados , Sequência Conservada , DNA/metabolismo , Proteínas de Ligação a DNA/genética , Evolução Molecular , Regulação da Expressão Gênica , Humanos , Dados de Sequência Molecular , Família Multigênica , Ligação Proteica , Vertebrados/genéticaRESUMO
BACKGROUND: Pluripotency is characterized by a unique transcriptional state, in which lineage-specification genes are poised for transcription upon exposure to appropriate stimuli, via a bivalency mechanism involving the simultaneous presence of activating and repressive methylation marks at promoter-associated histones. Recent evidence suggests that other mechanisms, such as RNA polymerase II pausing, might be operational in this process, but their regulation remains poorly understood. RESULTS: Here we identify the non-coding snRNA 7SK as a multifaceted regulator of transcription in embryonic stem cells. We find that 7SK represses a specific cohort of transcriptionally poised genes with bivalent or activating chromatin marks in these cells, suggesting a novel poising mechanism independent of Polycomb activity. Genome-wide analysis shows that 7SK also prevents transcription downstream of polyadenylation sites at several active genes, indicating that 7SK is required for normal transcriptional termination or control of 3'-UTR length. In addition, 7SK suppresses divergent upstream antisense transcription at more than 2,600 loci, including many that encode divergent long non-coding RNAs, a finding that implicates the 7SK snRNA in the control of transcriptional bidirectionality. CONCLUSIONS: Our study indicates that a single non-coding RNA, the snRNA 7SK, is a gatekeeper of transcriptional termination and bidirectional transcription in embryonic stem cells and mediates transcriptional poising through a mechanism independent of chromatin bivalency.
Assuntos
Células-Tronco Embrionárias/metabolismo , Regulação da Expressão Gênica no Desenvolvimento , Genoma , RNA Polimerase II/genética , RNA Nuclear Pequeno/genética , Terminação da Transcrição Genética , Regiões 3' não Traduzidas , Animais , Sítios de Ligação , Cromatina/química , Cromatina/metabolismo , Embrião de Mamíferos , Células-Tronco Embrionárias/citologia , Loci Gênicos , Histonas/genética , Histonas/metabolismo , Camundongos , Poliadenilação , Regiões Promotoras Genéticas , RNA Polimerase II/metabolismo , RNA Interferente Pequeno/genética , RNA Interferente Pequeno/metabolismo , RNA Nuclear Pequeno/antagonistas & inibidores , RNA Nuclear Pequeno/metabolismoRESUMO
Glioblastoma multiforme (GBM) is the most common primary brain cancer in adults and there are few effective treatments. GBMs contain cells with molecular and cellular characteristics of neural stem cells that drive tumour growth. Here we compare responses of human glioblastoma-derived neural stem (GNS) cells and genetically normal neural stem (NS) cells to a panel of 160 small molecule kinase inhibitors. We used live-cell imaging and high content image analysis tools and identified JNJ-10198409 (J101) as an agent that induces mitotic arrest at prometaphase in GNS cells but not NS cells. Antibody microarrays and kinase profiling suggested that J101 responses are triggered by suppression of the active phosphorylated form of polo-like kinase 1 (Plk1) (phospho T210), with resultant spindle defects and arrest at prometaphase. We found that potent and specific Plk1 inhibitors already in clinical development (BI 2536, BI 6727 and GSK 461364) phenocopied J101 and were selective against GNS cells. Using a porcine brain endothelial cell blood-brain barrier model we also observed that these compounds exhibited greater blood-brain barrier permeability in vitro than J101. Our analysis of mouse mutant NS cells (INK4a/ARF(-/-), or p53(-/-)), as well as the acute genetic deletion of p53 from a conditional p53 floxed NS cell line, suggests that the sensitivity of GNS cells to BI 2536 or J101 may be explained by the lack of a p53-mediated compensatory pathway. Together these data indicate that GBM stem cells are acutely susceptible to proliferative disruption by Plk1 inhibitors and that such agents may have immediate therapeutic value.