RESUMO
Combinatorial interactions among transcription factors are critical to directing tissue-specific gene expression. To build a global atlas of these combinations, we have screened for physical interactions among the majority of human and mouse DNA-binding transcription factors (TFs). The complete networks contain 762 human and 877 mouse interactions. Analysis of the networks reveals that highly connected TFs are broadly expressed across tissues, and that roughly half of the measured interactions are conserved between mouse and human. The data highlight the importance of TF combinations for determining cell fate, and they lead to the identification of a SMAD3/FLI1 complex expressed during development of immunity. The availability of large TF combinatorial networks in both human and mouse will provide many opportunities to study gene regulation, tissue differentiation, and mammalian evolution.
Assuntos
Regulação da Expressão Gênica , Redes Reguladoras de Genes , Fatores de Transcrição/metabolismo , Animais , Diferenciação Celular , Evolução Molecular , Humanos , Camundongos , Monócitos/citologia , Especificidade de Órgãos , Proteína Smad3/metabolismo , Transativadores/metabolismoRESUMO
BACKGROUND: Colorectal cancer (CRC) is a heterogeneous disease, with subtypes that have different clinical behaviours and subsequent prognoses. There is a growing body of evidence suggesting that right-sided colorectal cancer (RCC) and left-sided colorectal cancer (LCC) also differ in treatment success and patient outcomes. Biomarkers that differentiate between RCC and LCC are not well-established. Here, we apply random forest (RF) machine learning methods to identify genomic or microbial biomarkers that differentiate RCC and LCC. METHODS: RNA-seq expression data for 58,677 coding and non-coding human genes and count data for 28,557 human unmapped reads were obtained from 308 patient CRC tumour samples. We created three RF models for datasets of human genes-only, microbes-only, and genes-and-microbes combined. We used a permutation test to identify features of significant importance. Finally, we used differential expression (DE) and paired Wilcoxon-rank sum tests to associate features with a particular side. RESULTS: RF model accuracy scores were 90%, 70%, and 87% with area under curve (AUC) of 0.9, 0.76, and 0.89 for the human genomic, microbial, and combined feature sets, respectively. 15 features were identified as significant in the model of genes-only, 54 microbes in the model of microbes-only, and 28 genes and 18 microbes in the model with genes-and-microbes combined. PRAC1 expression was the most important feature for differentiating RCC and LCC in the genes-only model, with HOXB13, SPAG16, HOXC4, and RNLS also playing a role. Ruminococcus gnavus and Clostridium acetireducens were the most important in the microbial-only model. MYOM3, HOXC4, Coprococcus eutactus, PRAC1, lncRNA AC012531.25, Ruminococcus gnavus, RNLS, HOXC6, SPAG16 and Fusobacterium nucleatum were most important in the combined model. CONCLUSIONS: Many of the identified genes and microbes among all models have previously established associations with CRC. However, the ability of RF models to account for inter-feature relationships within the underlying decision trees may yield a more sensitive and biologically interconnected set of genomic and microbial biomarkers.
Assuntos
Neoplasias Colorretais , Microbiota , Neoplasias Colorretais/genética , Humanos , Algoritmo Florestas Aleatórias , Aprendizado de Máquina , Microbiota/genética , Marcadores Genéticos , Masculino , Feminino , Adulto , Pessoa de Meia-Idade , Idoso , Idoso de 80 Anos ou maisRESUMO
Long non-coding RNAs (lncRNAs) are largely heterogeneous and functionally uncharacterized. Here, using FANTOM5 cap analysis of gene expression (CAGE) data, we integrate multiple transcript collections to generate a comprehensive atlas of 27,919 human lncRNA genes with high-confidence 5' ends and expression profiles across 1,829 samples from the major human primary cell types and tissues. Genomic and epigenomic classification of these lncRNAs reveals that most intergenic lncRNAs originate from enhancers rather than from promoters. Incorporating genetic and expression data, we show that lncRNAs overlapping trait-associated single nucleotide polymorphisms are specifically expressed in cell types relevant to the traits, implicating these lncRNAs in multiple diseases. We further demonstrate that lncRNAs overlapping expression quantitative trait loci (eQTL)-associated single nucleotide polymorphisms of messenger RNAs are co-expressed with the corresponding messenger RNAs, suggesting their potential roles in transcriptional regulation. Combining these findings with conservation data, we identify 19,175 potentially functional lncRNAs in the human genome.
Assuntos
Bases de Dados Genéticas , RNA Longo não Codificante/química , RNA Longo não Codificante/genética , Transcriptoma/genética , Células Cultivadas , Sequência Conservada/genética , Conjuntos de Dados como Assunto , Elementos Facilitadores Genéticos/genética , Epigênese Genética , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Genoma Humano/genética , Estudo de Associação Genômica Ampla , Genômica , Humanos , Internet , Anotação de Sequência Molecular , Especificidade de Órgãos/genética , Polimorfismo de Nucleotídeo Único , Regiões Promotoras Genéticas/genética , Locos de Características Quantitativas/genética , Estabilidade de RNA , RNA Mensageiro/genéticaRESUMO
Circular RNAs (circRNAs) are an evolutionarily conserved form of noncoding RNA with covalently closed loop structures. Initial studies established a functional role for circRNAs as potent microRNA sponges and many other studies have focussed solely on this. However, the biological functions of most circRNAs are still undetermined and other functional roles are gaining traction. These include protein sponges and regulators, and coding for proteins with an alternative mechanism of translation, potentially opening up a whole new transcriptome. The first step to gaining insight into circRNA function is accurate identification and various software platforms have been developed. Specialized detection software has now evolved into whole bioinformatics pipelines that can be used for detection, de novo identification, functional prediction, and validation of circRNAs. However, few cardiovascular circRNA studies have utilized these tools. This review summarizes current knowledge of circRNA biogenesis, bioinformatic detection tools and the emerging role of circRNAs in cardiovascular disease.
Assuntos
Doenças Cardiovasculares/genética , Biologia Computacional/métodos , RNA Circular/genética , Regulação da Expressão Gênica , Humanos , MicroRNAs/genética , SoftwareRESUMO
BACKGROUND: Long noncoding RNAs (lncRNAs) have been implicated in the pathogenesis of cardiovascular diseases. We aimed to identify novel lncRNAs associated with the early response to ischemia in the heart. METHODS AND RESULTS: RNA sequencing data gathered from 81 paired left ventricle samples from patients undergoing cardiopulmonary bypass was collected before and after a period of ischemia. Novel lncRNAs were validated with Oxford Nanopore Technologies long-read sequencing. Gene modules associated with an early ischemic response were identified and the subcellular location of selected lncRNAs was determined with RNAscope. A total of 2446 mRNAs, 270 annotated lncRNAs and one novel lncRNA differed in response to ischemia (adjusted p < 0.001, absolute fold change >1.2). The novel lncRNA belonged to a gene module of highly correlated genes that also included 39 annotated lncRNAs. This module associated with ischemia (Pearson correlation coefficient = -0.69, p = 1 × 10-23) and activation of cell death pathways (p < 6 × 10-9). A further nine novel cardiac lncRNAs were identified, of which, one overlapped five cis-eQTL eSNPs for the gene RWD Domain-Containing Sumoylation Enhancer (RWDD3) and was itself correlated with RWDD3 expression (Pearson correlation coefficient -0.2, p = 0.002). CONCLUSION: We have identified 10 novel lncRNAs, one of which was associated with myocardial ischemia and may have potential as a novel therapeutic target or early marker for myocardial dysfunction.
Assuntos
Isquemia Miocárdica/genética , Isquemia Miocárdica/metabolismo , RNA Longo não Codificante/metabolismo , Bases de Dados Genéticas , Regulação da Expressão Gênica , Redes Reguladoras de Genes , Ventrículos do Coração/metabolismo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Miocárdio/metabolismo , RNA Mensageiro/metabolismo , Análise de Sequência de RNARESUMO
Regulated transcription controls the diversity, developmental pathways and spatial organization of the hundreds of cell types that make up a mammal. Using single-molecule cDNA sequencing, we mapped transcription start sites (TSSs) and their usage in human and mouse primary cells, cell lines and tissues to produce a comprehensive overview of mammalian gene expression across the human body. We find that few genes are truly 'housekeeping', whereas many mammalian promoters are composite entities composed of several closely separated TSSs, with independent cell-type-specific expression profiles. TSSs specific to different cell types evolve at different rates, whereas promoters of broadly expressed genes are the most conserved. Promoter-based expression analysis reveals key transcription factors defining cell states and links them to binding-site motifs. The functions of identified novel transcripts can be predicted by coexpression and sample ontology enrichment analyses. The functional annotation of the mammalian genome 5 (FANTOM5) project provides comprehensive expression profiles and functional annotation of mammalian cell-type-specific transcriptomes with wide applications in biomedical research.
Assuntos
Atlas como Assunto , Anotação de Sequência Molecular , Regiões Promotoras Genéticas/genética , Transcriptoma/genética , Animais , Linhagem Celular , Células Cultivadas , Análise por Conglomerados , Sequência Conservada/genética , Regulação da Expressão Gênica/genética , Redes Reguladoras de Genes/genética , Genes Essenciais/genética , Genoma/genética , Humanos , Camundongos , Fases de Leitura Aberta/genética , Especificidade de Órgãos , RNA Mensageiro/análise , RNA Mensageiro/genética , Fatores de Transcrição/metabolismo , Sítio de Iniciação de Transcrição , Transcrição Gênica/genéticaRESUMO
BACKGROUND: Tuberculosis is a life-threatening infectious disease caused by Mycobacterium tuberculosis (M.tb). M.tb subverts host immune responses to build a favourable niche and survive inside of host macrophages. Macrophages can control or eliminate the infection, if acquire appropriate functional phenotypes. Transcriptional regulation is a key process that governs the activation and maintenance of these phenotypes. Among the factors orchestrating transcriptional regulation during M.tb infection, transcriptional enhancers still remain unexplored. RESULTS: We analysed transcribed enhancers in M.tb-infected mouse bone marrow-derived macrophages. We established a link between known M.tb-responsive transcription factors and transcriptional activation of enhancers and their target genes. Our data suggest that enhancers might drive macrophage response via transcriptional activation of key immune genes, such as Tnf, Tnfrsf1b, Irg1, Hilpda, Ccl3, and Ccl4. We report enhancers acquiring transcription de novo upon infection. Finally, we link highly transcriptionally induced enhancers to activation of genes with previously unappreciated roles in M.tb infection, such as Fbxl3, Tapt1, Edn1, and Hivep1. CONCLUSIONS: Our findings suggest the importance of macrophage host transcriptional enhancers during M.tb infection. Our study extends current knowledge of the regulation of macrophage responses to M.tb infection and provides a basis for future functional studies on enhancer-gene interactions in this process.
Assuntos
Elementos Facilitadores Genéticos , Regulação da Expressão Gênica , Macrófagos/imunologia , Mycobacterium tuberculosis/fisiologia , Animais , Sítios de Ligação , Macrófagos/metabolismo , Macrófagos/microbiologia , Camundongos Endogâmicos BALB C , Fatores de Transcrição/metabolismo , Transcrição GênicaRESUMO
Mammalian diversification has coincided with a rapid proliferation of various types of noncoding RNAs, including members of both snRNAs and snoRNAs. The significance of this expansion however remains obscure. While some ncRNA copy-number expansions have been linked to functionally tractable effects, such events may equally likely be neutral, perhaps as a result of random retrotransposition. Hindering progress in our understanding of such observations is the difficulty in establishing function for the diverse features that have been identified in our own genome. Projects such as ENCODE and FANTOM have revealed a hidden world of genomic expression patterns, as well as a host of other potential indicators of biological function. However, such projects have been criticized, particularly from practitioners in the field of molecular evolution, where many suspect these data provide limited insight into biological function. The molecular evolution community has largely taken a skeptical view, thus it is important to establish tests of function. We use a range of data, including data drawn from ENCODE and FANTOM, to examine the case for function for the recent copy number expansion in mammals of six evolutionarily ancient RNA families involved in splicing and rRNA maturation. We use several criteria to assess evidence for function: conservation of sequence and structure, genomic synteny, evidence for transposition, and evidence for species-specific expression. Applying these criteria, we find that only a minority of loci show strong evidence for function and that, for the majority, we cannot reject the null hypothesis of no function.
Assuntos
Elementos de DNA Transponíveis , Dosagem de Genes , Expressão Gênica , Mamíferos/genética , RNA Nuclear Pequeno , Animais , Bases de Dados Genéticas , Evolução Molecular , Genômica , Família Multigênica , Splicing de RNARESUMO
Mammalian genomes contain a number of duplicated genes, and sequence identity between these duplicates can be maintained by purifying selection. However, between-duplicate recombination can also maintain sequence identity between copies, resulting in a pattern known as concerted evolution where within-genome repeats are more similar to each other than to orthologous repeats in related species. Here we investigated the tandemly-repeated keratin-associated protein 1 (KAP1) gene family, KRTAP1, which encodes proteins that are important components of hair and wool in mammals. Comparison of eutherian mammal KRTAP1 gene repeats within and between species shows a strong pattern of concerted evolution. However, in striking contrast to the coding regions of these genes, we find that the flanking regions have a divergent pattern of evolution. This contrast in evolutionary pattern transitions abruptly near the start and stop codons of the KRTAP1 genes. We reveal that this difference in evolutionary patterns is not explained by conventional purifying selection, nor is it likely a consequence of codon adaptation or reverse transcription of KRTAP1-n mRNA. Instead, the evidence suggests that these contrasting patterns result from short-tract gene conversion events that are biased to the KRTAP1 coding region by selection and/or differential sequence divergence. This work demonstrates the power that gene conversion has to finely shape the evolution of repetitive genes, and provides another distinctive pattern of contrasting evolutionary outcomes that results from gene conversion. A greater emphasis on exploring the evolution of multi-gene eukaryotic families will reveal how common different contrasting evolutionary patterns are in gene duplicates.
Assuntos
Evolução Molecular , Queratinas/genética , Mamíferos/genética , Fases de Leitura Aberta/genética , Animais , Sequência de Bases , Códon/genética , DNA Intergênico/genética , Conversão Gênica , Queratinas/metabolismo , Filogenia , Polimorfismo Genético , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Seleção Genética , Ovinos/genética , Sequências de Repetição em Tandem/genéticaRESUMO
BACKGROUND: Post-surgical staging is the mainstay of prognostic stratification for colorectal cancer (CRC). Here, we compare TNM staging to consensus molecular subtyping (CMS) and assess the value of subtyping in addition to stratification by TNM. METHODS: Three hundred and eight treatment-naïve colorectal tumours were accessed from our institutional tissue bank. CMS typing was carried out using tumour gene-expression data. Post-surgical TNM-staging and CMS were analysed with respect to clinicopathologic variables and patient outcome. RESULTS: CMS alone was not associated with survival, while TNM stage significantly explained mortality. Addition of CMS to TNM-stratified tumours showed a prognostic effect in stage 2 tumours; CMS3 tumours had a significantly lower overall survival (P = 0.006). Stage 2 patients with a good prognosis showed immune activation and up-regulation of tumour suppressor genes. CONCLUSIONS: Although stratification using CMS does not outperform TNM staging as a prognostic indicator, gene-expression based subtyping shows promise for improved prognostication in stage 2 CRC.
Assuntos
Biomarcadores Tumorais , Neoplasias Colorretais/diagnóstico , Neoplasias Colorretais/etiologia , Adulto , Idoso , Idoso de 80 Anos ou mais , Biomarcadores Tumorais/genética , Neoplasias Colorretais/mortalidade , Neoplasias Colorretais/cirurgia , Biologia Computacional/métodos , Feminino , Perfilação da Expressão Gênica/métodos , Humanos , Estimativa de Kaplan-Meier , Masculino , Pessoa de Meia-Idade , Estadiamento de Neoplasias , Cuidados Pós-Operatórios , Prognóstico , Modelos de Riscos Proporcionais , Recidiva , TranscriptomaRESUMO
Transcription factors (TFs) play a pivotal role in transcriptional regulation, making them crucial for cell survival and important biological functions. For the regulation of transcription, interactions of different regulatory proteins known as transcription co-factors (TcoFs) and TFs are essential in forming necessary protein complexes. Although TcoFs themselves do not bind DNA directly, their influence on transcriptional regulation and initiation, although indirect, has been shown to be significant, with the functionality of TFs strongly influenced by the presence of TcoFs. In the TcoF-DB v2 database, we collect information on TcoFs. In this article, we describe updates and improvements implemented in TcoF-DB v2. TcoF-DB v2 provides several new features that enables exploration of the roles of TcoFs. The content of the database has significantly expanded, and is enriched with information from Gene Ontology, biological pathways, diseases and molecular signatures. TcoF-DB v2 now includes many more TFs; has substantially increased the number of human TcoFs to 958, and now includes information on mouse (418 new TcoFs). TcoF-DB v2 enables the exploration of information on TcoFs and allows investigations into their influence on transcriptional regulation in humans and mice. TcoF-DB v2 can be accessed at http://tcofdb.org/.
Assuntos
Proteínas de Transporte , Bases de Dados Genéticas , Regulação da Expressão Gênica , Fatores de Transcrição , Animais , Proteínas de Transporte/metabolismo , Humanos , Camundongos , Ligação Proteica , Fatores de Transcrição/metabolismoRESUMO
Noncoding RNAs (ncRNAs), particularly microRNAs (miRNAs) and long ncRNAs (lncRNAs), are important players in diseases and emerge as novel drug targets. Thus, unraveling the relationships between ncRNAs and other biomedical entities in cells are critical for better understanding ncRNA roles that may eventually help develop their use in medicine. To support ncRNA research and facilitate retrieval of relevant information regarding miRNAs and lncRNAs from the plethora of published ncRNA-related research, we developed DES-ncRNA ( www.cbrc.kaust.edu.sa/des_ncrna ). DES-ncRNA is a knowledgebase containing text- and data-mined information from public scientific literature and other public resources. Exploration of mined information is enabled through terms and pairs of terms from 19 topic-specific dictionaries including, for example, antibiotics, toxins, drugs, enzymes, mutations, pathways, human genes and proteins, drug indications and side effects, mutations, diseases, etc. DES-ncRNA contains approximately 878,000 associations of terms from these dictionaries of which 36,222 (5,373) are with regards to miRNAs (lncRNAs). We provide several ways to explore information regarding ncRNAs to users including controlled generation of association networks as well as hypotheses generation. We show an example how DES-ncRNA can aid research on Alzheimer disease and suggest potential therapeutic role for Fasudil. DES-ncRNA is a powerful tool that can be used on its own or as a complement to the existing resources, to support research in human ncRNA. To our knowledge, this is the only knowledgebase dedicated to human miRNAs and lncRNAs derived primarily through literature-mining enabling exploration of a broad spectrum of associated biomedical entities, not paralleled by any other resource.
Assuntos
Mineração de Dados , Bases de Conhecimento , MicroRNAs/genética , RNA Longo não Codificante/genética , Software , 1-(5-Isoquinolinasulfonil)-2-Metilpiperazina/análogos & derivados , 1-(5-Isoquinolinasulfonil)-2-Metilpiperazina/uso terapêutico , Doença de Alzheimer/tratamento farmacológico , Doença de Alzheimer/genética , Dicionários como Assunto , Progressão da Doença , Ontologia Genética , Humanos , MicroRNAs/metabolismo , RNA Longo não Codificante/metabolismoRESUMO
Basic leucine zipper transcription factor Batf2 is poorly described, whereas Batf and Batf3 have been shown to play essential roles in dendritic cell, T cell, and B cell development and regulation. Batf2 was drastically induced in IFN-γ-activated classical macrophages (M1) compared with unstimulated or IL-4-activated alternative macrophages (M2). Batf2 knockdown experiments from IFN-γ-activated macrophages and subsequent expression profiling demonstrated important roles for regulation of immune responses, inducing inflammatory and host-protective genes Tnf, Ccl5, and Nos2. Mycobacterium tuberculosis (Beijing strain HN878)-infected macrophages further induced Batf2 and augmented host-protective Batf2-dependent genes, particularly in M1, whose mechanism was suggested to be mediated through both TLR2 and TLR4 by LPS and heat-killed HN878 (HKTB) stimulation experiments. Irf1 binding motif was enriched in the promoters of Batf2-regulated genes. Coimmunoprecipitation study demonstrated Batf2 association with Irf1. Furthermore, Irf1 knockdown showed downregulation of IFN-γ- or LPS/HKTB-activated host-protective genes Tnf, Ccl5, Il12b, and Nos2. Conclusively, Batf2 is an activation marker gene for M1 involved in gene regulation of IFN-γ-activated classical macrophages, as well as LPS/HKTB-induced macrophage stimulation, possibly by Batf2/Irf1 gene induction. Taken together, these results underline the role of Batf2/Irf1 in inducing inflammatory responses in M. tuberculosis infection.
Assuntos
Fatores de Transcrição de Zíper de Leucina Básica/genética , Fator Regulador 1 de Interferon/genética , Macrófagos/imunologia , Macrófagos/metabolismo , Infecções por Mycobacterium/genética , Infecções por Mycobacterium/imunologia , Mycobacterium/imunologia , Animais , Fatores de Transcrição de Zíper de Leucina Básica/metabolismo , Análise por Conglomerados , Modelos Animais de Doenças , Expressão Gênica , Perfilação da Expressão Gênica , Regulação da Expressão Gênica/efeitos dos fármacos , Técnicas de Silenciamento de Genes , Fator Regulador 1 de Interferon/metabolismo , Interferon gama/farmacologia , Lipopolissacarídeos/imunologia , Ativação de Macrófagos/imunologia , Masculino , Camundongos , Infecções por Mycobacterium/metabolismo , Óxido Nítrico Sintase Tipo II/genética , Óxido Nítrico Sintase Tipo II/metabolismo , Ligação Proteica , Fatores de Necrose Tumoral/genética , Fatores de Necrose Tumoral/metabolismoRESUMO
Classically or alternatively activated macrophages (M1 and M2, respectively) play distinct and important roles for microbiocidal activity, regulation of inflammation and tissue homeostasis. Despite this, their transcriptional regulatory dynamics are poorly understood. Using promoter-level expression profiling by non-biased deepCAGE we have studied the transcriptional dynamics of classically and alternatively activated macrophages. Transcription factor (TF) binding motif activity analysis revealed four motifs, NFKB1_REL_RELA, IRF1,2, IRF7 and TBP that are commonly activated but have distinct activity dynamics in M1 and M2 activation. We observe matching changes in the expression profiles of the corresponding TFs and show that only a restricted set of TFs change expression. There is an overall drastic and transient up-regulation in M1 and a weaker and more sustainable up-regulation in M2. Novel TFs, such as Thap6, Maff, (M1) and Hivep1, Nfil3, Prdm1, (M2) among others, were suggested to be involved in the activation processes. Additionally, 52 (M1) and 67 (M2) novel differentially expressed genes and, for the first time, several differentially expressed long non-coding RNA (lncRNA) transcriptome markers were identified. In conclusion, the finding of novel motifs, TFs and protein-coding and lncRNA genes is an important step forward to fully understand the transcriptional machinery of macrophage activation.
Assuntos
Regulação da Expressão Gênica , Ativação de Macrófagos/genética , Macrófagos/metabolismo , Transcriptoma , Animais , Células Cultivadas , DNA/química , Perfilação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Interferon gama/farmacologia , Interleucina-13/farmacologia , Interleucina-4/farmacologia , Macrófagos/efeitos dos fármacos , Masculino , Camundongos Endogâmicos BALB C , Motivos de Nucleotídeos , Regiões Promotoras Genéticas , Análise de Sequência de DNA , Fatores de Transcrição/metabolismoRESUMO
The initiation and regulation of transcription in eukaryotes is complex and involves a large number of transcription factors (TFs), which are known to bind to the regulatory regions of eukaryotic DNA. Apart from TF-DNA binding, protein-protein interaction involving TFs is an essential component of the machinery facilitating transcriptional regulation. Proteins that interact with TFs in the context of transcription regulation but do not bind to the DNA themselves, we consider transcription co-factors (TcoFs). The influence of TcoFs on transcriptional regulation and initiation, although indirect, has been shown to be significant with the functionality of TFs strongly influenced by the presence of TcoFs. While the role of TFs and their interaction with regulatory DNA regions has been well-studied, the association between TFs and TcoFs has so far been given less attention. Here, we present a resource that is comprised of a collection of human TFs and the TcoFs with which they interact. Other proteins that have a proven interaction with a TF, but are not considered TcoFs are also included. Our database contains 157 high-confidence TcoFs and additionally 379 hypothetical TcoFs. These have been identified and classified according to the type of available evidence for their involvement in transcriptional regulation and their presence in the cell nucleus. We have divided TcoFs into four groups, one of which contains high-confidence TcoFs and three others contain TcoFs which are hypothetical to different extents. We have developed the Dragon Database for Human Transcription Co-Factors and Transcription Factor Interacting Proteins (TcoF-DB). A web-based interface for this resource can be freely accessed at http://cbrc.kaust.edu.sa/tcof/ and http://apps.sanbi.ac.za/tcof/.
Assuntos
Bases de Dados de Proteínas , Fatores de Transcrição/metabolismo , Transcrição Gênica , Proteínas de Transporte/metabolismo , Humanos , Mapeamento de Interação de ProteínasRESUMO
Prostate cancer (PC) is one of the most commonly diagnosed cancers in men. PC is relatively difficult to diagnose due to a lack of clear early symptoms. Extensive research of PC has led to the availability of a large amount of data on PC. Several hundred genes are implicated in different stages of PC, which may help in developing diagnostic methods or even cures. In spite of this accumulated information, effective diagnostics and treatments remain evasive. We have developed Dragon Database of Genes associated with Prostate Cancer (DDPC) as an integrated knowledgebase of genes experimentally verified as implicated in PC. DDPC is distinctive from other databases in that (i) it provides pre-compiled biomedical text-mining information on PC, which otherwise require tedious computational analyses, (ii) it integrates data on molecular interactions, pathways, gene ontologies, gene regulation at molecular level, predicted transcription factor binding sites on promoters of PC implicated genes and transcription factors that correspond to these binding sites and (iii) it contains DrugBank data on drugs associated with PC. We believe this resource will serve as a source of useful information for research on PC. DDPC is freely accessible for academic and non-profit users via http://apps.sanbi.ac.za/ddpc/ and http://cbrc.kaust.edu.sa/ddpc/.
Assuntos
Bases de Dados Genéticas , Genes Neoplásicos , Neoplasias da Próstata/genética , Mineração de Dados , Humanos , Bases de Conhecimento , MasculinoRESUMO
The identification of functional processes taking place in microbiome communities augment traditional microbiome taxonomic studies, giving a more complete picture of interactions taking place within the community. While there are applications that perform functional annotation on metagenomes or metatranscriptomes, very few of these are able to link taxonomic identity to function or are limited by their input types or databases used. Here we present MetaFunc, a workflow which takes RNA sequences as input reads, and from these (1) identifies species present in the microbiome sample and (2) provides gene ontology annotations associated with the species identified. In addition, MetaFunc allows for host gene analysis, mapping the reads to a host genome, and separating these reads, prior to microbiome analyses. Differential abundance analysis for microbe taxonomies, and differential gene expression analysis and gene set enrichment analysis may then be carried out through the pipeline. A final correlation analysis between microbial species and host genes can also be performed. Finally, MetaFunc builds an R shiny application that allows users to view and interact with the microbiome results. In this paper, we showed how MetaFunc can be applied to metatranscriptomic datasets of colorectal cancer.
RESUMO
Colorectal cancer (CRC) is a leading cause of morbidity and mortality worldwide. The majority of CRC deaths are caused by tumor metastasis, even following treatment. There is strong evidence for epigenetic changes, such as DNA methylation, accompanying CRC metastasis and poorer patient survival. Earlier detection and a better understanding of molecular drivers for CRC metastasis are of critical clinical importance. Here, we identify a signature of advanced CRC metastasis by performing whole genome-scale DNA methylation and full transcriptome analyses of paired primary cancers and liver metastases from CRC patients. We observed striking methylation differences between primary and metastatic pairs. A subset of loci showed coordinated methylation-expression changes, suggesting these are potentially epigenetic drivers that control the expression of critical genes in the metastatic cascade. The identification of CRC epigenomic markers of metastasis has the potential to enable better outcome prediction and lead to the discovery of new therapeutic targets.
RESUMO
Advances in RNA sequencing (RNA-Seq) have facilitated transcriptomic analysis of plasma for the discovery of new diagnostic and prognostic markers for disease. We aimed to develop a short-read RNA-Seq protocol to detect mRNAs, long non-coding RNAs (lncRNAs) and circular RNAs (circRNAs) in plasma for the discovery of novel markers for coronary artery disease (CAD) and heart failure (HF). Circulating cell-free RNA from 59 patients with stable CAD (half of whom developed HF within 3 years) and 30 controls was sequenced to a median depth of 108 paired reads per sample. We identified fragments from 3986 messenger RNAs (mRNAs), 164 long non-coding RNAs (lncRNAs), 405 putative novel lncRNAs and 227 circular RNAs in plasma. Circulating levels of 160 mRNAs, 10 lncRNAs and 2 putative novel lncRNAs were altered in patients compared with controls (absolute fold change >1.2, p < 0.01 adjusted for multiple comparisons). The most differentially abundant transcripts were enriched in mRNAs encoded by the mitochondrial genome. We did not detect any differences in the plasma RNA profile between patients who developed HF compared with those who did not. In summary, we show that mRNAs, lncRNAs and circular RNAs can be reliably detected in plasma by deep RNA-Seq. Multiple coding and non-coding transcripts were altered in association with CAD, including several mitochondrial mRNAs, which may indicate underlying myocardial ischaemia and oxidative stress. If validated, circulating levels of these transcripts could potentially be used to help identify asymptomatic individuals with established CAD prior to an acute coronary event.