Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Int J Mol Sci ; 22(20)2021 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-34681774

RESUMO

Genetic interactions (GIs), such as the synthetic lethal interaction, are promising therapeutic targets in precision medicine. However, despite extensive efforts to characterize GIs by large-scale perturbation screening, considerable false positives have been reported in multiple studies. We propose a new computational approach for improved precision in GI identification by applying constraints that consider actual biological phenomena. In this study, GIs were characterized by assessing mutation, loss of function, and expression profiles in the DEPMAP database. The expression profiles were used to exclude loss-of-function data for nonexpressed genes in GI characterization. More importantly, the characterized GIs were refined based on Kyoto Encyclopedia of Genes and Genomes (KEGG) or protein-protein interaction (PPI) networks, under the assumption that genes genetically interacting with a certain mutated gene are adjacent in the networks. As a result, the initial GIs characterized with CRISPR and RNAi screenings were refined to 65 and 23 GIs based on KEGG networks and to 183 and 142 GIs based on PPI networks. The evaluation of refined GIs showed improved precision with respect to known synthetic lethal interactions. The refining process also yielded a synthetic partner network (SPN) for each mutated gene, which provides insight into therapeutic strategies for the mutated genes; specifically, exploring the SPN of mutated BRAF revealed ELAVL1 as a potential target for treating BRAF-mutated cancer, as validated by previous research. We expect that this work will advance cancer therapeutic research.


Assuntos
Redes Reguladoras de Genes/fisiologia , Neoplasias/genética , Mapas de Interação de Proteínas/genética , Linhagem Celular Tumoral , Biologia Computacional/métodos , Epistasia Genética/fisiologia , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Genes Neoplásicos/genética , Humanos , Mutação com Perda de Função , Mutação , Transcriptoma
2.
Bioinformatics ; 37(11): 1544-1553, 2021 Jul 12.
Artigo em Inglês | MEDLINE | ID: mdl-31070735

RESUMO

BACKGROUND: MicroRNAs, small noncoding RNAs, are conserved in many species, and they are key regulators that mediate post-transcriptional gene silencing. Since biologists cannot perform experiments for each of target genes of thousands of microRNAs in numerous specific conditions, prediction on microRNA target genes has been extensively investigated. A general framework is a two-step process of selecting target candidates based on sequence and binding energy features and then predicting targets based on negative correlation of microRNAs and their targets. However, there are few methods that are designed for target predictions using time-series gene expression data. RESULTS: In this article, we propose a new pipeline, mirTime, that predicts microRNA targets by integrating sequence features and time-series expression profiles in a specific experimental condition. The most important feature of mirTime is that it uses the Gaussian process regression model to measure data at unobserved or unpaired time points. In experiments with two datasets in different experimental conditions and cell types, condition-specific target modules reported in the original papers were successfully predicted with our pipeline. The context specificity of target modules was assessed with three (correlation-based, target gene-based and network-based) evaluation criteria. mirTime showed better performance than existing expression-based microRNA target prediction methods in all three criteria. AVAILABILITY AND IMPLEMENTATION: mirTime is available at https://github.com/mirTime/mirtime. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

3.
Sci Rep ; 10(1): 18582, 2020 10 29.
Artigo em Inglês | MEDLINE | ID: mdl-33122739

RESUMO

Human pluripotent stem cells (hPSCs) have promising therapeutic applications due to their infinite capacity for self-renewal and pluripotency. Genomic stability is imperative for the clinical use of hPSCs; however, copy number variation (CNV), especially recurrent CNV at 20q11.21, may contribute genomic instability of hPSCs. Furthermore, the effects of CNVs in hPSCs at the whole-transcriptome scale are poorly understood. This study aimed to examine the functional in vivo and in vitro effects of frequently detected CNVs at 20q11.21 during early-stage differentiation of hPSCs. Comprehensive transcriptome profiling of abnormal hPSCs revealed that the differential gene expression patterns had a negative effect on differentiation potential. Transcriptional heterogeneity identified by single-cell RNA sequencing (scRNA-seq) of embryoid bodies from two different isogenic lines of hPSCs revealed alterations in differentiated cell distributions compared with that of normal cells. RNA-seq analysis of 22 teratomas identified several differentially expressed lineage-specific markers in hPSCs with CNVs, consistent with the histological results of the altered ecto/meso/endodermal ratio due to CNVs. Our results suggest that CNV amplification contributes to cell proliferation, apoptosis, and cell fate specification. This work shows the functional consequences of recurrent genetic abnormalities and thereby provides evidence to support the development of cell-based applications.


Assuntos
Biomarcadores Tumorais/genética , Diferenciação Celular , Aberrações Cromossômicas , Cromossomos Humanos Par 20/genética , Variações do Número de Cópias de DNA , Células-Tronco Pluripotentes/patologia , Teratoma/patologia , Animais , Biomarcadores Tumorais/metabolismo , Células Cultivadas , Regulação Neoplásica da Expressão Gênica , Humanos , Técnicas In Vitro , Camundongos , Camundongos Endogâmicos NOD , Camundongos SCID , Células-Tronco Pluripotentes/metabolismo , Análise de Sequência de RNA , Teratoma/genética , Teratoma/metabolismo , Transcriptoma
4.
Curr Biol ; 30(15): 2887-2900.e7, 2020 08 03.
Artigo em Inglês | MEDLINE | ID: mdl-32531282

RESUMO

Cambium drives the lateral growth of stems and roots, contributing to diverse plant growth forms. The root crop is one of the outstanding examples of the cambium-driven growth. To understand its molecular basis, we used radish to generate a compendium of root-tissue- and stage-specific transcriptomes from two contrasting inbred lines during root growth. Expression patterns of key cambium regulators and hormone signaling components were validated. Clustering and gene ontology (GO) enrichment analyses of radish datasets followed by a comparative analysis against the newly established Arabidopsis early cambium data revealed evolutionary conserved stress-response transcription factors that may intimately control the cambium. Indeed, an in vivo network consisting of selected stress-response and cambium regulators indicated ERF-1 as a potential key checkpoint of cambial activities, explaining how cambium-driven growth is altered in response to environmental changes. The findings here provide valuable information about dynamic gene expression changes during cambium-driven root growth and have implications with regard to future engineering schemes, leading to better crop yields.


Assuntos
Arabidopsis/crescimento & desenvolvimento , Arabidopsis/genética , Câmbio/genética , Câmbio/fisiologia , Proteínas de Ligação a DNA/genética , Proteínas de Ligação a DNA/fisiologia , Regulação da Expressão Gênica no Desenvolvimento/genética , Regulação da Expressão Gênica no Desenvolvimento/fisiologia , Redes Reguladoras de Genes/genética , Redes Reguladoras de Genes/fisiologia , Genes de Plantas/genética , Genes de Plantas/fisiologia , Desenvolvimento Vegetal/genética , Desenvolvimento Vegetal/fisiologia , Reguladores de Crescimento de Plantas/fisiologia , Fenômenos Fisiológicos Vegetais/genética , Proteínas de Plantas/genética , Proteínas de Plantas/fisiologia , Raízes de Plantas/crescimento & desenvolvimento , Raphanus/crescimento & desenvolvimento , Raphanus/genética , Transcriptoma/genética , Proteínas de Arabidopsis , Meio Ambiente , Transcriptoma/fisiologia
5.
BMC Med Genomics ; 13(Suppl 3): 27, 2020 02 24.
Artigo em Inglês | MEDLINE | ID: mdl-32093698

RESUMO

BACKGROUND: In cancer, mutations of DNA methylation modification genes have crucial roles for epigenetic modifications genome-wide, which lead to the activation or suppression of important genes including tumor suppressor genes. Mutations on the epigenetic modifiers could affect the enzyme activity, which would result in the difference in genome-wide methylation profiles and, activation of downstream genes. Therefore, we investigated the effect of mutations on DNA methylation modification genes such as DNMT1, DNMT3A, MBD1, MBD4, TET1, TET2 and TET3 through a pan-cancer analysis. METHODS: First, we investigated the effect of mutations in DNA methylation modification genes on genome-wide methylation profiles. We collected 3,644 samples that have both of mRNA and methylation data from 12 major cancer types in The Cancer Genome Atlas (TCGA). The samples were divided into two groups according to the mutational signature. Differentially methylated regions (DMR) that overlapped with the promoter region were selected using minfi and differentially expressed genes (DEG) were identified using EBSeq. By integrating the DMR and DEG results, we constructed a comprehensive DNA methylome profiles on a pan-cancer scale. Second, we investigated the effect of DNA methylations in the promoter regions on downstream genes by comparing the two groups of samples in 11 cancer types. To investigate the effects of promoter methylation on downstream gene activations, we performed clustering analysis of DEGs. Among the DEGs, we selected highly correlated gene set that had differentially methylated promoter regions using graph based sub-network clustering methods. RESULTS: We chose an up-regulated DEGs cluster where had hypomethylated promoter in acute myeloid leukemia (LAML) and another down-regulated DEGs cluster where had hypermethylated promoter in colon adenocarcinoma (COAD). To rule out effects of gene regulation by transcription factor (TF), if differentially expressed TFs bound to the promoter of DEGs, that DEGs did not included to the gene set that effected by DNA methylation modifiers. Consequently, we identified 54 hypomethylated promoter DMR up-regulated DEGs in LAML and 45 hypermethylated promoter DMR down-regulated DEGs in COAD. CONCLUSIONS: Our study on DNA methylation modification genes in mutated vs. non-mutated groups could provide useful insight into the epigenetic regulation of DEGs in cancer.


Assuntos
Metilação de DNA/genética , DNA de Neoplasias/metabolismo , Regulação Neoplásica da Expressão Gênica , Neoplasias/genética , Epigênese Genética , Epigenoma , Genoma Humano , Humanos , Mutação , Regiões Promotoras Genéticas
6.
BMC Genomics ; 20(Suppl 11): 949, 2019 Dec 20.
Artigo em Inglês | MEDLINE | ID: mdl-31856731

RESUMO

BACKGROUND: Recently, a number of studies have been conducted to investigate how plants respond to stress at the cellular molecular level by measuring gene expression profiles over time. As a result, a set of time-series gene expression data for the stress response are available in databases. With the data, an integrated analysis of multiple stresses is possible, which identifies stress-responsive genes with higher specificity because considering multiple stress can capture the effect of interference between stresses. To analyze such data, a machine learning model needs to be built. RESULTS: In this study, we developed StressGenePred, a neural network-based machine learning method, to integrate time-series transcriptome data of multiple stress types. StressGenePred is designed to detect single stress-specific biomarker genes by using a simple feature embedding method, a twin neural network model, and Confident Multiple Choice Learning (CMCL) loss. The twin neural network model consists of a biomarker gene discovery and a stress type prediction model that share the same logical layer to reduce training complexity. The CMCL loss is used to make the twin model select biomarker genes that respond specifically to a single stress. In experiments using Arabidopsis gene expression data for four major environmental stresses, such as heat, cold, salt, and drought, StressGenePred classified the types of stress more accurately than the limma feature embedding method and the support vector machine and random forest classification methods. In addition, StressGenePred discovered known stress-related genes with higher specificity than the Fisher method. CONCLUSIONS: StressGenePred is a machine learning method for identifying stress-related genes and predicting stress types for an integrated analysis of multiple stress time-series transcriptome data. This method can be used to other phenotype-gene associated studies.


Assuntos
Arabidopsis/genética , Genes de Plantas/genética , Modelos Biológicos , Redes Neurais de Computação , Estresse Fisiológico/genética , Biologia Computacional , Perfilação da Expressão Gênica , Estudos de Associação Genética , Aprendizado de Máquina , Fenótipo , Transcriptoma
7.
BMC Bioinformatics ; 20(Suppl 16): 588, 2019 Dec 02.
Artigo em Inglês | MEDLINE | ID: mdl-31787073

RESUMO

BACKGROUND: Integrated analysis that uses multiple sample gene expression data measured under the same stress can detect stress response genes more accurately than analysis of individual sample data. However, the integrated analysis is challenging since experimental conditions (strength of stress and the number of time points) are heterogeneous across multiple samples. RESULTS: HTRgene is a computational method to perform the integrated analysis of multiple heterogeneous time-series data measured under the same stress condition. The goal of HTRgene is to identify "response order preserving DEGs" that are defined as genes not only which are differentially expressed but also whose response order is preserved across multiple samples. The utility of HTRgene was demonstrated using 28 and 24 time-series sample gene expression data measured under cold and heat stress in Arabidopsis. HTRgene analysis successfully reproduced known biological mechanisms of cold and heat stress in Arabidopsis. Also, HTRgene showed higher accuracy in detecting the documented stress response genes than existing tools. CONCLUSIONS: HTRgene, a method to find the ordering of response time of genes that are commonly observed among multiple time-series samples, successfully integrated multiple heterogeneous time-series gene expression datasets. It can be applied to many research problems related to the integration of time series data analysis.


Assuntos
Algoritmos , Arabidopsis/genética , Arabidopsis/fisiologia , Temperatura Baixa , Biologia Computacional/métodos , Genes de Plantas , Resposta ao Choque Térmico/genética , Transdução de Sinais/genética , Bases de Dados Genéticas , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Redes Reguladoras de Genes , Fatores de Tempo , Fatores de Transcrição/metabolismo
8.
Front Plant Sci ; 10: 698, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31258543

RESUMO

Transcription factor (TF) has a significant influence on the state of a cell by regulating multiple down-stream genes. Thus, experimental and computational biologists have made great efforts to construct TF gene networks for regulatory interactions between TFs and their target genes. Now, an important research question is how to utilize TF networks to investigate the response of a plant to stress at the transcription control level using time-series transcriptome data. In this article, we present a new computational network, PropaNet, to investigate dynamics of TF networks from time-series transcriptome data using two state-of-the-art network analysis techniques, influence maximization and network propagation. PropaNet uses the influence maximization technique to produce a ranked list of TFs, in the order of TF that explains differentially expressed genes (DEGs) better at each time point. Then, a network propagation technique is used to select a group of TFs that explains DEGs best as a whole. For the analysis of Arabidopsis time series datasets from AtGenExpress, we used PlantRegMap as a template TF network and performed PropaNet analysis to investigate transcriptional dynamics of Arabidopsis under cold and heat stress. The time varying TF networks showed that Arabidopsis responded to cold and heat stress quite differently. For cold stress, bHLH and bZIP type TFs were the first responding TFs and the cold signal influenced histone variants, various genes involved in cell architecture, osmosis and restructuring of cells. However, the consequences of plants under heat stress were up-regulation of genes related to accelerating differentiation and starting re-differentiation. In terms of energy metabolism, plants under heat stress show elevated metabolic process and resulting in an exhausted status. We believe that PropaNet will be useful for the construction of condition-specific time-varying TF network for time-series data analysis in response to stress. PropaNet is available at http://biohealth.snu.ac.kr/software/PropaNet.

9.
Front Plant Sci ; 8: 1044, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28663756

RESUMO

This study was designed to investigate at the molecular level how a transgenic version of rice "Nipponbare" obtained a drought-resistant phenotype. Using multi-omics sequencing data, we compared wild-type rice (WT) and a transgenic version (erf71) that had obtained a drought-resistant phenotype by overexpressing OsERF71, a member of the AP2/ERF transcription factor (TF) family. A comprehensive bioinformatics analysis pipeline, including TF networks and a cascade tree, was developed for the analysis of multi-omics data. The results of the analysis showed that the presence of OsERF71 at the source of the network controlled global gene expression levels in a specific manner to make erf71 survive longer than WT. Our analysis of the time-series transcriptome data suggests that erf71 diverted more energy to survival-critical mechanisms related to translation, oxidative response, and DNA replication, while further suppressing energy-consuming mechanisms, such as photosynthesis. To support this hypothesis further, we measured the net photosynthesis level under physiological conditions, which confirmed the further suppression of photosynthesis in erf71. In summary, our work presents a comprehensive snapshot of transcriptional modification in transgenic rice and shows how this induced the plants to acquire a drought-resistant phenotype.

10.
Bioinformatics ; 33(23): 3827-3835, 2017 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-28096084

RESUMO

MOTIVATION: Identifying biologically meaningful gene expression patterns from time series gene expression data is important to understand the underlying biological mechanisms. To identify significantly perturbed gene sets between different phenotypes, analysis of time series transcriptome data requires consideration of time and sample dimensions. Thus, the analysis of such time series data seeks to search gene sets that exhibit similar or different expression patterns between two or more sample conditions, constituting the three-dimensional data, i.e. gene-time-condition. Computational complexity for analyzing such data is very high, compared to the already difficult NP-hard two dimensional biclustering algorithms. Because of this challenge, traditional time series clustering algorithms are designed to capture co-expressed genes with similar expression pattern in two sample conditions. RESULTS: We present a triclustering algorithm, TimesVector, specifically designed for clustering three-dimensional time series data to capture distinctively similar or different gene expression patterns between two or more sample conditions. TimesVector identifies clusters with distinctive expression patterns in three steps: (i) dimension reduction and clustering of time-condition concatenated vectors, (ii) post-processing clusters for detecting similar and distinct expression patterns and (iii) rescuing genes from unclassified clusters. Using four sets of time series gene expression data, generated by both microarray and high throughput sequencing platforms, we demonstrated that TimesVector successfully detected biologically meaningful clusters of high quality. TimesVector improved the clustering quality compared to existing triclustering tools and only TimesVector detected clusters with differential expression patterns across conditions successfully. AVAILABILITY AND IMPLEMENTATION: The TimesVector software is available at http://biohealth.snu.ac.kr/software/TimesVector/. CONTACT: sunkim.bioinfo@snu.ac.kr. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Análise por Conglomerados , Perfilação da Expressão Gênica/métodos , Fenótipo , Transcriptoma , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência com Séries de Oligonucleotídeos , Reprodutibilidade dos Testes , Software , Fatores de Tempo
11.
BMC Med Genomics ; 9 Suppl 1: 33, 2016 08 12.
Artigo em Inglês | MEDLINE | ID: mdl-27534535

RESUMO

BACKGROUND: Multifunctional transcription factor (TF) gene EWS/EWSR1 is involved in various cellular processes such as transcription regulation, noncoding RNA regulation, splicing regulation, genotoxic stress response, and cancer generation. Role of a TF gene can be effectively studied by measuring genome-wide gene expression, i.e., transcriptome, in an animal model of Ews/Ewsr1 knockout (KO). However, when a TF gene has complex multi-functions, conventional approaches such as differentially expressed genes (DEGs) analysis are not successful to characterize the role of the EWS gene. In this regard, network-based analyses that consider associations among genes are the most promising approach. METHODS: Networks are constructed and used to show associations among biological entities at various levels, thus different networks represent association at different levels. Taken together, in this paper, we report contributions on both computational and biological sides. RESULTS: Contribution on the computational side is to develop a novel computational framework that combines miRNA-gene network and protein-protein interaction network information to characterize the multifunctional role of EWS gene. On the biological side, we report that EWS regulates G-protein, Gnai1, in the spinal cord of Ews/Ewsr1 KO mice using the two biological network integrated analysis method. Neighbor proteins of Gnai1, G-protein complex subunits Gnb1, Gnb2 and Gnb4 were also down-regulated at their gene expression level. Interestingly, up-regulated genes, such as Rgs1 and Rgs19, are linked to the inhibition of Gnai1 activities. We further verified the altered expression of Gnai1 by qRT-PCR in Ews/Ewsr1 KO mice. CONCLUSIONS: Our integrated analysis of miRNA-transcriptome network and PPI network combined with qRT-PCR verifies that Gnai1 function is impaired in the spinal cord of Ews/Ewsr1 KO mice.


Assuntos
Proteínas de Ligação a Calmodulina/deficiência , Proteínas de Ligação a Calmodulina/genética , Biologia Computacional , Subunidades alfa Gi-Go de Proteínas de Ligação ao GTP/metabolismo , MicroRNAs/genética , Mapeamento de Interação de Proteínas , Proteínas de Ligação a RNA/genética , Medula Espinal/metabolismo , Animais , Perfilação da Expressão Gênica , Ontologia Genética , Redes Reguladoras de Genes , Camundongos , Camundongos Knockout , RNA Mensageiro/genética , Proteína EWS de Ligação a RNA , Análise de Sequência de RNA
12.
BMC Syst Biol ; 10(Suppl 4): 115, 2016 Dec 23.
Artigo em Inglês | MEDLINE | ID: mdl-28155667

RESUMO

MOTIVATION: Drought tolerance is an important trait related to growth and yield in crop. Until now, drought related research has focused on coding genes. However, non-coding RNAs also respond significantly to environmental stimuli such as drought stress. Unfortunately, characterizing the role of siRNAs under drought stress is difficult since a large number of heterogenous siRNA species are expressed under drought stress and non-coding RNAs have very weak evolutionary conservation. Thus, to characterize the role of siRNAs, we need a well designed biological and bioinformatics strategy. In this paper, to characterize the function of siRNAs we developed and used a bioinformatics pipeline that includes a genomic-location based clustering technique and an evolutionary conservation tool. RESULTS: By comparing the wild type Nipponbare and two drought resistant rice varities, we found that 21 nt and 24 nt siRNAs are significantly expressed in the three rice plants but at different time points under a short-term (0, 1, and 6 hrs) drought treatment. siRNAs were up-regulated in the wild type at an early stage while the up-regulation was delayed in the two drought tolerant plants. Genes targeted by up-regulated siRNAs were related to oxidation reduction and proteolysis, which are well known to be associated with water deficit phenotypes. More interestingly, we found that siRNAs were located in intronic regions as clusters and were of high evolutionary conservation among monocot grass plants. In summary, we show that siRNAs are important respondents to drought stress and regulate genes related to the drought tolerance in water deficit conditions.


Assuntos
Biologia Computacional/métodos , Secas , Evolução Molecular , Oryza/genética , Oryza/fisiologia , RNA Interferente Pequeno/genética , Estresse Fisiológico/genética , Análise por Conglomerados , Sequência Conservada , Motivos de Nucleotídeos , RNA Mensageiro/genética , RNA Mensageiro/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA