Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 50
Filter
Add more filters

Publication year range
1.
Cell ; 147(6): 1283-94, 2011 Dec 09.
Article in English | MEDLINE | ID: mdl-22153073

ABSTRACT

Key regulatory genes, suppressed by Polycomb and H3K27me3, become active during normal differentiation and induced reprogramming. Using the well-characterized enhancer/promoter pair of MYOD1 as a model, we have identified a critical role for enhancers in reprogramming. We observed an unexpected nucleosome-depleted region (NDR) at the H3K4me1-enriched enhancer at which transcriptional regulators initially bind, leading to subsequent changes in the chromatin at the cognate promoter. Exogenous Myod1 activates its own transcription by binding first at the enhancer, leading to an NDR and transcription-permissive chromatin at the associated MYOD1 promoter. Exogenous OCT4 also binds first to the permissive MYOD1 enhancer but has a different effect on the cognate promoter, where the monovalent H3K27me3 marks are converted to the bivalent state characteristic of stem cells. Genome-wide, a high percentage of Polycomb targets are associated with putative enhancers in permissive states, suggesting that they may provide a widespread avenue for the initiation of cell-fate reprogramming.


Subject(s)
Enhancer Elements, Genetic , Repressor Proteins/metabolism , Animals , Cell Line , Epigenomics , Fibroblasts/metabolism , Humans , Mice , MyoD Protein/genetics , Nucleosomes/metabolism , Octamer Transcription Factor-3/metabolism , Polycomb-Group Proteins , Promoter Regions, Genetic
2.
Int J Mol Sci ; 23(1)2022 Jan 01.
Article in English | MEDLINE | ID: mdl-35008908

ABSTRACT

The major biological methyl donor, S-adenosylmethionine (adoMet) synthesis occurs mainly in the liver. Methionine adenosyltransferase 1A (MAT1A) and glycine N-methyltransferase (GNMT) are two key enzymes involved in the functional implications of that variation. We collected 42 RNA-seq data from paired hepatocellular carcinoma (HCC) and its adjacent normal liver tissue from the Cancer Genome Atlas (TCGA). There was no mutation found in MAT1A or GNMT RNA in the 42 HCC patients. The 11,799 genes were annotated in the RNA-Seq data, and their expression levels were used to investigate the phenotypes of low MAT1A and low GNMT by Gene Set Enrichment Analysis (GSEA). The REACTOME_TRANSLATION gene set was enriched and visualized in a heatmap along with corresponding differences in gene expression between low MAT1A versus high MAT1A and low GNMT versus high GNMT. We identified 43 genes of the REACTOME_TRANSLATION gene set that are powerful prognosis factors in HCC. The significantly predicted genes were referred into eukaryotic translation initiation (EIF3B, EIF3K), eukaryotic translation elongation (EEF1D), and ribosomal proteins (RPs). Cell models expressing various MAT1A and GNMT proved that simultaneous restoring the expression of MAT1A and GNMT decreased cell proliferation, invasion, as well as the REACTOME_TRANSLATION gene EEF1D, consistent with a better prognosis in human HCC. We demonstrated new findings that downregulation or defect in MAT1A and GNMT genes can enrich the protein-associated translation process that may account for poor HCC prognosis. This is the first study demonstrated that MAT1A and GNMT, the 2 key enzymes involved in methionine cycle, could attenuate the function of ribosome translation. We propose a potential novel mechanism by which the diminished GNMT and MAT1A expression may confer poor prognosis for HCC.


Subject(s)
Carcinoma, Hepatocellular/genetics , Down-Regulation/genetics , Gene Expression Regulation, Neoplastic , Glycine N-Methyltransferase/genetics , Liver Neoplasms/genetics , Methionine Adenosyltransferase/genetics , Methionine/metabolism , Protein Biosynthesis , Base Sequence , Carcinoma, Hepatocellular/pathology , Cell Line, Tumor , Cell Proliferation/genetics , DNA Methylation/genetics , Eukaryotic Initiation Factor-3/metabolism , Glycine N-Methyltransferase/metabolism , Humans , Kaplan-Meier Estimate , Liver Neoplasms/pathology , Methionine Adenosyltransferase/metabolism , Neoplasm Invasiveness , Peptide Elongation Factor 1/metabolism , Promoter Regions, Genetic/genetics , Protein Biosynthesis/genetics , Survival Analysis
3.
Mol Genet Genomics ; 296(6): 1323-1335, 2021 Nov.
Article in English | MEDLINE | ID: mdl-34609588

ABSTRACT

Sex form is one of the most important characteristics in papaya cultivation in which hermaphrodite is the preferable form. Self-pollination of H*-TSS No.7, an inbred line derived from a rare X chromosome mutant SR*, produced all-hermaphrodite progeny. The recessive lethal allele controlling the all-hermaphrodite phenomenon was proposed to be the recessive Germination suppressor (gs) locus. This study employed next-generation sequencing technology and genome comparison to identify the candidate Gs gene. One specific gene, monodehydroascorbate reductase 4 (MDAR4) harboring a unique polymorphic 3 bp deletion in H*-TSS No.7 was identified. The function of MDAR4 is known to be involved in the hydrogen peroxide (H2O2) scavenging pathway and is associated with seed germination. Furthermore, MDAR4 showed higher expression in the imbibed seeds than that in the dry seeds indicating its potential role in the seed germination. Perhaps this is the very first report providing the evidences that MDAR4 is the candidate of Gs locus in H*-TSS No.7. In addition, Gs allele-specific markers were developed which would be facilitated for breeding all-hermaphrodite lines.


Subject(s)
Carica/genetics , Chromosomes, Plant/genetics , Hermaphroditic Organisms/genetics , NADH, NADPH Oxidoreductases/genetics , Genome, Plant/genetics , Germination/genetics , Hydrogen Peroxide/metabolism , Pollination/genetics , Pollination/physiology , Seeds/growth & development , Sequence Deletion/genetics
4.
Int J Mol Sci ; 22(17)2021 Aug 30.
Article in English | MEDLINE | ID: mdl-34502300

ABSTRACT

Folate depletion causes chromosomal instability by increasing DNA strand breakage, uracil misincorporation, and defective repair. Folate mediated one-carbon metabolism has been suggested to play a key role in the carcinogenesis and progression of hepatocellular carcinoma (HCC) through influencing DNA integrity. Methylenetetrahydrofolate reductase (MTHFR) is the enzyme catalyzing the irreversible conversion of 5,10-methylenetetrahydrofolate to 5-methyltetrahydrofolate that can control folate cofactor distributions and modulate the partitioning of intracellular one-carbon moieties. The association between MTHFR polymorphisms and HCC risk is inconsistent and remains controversial in populational studies. We aimed to establish an in vitro cell model of liver origin to elucidate the interactions between MTHFR function, folate status, and chromosome stability. In the present study, we (1) examined MTHFR expression in HCC patients; (2) established cell models of liver origin with stabilized inhibition of MTHFR using small hairpin RNA delivered by a lentiviral vector, and (3) investigated the impacts of reduced MTHFR and folate status on cell cycle, methyl group homeostasis, nucleotide biosynthesis, and DNA stability, all of which are pathways involved in DNA integrity and repair and are critical in human tumorigenesis. By analyzing the TCGA/GTEx datasets available within GEPIA2, we discovered that HCC cancer patients with higher MTHFR had a worse survival rate. The shRNA of MTHFR (shMTHFR) resulted in decreased MTHFR gene expression, MTHFR protein, and enzymatic activity in human hepatoma cell HepG2. shMTHFR tended to decrease intracellular S-adenosylmethionine (SAM) contents but folate depletion similarly decreased SAM in wildtype (WT), negative control (Neg), and shMTHFR cells, indicating that in cells of liver origin, shMTHFR does not exacerbate the methyl group supply in folate depletion. shMTHFR caused cell accumulations in the G2/M, and cell population in the G2/M was inversely correlated with MTHFR gene level (r = -0.81, p < 0.0001), MTHFR protein expression (r = -0.8; p = 0.01), and MTHFR enzyme activity (r = -0.842; p = 0.005). Folate depletion resulted in G2/M cell cycle arrest in WT and Neg but not in shMTHFR cells, indicating that shMTHFR does not exacerbate folate depletion-induced G2/M cell cycle arrest. In addition, shMTHFR promoted the expression and translocation of nuclei thymidine synthetic enzyme complex SHMT1/DHFR/TYMS and assisted folate-dependent de novo nucleotide biosynthesis under folate restriction. Finally, shMTHFR promoted nuclear MLH1/p53 expression under folate deficiency and further reduced micronuclei formation and DNA uracil misincorporation under folate deficiency. In conclusion, shMTHFR in HepG2 induces cell cycle arrest in G2/M that may promote nucleotide supply and assist cell defense against folate depletion-induced chromosome segregation and uracil misincorporation in the DNA. This study provided insight into the significant impact of MTHFR function on chromosome stability of hepatic tissues. Data from the present study may shed light on the potential regulatory mechanism by which MTHFR modulates the risk for hepatic malignancies.


Subject(s)
Carcinoma, Hepatocellular/pathology , Chromosome Segregation , DNA, Neoplasm/genetics , Folic Acid/metabolism , Methylenetetrahydrofolate Reductase (NADPH2)/antagonists & inhibitors , Uracil/metabolism , Apoptosis , Carcinoma, Hepatocellular/genetics , Carcinoma, Hepatocellular/metabolism , Cell Proliferation , Chromosomal Instability , DNA, Neoplasm/metabolism , Gene Expression Regulation, Neoplastic , Humans , Liver Neoplasms/genetics , Liver Neoplasms/metabolism , Liver Neoplasms/pathology , Methylenetetrahydrofolate Reductase (NADPH2)/genetics , Methylenetetrahydrofolate Reductase (NADPH2)/metabolism , Polymorphism, Genetic , Prognosis , Survival Rate , Tumor Cells, Cultured
5.
Bioinformatics ; 35(17): 3127-3132, 2019 09 01.
Article in English | MEDLINE | ID: mdl-30668638

ABSTRACT

MOTIVATION: In recent years, several experimental studies have revealed that the microRNAs (miRNAs) in serum, plasma, exosome and whole blood are dysregulated in various types of diseases, indicating that the circulating miRNAs may serve as potential noninvasive biomarkers for disease diagnosis and prognosis. However, no database has been constructed to integrate the large-scale circulating miRNA profiles, explore the functional pathways involved and predict the potential biomarkers using feature selection between the disease conditions. Although there have been several studies attempting to generate a circulating miRNA database, they have not yet integrated the large-scale circulating miRNA profiles or provided the biomarker-selection function using machine learning methods. RESULTS: To fill this gap, we constructed the Circulating MicroRNA Expression Profiling (CMEP) database for integrating, analyzing and visualizing the large-scale expression profiles of phenotype-specific circulating miRNAs. The CMEP database contains massive datasets that were manually curated from NCBI GEO and the exRNA Atlas, including 66 datasets, 228 subsets and 10 419 samples. The CMEP provides the differential expression circulating miRNAs analysis and the KEGG functional pathway enrichment analysis. Furthermore, to provide the function of noninvasive biomarker discovery, we implemented several feature-selection methods, including ridge regression, lasso regression, support vector machine and random forests. Finally, we implemented a user-friendly web interface to improve the user experience and to visualize the data and results of CMEP. AVAILABILITY AND IMPLEMENTATION: CMEP is accessible at http://syslab5.nchu.edu.tw/CMEP.


Subject(s)
Databases, Factual , Biomarkers , Circulating MicroRNA , Exosomes , Gene Expression Profiling
6.
Nucleic Acids Res ; 46(15): e89, 2018 09 06.
Article in English | MEDLINE | ID: mdl-29897492

ABSTRACT

The detection of tumor-derived cell-free DNA in plasma is one of the most promising directions in cancer diagnosis. The major challenge in such an approach is how to identify the tiny amount of tumor DNAs out of total cell-free DNAs in blood. Here we propose an ultrasensitive cancer detection method, termed 'CancerDetector', using the DNA methylation profiles of cell-free DNAs. The key of our method is to probabilistically model the joint methylation states of multiple adjacent CpG sites on an individual sequencing read, in order to exploit the pervasive nature of DNA methylation for signal amplification. Therefore, CancerDetector can sensitively identify a trace amount of tumor cfDNAs in plasma, at the level of individual reads. We evaluated CancerDetector on the simulated data, and showed a high concordance of the predicted and true tumor fraction. Testing CancerDetector on real plasma data demonstrated its high sensitivity and specificity in detecting tumor cfDNAs. In addition, the predicted tumor fraction showed great consistency with tumor size and survival outcome. Note that all of those testing were performed on sequencing data at low to medium coverage (1× to 10×). Therefore, CancerDetector holds the great potential to detect cancer early and cost-effectively.


Subject(s)
Algorithms , Cell-Free Nucleic Acids/genetics , Computational Biology/methods , DNA Methylation , Neoplasms/diagnosis , Cell-Free Nucleic Acids/chemistry , CpG Islands/genetics , DNA, Neoplasm/chemistry , DNA, Neoplasm/genetics , High-Throughput Nucleotide Sequencing/methods , Humans , Neoplasms/blood , Neoplasms/genetics , ROC Curve , Reproducibility of Results
7.
BMC Genomics ; 19(1): 966, 2018 Dec 27.
Article in English | MEDLINE | ID: mdl-30587128

ABSTRACT

BACKGROUND: Abiotic and biotic stresses severely affect the growth and reproduction of plants and crops. Determining the critical molecular mechanisms and cellular processes in response to stresses will provide biological insight for addressing both climate change and food crises. RNA sequencing (RNA-Seq) is a revolutionary tool that has been used extensively in plant stress research. However, no existing large-scale RNA-Seq database has been designed to provide information on the stress-specific differentially expressed transcripts that occur across diverse plant species and various stresses. RESULTS: We have constructed a comprehensive database, the plant stress RNA-Seq nexus (PSRN), which includes 12 plant species, 26 plant-stress RNA-Seq datasets, and 937 samples. All samples are assigned to 133 stress-specific subsets, which are constructed into 254 subset pairs, a comparison between selected two subsets, for stress-specific differentially expressed transcript identification. CONCLUSIONS: PSRN is an open resource for intuitive data exploration, providing expression profiles of coding-transcript/lncRNA and identifying which transcripts are differentially expressed between different stress-specific subsets, in order to support researchers generating new biological insights and hypotheses in molecular breeding or evolution. PSRN is freely available at http://syslab5.nchu.edu.tw/PSRN .


Subject(s)
Databases, Genetic , Plant Cells/metabolism , Stress, Physiological , Transcriptome , Internet Access , RNA, Plant/metabolism , User-Computer Interface
8.
BMC Genomics ; 19(Suppl 1): 958, 2018 01 19.
Article in English | MEDLINE | ID: mdl-29363420

ABSTRACT

BACKGROUND: Emerging evidence has been experimentally confirmed the tissue-specific expression of circRNAs (circRNAs). Global identification of human tissue-specific circRNAs is crucial for the functionality study, which facilitates the discovery of circRNAs for potential diagnostic biomarkers. RESULTS: In this study, circRNA back-splicing junctions were identified from 465 publicly available transcriptome sequencing samples. The number of reads aligned to these identified junctions was normalized with the read length and sequence depth for each sample. We generated 66 models representing enriched circRNAs among human tissue transcriptome through biclustering algorithm. The result provides thousands of newly identified human tissue-specific circRNAs. CONCLUSIONS: This result suggests that expression of circRNAs is not prompted by random splicing error but serving molecular functional roles. We also identified circRNAs enriched within circulating system, which, along with identified tissue-specific circRNAs, can serve as potential diagnostic biomarkers.


Subject(s)
Algorithms , Biomarkers/metabolism , Gene Expression Regulation , High-Throughput Nucleotide Sequencing/methods , RNA/genetics , Transcriptome , Brain/metabolism , Cluster Analysis , Humans , Organ Specificity , RNA, Circular
9.
Nucleic Acids Res ; 44(D1): D944-51, 2016 Jan 04.
Article in English | MEDLINE | ID: mdl-26602695

ABSTRACT

The genome-wide transcriptome profiling of cancerous and normal tissue samples can provide insights into the molecular mechanisms of cancer initiation and progression. RNA Sequencing (RNA-Seq) is a revolutionary tool that has been used extensively in cancer research. However, no existing RNA-Seq database provides all of the following features: (i) large-scale and comprehensive data archives and analyses, including coding-transcript profiling, long non-coding RNA (lncRNA) profiling and coexpression networks; (ii) phenotype-oriented data organization and searching and (iii) the visualization of expression profiles, differential expression and regulatory networks. We have constructed the first public database that meets these criteria, the Cancer RNA-Seq Nexus (CRN, http://syslab4.nchu.edu.tw/CRN). CRN has a user-friendly web interface designed to facilitate cancer research and personalized medicine. It is an open resource for intuitive data exploration, providing coding-transcript/lncRNA expression profiles to support researchers generating new hypotheses in cancer research and personalized medicine.


Subject(s)
Databases, Genetic , Gene Expression Profiling , Gene Regulatory Networks , Neoplasms/genetics , Humans , Neoplasms/metabolism , Phenotype , RNA, Long Noncoding/metabolism , RNA, Messenger/metabolism
10.
Nucleic Acids Res ; 44(D1): D209-15, 2016 Jan 04.
Article in English | MEDLINE | ID: mdl-26450965

ABSTRACT

Circular RNAs (circRNAs) represent a new type of regulatory noncoding RNA that only recently has been identified and cataloged. Emerging evidence indicates that circRNAs exert a new layer of post-transcriptional regulation of gene expression. In this study, we utilized transcriptome sequencing datasets to systematically identify the expression of circRNAs (including known and newly identified ones by our pipeline) in 464 RNA-seq samples, and then constructed the CircNet database (http://circnet.mbc.nctu.edu.tw/) that provides the following resources: (i) novel circRNAs, (ii) integrated miRNA-target networks, (iii) expression profiles of circRNA isoforms, (iv) genomic annotations of circRNA isoforms (e.g. 282 948 exon positions), and (v) sequences of circRNA isoforms. The CircNet database is to our knowledge the first public database that provides tissue-specific circRNA expression profiles and circRNA-miRNA-gene regulatory networks. It not only extends the most up to date catalog of circRNAs but also provides a thorough expression analysis of both previously reported and novel circRNAs. Furthermore, it generates an integrated regulatory network that illustrates the regulation between circRNAs, miRNAs and genes.


Subject(s)
Databases, Nucleic Acid , RNA/metabolism , Gene Expression Profiling , Gene Regulatory Networks , Humans , RNA/chemistry , RNA, Circular , Sequence Analysis, RNA
11.
Nucleic Acids Res ; 44(D1): D239-47, 2016 Jan 04.
Article in English | MEDLINE | ID: mdl-26590260

ABSTRACT

MicroRNAs (miRNAs) are small non-coding RNAs of approximately 22 nucleotides, which negatively regulate the gene expression at the post-transcriptional level. This study describes an update of the miRTarBase (http://miRTarBase.mbc.nctu.edu.tw/) that provides information about experimentally validated miRNA-target interactions (MTIs). The latest update of the miRTarBase expanded it to identify systematically Argonaute-miRNA-RNA interactions from 138 crosslinking and immunoprecipitation sequencing (CLIP-seq) data sets that were generated by 21 independent studies. The database contains 4966 articles, 7439 strongly validated MTIs (using reporter assays or western blots) and 348 007 MTIs from CLIP-seq. The number of MTIs in the miRTarBase has increased around 7-fold since the 2014 miRTarBase update. The miRNA and gene expression profiles from The Cancer Genome Atlas (TCGA) are integrated to provide an effective overview of this exponential growth in the miRNA experimental data. These improvements make the miRTarBase one of the more comprehensively annotated, experimentally validated miRNA-target interactions databases and motivate additional miRNA research efforts.


Subject(s)
Databases, Nucleic Acid , MicroRNAs/metabolism , RNA, Messenger/metabolism , Disease/genetics , Gene Expression Profiling , Humans , RNA, Messenger/chemistry , Sequence Analysis, RNA
12.
BMC Genomics ; 18(1): 61, 2017 01 10.
Article in English | MEDLINE | ID: mdl-28068916

ABSTRACT

BACKGROUND: Transcription factors (TFs) often interact with one another to form TF complexes that bind DNA and regulate gene expression. Many databases are created to describe known TF complexes identified by either mammalian two-hybrid experiments or data mining. Lately, a wealth of ChIP-seq data on human TFs under different experiment conditions are available, making it possible to investigate condition-specific (cell type and/or physiologic state) TF complexes and their target genes. RESULTS: Here, we developed a systematic pipeline to infer Condition-Specific Targets of human TF-TF complexes (called the CST pipeline) by integrating ChIP-seq data and TF motifs. In total, we predicted 2,392 TF complexes and 13,504 high-confidence or 127,994 low-confidence regulatory interactions amongst TF complexes and their target genes. We validated our predictions by (i) comparing predicted TF complexes to external TF complex databases, (ii) validating selected target genes of TF complexes using ChIP-qPCR and RT-PCR experiments, and (iii) analysing target genes of select TF complexes using gene ontology enrichment to demonstrate the accuracy of our work. Finally, the predicted results above were integrated and employed to construct a CST database. CONCLUSIONS: We built up a methodology to construct the CST database, which contributes to the analysis of transcriptional regulation and the identification of novel TF-TF complex formation in a certain condition. This database also allows users to visualize condition-specific TF regulatory networks through a user-friendly web interface.


Subject(s)
Chromatin Immunoprecipitation , Computational Biology , Sequence Analysis, DNA , Transcription Factors/metabolism , Databases, Genetic , Gene Ontology , Humans , Nucleotide Motifs , Transcription, Genetic
13.
BMC Biotechnol ; 17(1): 36, 2017 04 11.
Article in English | MEDLINE | ID: mdl-28399854

ABSTRACT

BACKGROUND: N-Butanol has favorable characteristics for use as either an alternative fuel or platform chemical. Bio-based n-butanol production using microbes is an emerging technology that requires further development. Although bio-industrial microbes such as Escherichia coli have been engineered to produce n-butanol, reactive oxygen species (ROS)-mediated toxicity may limit productivity. Previously, we show that outer-membrane-targeted tilapia metallothionein (OmpC-TMT) is more effective as an ROS scavenger than human and mouse metallothioneins to reduce oxidative stress in the host cell. RESULTS: The host strain (BUT1-DE) containing the clostridial n-butanol pathway displayed a decreased growth rate and limited n-butanol productivity, likely due to ROS accumulation. The clostridial n-butanol pathway was co-engineered with inducible OmpC-TMT in E. coli (BUT3-DE) for simultaneous ROS removal, and its effect on n-butanol productivity was examined. The ROS scavenging ability of cells overexpressing OmpC-TMT was examined and showed an approximately twofold increase in capacity. The modified strain improved n-butanol productivity to 320 mg/L, whereas the control strain produced only 95.1 mg/L. Transcriptomic analysis revealed three major KEGG pathways that were significantly differentially expressed in the BUT3-DE strain compared with their expression in the BUT1-DE strain, including genes involved in oxidative phosphorylation, fructose and mannose metabolism and glycolysis/gluconeogenesis. CONCLUSIONS: These results indicate that OmpC-TMT can increase n-butanol production by scavenging ROS. The transcriptomic analysis suggested that n-butanol causes quinone malfunction, resulting in oxidative-phosphorylation-related nuo operon downregulation, which would diminish the ability to convert NADH to NAD+ and generate proton motive force. However, fructose and mannose metabolism-related genes (fucA, srlE and srlA) were upregulated, and glycolysis/gluconeogenesis-related genes (pfkB, pgm) were downregulated, which further assisted in regulating NADH/NAD+ redox and preventing additional ATP depletion. These results indicated that more NADH and ATP were required in the n-butanol synthetic pathway. Our study demonstrates a potential approach to increase the robustness of microorganisms and the production of toxic chemicals through the ability to reduce oxidative stress.


Subject(s)
1-Butanol/metabolism , Clostridium/enzymology , Escherichia coli/physiology , Metallothionein/metabolism , Porins/metabolism , Tilapia/metabolism , 1-Butanol/isolation & purification , Animals , Cell Membrane/metabolism , Clostridium/genetics , Gene Expression Regulation, Bacterial/physiology , Genetic Enhancement/methods , Metallothionein/genetics , Porins/genetics , Protein Engineering/methods , Signal Transduction/genetics , Tilapia/genetics
14.
Methods ; 93: 110-8, 2016 Jan 15.
Article in English | MEDLINE | ID: mdl-26238263

ABSTRACT

In past decades, the experimental determination of protein functions was expensive and time-consuming, so numerous computational methods were developed to speed up and guide the process. However, most of these methods predict protein functions at the gene level and do not consider the fact that protein isoforms (translated from alternatively spliced transcripts), not genes, are the actual function carriers. Now, high-throughput RNA-seq technology is providing unprecedented opportunities to unravel protein functions at the isoform level. In this article, we review recent progress in the high-resolution functional annotations of protein isoforms, focusing on two methods developed by the authors. Both methods can integrate multiple RNA-seq datasets for comprehensively characterizing functions of protein isoforms.


Subject(s)
Cell Physiological Phenomena/physiology , Databases, Genetic , Protein Isoforms/physiology , Animals , Forecasting , Humans , RNA/physiology
15.
Nucleic Acids Res ; 43(2): 1268-82, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25567984

ABSTRACT

FOXP3 is a lineage-specific transcription factor that is required for regulatory T cell development and function. In this study, we determined the crystal structure of the FOXP3 forkhead domain bound to DNA. The structure reveals that FOXP3 can form a stable domain-swapped dimer to bridge DNA in the absence of cofactors, suggesting that FOXP3 may play a role in long-range gene interactions. To test this hypothesis, we used circular chromosome conformation capture coupled with high throughput sequencing (4C-seq) to analyze FOXP3-dependent genomic contacts around a known FOXP3-bound locus, Ptpn22. Our studies reveal that FOXP3 induces significant changes in the chromatin contacts between the Ptpn22 locus and other Foxp3-regulated genes, reflecting a mechanism by which FOXP3 reorganizes the genome architecture to coordinate the expression of its target genes. Our results suggest that FOXP3 mediates long-range chromatin interactions as part of its mechanisms to regulate specific gene expression in regulatory T cells.


Subject(s)
Chromosomes/chemistry , DNA/chemistry , Forkhead Transcription Factors/chemistry , Animals , DNA/metabolism , Forkhead Transcription Factors/metabolism , Gene Expression Regulation , Humans , Mice, Inbred C57BL , Models, Molecular , Protein Multimerization , Protein Structure, Tertiary , Protein Tyrosine Phosphatase, Non-Receptor Type 22/genetics
16.
BMC Genomics ; 17(1): 632, 2016 08 12.
Article in English | MEDLINE | ID: mdl-27519564

ABSTRACT

BACKGROUND: Chromatin immunoprecipitation followed by massively parallel DNA sequencing (ChIP-seq) or microarray hybridization (ChIP-chip) has been widely used to determine the genomic occupation of transcription factors (TFs). We have previously developed a probabilistic method, called TIP (Target Identification from Profiles), to identify TF target genes using ChIP-seq/ChIP-chip data. To achieve high specificity, TIP applies a conservative method to estimate significance of target genes, with the trade-off being a relatively low sensitivity of target gene identification compared to other methods. Additionally, TIP's output does not render binding-peak locations or intensity, information highly useful for visualization and general experimental biological use, while the variability of ChIP-seq/ChIP-chip file formats has made input into TIP more difficult than desired. DESCRIPTION: To improve upon these facets, here we present are fined TIP with key extensions. First, it implements a Gaussian mixture model for p-value estimation, increasing target gene identification sensitivity and more accurately capturing the shape of TF binding profile distributions. Second, it enables the incorporation of TF binding-peak data by identifying their locations in significant target gene promoter regions and quantifies their strengths. Finally, for full ease of implementation we have incorporated it into a web server ( http://syslab3.nchu.edu.tw/iTAR/ ) that enables flexibility of input file format, can be used across multiple species and genome assembly versions, and is freely available for public use. The web server additionally performs GO enrichment analysis for the identified target genes to reveal the potential function of the corresponding TF. CONCLUSIONS: The iTAR web server provides a user-friendly interface and supports target gene identification in seven species, ranging from yeast to human. To facilitate investigating the quality of ChIP-seq/ChIP-chip data, the web server generates the chart of the characteristic binding profiles and the density plot of normalized regulatory scores. The iTAR web server is a useful tool in identifying TF target genes from ChIP-seq/ChIP-chip data and discovering biological insights.


Subject(s)
Chromatin Immunoprecipitation , STAT3 Transcription Factor/metabolism , User-Computer Interface , Algorithms , HeLa Cells , High-Throughput Nucleotide Sequencing , Humans , Internet , Promoter Regions, Genetic , STAT3 Transcription Factor/genetics , Sequence Analysis, DNA
17.
Cancer Cell ; 13(1): 48-57, 2008 Jan.
Article in English | MEDLINE | ID: mdl-18167339

ABSTRACT

We investigated whether microRNA expression profiles can predict clinical outcome of NSCLC patients. Using real-time RT-PCR, we obtained microRNA expressions in 112 NSCLC patients, which were divided into the training and testing sets. Using Cox regression and risk-score analysis, we identified a five-microRNA signature for the prediction of treatment outcome of NSCLC in the training set. This microRNA signature was validated by the testing set and an independent cohort. Patients with high-risk scores in their microRNA signatures had poor overall and disease-free survivals compared to the low-risk-score patients. This microRNA signature is an independent predictor of the cancer relapse and survival of NSCLC patients.


Subject(s)
Carcinoma, Non-Small-Cell Lung/genetics , Gene Expression Regulation, Neoplastic , Lung Neoplasms/diagnosis , Lung Neoplasms/genetics , MicroRNAs/genetics , Neoplasm Recurrence, Local/genetics , Neoplasm Recurrence, Local/pathology , Aged , Carcinoma, Non-Small-Cell Lung/classification , Carcinoma, Non-Small-Cell Lung/pathology , Cohort Studies , Disease-Free Survival , Female , Humans , Kaplan-Meier Estimate , Lung Neoplasms/classification , Lung Neoplasms/pathology , Male , Neoplasm Invasiveness , Neoplasm Staging , Prognosis , Regression Analysis , Reproducibility of Results
18.
Nucleic Acids Res ; 42(Database issue): D178-83, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24302579

ABSTRACT

Gene expression profiling has been extensively used in the past decades, resulting in an enormous amount of expression data available in public databases. These data sets are informative in elucidating transcriptional regulation of genes underlying various biological and clinical conditions. However, it is usually difficult to identify transcription factors (TFs) responsible for gene expression changes directly from their own expression, as TF activity is often regulated at the posttranscriptional level. In recent years, technical advances have made it possible to systematically determine the target genes of TFs by ChIP-seq experiments. To identify the regulatory programs underlying gene expression profiles, we constructed a database of phenotype-specific regulatory programs (DPRP, http://syslab.nchu.edu.tw/DPRP/) derived from the integrative analysis of TF binding data and gene expression data. DPRP provides three methods: the Fisher's Exact Test, the Kolmogorov-Smirnov test and the BASE algorithm to facilitate the application of gene expression data for generating new hypotheses on transcriptional regulatory programs in biological and clinical studies.


Subject(s)
Databases, Genetic , Gene Expression Profiling , Gene Regulatory Networks , Transcription Factors/metabolism , Algorithms , Binding Sites , Humans , Internet , Phenotype
19.
Nucleic Acids Res ; 42(6): e39, 2014 Apr.
Article in English | MEDLINE | ID: mdl-24369432

ABSTRACT

Alternative transcript processing is an important mechanism for generating functional diversity in genes. However, little is known about the precise functions of individual isoforms. In fact, proteins (translated from transcript isoforms), not genes, are the function carriers. By integrating multiple human RNA-seq data sets, we carried out the first systematic prediction of isoform functions, enabling high-resolution functional annotation of human transcriptome. Unlike gene function prediction, isoform function prediction faces a unique challenge: the lack of the training data--all known functional annotations are at the gene level. To address this challenge, we modelled the gene-isoform relationships as multiple instance data and developed a novel label propagation method to predict functions. Our method achieved an average area under the receiver operating characteristic curve of 0.67 and assigned functions to 15 572 isoforms. Interestingly, we observed that different functions have different sensitivities to alternative isoform processing, and that the function diversity of isoforms from the same gene is positively correlated with their tissue expression diversity. Finally, we surveyed the literature to validate our predictions for a number of apoptotic genes. Strikingly, for the famous 'TP53' gene, we not only accurately identified the apoptosis regulation function of its five isoforms, but also correctly predicted the precise direction of the regulation.


Subject(s)
Gene Expression Profiling , Molecular Sequence Annotation , Protein Isoforms/physiology , Sequence Analysis, RNA , Apoptosis , Gene Regulatory Networks , Humans , Protein Isoforms/genetics , Protein Isoforms/metabolism , RNA Isoforms/metabolism
20.
Nucleic Acids Res ; 42(Web Server issue): W137-46, 2014 Jul.
Article in English | MEDLINE | ID: mdl-24895436

ABSTRACT

The DiseaseConnect (http://disease-connect.org) is a web server for analysis and visualization of a comprehensive knowledge on mechanism-based disease connectivity. The traditional disease classification system groups diseases with similar clinical symptoms and phenotypic traits. Thus, diseases with entirely different pathologies could be grouped together, leading to a similar treatment design. Such problems could be avoided if diseases were classified based on their molecular mechanisms. Connecting diseases with similar pathological mechanisms could inspire novel strategies on the effective repositioning of existing drugs and therapies. Although there have been several studies attempting to generate disease connectivity networks, they have not yet utilized the enormous and rapidly growing public repositories of disease-related omics data and literature, two primary resources capable of providing insights into disease connections at an unprecedented level of detail. Our DiseaseConnect, the first public web server, integrates comprehensive omics and literature data, including a large amount of gene expression data, Genome-Wide Association Studies catalog, and text-mined knowledge, to discover disease-disease connectivity via common molecular mechanisms. Moreover, the clinical comorbidity data and a comprehensive compilation of known drug-disease relationships are additionally utilized for advancing the understanding of the disease landscape and for facilitating the mechanism-based development of new drug treatments.


Subject(s)
Disease/genetics , Software , Comorbidity , Drug Therapy , Gene Expression , Humans , Internet , MicroRNAs/metabolism , Polymorphism, Single Nucleotide
SELECTION OF CITATIONS
SEARCH DETAIL