RESUMO
Cell-free DNA (cfDNA) in blood, viewed as a surrogate for tumor biopsy, has many clinical applications, including diagnosing cancer, guiding cancer treatment and monitoring treatment response. All these applications depend on an indispensable, yet underdeveloped task: detecting somatic mutations from cfDNA. The task is challenging because of the low tumor fraction in cfDNA. Recently, we developed the computational method cfSNV, the first method that comprehensively considers the properties of cfDNA for the sensitive detection of mutations from cfDNA. cfSNV vastly outperformed the conventional methods that were developed primarily for calling mutations from solid tumor tissues. cfSNV can accurately detect mutations in cfDNA even with medium-coverage (e.g., ≥200×) sequencing, which makes whole-exome sequencing (WES) of cfDNA a viable option for various clinical utilities. Here, we present a user-friendly cfSNV package that exhibits fast computation and convenient user options. We also built a Docker image of it, which is designed to enable researchers and clinicians with a limited computational background to easily carry out analyses on both high-performance computing platforms and local computers. Mutation calling from a standard preprocessed WES dataset (~250× and ~70 million base pair target size) can be carried out in 3 h on a server with eight virtual CPUs and 32 GB of random access memory.
Assuntos
Ácidos Nucleicos Livres , Neoplasias , Humanos , Ácidos Nucleicos Livres/genética , Neoplasias/diagnóstico , Neoplasias/genética , Mutação , Software , Sequenciamento de Nucleotídeos em Larga Escala/métodosRESUMO
Early cancer detection by cell-free DNA faces multiple challenges: low fraction of tumor cell-free DNA, molecular heterogeneity of cancer, and sample sizes that are not sufficient to reflect diverse patient populations. Here, we develop a cancer detection approach to address these challenges. It consists of an assay, cfMethyl-Seq, for cost-effective sequencing of the cell-free DNA methylome (with > 12-fold enrichment over whole genome bisulfite sequencing in CpG islands), and a computational method to extract methylation information and diagnose patients. Applying our approach to 408 colon, liver, lung, and stomach cancer patients and controls, at 97.9% specificity we achieve 80.7% and 74.5% sensitivity in detecting all-stage and early-stage cancer, and 89.1% and 85.0% accuracy for locating tissue-of-origin of all-stage and early-stage cancer, respectively. Our approach cost-effectively retains methylome profiles of cancer abnormalities, allowing us to learn new features and expand to other cancer types as training cohorts grow.
Assuntos
Ácidos Nucleicos Livres , Neoplasias Gástricas , Ácidos Nucleicos Livres/genética , Análise Custo-Benefício , Detecção Precoce de Câncer , Epigenoma , Humanos , Neoplasias Gástricas/diagnóstico , Neoplasias Gástricas/genéticaRESUMO
The major biological methyl donor, S-adenosylmethionine (adoMet) synthesis occurs mainly in the liver. Methionine adenosyltransferase 1A (MAT1A) and glycine N-methyltransferase (GNMT) are two key enzymes involved in the functional implications of that variation. We collected 42 RNA-seq data from paired hepatocellular carcinoma (HCC) and its adjacent normal liver tissue from the Cancer Genome Atlas (TCGA). There was no mutation found in MAT1A or GNMT RNA in the 42 HCC patients. The 11,799 genes were annotated in the RNA-Seq data, and their expression levels were used to investigate the phenotypes of low MAT1A and low GNMT by Gene Set Enrichment Analysis (GSEA). The REACTOME_TRANSLATION gene set was enriched and visualized in a heatmap along with corresponding differences in gene expression between low MAT1A versus high MAT1A and low GNMT versus high GNMT. We identified 43 genes of the REACTOME_TRANSLATION gene set that are powerful prognosis factors in HCC. The significantly predicted genes were referred into eukaryotic translation initiation (EIF3B, EIF3K), eukaryotic translation elongation (EEF1D), and ribosomal proteins (RPs). Cell models expressing various MAT1A and GNMT proved that simultaneous restoring the expression of MAT1A and GNMT decreased cell proliferation, invasion, as well as the REACTOME_TRANSLATION gene EEF1D, consistent with a better prognosis in human HCC. We demonstrated new findings that downregulation or defect in MAT1A and GNMT genes can enrich the protein-associated translation process that may account for poor HCC prognosis. This is the first study demonstrated that MAT1A and GNMT, the 2 key enzymes involved in methionine cycle, could attenuate the function of ribosome translation. We propose a potential novel mechanism by which the diminished GNMT and MAT1A expression may confer poor prognosis for HCC.
Assuntos
Carcinoma Hepatocelular/genética , Regulação para Baixo/genética , Regulação Neoplásica da Expressão Gênica , Glicina N-Metiltransferase/genética , Neoplasias Hepáticas/genética , Metionina Adenosiltransferase/genética , Metionina/metabolismo , Biossíntese de Proteínas , Sequência de Bases , Carcinoma Hepatocelular/patologia , Linhagem Celular Tumoral , Proliferação de Células/genética , Metilação de DNA/genética , Fator de Iniciação 3 em Eucariotos/metabolismo , Glicina N-Metiltransferase/metabolismo , Humanos , Estimativa de Kaplan-Meier , Neoplasias Hepáticas/patologia , Metionina Adenosiltransferase/metabolismo , Invasividade Neoplásica , Fator 1 de Elongação de Peptídeos/metabolismo , Regiões Promotoras Genéticas/genética , Biossíntese de Proteínas/genética , Análise de SobrevidaRESUMO
Sex form is one of the most important characteristics in papaya cultivation in which hermaphrodite is the preferable form. Self-pollination of H*-TSS No.7, an inbred line derived from a rare X chromosome mutant SR*, produced all-hermaphrodite progeny. The recessive lethal allele controlling the all-hermaphrodite phenomenon was proposed to be the recessive Germination suppressor (gs) locus. This study employed next-generation sequencing technology and genome comparison to identify the candidate Gs gene. One specific gene, monodehydroascorbate reductase 4 (MDAR4) harboring a unique polymorphic 3 bp deletion in H*-TSS No.7 was identified. The function of MDAR4 is known to be involved in the hydrogen peroxide (H2O2) scavenging pathway and is associated with seed germination. Furthermore, MDAR4 showed higher expression in the imbibed seeds than that in the dry seeds indicating its potential role in the seed germination. Perhaps this is the very first report providing the evidences that MDAR4 is the candidate of Gs locus in H*-TSS No.7. In addition, Gs allele-specific markers were developed which would be facilitated for breeding all-hermaphrodite lines.
Assuntos
Carica/genética , Cromossomos de Plantas/genética , Organismos Hermafroditas/genética , NADH NADPH Oxirredutases/genética , Genoma de Planta/genética , Germinação/genética , Peróxido de Hidrogênio/metabolismo , Polinização/genética , Polinização/fisiologia , Sementes/crescimento & desenvolvimento , Deleção de Sequência/genéticaRESUMO
Folate depletion causes chromosomal instability by increasing DNA strand breakage, uracil misincorporation, and defective repair. Folate mediated one-carbon metabolism has been suggested to play a key role in the carcinogenesis and progression of hepatocellular carcinoma (HCC) through influencing DNA integrity. Methylenetetrahydrofolate reductase (MTHFR) is the enzyme catalyzing the irreversible conversion of 5,10-methylenetetrahydrofolate to 5-methyltetrahydrofolate that can control folate cofactor distributions and modulate the partitioning of intracellular one-carbon moieties. The association between MTHFR polymorphisms and HCC risk is inconsistent and remains controversial in populational studies. We aimed to establish an in vitro cell model of liver origin to elucidate the interactions between MTHFR function, folate status, and chromosome stability. In the present study, we (1) examined MTHFR expression in HCC patients; (2) established cell models of liver origin with stabilized inhibition of MTHFR using small hairpin RNA delivered by a lentiviral vector, and (3) investigated the impacts of reduced MTHFR and folate status on cell cycle, methyl group homeostasis, nucleotide biosynthesis, and DNA stability, all of which are pathways involved in DNA integrity and repair and are critical in human tumorigenesis. By analyzing the TCGA/GTEx datasets available within GEPIA2, we discovered that HCC cancer patients with higher MTHFR had a worse survival rate. The shRNA of MTHFR (shMTHFR) resulted in decreased MTHFR gene expression, MTHFR protein, and enzymatic activity in human hepatoma cell HepG2. shMTHFR tended to decrease intracellular S-adenosylmethionine (SAM) contents but folate depletion similarly decreased SAM in wildtype (WT), negative control (Neg), and shMTHFR cells, indicating that in cells of liver origin, shMTHFR does not exacerbate the methyl group supply in folate depletion. shMTHFR caused cell accumulations in the G2/M, and cell population in the G2/M was inversely correlated with MTHFR gene level (r = -0.81, p < 0.0001), MTHFR protein expression (r = -0.8; p = 0.01), and MTHFR enzyme activity (r = -0.842; p = 0.005). Folate depletion resulted in G2/M cell cycle arrest in WT and Neg but not in shMTHFR cells, indicating that shMTHFR does not exacerbate folate depletion-induced G2/M cell cycle arrest. In addition, shMTHFR promoted the expression and translocation of nuclei thymidine synthetic enzyme complex SHMT1/DHFR/TYMS and assisted folate-dependent de novo nucleotide biosynthesis under folate restriction. Finally, shMTHFR promoted nuclear MLH1/p53 expression under folate deficiency and further reduced micronuclei formation and DNA uracil misincorporation under folate deficiency. In conclusion, shMTHFR in HepG2 induces cell cycle arrest in G2/M that may promote nucleotide supply and assist cell defense against folate depletion-induced chromosome segregation and uracil misincorporation in the DNA. This study provided insight into the significant impact of MTHFR function on chromosome stability of hepatic tissues. Data from the present study may shed light on the potential regulatory mechanism by which MTHFR modulates the risk for hepatic malignancies.
Assuntos
Carcinoma Hepatocelular/patologia , Segregação de Cromossomos , DNA de Neoplasias/genética , Ácido Fólico/metabolismo , Metilenotetra-Hidrofolato Redutase (NADPH2)/antagonistas & inibidores , Uracila/metabolismo , Apoptose , Carcinoma Hepatocelular/genética , Carcinoma Hepatocelular/metabolismo , Proliferação de Células , Instabilidade Cromossômica , DNA de Neoplasias/metabolismo , Regulação Neoplásica da Expressão Gênica , Humanos , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/metabolismo , Neoplasias Hepáticas/patologia , Metilenotetra-Hidrofolato Redutase (NADPH2)/genética , Metilenotetra-Hidrofolato Redutase (NADPH2)/metabolismo , Polimorfismo Genético , Prognóstico , Taxa de Sobrevida , Células Tumorais CultivadasRESUMO
Aberrant elevated Src activity is related to lung cancer growth and metastasis. Therefore, the development of potent small molecule inhibitors to target Src kinase is a potential therapeutic strategy for lung cancer. This study aimed to develop a computational model for the in silico screening of Src inhibitors and then assess the suppressive effect of candidate compounds on cellular functions. A 3D-quantitative structure-activity relationship (QSAR) pharmacophore model consisting of two hydrogen bond acceptors and two hydrophobic regions was constructed by using 28 structurally diverse compounds with IC50 values spanning four orders of magnitude. A National Cancer Institute (NCI) compound dataset was employed for virtual screening by applying the pharmacophore model and molecular docking. Candidate compounds were chosen from the top 20% of scored hits. Among these compounds, the suppressive effects of 30 compounds available in the NCI on Src phosphorylation were validated by using an enzyme-linked immunosorbent assay. Among these compounds, SJG-136, a pyrrolobenzodiazepine dimer, showed a significant inhibitory effect against Src activity in a dose-dependent manner. Further investigations showed that SJG-136 can inhibit lung cancer cell proliferation, clonogenicity, invasion and migration in vitro and tumour growth in vivo. Furthermore, SJG-136 also had an inhibitory effect on Src-related signaling pathways, including the FAK, paxillin, p130Cas, PI3K, AKT, and MEK pathways. In conclusion, we have established a pharmacophore-based virtual screening approach to identify novel Src inhibitors that can inhibit lung cancer cell growth and motility through suppressing Src-related pathways. These findings may contribute to the development of targeted drugs for lung cancer treatment, such as lead compounds.
RESUMO
RNA-Sequencing (RNA-Seq), the most commonly used sequencing application tool, is not only a method for measuring gene expression but also an excellent media to detect important structural variants such as single nucleotide variants (SNVs), insertion/deletion (Indels), or fusion transcripts. The Cancer Genome Atlas (TCGA) contains genomic data from a variety of cancer types and also provides the raw data generated by TCGA consortium. p53 is among the top 10 somatic mutations associated with hepatocellular carcinoma (HCC). The aim of the present study was to analyze concordant different gene profiles and the priori defined set of genes based on p53 mutation status in HCC using RNA-Seq data. In the study, expression profile of 11 799 genes on 42 paired tumor and adjacent normal tissues was collected, processed, and further stratified by the mutated versus normal p53 expression. Furthermore, we used a knowledge-based approach Gene Set Enrichment Analysis (GSEA) to compare between normal and p53 mutation gene expression profiles. The statistical significance (nominal P value) of the enrichment score (ES) genes was calculated. The ranked gene list that reflects differential expression between p53 wild-type and mutant genotypes was then mapped to metabolic process by KEGG, an encyclopedia of genes and genomes to assign functional meanings. These approaches enable us to identify pathways and potential target gene/pathways that are highly expressed in p53 mutated HCC. Our analysis revealed 2 genes, the hexokinase 2 (HK2) and Enolase 1 (ENO1), were conspicuous of red pixel in the heatmap. To further explore the role of these genes in HCC, the overall survival plots by Kaplan-Meier method were performed for HK2 and ENO1 that revealed high HK2 and ENO1 expression in patients with HCC have poor prognosis. These results suggested that these glycolysis genes are associated with mutated-p53 in HCC that may contribute to poor prognosis. In this proof-of-concept study, we proposed an approach for identifying novel potential therapeutic targets in human HCC with mutated p53. These approaches can take advantage of the massive next-generation sequencing (NGS) data generated worldwide and make more out of it by exploring new potential therapeutic targets.
RESUMO
We aimed to investigate the association of gut microbiota with disease activity, inflammatory parameters, and auto-antibodies profile in rheumatoid arthritis (RA). A total of 138 RA patients and 21 healthy controls (HC) were enrolled. Fecal samples were collected for bacterial DNA extraction and 16S ribosome (r)RNA sequencing, followed by analyses of gut microbiota composition. Serum levels of tumor necrosis factor (TNF)-α, interleukin (IL)-6, and IL-17A were determined by using ELISA. Our results indicated that RA patients had lower diversity index, which reflects both evenness and richness of gut microbiota, compared to HC. The alpha-diversity was lower in anti-citrullinated peptide antibodies (ACPA)-positive patients than in HC. The phylum Verrucomicrobiae and genus Akkermansia were more abundant in patients compared to HC. There was increased relative abundance of Enterobacteriaceae as well as Klebsiella, and less abundance of Bifidobacterium in patients with high levels of TNF-α or IL-17A compared to those who had low levels of these cytokines. In addition, ACPA-positive patients had higher proportions of Blautia, Akkermansia, and Clostridiales than ACPA-negative patients. Gut dysbiosis in RA patients was presented as different microbial composition and its association with inflammatory parameters as well as ACPA seropositivity. These findings support the involvement of gut microbiota in RA pathogenesis.
RESUMO
MOTIVATION: In recent years, several experimental studies have revealed that the microRNAs (miRNAs) in serum, plasma, exosome and whole blood are dysregulated in various types of diseases, indicating that the circulating miRNAs may serve as potential noninvasive biomarkers for disease diagnosis and prognosis. However, no database has been constructed to integrate the large-scale circulating miRNA profiles, explore the functional pathways involved and predict the potential biomarkers using feature selection between the disease conditions. Although there have been several studies attempting to generate a circulating miRNA database, they have not yet integrated the large-scale circulating miRNA profiles or provided the biomarker-selection function using machine learning methods. RESULTS: To fill this gap, we constructed the Circulating MicroRNA Expression Profiling (CMEP) database for integrating, analyzing and visualizing the large-scale expression profiles of phenotype-specific circulating miRNAs. The CMEP database contains massive datasets that were manually curated from NCBI GEO and the exRNA Atlas, including 66 datasets, 228 subsets and 10 419 samples. The CMEP provides the differential expression circulating miRNAs analysis and the KEGG functional pathway enrichment analysis. Furthermore, to provide the function of noninvasive biomarker discovery, we implemented several feature-selection methods, including ridge regression, lasso regression, support vector machine and random forests. Finally, we implemented a user-friendly web interface to improve the user experience and to visualize the data and results of CMEP. AVAILABILITY AND IMPLEMENTATION: CMEP is accessible at http://syslab5.nchu.edu.tw/CMEP.
Assuntos
Bases de Dados Factuais , Biomarcadores , MicroRNA Circulante , Exossomos , Perfilação da Expressão GênicaRESUMO
BACKGROUND: Abiotic and biotic stresses severely affect the growth and reproduction of plants and crops. Determining the critical molecular mechanisms and cellular processes in response to stresses will provide biological insight for addressing both climate change and food crises. RNA sequencing (RNA-Seq) is a revolutionary tool that has been used extensively in plant stress research. However, no existing large-scale RNA-Seq database has been designed to provide information on the stress-specific differentially expressed transcripts that occur across diverse plant species and various stresses. RESULTS: We have constructed a comprehensive database, the plant stress RNA-Seq nexus (PSRN), which includes 12 plant species, 26 plant-stress RNA-Seq datasets, and 937 samples. All samples are assigned to 133 stress-specific subsets, which are constructed into 254 subset pairs, a comparison between selected two subsets, for stress-specific differentially expressed transcript identification. CONCLUSIONS: PSRN is an open resource for intuitive data exploration, providing expression profiles of coding-transcript/lncRNA and identifying which transcripts are differentially expressed between different stress-specific subsets, in order to support researchers generating new biological insights and hypotheses in molecular breeding or evolution. PSRN is freely available at http://syslab5.nchu.edu.tw/PSRN .
Assuntos
Bases de Dados Genéticas , Células Vegetais/metabolismo , Estresse Fisiológico , Transcriptoma , Acesso à Internet , RNA de Plantas/metabolismo , Interface Usuário-ComputadorRESUMO
The detection of tumor-derived cell-free DNA in plasma is one of the most promising directions in cancer diagnosis. The major challenge in such an approach is how to identify the tiny amount of tumor DNAs out of total cell-free DNAs in blood. Here we propose an ultrasensitive cancer detection method, termed 'CancerDetector', using the DNA methylation profiles of cell-free DNAs. The key of our method is to probabilistically model the joint methylation states of multiple adjacent CpG sites on an individual sequencing read, in order to exploit the pervasive nature of DNA methylation for signal amplification. Therefore, CancerDetector can sensitively identify a trace amount of tumor cfDNAs in plasma, at the level of individual reads. We evaluated CancerDetector on the simulated data, and showed a high concordance of the predicted and true tumor fraction. Testing CancerDetector on real plasma data demonstrated its high sensitivity and specificity in detecting tumor cfDNAs. In addition, the predicted tumor fraction showed great consistency with tumor size and survival outcome. Note that all of those testing were performed on sequencing data at low to medium coverage (1× to 10×). Therefore, CancerDetector holds the great potential to detect cancer early and cost-effectively.
Assuntos
Algoritmos , Ácidos Nucleicos Livres/genética , Biologia Computacional/métodos , Metilação de DNA , Neoplasias/diagnóstico , Ácidos Nucleicos Livres/química , Ilhas de CpG/genética , DNA de Neoplasias/química , DNA de Neoplasias/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Neoplasias/sangue , Neoplasias/genética , Curva ROC , Reprodutibilidade dos TestesRESUMO
BACKGROUND: Emerging evidence has been experimentally confirmed the tissue-specific expression of circRNAs (circRNAs). Global identification of human tissue-specific circRNAs is crucial for the functionality study, which facilitates the discovery of circRNAs for potential diagnostic biomarkers. RESULTS: In this study, circRNA back-splicing junctions were identified from 465 publicly available transcriptome sequencing samples. The number of reads aligned to these identified junctions was normalized with the read length and sequence depth for each sample. We generated 66 models representing enriched circRNAs among human tissue transcriptome through biclustering algorithm. The result provides thousands of newly identified human tissue-specific circRNAs. CONCLUSIONS: This result suggests that expression of circRNAs is not prompted by random splicing error but serving molecular functional roles. We also identified circRNAs enriched within circulating system, which, along with identified tissue-specific circRNAs, can serve as potential diagnostic biomarkers.
Assuntos
Algoritmos , Biomarcadores/metabolismo , Regulação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , RNA/genética , Transcriptoma , Encéfalo/metabolismo , Análise por Conglomerados , Humanos , Especificidade de Órgãos , RNA CircularRESUMO
BACKGROUND: N-Butanol has favorable characteristics for use as either an alternative fuel or platform chemical. Bio-based n-butanol production using microbes is an emerging technology that requires further development. Although bio-industrial microbes such as Escherichia coli have been engineered to produce n-butanol, reactive oxygen species (ROS)-mediated toxicity may limit productivity. Previously, we show that outer-membrane-targeted tilapia metallothionein (OmpC-TMT) is more effective as an ROS scavenger than human and mouse metallothioneins to reduce oxidative stress in the host cell. RESULTS: The host strain (BUT1-DE) containing the clostridial n-butanol pathway displayed a decreased growth rate and limited n-butanol productivity, likely due to ROS accumulation. The clostridial n-butanol pathway was co-engineered with inducible OmpC-TMT in E. coli (BUT3-DE) for simultaneous ROS removal, and its effect on n-butanol productivity was examined. The ROS scavenging ability of cells overexpressing OmpC-TMT was examined and showed an approximately twofold increase in capacity. The modified strain improved n-butanol productivity to 320 mg/L, whereas the control strain produced only 95.1 mg/L. Transcriptomic analysis revealed three major KEGG pathways that were significantly differentially expressed in the BUT3-DE strain compared with their expression in the BUT1-DE strain, including genes involved in oxidative phosphorylation, fructose and mannose metabolism and glycolysis/gluconeogenesis. CONCLUSIONS: These results indicate that OmpC-TMT can increase n-butanol production by scavenging ROS. The transcriptomic analysis suggested that n-butanol causes quinone malfunction, resulting in oxidative-phosphorylation-related nuo operon downregulation, which would diminish the ability to convert NADH to NAD+ and generate proton motive force. However, fructose and mannose metabolism-related genes (fucA, srlE and srlA) were upregulated, and glycolysis/gluconeogenesis-related genes (pfkB, pgm) were downregulated, which further assisted in regulating NADH/NAD+ redox and preventing additional ATP depletion. These results indicated that more NADH and ATP were required in the n-butanol synthetic pathway. Our study demonstrates a potential approach to increase the robustness of microorganisms and the production of toxic chemicals through the ability to reduce oxidative stress.
Assuntos
1-Butanol/metabolismo , Clostridium/enzimologia , Escherichia coli/fisiologia , Metalotioneína/metabolismo , Porinas/metabolismo , Tilápia/metabolismo , 1-Butanol/isolamento & purificação , Animais , Membrana Celular/metabolismo , Clostridium/genética , Regulação Bacteriana da Expressão Gênica/fisiologia , Melhoramento Genético/métodos , Metalotioneína/genética , Porinas/genética , Engenharia de Proteínas/métodos , Transdução de Sinais/genética , Tilápia/genéticaRESUMO
BACKGROUND: Transcription factors (TFs) often interact with one another to form TF complexes that bind DNA and regulate gene expression. Many databases are created to describe known TF complexes identified by either mammalian two-hybrid experiments or data mining. Lately, a wealth of ChIP-seq data on human TFs under different experiment conditions are available, making it possible to investigate condition-specific (cell type and/or physiologic state) TF complexes and their target genes. RESULTS: Here, we developed a systematic pipeline to infer Condition-Specific Targets of human TF-TF complexes (called the CST pipeline) by integrating ChIP-seq data and TF motifs. In total, we predicted 2,392 TF complexes and 13,504 high-confidence or 127,994 low-confidence regulatory interactions amongst TF complexes and their target genes. We validated our predictions by (i) comparing predicted TF complexes to external TF complex databases, (ii) validating selected target genes of TF complexes using ChIP-qPCR and RT-PCR experiments, and (iii) analysing target genes of select TF complexes using gene ontology enrichment to demonstrate the accuracy of our work. Finally, the predicted results above were integrated and employed to construct a CST database. CONCLUSIONS: We built up a methodology to construct the CST database, which contributes to the analysis of transcriptional regulation and the identification of novel TF-TF complex formation in a certain condition. This database also allows users to visualize condition-specific TF regulatory networks through a user-friendly web interface.
Assuntos
Imunoprecipitação da Cromatina , Biologia Computacional , Análise de Sequência de DNA , Fatores de Transcrição/metabolismo , Bases de Dados Genéticas , Ontologia Genética , Humanos , Motivos de Nucleotídeos , Transcrição GênicaRESUMO
MicroRNAs (miRNAs) are known to play critical roles in plant development and stress-response regulation, and they frequently display multi-targeting characteristics. The control of defined rice phenotypes occurs through multiple genes; however, evidence demonstrating the relationship between agronomic traits and miRNA expression profiles is lacking. In this study, we investigated eight yield-related traits in 187 local rice cultivars and profiled the expression levels of 193 miRNAs in these cultivars using microarray analyses. By integrating the miRBase database, the rice annotation project database, and the miRanda and psRNATarget web servers, we constructed a database (RiceATM) that can be employed to investigate the association between rice agronomic traits and miRNA expression. The functions of this platform include phenotype selection, sample grouping, microarray data pretreatment, statistical analysis and target gene predictions. To demonstrate the utility of RiceATM, we used the database to identify four miRNAs associated with the heading date and validated their expression trends in the cultivars with early or late heading date by real-time PCR. RiceATM is a useful tool for researchers seeking to characterize the role of certain miRNAs for a specific phenotype and discover potential biomarkers for breeding or functional studies.Database URL: http://syslab3.nchu.edu.tw/rice/.
Assuntos
Produtos Agrícolas , Bases de Dados Genéticas , Regulação da Expressão Gênica de Plantas , Oryza , Locos de Características Quantitativas , RNA de Plantas , Software , Produtos Agrícolas/genética , Produtos Agrícolas/metabolismo , Genoma de Planta , MicroRNAs/biossíntese , MicroRNAs/genética , Anotação de Sequência Molecular , Oryza/genética , Oryza/metabolismo , RNA de Plantas/biossíntese , RNA de Plantas/genéticaRESUMO
BACKGROUND: Chromatin immunoprecipitation followed by massively parallel DNA sequencing (ChIP-seq) or microarray hybridization (ChIP-chip) has been widely used to determine the genomic occupation of transcription factors (TFs). We have previously developed a probabilistic method, called TIP (Target Identification from Profiles), to identify TF target genes using ChIP-seq/ChIP-chip data. To achieve high specificity, TIP applies a conservative method to estimate significance of target genes, with the trade-off being a relatively low sensitivity of target gene identification compared to other methods. Additionally, TIP's output does not render binding-peak locations or intensity, information highly useful for visualization and general experimental biological use, while the variability of ChIP-seq/ChIP-chip file formats has made input into TIP more difficult than desired. DESCRIPTION: To improve upon these facets, here we present are fined TIP with key extensions. First, it implements a Gaussian mixture model for p-value estimation, increasing target gene identification sensitivity and more accurately capturing the shape of TF binding profile distributions. Second, it enables the incorporation of TF binding-peak data by identifying their locations in significant target gene promoter regions and quantifies their strengths. Finally, for full ease of implementation we have incorporated it into a web server ( http://syslab3.nchu.edu.tw/iTAR/ ) that enables flexibility of input file format, can be used across multiple species and genome assembly versions, and is freely available for public use. The web server additionally performs GO enrichment analysis for the identified target genes to reveal the potential function of the corresponding TF. CONCLUSIONS: The iTAR web server provides a user-friendly interface and supports target gene identification in seven species, ranging from yeast to human. To facilitate investigating the quality of ChIP-seq/ChIP-chip data, the web server generates the chart of the characteristic binding profiles and the density plot of normalized regulatory scores. The iTAR web server is a useful tool in identifying TF target genes from ChIP-seq/ChIP-chip data and discovering biological insights.
Assuntos
Imunoprecipitação da Cromatina , Fator de Transcrição STAT3/metabolismo , Interface Usuário-Computador , Algoritmos , Células HeLa , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Internet , Regiões Promotoras Genéticas , Fator de Transcrição STAT3/genética , Análise de Sequência de DNARESUMO
The post-genomic era has resulted in the accumulation of high-throughput cancer data from a vast array of genomic technologies including next-generation sequencing and microarray. As such, the large amounts of germline variant and somatic mutation data that have been generated from GWAS and sequencing projects, respectively, show great promise in providing a systems-level view of these genetic aberrations. In this study, we analyze publicly available GWAS, somatic mutation, and drug target data derived from large databanks using a network-based approach that incorporates directed edge information under a randomized network hypothesis testing procedure. We show that these three classes of disease-associated nodes exhibit non-random topological characteristics in the context of a functional interactome. Specifically, we show that drug targets tend to lie upstream of somatic mutations and disease susceptibility germline variants. In addition, we introduce a new approach to measuring hierarchy between drug targets, somatic mutants, and disease susceptibility genes by utilizing directionality and path length information. Overall, our results provide new insight into the intrinsic relationships between these node classes that broaden our understanding of cancer. In addition, our results align with current knowledge on the therapeutic actionability of GWAS and somatic mutant nodes, while demonstrating relationships between node classes from a global network perspective.
Assuntos
Genes Neoplásicos , Neoplasias/genética , Bases de Dados Genéticas , Sistemas de Liberação de Medicamentos , Redes Reguladoras de Genes , Estudo de Associação Genômica Ampla , Humanos , Mutação/genéticaRESUMO
The genome-wide transcriptome profiling of cancerous and normal tissue samples can provide insights into the molecular mechanisms of cancer initiation and progression. RNA Sequencing (RNA-Seq) is a revolutionary tool that has been used extensively in cancer research. However, no existing RNA-Seq database provides all of the following features: (i) large-scale and comprehensive data archives and analyses, including coding-transcript profiling, long non-coding RNA (lncRNA) profiling and coexpression networks; (ii) phenotype-oriented data organization and searching and (iii) the visualization of expression profiles, differential expression and regulatory networks. We have constructed the first public database that meets these criteria, the Cancer RNA-Seq Nexus (CRN, http://syslab4.nchu.edu.tw/CRN). CRN has a user-friendly web interface designed to facilitate cancer research and personalized medicine. It is an open resource for intuitive data exploration, providing coding-transcript/lncRNA expression profiles to support researchers generating new hypotheses in cancer research and personalized medicine.