RESUMO
By analyzing gene expression data in glioblastoma in combination with matched microRNA profiles, we have uncovered a posttranscriptional regulation layer of surprising magnitude, comprising more than 248,000 microRNA (miR)-mediated interactions. These include â¼7,000 genes whose transcripts act as miR "sponges" and 148 genes that act through alternative, nonsponge interactions. Biochemical analyses in cell lines confirmed that this network regulates established drivers of tumor initiation and subtype implementation, including PTEN, PDGFRA, RB1, VEGFA, STAT3, and RUNX1, suggesting that these interactions mediate crosstalk between canonical oncogenic pathways. siRNA silencing of 13 miR-mediated PTEN regulators, whose locus deletions are predictive of PTEN expression variability, was sufficient to downregulate PTEN in a 3'UTR-dependent manner and to increase tumor cell growth rates. Thus, miR-mediated interactions provide a mechanistic, experimentally validated rationale for the loss of PTEN expression in a large number of glioma samples with an intact PTEN locus.
Assuntos
Regulação Neoplásica da Expressão Gênica , Glioblastoma/genética , Glioblastoma/metabolismo , MicroRNAs/metabolismo , Humanos , Análise Multivariada , Oncogenes , PTEN Fosfo-Hidrolase/genética , Interferência de RNARESUMO
LIN28 is a bipartite RNA-binding protein that post-transcriptionally inhibits the biogenesis of let-7 microRNAs to regulate development and influence disease states. However, the mechanisms of let-7 suppression remain poorly understood because LIN28 recognition depends on coordinated targeting by both the zinc knuckle domain (ZKD), which binds a GGAG-like element in the precursor, and the cold shock domain (CSD), whose binding sites have not been systematically characterized. By leveraging single-nucleotide-resolution mapping of LIN28 binding sites in vivo, we determined that the CSD recognizes a (U)GAU motif. This motif partitions the let-7 microRNAs into two subclasses, precursors with both CSD and ZKD binding sites (CSD+) and precursors with ZKD but no CSD binding sites (CSD-). LIN28 in vivo recognition-and subsequent 3' uridylation and degradation-of CSD+ precursors is more efficient, leading to their stronger suppression in LIN28-activated cells and cancers. Thus, CSD binding sites amplify the regulatory effects of LIN28.
Assuntos
MicroRNAs/metabolismo , Proteínas de Ligação a RNA/metabolismo , Animais , Sequência de Bases , Células-Tronco Embrionárias , Células Hep G2 , Humanos , Células K562 , Camundongos , MicroRNAs/genética , Modelos Moleculares , Conformação de Ácido Nucleico , Domínios Proteicos , Estrutura Terciária de Proteína , Precursores de RNA/metabolismo , Proteínas de Ligação a RNA/genéticaRESUMO
Negative Pressure Wound Therapy (NPWT) is a commonly employed clinical strategy for wound healing, yet its early-stage mechanisms remain poorly understood. To address this knowledge gap and overcome the limitations of human trials, we establish an NPWT C57BL/6JNarl mouse model to investigate the molecular mechanisms involved in NPWT. In this study, we investigate the intricate molecular mechanisms through which NPWT expedites wound healing. Our focus is on NPWT's modulation of inflammatory immune responses and the concurrent orchestration of multiple signal transduction pathways, resulting in shortened coagulation time and reduced inflammation. Notably, we observe a significant rise in dickkopf-related protein 1 (DKK-1) concentration during NPWT, promoting the differentiation of Hair Follicle Stem Cells (HFSCs) into epidermal cells, expediting wound closure. Under negative pressure, macrophages express and release DKK-1 cytokines, crucial for stimulating HFSC differentiation, as validated in animal experiments and in vitro studies. Our findings illuminate the inflammatory dynamics under NPWT, revealing potential signal transduction pathways. The proposed framework, involving early hemostasis, balanced inflammation, and macrophage-mediated DKK-1 induction, provides a novel perspective on enhancing wound healing during NPWT. Furthermore, these insights lay the groundwork for future pharmacological advancements in managing extensive wounds, opening avenues for targeted therapeutic interventions in wound care.
Assuntos
Tratamento de Ferimentos com Pressão Negativa , Humanos , Camundongos , Animais , Tratamento de Ferimentos com Pressão Negativa/métodos , Modelos Animais de Doenças , Camundongos Endogâmicos C57BL , Cicatrização , Inflamação/terapiaRESUMO
MOTIVATION: Microbiota analyses have important implications for health and science. These analyses make use of 16S/18S rRNA gene sequencing to identify taxa and predict species diversity. However, most available tools for analyzing microbiota data require adept programming skills and in-depth statistical knowledge for proper implementation. While long-read amplicon sequencing can lead to more accurate taxa predictions and is quickly becoming more common, practitioners have no easily accessible tools with which to perform their analyses. RESULTS: We present MOCHI, a GUI tool for microbiota amplicon sequencing analysis. MOCHI preprocesses sequences, assigns taxonomy, identifies different abundant species and predicts species diversity and function. It takes either taxonomic count table or FASTQ of partial 16S/18S rRNA or full-length 16S rRNA gene as input. It performs analyses in real time and visualizes data in both tabular and graphical formats. AVAILABILITY AND IMPLEMENTATION: MOCHI can be installed to run locally or accessed as a web tool at https://mochi.life.nctu.edu.tw. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Microbiota , RNA Ribossômico 16S/genética , Análise de Sequência de DNA , Microbiota/genética , FilogeniaRESUMO
BACKGROUND: Racial disparities in cancer outcomes are increasingly recognized, but comprehensive analyses, including molecular studies, are limited. The objective of the current study was to perform a pan-cancer clinical and epigenetic molecular analysis of outcomes in African American (AA) and European American (EA) patients. METHODS: Cross-platform analyses using cancer databases (the Surveillance, Epidemiology, and End Results program database and the National Cancer Data Base) and a molecular database (The Cancer Genome Ancestry Atlas) were performed to evaluate clinical and epigenetic molecular differences between AA and EA patients based on genetic ancestry. RESULTS: In the primary pan-cancer survival analysis using the Surveillance, Epidemiology, and End Results database (2,045,839 patients; 87.5% EA and 12.5% AA), AA patients had higher mortality rates for 28 of 42 cancer types analyzed (hazard ratio, >1.0). AAs continued to have higher mortality in 13 cancer types after adjustment for socioeconomic variables using the National Cancer Database (5,150,023 patients; 11.6% AA and 88.4% EA). Then, molecular features of 5,283 tumors were analyzed in patients who had genetic ancestry data available (87.2% EA and 12.8% AA). Genes were identified with altered DNA methylation along with increased microRNA expression levels unique to AA patients that are associated with cancer drug resistance. Increased miRNAs (miR-15a, miR-17, miR-130-3p, miR-181a) were noted in common among AAs with breast, kidney, thyroid, or prostate carcinomas. CONCLUSIONS: The current results identified epigenetic features in AA patients who have cancer that may contribute to higher mortality rates compared with EA patients who have cancer. Therefore, a focus on molecular signatures unique to AAs may identify actionable molecular abnormalities.
Assuntos
Negro ou Afro-Americano/genética , Epigênese Genética/genética , Disparidades nos Níveis de Saúde , MicroRNAs/genética , Neoplasias/genética , População Branca/genética , Negro ou Afro-Americano/estatística & dados numéricos , Idoso , Feminino , Humanos , Incidência , Masculino , Pessoa de Meia-Idade , Neoplasias/epidemiologia , Neoplasias/etnologia , Programa de SEER/estatística & dados numéricos , Análise de Sobrevida , Estados Unidos/epidemiologia , População Branca/estatística & dados numéricosRESUMO
microRNAs (miRNAs) play key roles in cancer, but their propensity to couple their targets as competing endogenous RNAs (ceRNAs) has only recently emerged. Multiple models have studied ceRNA regulation, but these models did not account for the effects of co-regulation by miRNAs with many targets. We modeled ceRNA and simulated its effects using established parameters for miRNA/mRNA interaction kinetics while accounting for co-regulation by multiple miRNAs with many targets. Our simulations suggested that co-regulation by many miRNA species is more likely to produce physiologically relevant context-independent couplings. To test this, we studied the overlap of inferred ceRNA networks from four tumor contexts-our proposed pan-cancer ceRNA interactome (PCI). PCI was composed of interactions between genes that were co-regulated by nearly three-times as many miRNAs as other inferred ceRNA interactions. Evidence from expression-profiling datasets suggested that PCI interactions are predictive of gene expression in 12 independent tumor- and non-tumor contexts. Biochemical assays confirmed ceRNA couplings for two PCI subnetworks, including oncogenes CCND1, HIF1A and HMGA2, and tumor suppressors PTEN, RB1 and TP53. Our results suggest that PCI is enriched for context-independent interactions that are coupled by many miRNA species and are more likely to be context independent.
Assuntos
Regulação Neoplásica da Expressão Gênica , MicroRNAs/metabolismo , Neoplasias/genética , RNA Neoplásico/metabolismo , Humanos , Neoplasias/metabolismoRESUMO
We introduce a method for simultaneous prediction of microRNA-target interactions and their mediated competitive endogenous RNA (ceRNA) interactions. Using high-throughput validation assays in breast cancer cell lines, we show that our integrative approach significantly improves on microRNA-target prediction accuracy as assessed by both mRNA and protein level measurements. Our biochemical assays support nearly 500 microRNA-target interactions with evidence for regulation in breast cancer tumors. Moreover, these assays constitute the most extensive validation platform for computationally inferred networks of microRNA-target interactions in breast cancer tumors, providing a useful benchmark to ascertain future improvements.
Assuntos
Biologia Computacional/métodos , Epistasia Genética , Redes Reguladoras de Genes , MicroRNAs/genética , Interferência de RNA , RNA Mensageiro/genética , Regiões 3' não Traduzidas , Algoritmos , Sítios de Ligação , Neoplasias da Mama/genética , Neoplasias da Mama/metabolismo , Linhagem Celular Tumoral , Análise por Conglomerados , Receptor alfa de Estrogênio/genética , Receptor alfa de Estrogênio/metabolismo , Feminino , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Humanos , MicroRNAs/química , RNA Mensageiro/químicaRESUMO
BACKGROUND: MicroRNAs (miRNAs) play multiple roles in tumor biology. Interestingly, reports from multiple groups suggest that miRNA targets may be coupled through competitive stoichiometric sequestration. Specifically, computational models predicted and experimental assays confirmed that miRNA activity is dependent on miRNA target abundance, and consequently, changes in the abundance of some miRNA targets lead to changes to the regulation and abundance of their other targets. The resulting indirect regulatory influence between miRNA targets resembles competition and has been dubbed competitive endogenous RNA (ceRNA). Recent studies have questioned the physiological relevance of ceRNA interactions, our ability to accurately predict these interactions, and the number of genes that are impacted by ceRNA interactions in specific cellular contexts. RESULTS: To address these concerns, we reverse engineered ceRNA networks (ceRNETs) in breast and prostate adenocarcinomas using context-specific TCGA profiles, and tested whether ceRNA interactions can predict the effects of RNAi-mediated gene silencing perturbations in PC3 and MCF7 cells._ENREF_22 Our results, based on tests of thousands of inferred ceRNA interactions that are predicted to alter hundreds of cancer genes in each of the two tumor contexts, confirmed statistically significant effects for half of the predicted targets. CONCLUSIONS: Our results suggest that the expression of a significant fraction of cancer genes may be regulated by ceRNA interactions in each of the two tumor contexts.
Assuntos
Redes Reguladoras de Genes , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de RNA , Bases de Dados Genéticas , Humanos , Células MCF-7 , MicroRNAs/genéticaRESUMO
BACKGROUND: Catheter-associated urinary tract infections (CAUTIs) increase clinical burdens. Identifying the high-risk patients is crucial. We aimed to develop and externally validate an explainable, prognostic prediction model of CAUTIs among hospitalized individuals receiving urinary catheterization. METHODS: A retrospective cohort paradigm was applied for model development and validation using data from two hospitals and used the third hospital's data for external validation. Machine learning algorithms were applied for predictive modeling. We evaluated the calibration, clinical utility, and discrimination ability to choose the best model by the validation set. The best model was assessed for the explainability. RESULTS: We included 122,417 instances from 20-to-75-year-old subjects. Fourteen predictors were selected from 20 candidates. The best model was the RF for prediction within 6 days. It detected 97.63% (95% confidence interval [CI]: ±0.06%) CAUTI positive, and 97.36% (95% CI: ±0.07%) of individuals that were predicted to be CAUTI negative were true negatives. Among those predicted to be CAUTI positives, we expected 22.85% (95% CI: ±0.07%) of them to truly be high-risk individuals. We provide a web-based application and a paper-based nomogram for using this model. CONCLUSIONS: Our prediction model accurately detected most CAUTI positive cases, while most predicted negative individuals were correctly ruled out.
RESUMO
BACKGROUND: RNA profiling technologies at single-cell resolutions, including single-cell and single-nuclei RNA sequencing (scRNA-seq and snRNA-seq, scnRNA-seq for short), can help characterize the composition of tissues and reveal cells that influence key functions in both healthy and disease tissues. However, the use of these technologies is operationally challenging because of high costs and stringent sample-collection requirements. Computational deconvolution methods that infer the composition of bulk-profiled samples using scnRNA-seq-characterized cell types can broaden scnRNA-seq applications, but their effectiveness remains controversial. RESULTS: We produced the first systematic evaluation of deconvolution methods on datasets with either known or scnRNA-seq-estimated compositions. Our analyses revealed biases that are common to scnRNA-seq 10X Genomics assays and illustrated the importance of accurate and properly controlled data preprocessing and method selection and optimization. Moreover, our results suggested that concurrent RNA-seq and scnRNA-seq profiles can help improve the accuracy of both scnRNA-seq preprocessing and the deconvolution methods that employ them. Indeed, our proposed method, Single-cell RNA Quantity Informed Deconvolution (SQUID), which combines RNA-seq transformation and dampened weighted least-squares deconvolution approaches, consistently outperformed other methods in predicting the composition of cell mixtures and tissue samples. CONCLUSIONS: We showed that analysis of concurrent RNA-seq and scnRNA-seq profiles with SQUID can produce accurate cell-type abundance estimates and that this accuracy improvement was necessary for identifying outcomes-predictive cancer cell subclones in pediatric acute myeloid leukemia and neuroblastoma datasets. These results suggest that deconvolution accuracy improvements are vital to enabling its applications in the life sciences.
Assuntos
Perfilação da Expressão Gênica , Transcriptoma , Criança , Humanos , RNA-Seq , Perfilação da Expressão Gênica/métodos , RNA Interferente Pequeno , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodosRESUMO
Most of the transcribed human genome codes for noncoding RNAs (ncRNAs), and long noncoding RNAs (lncRNAs) make for the lion's share of the human ncRNA space. Despite growing interest in lncRNAs, because there are so many of them, and because of their tissue specialization and, often, lower abundance, their catalog remains incomplete and there are multiple ongoing efforts to improve it. Consequently, the number of human lncRNA genes may be lower than 10,000 or higher than 200,000. A key open challenge for lncRNA research, now that so many lncRNA species have been identified, is the characterization of lncRNA function and the interpretation of the roles of genetic and epigenetic alterations at their loci. After all, the most important human genes to catalog and study are those that contribute to important cellular functions-that affect development or cell differentiation and whose dysregulation may play a role in the genesis and progression of human diseases. Multiple efforts have used screens based on RNA-mediated interference (RNAi), antisense oligonucleotide (ASO), and CRISPR screens to identify the consequences of lncRNA dysregulation and predict lncRNA function in select contexts, but these approaches have unresolved scalability and accuracy challenges. Instead-as was the case for better-studied ncRNAs in the past-researchers often focus on characterizing lncRNA interactions and investigating their effects on genes and pathways with known functions. Here, we focus most of our review on computational methods to identify lncRNA interactions and to predict the effects of their alterations and dysregulation on human disease pathways.
Assuntos
RNA Longo não Codificante/genética , Genoma Humano , Humanos , RNA não TraduzidoRESUMO
PURPOSE: To show that intrinsic radiosensitivity varies greatly for protons and carbon (C) ions in addition to photons, and that DNA repair capacity remains important in governing this variability. METHODS: We measured or obtained from the literature clonogenic survival data for a number of human cancer cell lines exposed to photons, protons (9.9 keV/µm), and C-ions (13.3-77.1 keV/µm). We characterized their intrinsic radiosensitivity by the dose for 10% or 50% survival (D10% or D50% ), and quantified the variability at each radiation quality by the coefficient of variation (COV) in D10% and D50% . We also treated cells with DNA repair inhibitors prior to irradiation to assess how DNA repair capacity affects their variability. RESULTS: We found no statistically significant differences in the COVs of D10% or D50% between any of the radiation qualities investigated. The same was true regardless of whether the cells were treated with DNA repair inhibitors, or whether they were stratified into histologic subsets. Even within histologic subsets, we found remarkable differences in radiosensitivity for high LET C-ions that were often greater than the variations in RBE, with brain cancer cells varying in D10% (D50% ) up to 100% (131%) for 77.1 keV/µm C-ions, and non-small cell lung cancer and pancreatic cancer cell lines varying up to 55% (76%) and 51% (78%), respectively, for 60.5 keV/µm C-ions. The cell lines with modulated DNA repair capacity had greater variability in intrinsic radiosensitivity across all radiation qualities. CONCLUSIONS: Even for cell lines of the same histologic type, there are remarkable variations in intrinsic radiosensitivity, and these variations do not differ significantly between photon, proton or C-ion radiation. The importance of DNA repair capacity in governing the variability in intrinsic radiosensitivity is not significantly diminished for higher LET radiation.
Assuntos
Carcinoma Pulmonar de Células não Pequenas , Neoplasias Pulmonares , Carbono , Linhagem Celular , Sobrevivência Celular , Humanos , Prótons , Tolerância a Radiação , Eficiência Biológica RelativaRESUMO
Existing compendia of non-coding RNA (ncRNA) are incomplete, in part because they are derived almost exclusively from small and polyadenylated RNAs. Here we present a more comprehensive atlas of the human transcriptome, which includes small and polyA RNA as well as total RNA from 300 human tissues and cell lines. We report thousands of previously uncharacterized RNAs, increasing the number of documented ncRNAs by approximately 8%. To infer functional regulation by known and newly characterized ncRNAs, we exploited pre-mRNA abundance estimates from total RNA sequencing, revealing 316 microRNAs and 3,310 long non-coding RNAs with multiple lines of evidence for roles in regulating protein-coding genes and pathways. Our study both refines and expands the current catalog of human ncRNAs and their regulatory interactions. All data, analyses and results are available for download and interrogation in the R2 web portal, serving as a basis for future exploration of RNA biology and function.
Assuntos
MicroRNAs , RNA Longo não Codificante , Humanos , MicroRNAs/genética , RNA Longo não Codificante/genética , RNA Mensageiro , RNA não Traduzido/genética , Transcriptoma/genéticaRESUMO
Long intergenic non-coding RNAs (lincRNAs) are emerging as integral components of signaling pathways in various cancer types. In neuroblastoma, only a handful of lincRNAs are known as upstream regulators or downstream effectors of oncogenes. Here, we exploit RNA sequencing data of primary neuroblastoma tumors, neuroblast precursor cells, neuroblastoma cell lines and various cellular perturbation model systems to define the neuroblastoma lincRNome and map lincRNAs up- and downstream of neuroblastoma driver genes MYCN, ALK and PHOX2B. Each of these driver genes controls the expression of a particular subset of lincRNAs, several of which are associated with poor survival and are differentially expressed in neuroblastoma tumors compared to neuroblasts. By integrating RNA sequencing data from both primary tumor tissue and cancer cell lines, we demonstrate that several of these lincRNAs are expressed in stromal cells. Deconvolution of primary tumor gene expression data revealed a strong association between stromal cell composition and driver gene status, resulting in differential expression of these lincRNAs. We also explored lincRNAs that putatively act upstream of neuroblastoma driver genes, either as presumed modulators of driver gene activity, or as modulators of effectors regulating driver gene expression. This analysis revealed strong associations between the neuroblastoma lincRNAs MIAT and MEG3 and MYCN and PHOX2B activity or expression. Together, our results provide a comprehensive catalogue of the neuroblastoma lincRNome, highlighting lincRNAs up- and downstream of key neuroblastoma driver genes. This catalogue forms a solid basis for further functional validation of candidate neuroblastoma lincRNAs.
Assuntos
Neuroblastoma/genética , RNA Longo não Codificante/genética , Linhagem Celular Tumoral , Tecnologia de Impulso Genético/métodos , Perfilação da Expressão Gênica/métodos , Humanos , Células-Tronco Neurais/fisiologia , Análise de Sequência de RNA/métodos , Transdução de Sinais/genética , Fatores de Transcrição/genéticaRESUMO
A correction to this article has been published and is linked from the HTML and PDF versions of this paper. The error has been fixed in the paper.
RESUMO
Prediction of protein subcellular localization (PSL) is important for genome annotation, protein function prediction, and drug discovery. Many computational approaches for PSL prediction based on protein sequences have been proposed in recent years for Gram-negative bacteria. We present PSLDoc, a method based on gapped-dipeptides and probabilistic latent semantic analysis (PLSA) to solve this problem. A protein is considered as a term string composed by gapped-dipeptides, which are defined as any two residues separated by one or more positions. The weighting scheme of gapped-dipeptides is calculated according to a position specific score matrix, which includes sequence evolutionary information. Then, PLSA is applied for feature reduction, and reduced vectors are input to five one-versus-rest support vector machine classifiers. The localization site with the highest probability is assigned as the final prediction. It has been reported that there is a strong correlation between sequence homology and subcellular localization (Nair and Rost, Protein Sci 2002;11:2836-2847; Yu et al., Proteins 2006;64:643-651). To properly evaluate the performance of PSLDoc, a target protein can be classified into low- or high-homology data sets. PSLDoc's overall accuracy of low- and high-homology data sets reaches 86.84% and 98.21%, respectively, and it compares favorably with that of CELLO II (Yu et al., Proteins 2006;64:643-651). In addition, we set a confidence threshold to achieve a high precision at specified levels of recall rates. When the confidence threshold is set at 0.7, PSLDoc achieves 97.89% in precision which is considerably better than that of PSORTb v.2.0 (Gardy et al., Bioinformatics 2005;21:617-623). Our approach demonstrates that the specific feature representation for proteins can be successfully applied to the prediction of protein subcellular localization and improves prediction accuracy. Besides, because of the generality of the representation, our method can be extended to eukaryotic proteomes in the future. The web server of PSLDoc is publicly available at http://bio-cluster.iis.sinica.edu.tw/~ bioapp/PSLDoc/.
Assuntos
Dipeptídeos/metabolismo , Proteínas/metabolismo , Frações Subcelulares/metabolismo , Probabilidade , Proteínas/químicaRESUMO
Long noncoding RNAs (lncRNAs) are commonly dysregulated in tumors, but only a handful are known to play pathophysiological roles in cancer. We inferred lncRNAs that dysregulate cancer pathways, oncogenes, and tumor suppressors (cancer genes) by modeling their effects on the activity of transcription factors, RNA-binding proteins, and microRNAs in 5,185 TCGA tumors and 1,019 ENCODE assays. Our predictions included hundreds of candidate onco- and tumor-suppressor lncRNAs (cancer lncRNAs) whose somatic alterations account for the dysregulation of dozens of cancer genes and pathways in each of 14 tumor contexts. To demonstrate proof of concept, we showed that perturbations targeting OIP5-AS1 (an inferred tumor suppressor) and TUG1 and WT1-AS (inferred onco-lncRNAs) dysregulated cancer genes and altered proliferation of breast and gynecologic cancer cells. Our analysis indicates that, although most lncRNAs are dysregulated in a tumor-specific manner, some, including OIP5-AS1, TUG1, NEAT1, MEG3, and TSIX, synergistically dysregulate cancer pathways in multiple tumor contexts.
Assuntos
Regulação Neoplásica da Expressão Gênica , Neoplasias/genética , RNA Longo não Codificante/genética , Linhagem Celular , Linhagem Celular Tumoral , Redes Reguladoras de Genes , Genes Supressores de Tumor , Humanos , OncogenesRESUMO
We analyzed molecular data on 2,579 tumors from The Cancer Genome Atlas (TCGA) of four gynecological types plus breast. Our aims were to identify shared and unique molecular features, clinically significant subtypes, and potential therapeutic targets. We found 61 somatic copy-number alterations (SCNAs) and 46 significantly mutated genes (SMGs). Eleven SCNAs and 11 SMGs had not been identified in previous TCGA studies of the individual tumor types. We found functionally significant estrogen receptor-regulated long non-coding RNAs (lncRNAs) and gene/lncRNA interaction networks. Pathway analysis identified subtypes with high leukocyte infiltration, raising potential implications for immunotherapy. Using 16 key molecular features, we identified five prognostic subtypes and developed a decision tree that classified patients into the subtypes based on just six features that are assessable in clinical laboratories.
Assuntos
Neoplasias da Mama/genética , Variações do Número de Cópias de DNA , Redes Reguladoras de Genes , Neoplasias dos Genitais Femininos/genética , Mutação , Bases de Dados Genéticas , Feminino , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Predisposição Genética para Doença , Humanos , Especificidade de Órgãos , Prognóstico , RNA Longo não Codificante/genética , Receptores de Estrogênio/genéticaRESUMO
This integrated, multiplatform PanCancer Atlas study co-mapped and identified distinguishing molecular features of squamous cell carcinomas (SCCs) from five sites associated with smoking and/or human papillomavirus (HPV). SCCs harbor 3q, 5p, and other recurrent chromosomal copy-number alterations (CNAs), DNA mutations, and/or aberrant methylation of genes and microRNAs, which are correlated with the expression of multi-gene programs linked to squamous cell stemness, epithelial-to-mesenchymal differentiation, growth, genomic integrity, oxidative damage, death, and inflammation. Low-CNA SCCs tended to be HPV(+) and display hypermethylation with repression of TET1 demethylase and FANCF, previously linked to predisposition to SCC, or harbor mutations affecting CASP8, RAS-MAPK pathways, chromatin modifiers, and immunoregulatory molecules. We uncovered hypomethylation of the alternative promoter that drives expression of the ΔNp63 oncogene and embedded miR944. Co-expression of immune checkpoint, T-regulatory, and Myeloid suppressor cells signatures may explain reduced efficacy of immune therapy. These findings support possibilities for molecular classification and therapeutic approaches.
Assuntos
Carcinoma de Células Escamosas/classificação , Regulação Neoplásica da Expressão Gênica , Redes e Vias Metabólicas , Carcinoma de Células Escamosas/genética , Carcinoma de Células Escamosas/imunologia , Carcinoma de Células Escamosas/metabolismo , Linhagem Celular Tumoral , Metilação de DNA , Transição Epitelial-Mesenquimal , Genômica/métodos , Humanos , Polimorfismo GenéticoRESUMO
BACKGROUND: Protein subcellular localization is crucial for genome annotation, protein function prediction, and drug discovery. Determination of subcellular localization using experimental approaches is time-consuming; thus, computational approaches become highly desirable. Extensive studies of localization prediction have led to the development of several methods including composition-based and homology-based methods. However, their performance might be significantly degraded if homologous sequences are not detected. Moreover, methods that integrate various features could suffer from the problem of low coverage in high-throughput proteomic analyses due to the lack of information to characterize unknown proteins. RESULTS: We propose a hybrid prediction method for Gram-negative bacteria that combines a one-versus-one support vector machines (SVM) model and a structural homology approach. The SVM model comprises a number of binary classifiers, in which biological features derived from Gram-negative bacteria translocation pathways are incorporated. In the structural homology approach, we employ secondary structure alignment for structural similarity comparison and assign the known localization of the top-ranked protein as the predicted localization of a query protein. The hybrid method achieves overall accuracy of 93.7% and 93.2% using ten-fold cross-validation on the benchmark data sets. In the assessment of the evaluation data sets, our method also attains accurate prediction accuracy of 84.0%, especially when testing on sequences with a low level of homology to the training data. A three-way data split procedure is also incorporated to prevent overestimation of the predictive performance. In addition, we show that the prediction accuracy should be approximately 85% for non-redundant data sets of sequence identity less than 30%. CONCLUSION: Our results demonstrate that biological features derived from Gram-negative bacteria translocation pathways yield a significant improvement. The biological features are interpretable and can be applied in advanced analyses and experimental designs. Moreover, the overall accuracy of combining the structural homology approach is further improved, which suggests that structural conservation could be a useful indicator for inferring localization in addition to sequence homology. The proposed method can be used in large-scale analyses of proteomes.