Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 1.064
Filtrar
1.
Biomed Res Int ; 2021: 8171236, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34812409

RESUMEN

OBJECTIVE: This study is set out to explore the potential difference of miR in PD through GEO data and provide diagnostic indicators for clinical practice. METHODS: In this study, differential miR was screened through the Gene Expression Omnibus (GEO) database, 68 PD patients treated in our hospital from May 2017 to March 2018 were collected as the research group (RG), and 50 normal subjects who underwent physical examination in our hospital during the same period were collected as the control group (CG). Quantitative real-time polymerase chain reaction (qRT-PCR) was used to detect the expression and diagnostic value of miR-374a-5p in serum of patients. The potential target genes of miR-374a-5p were predicted, and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis and Gene Ontology Consortium (GO) were carried out. RESULTS: GEO2R analysis revealed that 193 miRs are expressed differentially, of which 78 were highly expressed and 115 were poorly expressed. The miR-374a-5p expression in the serum of the RG was reduced markedly and had a diagnostic value. Targetscan and miRDB online websites were used to predict their target genes, with 415 common target genes. miR-374a-5p may participate in 27 functional pathways and 8 signal pathways. CONCLUSION: miR-335-5p has low expression in PD and is expected to be a potential diagnostic indicator.


Asunto(s)
MicroARNs/genética , Enfermedad de Parkinson/genética , Estudios de Casos y Controles , Biología Computacional , Bases de Datos de Ácidos Nucleicos , Ontología de Genes , Marcadores Genéticos , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Enfermedad de Parkinson/diagnóstico , Transducción de Señal/genética
2.
Comput Math Methods Med ; 2021: 7471516, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34394707

RESUMEN

High-throughput data make it possible to study expression levels of thousands of genes simultaneously under a particular condition. However, only few of the genes are discriminatively expressed. How to identify these biomarkers precisely is significant for disease diagnosis, prognosis, and therapy. Many studies utilized pathway information to identify the biomarkers. However, most of these studies only incorporate the group information while the pathway structural information is ignored. In this paper, we proposed a Bayesian gene selection with a network-constrained regularization method, which can incorporate the pathway structural information as priors to perform gene selection. All the priors are conjugated; thus, the parameters can be estimated effectively through Gibbs sampling. We present the application of our method on 6 microarray datasets, comparing with Bayesian Lasso, Bayesian Elastic Net, and Bayesian Fused Lasso. The results show that our method performs better than other Bayesian methods and pathway structural information can improve the result.


Asunto(s)
Teorema de Bayes , Redes Reguladoras de Genes , Marcadores Genéticos , Biomarcadores de Tumor/genética , Biología Computacional , Simulación por Computador , Bases de Datos Genéticas/estadística & datos numéricos , Femenino , Perfilación de la Expresión Génica , Predisposición Genética a la Enfermedad , Humanos , Masculino , Modelos Genéticos , Neoplasias/genética , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos
3.
Comput Math Methods Med ; 2021: 5584684, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34122617

RESUMEN

In view of the challenges of the group Lasso penalty methods for multicancer microarray data analysis, e.g., dividing genes into groups in advance and biological interpretability, we propose a robust adaptive multinomial regression with sparse group Lasso penalty (RAMRSGL) model. By adopting the overlapping clustering strategy, affinity propagation clustering is employed to obtain each cancer gene subtype, which explores the group structure of each cancer subtype and merges the groups of all subtypes. In addition, the data-driven weights based on noise are added to the sparse group Lasso penalty, combining with the multinomial log-likelihood function to perform multiclassification and adaptive group gene selection simultaneously. The experimental results on acute leukemia data verify the effectiveness of the proposed method.


Asunto(s)
Algoritmos , Neoplasias/clasificación , Neoplasias/genética , Análisis por Conglomerados , Biología Computacional , Bases de Datos Genéticas/estadística & datos numéricos , Humanos , Leucemia/clasificación , Leucemia/genética , Funciones de Verosimilitud , Modelos Genéticos , Familia de Multigenes , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Oncogenes , Análisis de Regresión
4.
Comput Math Methods Med ; 2021: 5556992, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33986823

RESUMEN

Ensemble learning combines multiple learners to perform combinatorial learning, which has advantages of good flexibility and higher generalization performance. To achieve higher quality cancer classification, in this study, the fast correlation-based feature selection (FCBF) method was used to preprocess the data to eliminate irrelevant and redundant features. Then, the classification was carried out in the stacking ensemble learner. A library for support vector machine (LIBSVM), K-nearest neighbor (KNN), decision tree C4.5 (C4.5), and random forest (RF) were used as the primary learners of the stacking ensemble. Given the imbalanced characteristics of cancer gene expression data, the embedding cost-sensitive naive Bayes was used as the metalearner of the stacking ensemble, which was represented as CSNB stacking. The proposed CSNB stacking method was applied to nine cancer datasets to further verify the classification performance of the model. Compared with other classification methods, such as single classifier algorithms and ensemble algorithms, the experimental results showed the effectiveness and robustness of the proposed method in processing different types of cancer data. This method may therefore help guide cancer diagnosis and research.


Asunto(s)
Algoritmos , Aprendizaje Automático , Neoplasias/clasificación , Teorema de Bayes , Biología Computacional , Bases de Datos Genéticas/estadística & datos numéricos , Árboles de Decisión , Femenino , Regulación Neoplásica de la Expresión Génica , Humanos , Masculino , Neoplasias/genética , Redes Neurales de la Computación , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Oncogenes , Curva ROC , Máquina de Vectores de Soporte
5.
Nucleic Acids Res ; 49(D1): D1502-D1506, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33211879

RESUMEN

ArrayExpress (https://www.ebi.ac.uk/arrayexpress) is an archive of functional genomics data at EMBL-EBI, established in 2002, initially as an archive for publication-related microarray data and was later extended to accept sequencing-based data. Over the last decade an increasing share of biological experiments involve multiple technologies assaying different biological modalities, such as epigenetics, and RNA and protein expression, and thus the BioStudies database (https://www.ebi.ac.uk/biostudies) was established to deal with such multimodal data. Its central concept is a study, which typically is associated with a publication. BioStudies stores metadata describing the study, provides links to the relevant databases, such as European Nucleotide Archive (ENA), as well as hosts the types of data for which specialized databases do not exist. With BioStudies now fully functional, we are able to further harmonize the archival data infrastructure at EMBL-EBI, and ArrayExpress is being migrated to BioStudies. In future, all functional genomics data will be archived at BioStudies. The process will be seamless for the users, who will continue to submit data using the online tool Annotare and will be able to query and download data largely in the same manner as before. Nevertheless, some technical aspects, particularly programmatic access, will change. This update guides the users through these changes.


Asunto(s)
Bases de Datos Genéticas , Epigénesis Genética , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/estadística & datos numéricos , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Animales , Línea Celular , Metilación de ADN , Perfilación de la Expresión Génica , Humanos , Internet , Metadatos , Especificidad de Órganos , Plantas/genética , Análisis de la Célula Individual , Programas Informáticos
6.
Clin Chem ; 66(7): 934-945, 2020 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-32613237

RESUMEN

BACKGROUND: We translated a multigene expression index to predict sensitivity to endocrine therapy for Stage II-III breast cancer (SET2,3) to hybridization-based expression assays of formalin-fixed paraffin-embedded (FFPE) tissue sections. Here we report the technical validity with FFPE samples, including preanalytical and analytical performance. METHODS: We calibrated SET2,3 from microarrays (Affymetrix U133A) of frozen samples to hybridization-based assays of FFPE tissue, using bead-based QuantiGene Plex (QGP) and slide-based NanoString (NS). The following preanalytical and analytical conditions were tested in controlled studies: replicates within and between frozen and fixed samples, age of paraffin blocks, homogenization of fixed sections versus extracted RNA, core biopsy versus surgically resected tumor, technical replicates, precision over 20 weeks, limiting dilution, linear range, and analytical sensitivity. Lin's concordance correlation coefficient (CCC) was used to measure concordance between measurements. RESULTS: SET2,3 index was calibrated to use with QGP (CCC 0.94) and NS (CCC 0.93) technical platforms, and was validated in two cohorts of older fixed samples using QGP (CCC 0.72, 0.85) and NS (CCC 0.78, 0.78). QGP assay was concordant using direct homogenization of fixed sections versus purified RNA (CCC 0.97) and between core and surgical sample types (CCC 0.90), with 100% accuracy in technical replicates, 1-9% coefficient of variation over 20 weekly tests, linear range 3.0-11.5 (log2 counts), and analytical sensitivity ≥2.0 (log2 counts). CONCLUSIONS: Measurement of the novel SET2,3 assay was technically valid from fixed tumor sections of biopsy or resection samples using simple, inexpensive, hybridization methods, without the need for RNA purification.


Asunto(s)
Neoplasias de la Mama/genética , Perfilación de la Expresión Génica/estadística & datos numéricos , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , ARN Mensajero/análisis , Aurora Quinasa A/genética , Neoplasias de la Mama/tratamiento farmacológico , Neoplasias de la Mama/patología , Estudios de Cohortes , Receptor alfa de Estrógeno/genética , Estrógenos/uso terapéutico , Humanos , Adhesión en Parafina , Receptor ErbB-2/genética , Receptores de Progesterona/genética , Reproducibilidad de los Resultados , Fijación del Tejido
7.
BMC Cancer ; 20(1): 490, 2020 Jun 02.
Artículo en Inglés | MEDLINE | ID: mdl-32487193

RESUMEN

BACKGROUND: Stomach cancer (SC) is a type of cancer, which is derived from the stomach mucous membrane. As there are non-specific symptoms or no noticeable symptoms observed at the early stage, newly diagnosed SC cases usually reach an advanced stage and are thus difficult to cure. Therefore, in this study, we aimed to develop an integrated database of SC. METHODS: SC-related genes were identified through literature mining and by analyzing the publicly available microarray datasets. Using the RNA-seq, miRNA-seq and clinical data downloaded from The Cancer Genome Atlas (TCGA), the Kaplan-Meier (KM) survival curves for all the SC-related genes were generated and analyzed. The miRNAs (miRanda, miRTarget2, PicTar, PITA and TargetScan databases), SC-related miRNAs (HMDD and miR2Disease databases), single nucleotide polymorphisms (SNPs, dbSNP database), and SC-related SNPs (ClinVar database) were also retrieved from the indicated databases. Moreover, gene_disease (OMIM and GAD databases), copy number variation (CNV, DGV database), methylation (PubMeth database), drug (WebGestalt database), and transcription factor (TF, TRANSFAC database) analyses were performed for the differentially expressed genes (DEGs). RESULTS: In total, 9990 SC-related genes (including 8347 up-regulated genes and 1643 down-regulated genes) were identified, among which, 65 genes were further confirmed as SC-related genes by performing enrichment analysis. Besides this, 457 miRNAs, 20 SC-related miRNAs, 1570 SNPs, 108 SC-related SNPs, 419 TFs, 44,605 CNVs, 3404 drug-associated genes, 63 genes with methylation, and KM survival curves of 20,264 genes were obtained. By integrating these datasets, an integrated database of stomach cancer, designated as SCDb, (available at http://www.stomachcancerdb.org/) was established. CONCLUSIONS: As a comprehensive resource for human SC, SCDb database will be very useful for performing SC-related research in future, and will thus promote the understanding of the pathogenesis of SC.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas/estadística & datos numéricos , Conjuntos de Datos como Asunto , Regulación Neoplásica de la Expresión Génica , Neoplasias Gástricas/genética , Biología Computacional/estadística & datos numéricos , Redes Reguladoras de Genes , Humanos , Estimación de Kaplan-Meier , MicroARNs/metabolismo , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Polimorfismo de Nucleótido Simple , RNA-Seq/estadística & datos numéricos , Neoplasias Gástricas/mortalidad , Neoplasias Gástricas/patología
8.
J Bioinform Comput Biol ; 18(1): 2050002, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-32336254

RESUMEN

Gene set analysis aims to identify differentially expressed or co-expressed genes within a biological pathway between two experimental conditions, so that it can eventually reveal biological processes and pathways involved in disease development. In the last few decades, various statistical and computational methods have been proposed to improve statistical power of gene set analysis. In recent years, much attention has been paid to differentially co-expressed genes since they can be potentially disease-related genes without significant difference in average expression levels between two conditions. In this paper, we propose a new statistical method to identify differentially co-expressed genes from microarray gene expression data. The proposed method first estimates co-expression levels of paired genes using covariance regularization by thresholding, and then significance of difference in covariance estimation between two conditions is evaluated. We demonstrated that the proposed method is more powerful than the existing main-stream methods to detect co-expressed genes through extensive simulation studies. Also, we applied it to various microarray gene expression datasets related with mutant p53 transcriptional activity, and epithelium and stroma breast cancer.


Asunto(s)
Neoplasias de la Mama/genética , Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Neoplasias de la Mama/patología , Simulación por Computador , Femenino , Perfilación de la Expresión Génica/estadística & datos numéricos , Regulación Neoplásica de la Expresión Génica , Humanos , Mutación , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Proteína p53 Supresora de Tumor/genética
9.
PLoS One ; 15(4): e0231000, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32287265

RESUMEN

Myotonic dystrophy type 1 (DM1) is a rare genetic disorder, characterised by muscular dystrophy, myotonia, and other symptoms. DM1 is caused by the expansion of a CTG repeat in the 3'-untranslated region of DMPK. Longer CTG expansions are associated with greater symptom severity and earlier age at onset. The primary mechanism of pathogenesis is thought to be mediated by a gain of function of the CUG-containing RNA, that leads to trans-dysregulation of RNA metabolism of many other genes. Specifically, the alternative splicing (AS) and alternative polyadenylation (APA) of many genes is known to be disrupted. In the context of clinical trials of emerging DM1 treatments, it is important to be able to objectively quantify treatment efficacy at the level of molecular biomarkers. We show how previously described candidate mRNA biomarkers can be used to model an effective reduction in CTG length, using modern high-dimensional statistics (machine learning), and a blood and muscle mRNA microarray dataset. We show how this model could be used to detect treatment effects in the context of a clinical trial.


Asunto(s)
Distrofia Miotónica/genética , Distrofia Miotónica/terapia , ARN Mensajero/genética , Empalme Alternativo , Bioestadística , Ensayos Clínicos como Asunto/métodos , Ensayos Clínicos como Asunto/estadística & datos numéricos , Bases de Datos de Ácidos Nucleicos/estadística & datos numéricos , Marcadores Genéticos , Humanos , Análisis de los Mínimos Cuadrados , Aprendizaje Automático , Modelos Genéticos , Músculos/metabolismo , Distrofia Miotónica/metabolismo , Proteína Quinasa de Distrofia Miotónica/genética , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Poliadenilación , ARN Mensajero/metabolismo , Resultado del Tratamiento , Expansión de Repetición de Trinucleótido
10.
BMC Res Notes ; 13(1): 92, 2020 Feb 24.
Artículo en Inglés | MEDLINE | ID: mdl-32093752

RESUMEN

OBJECTIVE: The biological interpretation of gene expression measurements is a challenging task. While ordination methods are routinely used to identify clusters of samples or co-expressed genes, these methods do not take sample or gene annotations into account. We aim to provide a tool that allows users of all backgrounds to assess and visualize the intrinsic correlation structure of complex annotated gene expression data and discover the covariates that jointly affect expression patterns. RESULTS: The Bioconductor package covRNA provides a convenient and fast interface for testing and visualizing complex relationships between sample and gene covariates mediated by gene expression data in an entirely unsupervised setting. The relationships between sample and gene covariates are tested by statistical permutation tests and visualized by ordination. The methods are inspired by the fourthcorner and RLQ analyses used in ecological research for the analysis of species abundance data, that we modified to make them suitable for the distributional characteristics of both, RNA-Seq read counts and microarray intensities, and to provide a high-performance parallelized implementation for the analysis of large-scale gene expression data on multi-core computational systems. CovRNA provides additional modules for unsupervised gene filtering and plotting functions to ensure a smooth and coherent analysis workflow.


Asunto(s)
Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ARN/métodos , Programas Informáticos , Humanos , Análisis Multivariante , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Reproducibilidad de los Resultados
11.
J Comput Biol ; 27(9): 1384-1396, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-32031874

RESUMEN

One of the main methods to analyze gene expression data is biclustering, a nonsupervised technique, which consists of selection subgroups of genes that co-expressed under subgroups of experimental conditions. A large number of biclustering algorithms have been developed to classify gene expression data. These algorithms can give as output a large number of overlapped biclusters, whose visualization still requires deeper studies. We present VisBicluster, a web-based interactive visualization tool for displaying biclustering results. The developed visualization technique consists of laying out the generated biclusters in a two-dimensional matrix where each bicluster is represented as a column and each overlap between a set of biclusters is represented as a row. A search interface for the user is developed to query the matrix of bicluster intersection and visualize the results matching the queries. Our tool supports many interactive features such as sorting, zooming, and details-on-demand. We proved the usefulness of VisBicluster with biclustering results from real and synthetic datasets. Besides, we performed a user study with 14 participants to illustrate the clarity and simplicity of overlap representation with our tool.


Asunto(s)
Biología Computacional , Perfilación de la Expresión Génica/estadística & datos numéricos , Expresión Génica/genética , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Algoritmos , Análisis por Conglomerados , Gráficos por Computador , Humanos , Interfaz Usuario-Computador
12.
Med Sci Monit ; 26: e920261, 2020 Feb 14.
Artículo en Inglés | MEDLINE | ID: mdl-32058995

RESUMEN

BACKGROUND Gastric adenocarcinoma accounts for 95% of all gastric malignant tumors. The purpose of this research was to identify differentially expressed genes (DEGs) of gastric adenocarcinoma by use of bioinformatics methods. MATERIAL AND METHODS The gene microarray datasets of GSE103236, GSE79973, and GSE29998 were imported from the GEO database, containing 70 gastric adenocarcinoma samples and 68 matched normal samples. Gene ontology (GO) and KEGG analysis were applied to screened DEGs; Cytoscape software was used for constructing protein-protein interaction (PPI) networks and to perform module analysis of the DEGs. UALCAN was used for prognostic analysis. RESULTS We identified 2909 upregulated DEGs (uDEGs) and 7106 downregulated DEGs (dDEGs) of gastric adenocarcinoma. The GO analysis showed uDEGs were enriched in skeletal system development, cell adhesion, and biological adhesion. KEGG pathway analysis showed uDEGs were enriched in ECM-receptor interaction, focal adhesion, and Cytokine-cytokine receptor interaction. The top 10 hub genes - COL1A1, COL3A1, COL1A2, BGN, COL5A2, THBS2, TIMP1, SPP1, PDGFRB, and COL4A1 - were distinguished from the PPI network. These 10 hub genes were shown to be significantly upregulated in gastric adenocarcinoma tissues in GEPIA. Prognostic analysis of the 10 hub genes via UALCAN showed that the upregulated expression of COL3A1, COL1A2, BGN, and THBS2 significantly reduced the survival time of gastric adenocarcinoma patients. Module analysis revealed that gastric adenocarcinoma was related to 2 pathways: including focal adhesion signaling and ECM-receptor interaction. CONCLUSIONS This research distinguished hub genes and relevant signal pathways, which contributes to our understanding of the molecular mechanisms, and could be used as diagnostic indicators and therapeutic biomarkers for gastric adenocarcinoma.


Asunto(s)
Adenocarcinoma/genética , Biomarcadores de Tumor/genética , Regulación Neoplásica de la Expresión Génica , Redes Reguladoras de Genes , Neoplasias Gástricas/genética , Adenocarcinoma/mortalidad , Adenocarcinoma/patología , Biología Computacional , Conjuntos de Datos como Asunto , Mucosa Gástrica/patología , Perfilación de la Expresión Génica , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Pronóstico , Mapeo de Interacción de Proteínas , Mapas de Interacción de Proteínas/genética , Transducción de Señal/genética , Neoplasias Gástricas/mortalidad , Neoplasias Gástricas/patología , Análisis de Supervivencia , Factores de Tiempo
13.
Cancer Med ; 9(4): 1419-1429, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-31893575

RESUMEN

Early identification of metastatic or recurrent colorectal cancer (CRC) patients who will be sensitive to FOLFOX (5-FU, leucovorin and oxaliplatin) therapy is very important. We performed microarray meta-analysis to identify differentially expressed genes (DEGs) between FOLFOX responders and nonresponders in metastatic or recurrent CRC patients, and found that the expression levels of WASHC4, HELZ, ERN1, RPS6KB1, and APPBP2 were downregulated, while the expression levels of IRF7, EML3, LYPLA2, DRAP1, RNH1, PKP3, TSPAN17, LSS, MLKL, PPP1R7, GCDH, C19ORF24, and CCDC124 were upregulated in FOLFOX responders compared with nonresponders. Subsequent functional annotation showed that DEGs were significantly enriched in autophagy, ErbB signaling pathway, mitophagy, endocytosis, FoxO signaling pathway, apoptosis, and antifolate resistance pathways. Based on those candidate genes, several machine learning algorithms were applied to the training set, then performances of models were assessed via the cross validation method. Candidate models with the best tuning parameters were applied to the test set and the final model showed satisfactory performance. In addition, we also reported that MLKL and CCDC124 gene expression were independent prognostic factors for metastatic CRC patients undergoing FOLFOX therapy.


Asunto(s)
Protocolos de Quimioterapia Combinada Antineoplásica/uso terapéutico , Biomarcadores de Tumor/genética , Neoplasias Colorrectales/tratamiento farmacológico , Aprendizaje Automático , Recurrencia Local de Neoplasia/tratamiento farmacológico , Proteínas de Ciclo Celular/genética , Neoplasias Colorrectales/genética , Neoplasias Colorrectales/patología , Conjuntos de Datos como Asunto , Fluorouracilo/uso terapéutico , Perfilación de la Expresión Génica/estadística & datos numéricos , Regulación Neoplásica de la Expresión Génica , Humanos , Péptidos y Proteínas de Señalización Intracelular/genética , Leucovorina/uso terapéutico , Recurrencia Local de Neoplasia/genética , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Compuestos Organoplatinos/uso terapéutico , Pronóstico , Proteínas Quinasas/genética , Criterios de Evaluación de Respuesta en Tumores Sólidos
14.
Cancer Med ; 9(3): 1242-1253, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-31856408

RESUMEN

Most high-grade serous ovarian cancer (HGSOC) patients develop resistance to platinum-based chemotherapy and recur. Many biomarkers related to the survival and prognosis of drug-resistant patients have been delved by mining databases; however, the prediction effect of single-gene biomarker is not specific and sensitive enough. The present study aimed to develop a novel prognostic gene signature of platinum-based resistance for patients with HGSOC. The gene expression profiles were obtained from Gene Expression Omnibus and The Cancer Genome Atlas database. A total of 269 differentially expressed genes (DEGs) associated with platinum resistance were identified (P < .05, fold change >1.5). Functional analysis revealed that these DEGs were mainly involved in apoptosis process, PI3K-Akt pathway. Furthermore, we established a set of seven-gene signature that was significantly associated with overall survival (OS) in the test series. Compared with the low-risk score group, patients with a high-risk score suffered poorer OS (P < .001). The area under the curve (AUC) was found to be 0.710, which means the risk score had a certain accuracy on predicting OS in HGSOC (AUC > 0.7). Surprisingly, the risk score was identified as an independent prognostic indicator for HGSOC (P < .001). Subgroup analyses suggested that the risk score had a greater prognostic value for patients with grade 3-4, stage III-IV, venous invasion and objective response. In conclusion, we developed a seven-gene signature relating to platinum resistance, which can predict survival for HGSOC and provide novel insights into understanding of platinum resistance mechanisms and identification of HGSOC patients with poor prognosis.


Asunto(s)
Protocolos de Quimioterapia Combinada Antineoplásica/farmacología , Biomarcadores de Tumor/genética , Cistadenocarcinoma Seroso/tratamiento farmacológico , Resistencia a Antineoplásicos/genética , Compuestos Organoplatinos/farmacología , Neoplasias Ováricas/tratamiento farmacológico , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapéutico , Biología Computacional , Cistadenocarcinoma Seroso/genética , Cistadenocarcinoma Seroso/mortalidad , Cistadenocarcinoma Seroso/patología , Conjuntos de Datos como Asunto , Femenino , Perfilación de la Expresión Génica , Regulación Neoplásica de la Expresión Génica , Humanos , Modelos Genéticos , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Compuestos Organoplatinos/uso terapéutico , Neoplasias Ováricas/genética , Neoplasias Ováricas/mortalidad , Neoplasias Ováricas/patología , Fosfatidilinositol 3-Quinasas/metabolismo , Pronóstico , Supervivencia sin Progresión , ARN Mensajero , Curva ROC , Transcriptoma/genética
15.
Clin Transl Sci ; 13(1): 169-178, 2020 01.
Artículo en Inglés | MEDLINE | ID: mdl-31794148

RESUMEN

As an extremely prevalent disease worldwide, allergic rhinitis (AR) is a condition characterized by chronic inflammation of the nasal mucosa. To identify the finer molecular mechanisms associated with the AR susceptibility genes, differentially expressed genes (DEGs) in AR were investigated. The DEG expression and clinical data of the GSE19187 data set were used for weighted gene co-expression network analysis (WGCNA). After the modules related to AR had been screened, the genes in the module were extracted for Gene Ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis, whereby the genes enriched in the KEGG pathway were regarded as the pathway-genes. The DEGs in patients with AR were subsequently screened out from GSE19187, and the sensitive genes were identified in GSE18574 in connection with the allergen challenge. Two kinds of genes were compared with the pathway-genes in order to screen the AR susceptibility genes. Receiver operating characteristic (ROC) curve was plotted to evaluate the capability of the susceptibility genes to distinguish the AR state. Based on the WGCNA in the GSE19187 data set, 10 co-expression network modules were identified. The correlation analyses revealed that the yellow module was positively correlated with the disease state of AR. A total of 89 genes were found to be involved in the enrichment of the yellow module pathway. Four genes (CST1, SH2D1B, DPP4, and SLC5A5) were upregulated in AR and sensitive to allergen challenge, whose potentials were further confirmed by ROC curve. Taken together, CST1, SH2D1B, DPP4, and SLC5A5 are susceptibility genes to AR.


Asunto(s)
Redes Reguladoras de Genes/inmunología , Predisposición Genética a la Enfermedad , Rinitis Alérgica/genética , Biomarcadores/análisis , Biología Computacional/métodos , Conjuntos de Datos como Asunto , Dipeptidil Peptidasa 4/análisis , Dipeptidil Peptidasa 4/genética , Perfilación de la Expresión Génica/estadística & datos numéricos , Regulación de la Expresión Génica/inmunología , Humanos , Mucosa Nasal/inmunología , Mucosa Nasal/patología , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Valor Predictivo de las Pruebas , Curva ROC , Rinitis Alérgica/epidemiología , Rinitis Alérgica/inmunología , Rinitis Alérgica/patología , Medición de Riesgo/métodos , Cistatinas Salivales/análisis , Cistatinas Salivales/genética , Simportadores/análisis , Simportadores/genética , Factores de Transcripción/análisis , Factores de Transcripción/genética
16.
Cancer Med ; 9(1): 335-349, 2020 01.
Artículo en Inglés | MEDLINE | ID: mdl-31743579

RESUMEN

Gastric cancer (GC) remains an important malignancy worldwide with poor prognosis. Long noncoding RNAs (lncRNAs) can markedly affect cancer progression. Moreover, lncRNAs have been proposed as diagnostic or prognostic biomarkers of GC. Therefore, the current study aimed to explore lncRNA-based prognostic biomarkers for GC. LncRNA expression profiles from the Gene Expression Omnibus (GEO) database were first downloaded. After re-annotation of lncRNAs, a univariate Cox analysis identified 177 prognostic lncRNA probes in the training set GSE62254 (n = 225). Multivariate Cox analysis of each lncRNA with clinical characteristics as covariates identified a total of 46 prognostic lncRNA probes. Robust likelihood-based survival and least absolute shrinkage and selection operator (LASSO) models were used to establish a 6-lncRNA signature with prognostic value. Receiver operating characteristic (ROC) curve analyses were employed to compare survival prediction in terms of specificity and sensitivity. Patients with high-risk scores exhibited a significantly worse overall survival (OS) than patients with low-risk scores (log-rank test P-value <.0001), and the area under the ROC curve (AUC) for 5-year survival was 0.77. A nomogram and forest plot were constructed to compare the clinical characteristics and risk scores by a multivariable Cox regression analysis, which suggested that the 6-lncRNA signature can independently make the prognosis evaluation of patients. Single-sample GSEA (ssGSEA) was used to determine the relationships between the 6-lncRNA signature and biological functions. The internal validation set GSE62254 (n = 75) and the external validation set GSE57303 (n = 70) were successfully used to validate the robustness of our 6-lncRNA signature. In conclusion, based on the above results, the 6-lncRNA signature can effectively make the prognosis evaluation of GC patients.


Asunto(s)
Biomarcadores de Tumor/metabolismo , Nomogramas , ARN Largo no Codificante/metabolismo , Neoplasias Gástricas/mortalidad , Conjuntos de Datos como Asunto , Supervivencia sin Enfermedad , Femenino , Perfilación de la Expresión Génica/estadística & datos numéricos , Regulación Neoplásica de la Expresión Génica , Humanos , Estimación de Kaplan-Meier , Funciones de Verosimilitud , Masculino , Persona de Mediana Edad , Estadificación de Neoplasias , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Curva ROC , Neoplasias Gástricas/genética , Neoplasias Gástricas/patología
17.
Future Oncol ; 16(3): 4461-4473, 2020 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-31854204

RESUMEN

Currently, the prognostic effects of leukemia inhibitory factor (LIF) and LIF receptor (LIFR) in pancreatic adenocarcinoma (PAAD) are not clear. In the present study, we utilized the large datasets from four public databases to investigate the expression of LIF and LIFR and their clinical significance in PAAD. Eight cohorts containing 1278 cases with PAAD were identified and the analysis results suggested that LIF was highly expressed while LIFR was lowly expressed in PAAD tissues compared with adjacent or normal tissues. Kaplan-Meier plot curves and univariate and multivariate Cox proportional hazards regression analyses indicated high LIF expression was associated with shorter overall survival (adjusted hazard ratio = 1.641, 95% CI: 1.399-1.925, p < 0.001) whereas high LIFR expression was associated with longer overall survival (adjusted hazard ratio = 0.653, 95% CI: 0.517-0.826, p < 0.001).


Asunto(s)
Adenocarcinoma/genética , Biomarcadores de Tumor/genética , Subunidad alfa del Receptor del Factor Inhibidor de Leucemia/genética , Factor Inhibidor de Leucemia/genética , Neoplasias Pancreáticas/genética , Adenocarcinoma/mortalidad , Adenocarcinoma/patología , Anciano , Estudios de Cohortes , Conjuntos de Datos como Asunto , Regulación hacia Abajo , Femenino , Perfilación de la Expresión Génica , Regulación Neoplásica de la Expresión Génica , Humanos , Estimación de Kaplan-Meier , Masculino , Persona de Mediana Edad , Estadificación de Neoplasias , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Páncreas/patología , Neoplasias Pancreáticas/mortalidad , Neoplasias Pancreáticas/patología , Pronóstico , Regulación hacia Arriba , Neoplasias Pancreáticas
18.
J Bioinform Comput Biol ; 17(5): 1940010, 2019 10.
Artículo en Inglés | MEDLINE | ID: mdl-31856670

RESUMEN

Gene set analysis is a quantitative approach for generating biological insight from gene expression datasets. The abundance of gene set analysis methods speaks to their popularity, but raises the question of the extent to which results are affected by the choice of method. Our systematic analysis of 13 popular methods using 6 different datasets, from both DNA microarray and RNA-Seq origin, shows that this choice matters a great deal. We observed that the overall number of gene sets reported by each method differed by up to 2 orders of magnitude, and there was a bias toward reporting large gene sets with some methods. Furthermore, there was substantial disagreement between the 20 most statistically significant gene sets reported by the methods. This was also observed when expanding to the 100 most statistically significant reported gene sets. For different datasets of the same phenotype/condition, the top 20 and top 100 most significant results also showed little to no agreement even when using the same method. GAGE, PAGE, and ORA were the only methods able to achieve relatively high reproducibility when comparing the 20 and 100 most statistically significant gene sets. Biological validation on a juvenile idiopathic arthritis (JIA) dataset showed wide variation in terms of the relevance of the top 20 and top 100 most significant gene sets to known biology of the disease, where GAGE predicted the most relevant gene sets, followed by GSEA, ORA, and PAGE.


Asunto(s)
Bases de Datos Genéticas , Perfilación de la Expresión Génica/estadística & datos numéricos , Artritis Juvenil/genética , Perfilación de la Expresión Génica/métodos , Perfilación de la Expresión Génica/normas , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Fenotipo , Psoriasis/genética , Reproducibilidad de los Resultados
19.
Genes (Basel) ; 10(11)2019 11 14.
Artículo en Inglés | MEDLINE | ID: mdl-31739607

RESUMEN

Although there have been several analyses for identifying cancer-associated pathways, based on gene expression data, most of these are based on single pathway analyses, and thus do not consider correlations between pathways. In this paper, we propose a hierarchical structural component model for pathway analysis of gene expression data (HisCoM-PAGE), which accounts for the hierarchical structure of genes and pathways, as well as the correlations among pathways. Specifically, HisCoM-PAGE focuses on the survival phenotype and identifies its associated pathways. Moreover, its application to real biological data analysis of pancreatic cancer data demonstrated that HisCoM-PAGE could successfully identify pathways associated with pancreatic cancer prognosis. Simulation studies comparing the performance of HisCoM-PAGE with other competing methods such as Gene Set Enrichment Analysis (GSEA), Global Test, and Wald-type Test showed HisCoM-PAGE to have the highest power to detect causal pathways in most simulation scenarios.


Asunto(s)
Carcinoma Ductal Pancreático/genética , Análisis de Datos , Regulación Neoplásica de la Expresión Génica , Modelos Genéticos , Neoplasias Pancreáticas/genética , Anciano , Algoritmos , Carcinoma Ductal Pancreático/mortalidad , Simulación por Computador , Bases de Datos Genéticas/estadística & datos numéricos , Conjuntos de Datos como Asunto , Estudios de Factibilidad , Femenino , Redes Reguladoras de Genes , Humanos , Masculino , Persona de Mediana Edad , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Neoplasias Pancreáticas/mortalidad , Pronóstico , RNA-Seq/estadística & datos numéricos , República de Corea/epidemiología , Análisis de Supervivencia
20.
PLoS One ; 14(11): e0224446, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31730620

RESUMEN

Cancer is one of the leading cause of death, worldwide. Many believe that genomic data will enable us to better predict the survival time of these patients, which will lead to better, more personalized treatment options and patient care. As standard survival prediction models have a hard time coping with the high-dimensionality of such gene expression data, many projects use some dimensionality reduction techniques to overcome this hurdle. We introduce a novel methodology, inspired by topic modeling from the natural language domain, to derive expressive features from the high-dimensional gene expression data. There, a document is represented as a mixture over a relatively small number of topics, where each topic corresponds to a distribution over the words; here, to accommodate the heterogeneity of a patient's cancer, we represent each patient (≈ document) as a mixture over cancer-topics, where each cancer-topic is a mixture over gene expression values (≈ words). This required some extensions to the standard LDA model-e.g., to accommodate the real-valued expression values-leading to our novel discretized Latent Dirichlet Allocation (dLDA) procedure. After using this dLDA to learn these cancer-topics, we can then express each patient as a distribution over a small number of cancer-topics, then use this low-dimensional "distribution vector" as input to a learning algorithm-here, we ran the recent survival prediction algorithm, MTLR, on this representation of the cancer dataset. We initially focus on the METABRIC dataset, which describes each of n = 1,981 breast cancer patients using the r = 49,576 gene expression values, from microarrays. Our results show that our approach (dLDA followed by MTLR) provides survival estimates that are more accurate than standard models, in terms of the standard Concordance measure. We then validate this "dLDA+MTLR" approach by running it on the n = 883 Pan-kidney (KIPAN) dataset, over r = 15,529 gene expression values-here using the mRNAseq modality-and find that it again achieves excellent results. In both cases, we also show that the resulting model is calibrated, using the recent "D-calibrated" measure. These successes, in two different cancer types and expression modalities, demonstrates the generality, and the effectiveness, of this approach. The dLDA+MTLR source code is available at https://github.com/nitsanluke/GE-LDA-Survival.


Asunto(s)
Regulación Neoplásica de la Expresión Génica , Modelos Biológicos , Procesamiento de Lenguaje Natural , Neoplasias/mortalidad , Conjuntos de Datos como Asunto , Perfilación de la Expresión Génica/estadística & datos numéricos , Humanos , Estimación de Kaplan-Meier , Neoplasias/genética , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Pronóstico
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...