Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 1.064
Filtrar
1.
Biomed Res Int ; 2021: 8171236, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34812409

RESUMO

OBJECTIVE: This study is set out to explore the potential difference of miR in PD through GEO data and provide diagnostic indicators for clinical practice. METHODS: In this study, differential miR was screened through the Gene Expression Omnibus (GEO) database, 68 PD patients treated in our hospital from May 2017 to March 2018 were collected as the research group (RG), and 50 normal subjects who underwent physical examination in our hospital during the same period were collected as the control group (CG). Quantitative real-time polymerase chain reaction (qRT-PCR) was used to detect the expression and diagnostic value of miR-374a-5p in serum of patients. The potential target genes of miR-374a-5p were predicted, and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis and Gene Ontology Consortium (GO) were carried out. RESULTS: GEO2R analysis revealed that 193 miRs are expressed differentially, of which 78 were highly expressed and 115 were poorly expressed. The miR-374a-5p expression in the serum of the RG was reduced markedly and had a diagnostic value. Targetscan and miRDB online websites were used to predict their target genes, with 415 common target genes. miR-374a-5p may participate in 27 functional pathways and 8 signal pathways. CONCLUSION: miR-335-5p has low expression in PD and is expected to be a potential diagnostic indicator.


Assuntos
MicroRNAs/genética , Doença de Parkinson/genética , Estudos de Casos e Controles , Biologia Computacional , Bases de Dados de Ácidos Nucleicos , Ontologia Genética , Marcadores Genéticos , Humanos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Doença de Parkinson/diagnóstico , Transdução de Sinais/genética
2.
Comput Math Methods Med ; 2021: 7471516, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34394707

RESUMO

High-throughput data make it possible to study expression levels of thousands of genes simultaneously under a particular condition. However, only few of the genes are discriminatively expressed. How to identify these biomarkers precisely is significant for disease diagnosis, prognosis, and therapy. Many studies utilized pathway information to identify the biomarkers. However, most of these studies only incorporate the group information while the pathway structural information is ignored. In this paper, we proposed a Bayesian gene selection with a network-constrained regularization method, which can incorporate the pathway structural information as priors to perform gene selection. All the priors are conjugated; thus, the parameters can be estimated effectively through Gibbs sampling. We present the application of our method on 6 microarray datasets, comparing with Bayesian Lasso, Bayesian Elastic Net, and Bayesian Fused Lasso. The results show that our method performs better than other Bayesian methods and pathway structural information can improve the result.


Assuntos
Teorema de Bayes , Redes Reguladoras de Genes , Marcadores Genéticos , Biomarcadores Tumorais/genética , Biologia Computacional , Simulação por Computador , Bases de Dados Genéticas/estatística & dados numéricos , Feminino , Perfilação da Expressão Gênica , Predisposição Genética para Doença , Humanos , Masculino , Modelos Genéticos , Neoplasias/genética , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos
3.
Comput Math Methods Med ; 2021: 5584684, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34122617

RESUMO

In view of the challenges of the group Lasso penalty methods for multicancer microarray data analysis, e.g., dividing genes into groups in advance and biological interpretability, we propose a robust adaptive multinomial regression with sparse group Lasso penalty (RAMRSGL) model. By adopting the overlapping clustering strategy, affinity propagation clustering is employed to obtain each cancer gene subtype, which explores the group structure of each cancer subtype and merges the groups of all subtypes. In addition, the data-driven weights based on noise are added to the sparse group Lasso penalty, combining with the multinomial log-likelihood function to perform multiclassification and adaptive group gene selection simultaneously. The experimental results on acute leukemia data verify the effectiveness of the proposed method.


Assuntos
Algoritmos , Neoplasias/classificação , Neoplasias/genética , Análise por Conglomerados , Biologia Computacional , Bases de Dados Genéticas/estatística & dados numéricos , Humanos , Leucemia/classificação , Leucemia/genética , Funções Verossimilhança , Modelos Genéticos , Família Multigênica , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Oncogenes , Análise de Regressão
4.
Comput Math Methods Med ; 2021: 5556992, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33986823

RESUMO

Ensemble learning combines multiple learners to perform combinatorial learning, which has advantages of good flexibility and higher generalization performance. To achieve higher quality cancer classification, in this study, the fast correlation-based feature selection (FCBF) method was used to preprocess the data to eliminate irrelevant and redundant features. Then, the classification was carried out in the stacking ensemble learner. A library for support vector machine (LIBSVM), K-nearest neighbor (KNN), decision tree C4.5 (C4.5), and random forest (RF) were used as the primary learners of the stacking ensemble. Given the imbalanced characteristics of cancer gene expression data, the embedding cost-sensitive naive Bayes was used as the metalearner of the stacking ensemble, which was represented as CSNB stacking. The proposed CSNB stacking method was applied to nine cancer datasets to further verify the classification performance of the model. Compared with other classification methods, such as single classifier algorithms and ensemble algorithms, the experimental results showed the effectiveness and robustness of the proposed method in processing different types of cancer data. This method may therefore help guide cancer diagnosis and research.


Assuntos
Algoritmos , Aprendizado de Máquina , Neoplasias/classificação , Teorema de Bayes , Biologia Computacional , Bases de Dados Genéticas/estatística & dados numéricos , Árvores de Decisões , Feminino , Regulação Neoplásica da Expressão Gênica , Humanos , Masculino , Neoplasias/genética , Redes Neurais de Computação , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Oncogenes , Curva ROC , Máquina de Vetores de Suporte
5.
Nucleic Acids Res ; 49(D1): D1502-D1506, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33211879

RESUMO

ArrayExpress (https://www.ebi.ac.uk/arrayexpress) is an archive of functional genomics data at EMBL-EBI, established in 2002, initially as an archive for publication-related microarray data and was later extended to accept sequencing-based data. Over the last decade an increasing share of biological experiments involve multiple technologies assaying different biological modalities, such as epigenetics, and RNA and protein expression, and thus the BioStudies database (https://www.ebi.ac.uk/biostudies) was established to deal with such multimodal data. Its central concept is a study, which typically is associated with a publication. BioStudies stores metadata describing the study, provides links to the relevant databases, such as European Nucleotide Archive (ENA), as well as hosts the types of data for which specialized databases do not exist. With BioStudies now fully functional, we are able to further harmonize the archival data infrastructure at EMBL-EBI, and ArrayExpress is being migrated to BioStudies. In future, all functional genomics data will be archived at BioStudies. The process will be seamless for the users, who will continue to submit data using the online tool Annotare and will be able to query and download data largely in the same manner as before. Nevertheless, some technical aspects, particularly programmatic access, will change. This update guides the users through these changes.


Assuntos
Bases de Dados Genéticas , Epigênese Genética , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Animais , Linhagem Celular , Metilação de DNA , Perfilação da Expressão Gênica , Humanos , Internet , Metadados , Especificidade de Órgãos , Plantas/genética , Análise de Célula Única , Software
6.
Clin Chem ; 66(7): 934-945, 2020 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-32613237

RESUMO

BACKGROUND: We translated a multigene expression index to predict sensitivity to endocrine therapy for Stage II-III breast cancer (SET2,3) to hybridization-based expression assays of formalin-fixed paraffin-embedded (FFPE) tissue sections. Here we report the technical validity with FFPE samples, including preanalytical and analytical performance. METHODS: We calibrated SET2,3 from microarrays (Affymetrix U133A) of frozen samples to hybridization-based assays of FFPE tissue, using bead-based QuantiGene Plex (QGP) and slide-based NanoString (NS). The following preanalytical and analytical conditions were tested in controlled studies: replicates within and between frozen and fixed samples, age of paraffin blocks, homogenization of fixed sections versus extracted RNA, core biopsy versus surgically resected tumor, technical replicates, precision over 20 weeks, limiting dilution, linear range, and analytical sensitivity. Lin's concordance correlation coefficient (CCC) was used to measure concordance between measurements. RESULTS: SET2,3 index was calibrated to use with QGP (CCC 0.94) and NS (CCC 0.93) technical platforms, and was validated in two cohorts of older fixed samples using QGP (CCC 0.72, 0.85) and NS (CCC 0.78, 0.78). QGP assay was concordant using direct homogenization of fixed sections versus purified RNA (CCC 0.97) and between core and surgical sample types (CCC 0.90), with 100% accuracy in technical replicates, 1-9% coefficient of variation over 20 weekly tests, linear range 3.0-11.5 (log2 counts), and analytical sensitivity ≥2.0 (log2 counts). CONCLUSIONS: Measurement of the novel SET2,3 assay was technically valid from fixed tumor sections of biopsy or resection samples using simple, inexpensive, hybridization methods, without the need for RNA purification.


Assuntos
Neoplasias da Mama/genética , Perfilação da Expressão Gênica/estatística & dados numéricos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , RNA Mensageiro/análise , Aurora Quinase A/genética , Neoplasias da Mama/tratamento farmacológico , Neoplasias da Mama/patologia , Estudos de Coortes , Receptor alfa de Estrogênio/genética , Estrogênios/uso terapêutico , Humanos , Inclusão em Parafina , Receptor ErbB-2/genética , Receptores de Progesterona/genética , Reprodutibilidade dos Testes , Fixação de Tecidos
7.
BMC Cancer ; 20(1): 490, 2020 Jun 02.
Artigo em Inglês | MEDLINE | ID: mdl-32487193

RESUMO

BACKGROUND: Stomach cancer (SC) is a type of cancer, which is derived from the stomach mucous membrane. As there are non-specific symptoms or no noticeable symptoms observed at the early stage, newly diagnosed SC cases usually reach an advanced stage and are thus difficult to cure. Therefore, in this study, we aimed to develop an integrated database of SC. METHODS: SC-related genes were identified through literature mining and by analyzing the publicly available microarray datasets. Using the RNA-seq, miRNA-seq and clinical data downloaded from The Cancer Genome Atlas (TCGA), the Kaplan-Meier (KM) survival curves for all the SC-related genes were generated and analyzed. The miRNAs (miRanda, miRTarget2, PicTar, PITA and TargetScan databases), SC-related miRNAs (HMDD and miR2Disease databases), single nucleotide polymorphisms (SNPs, dbSNP database), and SC-related SNPs (ClinVar database) were also retrieved from the indicated databases. Moreover, gene_disease (OMIM and GAD databases), copy number variation (CNV, DGV database), methylation (PubMeth database), drug (WebGestalt database), and transcription factor (TF, TRANSFAC database) analyses were performed for the differentially expressed genes (DEGs). RESULTS: In total, 9990 SC-related genes (including 8347 up-regulated genes and 1643 down-regulated genes) were identified, among which, 65 genes were further confirmed as SC-related genes by performing enrichment analysis. Besides this, 457 miRNAs, 20 SC-related miRNAs, 1570 SNPs, 108 SC-related SNPs, 419 TFs, 44,605 CNVs, 3404 drug-associated genes, 63 genes with methylation, and KM survival curves of 20,264 genes were obtained. By integrating these datasets, an integrated database of stomach cancer, designated as SCDb, (available at http://www.stomachcancerdb.org/) was established. CONCLUSIONS: As a comprehensive resource for human SC, SCDb database will be very useful for performing SC-related research in future, and will thus promote the understanding of the pathogenesis of SC.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas/estatística & dados numéricos , Conjuntos de Dados como Assunto , Regulação Neoplásica da Expressão Gênica , Neoplasias Gástricas/genética , Biologia Computacional/estatística & dados numéricos , Redes Reguladoras de Genes , Humanos , Estimativa de Kaplan-Meier , MicroRNAs/metabolismo , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Polimorfismo de Nucleotídeo Único , RNA-Seq/estatística & dados numéricos , Neoplasias Gástricas/mortalidade , Neoplasias Gástricas/patologia
8.
J Bioinform Comput Biol ; 18(1): 2050002, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-32336254

RESUMO

Gene set analysis aims to identify differentially expressed or co-expressed genes within a biological pathway between two experimental conditions, so that it can eventually reveal biological processes and pathways involved in disease development. In the last few decades, various statistical and computational methods have been proposed to improve statistical power of gene set analysis. In recent years, much attention has been paid to differentially co-expressed genes since they can be potentially disease-related genes without significant difference in average expression levels between two conditions. In this paper, we propose a new statistical method to identify differentially co-expressed genes from microarray gene expression data. The proposed method first estimates co-expression levels of paired genes using covariance regularization by thresholding, and then significance of difference in covariance estimation between two conditions is evaluated. We demonstrated that the proposed method is more powerful than the existing main-stream methods to detect co-expressed genes through extensive simulation studies. Also, we applied it to various microarray gene expression datasets related with mutant p53 transcriptional activity, and epithelium and stroma breast cancer.


Assuntos
Neoplasias da Mama/genética , Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Neoplasias da Mama/patologia , Simulação por Computador , Feminino , Perfilação da Expressão Gênica/estatística & dados numéricos , Regulação Neoplásica da Expressão Gênica , Humanos , Mutação , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Proteína Supressora de Tumor p53/genética
9.
PLoS One ; 15(4): e0231000, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32287265

RESUMO

Myotonic dystrophy type 1 (DM1) is a rare genetic disorder, characterised by muscular dystrophy, myotonia, and other symptoms. DM1 is caused by the expansion of a CTG repeat in the 3'-untranslated region of DMPK. Longer CTG expansions are associated with greater symptom severity and earlier age at onset. The primary mechanism of pathogenesis is thought to be mediated by a gain of function of the CUG-containing RNA, that leads to trans-dysregulation of RNA metabolism of many other genes. Specifically, the alternative splicing (AS) and alternative polyadenylation (APA) of many genes is known to be disrupted. In the context of clinical trials of emerging DM1 treatments, it is important to be able to objectively quantify treatment efficacy at the level of molecular biomarkers. We show how previously described candidate mRNA biomarkers can be used to model an effective reduction in CTG length, using modern high-dimensional statistics (machine learning), and a blood and muscle mRNA microarray dataset. We show how this model could be used to detect treatment effects in the context of a clinical trial.


Assuntos
Distrofia Miotônica/genética , Distrofia Miotônica/terapia , RNA Mensageiro/genética , Processamento Alternativo , Bioestatística , Ensaios Clínicos como Assunto/métodos , Ensaios Clínicos como Assunto/estatística & dados numéricos , Bases de Dados de Ácidos Nucleicos/estatística & dados numéricos , Marcadores Genéticos , Humanos , Análise dos Mínimos Quadrados , Aprendizado de Máquina , Modelos Genéticos , Músculos/metabolismo , Distrofia Miotônica/metabolismo , Miotonina Proteína Quinase/genética , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Poliadenilação , RNA Mensageiro/metabolismo , Resultado do Tratamento , Expansão das Repetições de Trinucleotídeos
10.
BMC Res Notes ; 13(1): 92, 2020 Feb 24.
Artigo em Inglês | MEDLINE | ID: mdl-32093752

RESUMO

OBJECTIVE: The biological interpretation of gene expression measurements is a challenging task. While ordination methods are routinely used to identify clusters of samples or co-expressed genes, these methods do not take sample or gene annotations into account. We aim to provide a tool that allows users of all backgrounds to assess and visualize the intrinsic correlation structure of complex annotated gene expression data and discover the covariates that jointly affect expression patterns. RESULTS: The Bioconductor package covRNA provides a convenient and fast interface for testing and visualizing complex relationships between sample and gene covariates mediated by gene expression data in an entirely unsupervised setting. The relationships between sample and gene covariates are tested by statistical permutation tests and visualized by ordination. The methods are inspired by the fourthcorner and RLQ analyses used in ecological research for the analysis of species abundance data, that we modified to make them suitable for the distributional characteristics of both, RNA-Seq read counts and microarray intensities, and to provide a high-performance parallelized implementation for the analysis of large-scale gene expression data on multi-core computational systems. CovRNA provides additional modules for unsupervised gene filtering and plotting functions to ensure a smooth and coherent analysis workflow.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de RNA/métodos , Software , Humanos , Análise Multivariada , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Reprodutibilidade dos Testes
11.
Med Sci Monit ; 26: e920261, 2020 Feb 14.
Artigo em Inglês | MEDLINE | ID: mdl-32058995

RESUMO

BACKGROUND Gastric adenocarcinoma accounts for 95% of all gastric malignant tumors. The purpose of this research was to identify differentially expressed genes (DEGs) of gastric adenocarcinoma by use of bioinformatics methods. MATERIAL AND METHODS The gene microarray datasets of GSE103236, GSE79973, and GSE29998 were imported from the GEO database, containing 70 gastric adenocarcinoma samples and 68 matched normal samples. Gene ontology (GO) and KEGG analysis were applied to screened DEGs; Cytoscape software was used for constructing protein-protein interaction (PPI) networks and to perform module analysis of the DEGs. UALCAN was used for prognostic analysis. RESULTS We identified 2909 upregulated DEGs (uDEGs) and 7106 downregulated DEGs (dDEGs) of gastric adenocarcinoma. The GO analysis showed uDEGs were enriched in skeletal system development, cell adhesion, and biological adhesion. KEGG pathway analysis showed uDEGs were enriched in ECM-receptor interaction, focal adhesion, and Cytokine-cytokine receptor interaction. The top 10 hub genes - COL1A1, COL3A1, COL1A2, BGN, COL5A2, THBS2, TIMP1, SPP1, PDGFRB, and COL4A1 - were distinguished from the PPI network. These 10 hub genes were shown to be significantly upregulated in gastric adenocarcinoma tissues in GEPIA. Prognostic analysis of the 10 hub genes via UALCAN showed that the upregulated expression of COL3A1, COL1A2, BGN, and THBS2 significantly reduced the survival time of gastric adenocarcinoma patients. Module analysis revealed that gastric adenocarcinoma was related to 2 pathways: including focal adhesion signaling and ECM-receptor interaction. CONCLUSIONS This research distinguished hub genes and relevant signal pathways, which contributes to our understanding of the molecular mechanisms, and could be used as diagnostic indicators and therapeutic biomarkers for gastric adenocarcinoma.


Assuntos
Adenocarcinoma/genética , Biomarcadores Tumorais/genética , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes , Neoplasias Gástricas/genética , Adenocarcinoma/mortalidade , Adenocarcinoma/patologia , Biologia Computacional , Conjuntos de Dados como Assunto , Mucosa Gástrica/patologia , Perfilação da Expressão Gênica , Humanos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Prognóstico , Mapeamento de Interação de Proteínas , Mapas de Interação de Proteínas/genética , Transdução de Sinais/genética , Neoplasias Gástricas/mortalidade , Neoplasias Gástricas/patologia , Análise de Sobrevida , Fatores de Tempo
12.
J Comput Biol ; 27(9): 1384-1396, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32031874

RESUMO

One of the main methods to analyze gene expression data is biclustering, a nonsupervised technique, which consists of selection subgroups of genes that co-expressed under subgroups of experimental conditions. A large number of biclustering algorithms have been developed to classify gene expression data. These algorithms can give as output a large number of overlapped biclusters, whose visualization still requires deeper studies. We present VisBicluster, a web-based interactive visualization tool for displaying biclustering results. The developed visualization technique consists of laying out the generated biclusters in a two-dimensional matrix where each bicluster is represented as a column and each overlap between a set of biclusters is represented as a row. A search interface for the user is developed to query the matrix of bicluster intersection and visualize the results matching the queries. Our tool supports many interactive features such as sorting, zooming, and details-on-demand. We proved the usefulness of VisBicluster with biclustering results from real and synthetic datasets. Besides, we performed a user study with 14 participants to illustrate the clarity and simplicity of overlap representation with our tool.


Assuntos
Biologia Computacional , Perfilação da Expressão Gênica/estatística & dados numéricos , Expressão Gênica/genética , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Algoritmos , Análise por Conglomerados , Gráficos por Computador , Humanos , Interface Usuário-Computador
13.
Cancer Med ; 9(4): 1419-1429, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-31893575

RESUMO

Early identification of metastatic or recurrent colorectal cancer (CRC) patients who will be sensitive to FOLFOX (5-FU, leucovorin and oxaliplatin) therapy is very important. We performed microarray meta-analysis to identify differentially expressed genes (DEGs) between FOLFOX responders and nonresponders in metastatic or recurrent CRC patients, and found that the expression levels of WASHC4, HELZ, ERN1, RPS6KB1, and APPBP2 were downregulated, while the expression levels of IRF7, EML3, LYPLA2, DRAP1, RNH1, PKP3, TSPAN17, LSS, MLKL, PPP1R7, GCDH, C19ORF24, and CCDC124 were upregulated in FOLFOX responders compared with nonresponders. Subsequent functional annotation showed that DEGs were significantly enriched in autophagy, ErbB signaling pathway, mitophagy, endocytosis, FoxO signaling pathway, apoptosis, and antifolate resistance pathways. Based on those candidate genes, several machine learning algorithms were applied to the training set, then performances of models were assessed via the cross validation method. Candidate models with the best tuning parameters were applied to the test set and the final model showed satisfactory performance. In addition, we also reported that MLKL and CCDC124 gene expression were independent prognostic factors for metastatic CRC patients undergoing FOLFOX therapy.


Assuntos
Protocolos de Quimioterapia Combinada Antineoplásica/uso terapêutico , Biomarcadores Tumorais/genética , Neoplasias Colorretais/tratamento farmacológico , Aprendizado de Máquina , Recidiva Local de Neoplasia/tratamento farmacológico , Proteínas de Ciclo Celular/genética , Neoplasias Colorretais/genética , Neoplasias Colorretais/patologia , Conjuntos de Dados como Assunto , Fluoruracila/uso terapêutico , Perfilação da Expressão Gênica/estatística & dados numéricos , Regulação Neoplásica da Expressão Gênica , Humanos , Peptídeos e Proteínas de Sinalização Intracelular/genética , Leucovorina/uso terapêutico , Recidiva Local de Neoplasia/genética , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Compostos Organoplatínicos/uso terapêutico , Prognóstico , Proteínas Quinases/genética , Critérios de Avaliação de Resposta em Tumores Sólidos
14.
Future Oncol ; 16(3): 4461-4473, 2020 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-31854204

RESUMO

Currently, the prognostic effects of leukemia inhibitory factor (LIF) and LIF receptor (LIFR) in pancreatic adenocarcinoma (PAAD) are not clear. In the present study, we utilized the large datasets from four public databases to investigate the expression of LIF and LIFR and their clinical significance in PAAD. Eight cohorts containing 1278 cases with PAAD were identified and the analysis results suggested that LIF was highly expressed while LIFR was lowly expressed in PAAD tissues compared with adjacent or normal tissues. Kaplan-Meier plot curves and univariate and multivariate Cox proportional hazards regression analyses indicated high LIF expression was associated with shorter overall survival (adjusted hazard ratio = 1.641, 95% CI: 1.399-1.925, p < 0.001) whereas high LIFR expression was associated with longer overall survival (adjusted hazard ratio = 0.653, 95% CI: 0.517-0.826, p < 0.001).


Assuntos
Adenocarcinoma/genética , Biomarcadores Tumorais/genética , Subunidade alfa de Receptor de Fator Inibidor de Leucemia/genética , Fator Inibidor de Leucemia/genética , Neoplasias Pancreáticas/genética , Adenocarcinoma/mortalidade , Adenocarcinoma/patologia , Idoso , Estudos de Coortes , Conjuntos de Dados como Assunto , Regulação para Baixo , Feminino , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Humanos , Estimativa de Kaplan-Meier , Masculino , Pessoa de Meia-Idade , Estadiamento de Neoplasias , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Pâncreas/patologia , Neoplasias Pancreáticas/mortalidade , Neoplasias Pancreáticas/patologia , Prognóstico , Regulação para Cima , Neoplasias Pancreáticas
15.
Cancer Med ; 9(3): 1242-1253, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-31856408

RESUMO

Most high-grade serous ovarian cancer (HGSOC) patients develop resistance to platinum-based chemotherapy and recur. Many biomarkers related to the survival and prognosis of drug-resistant patients have been delved by mining databases; however, the prediction effect of single-gene biomarker is not specific and sensitive enough. The present study aimed to develop a novel prognostic gene signature of platinum-based resistance for patients with HGSOC. The gene expression profiles were obtained from Gene Expression Omnibus and The Cancer Genome Atlas database. A total of 269 differentially expressed genes (DEGs) associated with platinum resistance were identified (P < .05, fold change >1.5). Functional analysis revealed that these DEGs were mainly involved in apoptosis process, PI3K-Akt pathway. Furthermore, we established a set of seven-gene signature that was significantly associated with overall survival (OS) in the test series. Compared with the low-risk score group, patients with a high-risk score suffered poorer OS (P < .001). The area under the curve (AUC) was found to be 0.710, which means the risk score had a certain accuracy on predicting OS in HGSOC (AUC > 0.7). Surprisingly, the risk score was identified as an independent prognostic indicator for HGSOC (P < .001). Subgroup analyses suggested that the risk score had a greater prognostic value for patients with grade 3-4, stage III-IV, venous invasion and objective response. In conclusion, we developed a seven-gene signature relating to platinum resistance, which can predict survival for HGSOC and provide novel insights into understanding of platinum resistance mechanisms and identification of HGSOC patients with poor prognosis.


Assuntos
Protocolos de Quimioterapia Combinada Antineoplásica/farmacologia , Biomarcadores Tumorais/genética , Cistadenocarcinoma Seroso/tratamento farmacológico , Resistencia a Medicamentos Antineoplásicos/genética , Compostos Organoplatínicos/farmacologia , Neoplasias Ovarianas/tratamento farmacológico , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapêutico , Biologia Computacional , Cistadenocarcinoma Seroso/genética , Cistadenocarcinoma Seroso/mortalidade , Cistadenocarcinoma Seroso/patologia , Conjuntos de Dados como Assunto , Feminino , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Humanos , Modelos Genéticos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Compostos Organoplatínicos/uso terapêutico , Neoplasias Ovarianas/genética , Neoplasias Ovarianas/mortalidade , Neoplasias Ovarianas/patologia , Fosfatidilinositol 3-Quinases/metabolismo , Prognóstico , Intervalo Livre de Progressão , RNA Mensageiro , Curva ROC , Transcriptoma/genética
16.
Clin Transl Sci ; 13(1): 169-178, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31794148

RESUMO

As an extremely prevalent disease worldwide, allergic rhinitis (AR) is a condition characterized by chronic inflammation of the nasal mucosa. To identify the finer molecular mechanisms associated with the AR susceptibility genes, differentially expressed genes (DEGs) in AR were investigated. The DEG expression and clinical data of the GSE19187 data set were used for weighted gene co-expression network analysis (WGCNA). After the modules related to AR had been screened, the genes in the module were extracted for Gene Ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis, whereby the genes enriched in the KEGG pathway were regarded as the pathway-genes. The DEGs in patients with AR were subsequently screened out from GSE19187, and the sensitive genes were identified in GSE18574 in connection with the allergen challenge. Two kinds of genes were compared with the pathway-genes in order to screen the AR susceptibility genes. Receiver operating characteristic (ROC) curve was plotted to evaluate the capability of the susceptibility genes to distinguish the AR state. Based on the WGCNA in the GSE19187 data set, 10 co-expression network modules were identified. The correlation analyses revealed that the yellow module was positively correlated with the disease state of AR. A total of 89 genes were found to be involved in the enrichment of the yellow module pathway. Four genes (CST1, SH2D1B, DPP4, and SLC5A5) were upregulated in AR and sensitive to allergen challenge, whose potentials were further confirmed by ROC curve. Taken together, CST1, SH2D1B, DPP4, and SLC5A5 are susceptibility genes to AR.


Assuntos
Redes Reguladoras de Genes/imunologia , Predisposição Genética para Doença , Rinite Alérgica/genética , Biomarcadores/análise , Biologia Computacional/métodos , Conjuntos de Dados como Assunto , Dipeptidil Peptidase 4/análise , Dipeptidil Peptidase 4/genética , Perfilação da Expressão Gênica/estatística & dados numéricos , Regulação da Expressão Gênica/imunologia , Humanos , Mucosa Nasal/imunologia , Mucosa Nasal/patologia , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Valor Preditivo dos Testes , Curva ROC , Rinite Alérgica/epidemiologia , Rinite Alérgica/imunologia , Rinite Alérgica/patologia , Medição de Risco/métodos , Cistatinas Salivares/análise , Cistatinas Salivares/genética , Simportadores/análise , Simportadores/genética , Fatores de Transcrição/análise , Fatores de Transcrição/genética
17.
Cancer Med ; 9(1): 335-349, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31743579

RESUMO

Gastric cancer (GC) remains an important malignancy worldwide with poor prognosis. Long noncoding RNAs (lncRNAs) can markedly affect cancer progression. Moreover, lncRNAs have been proposed as diagnostic or prognostic biomarkers of GC. Therefore, the current study aimed to explore lncRNA-based prognostic biomarkers for GC. LncRNA expression profiles from the Gene Expression Omnibus (GEO) database were first downloaded. After re-annotation of lncRNAs, a univariate Cox analysis identified 177 prognostic lncRNA probes in the training set GSE62254 (n = 225). Multivariate Cox analysis of each lncRNA with clinical characteristics as covariates identified a total of 46 prognostic lncRNA probes. Robust likelihood-based survival and least absolute shrinkage and selection operator (LASSO) models were used to establish a 6-lncRNA signature with prognostic value. Receiver operating characteristic (ROC) curve analyses were employed to compare survival prediction in terms of specificity and sensitivity. Patients with high-risk scores exhibited a significantly worse overall survival (OS) than patients with low-risk scores (log-rank test P-value <.0001), and the area under the ROC curve (AUC) for 5-year survival was 0.77. A nomogram and forest plot were constructed to compare the clinical characteristics and risk scores by a multivariable Cox regression analysis, which suggested that the 6-lncRNA signature can independently make the prognosis evaluation of patients. Single-sample GSEA (ssGSEA) was used to determine the relationships between the 6-lncRNA signature and biological functions. The internal validation set GSE62254 (n = 75) and the external validation set GSE57303 (n = 70) were successfully used to validate the robustness of our 6-lncRNA signature. In conclusion, based on the above results, the 6-lncRNA signature can effectively make the prognosis evaluation of GC patients.


Assuntos
Biomarcadores Tumorais/metabolismo , Nomogramas , RNA Longo não Codificante/metabolismo , Neoplasias Gástricas/mortalidade , Conjuntos de Dados como Assunto , Intervalo Livre de Doença , Feminino , Perfilação da Expressão Gênica/estatística & dados numéricos , Regulação Neoplásica da Expressão Gênica , Humanos , Estimativa de Kaplan-Meier , Funções Verossimilhança , Masculino , Pessoa de Meia-Idade , Estadiamento de Neoplasias , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Curva ROC , Neoplasias Gástricas/genética , Neoplasias Gástricas/patologia
18.
J Bioinform Comput Biol ; 17(5): 1940010, 2019 10.
Artigo em Inglês | MEDLINE | ID: mdl-31856670

RESUMO

Gene set analysis is a quantitative approach for generating biological insight from gene expression datasets. The abundance of gene set analysis methods speaks to their popularity, but raises the question of the extent to which results are affected by the choice of method. Our systematic analysis of 13 popular methods using 6 different datasets, from both DNA microarray and RNA-Seq origin, shows that this choice matters a great deal. We observed that the overall number of gene sets reported by each method differed by up to 2 orders of magnitude, and there was a bias toward reporting large gene sets with some methods. Furthermore, there was substantial disagreement between the 20 most statistically significant gene sets reported by the methods. This was also observed when expanding to the 100 most statistically significant reported gene sets. For different datasets of the same phenotype/condition, the top 20 and top 100 most significant results also showed little to no agreement even when using the same method. GAGE, PAGE, and ORA were the only methods able to achieve relatively high reproducibility when comparing the 20 and 100 most statistically significant gene sets. Biological validation on a juvenile idiopathic arthritis (JIA) dataset showed wide variation in terms of the relevance of the top 20 and top 100 most significant gene sets to known biology of the disease, where GAGE predicted the most relevant gene sets, followed by GSEA, ORA, and PAGE.


Assuntos
Bases de Dados Genéticas , Perfilação da Expressão Gênica/estatística & dados numéricos , Artrite Juvenil/genética , Perfilação da Expressão Gênica/métodos , Perfilação da Expressão Gênica/normas , Humanos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Fenótipo , Psoríase/genética , Reprodutibilidade dos Testes
19.
PLoS One ; 14(11): e0224446, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31730620

RESUMO

Cancer is one of the leading cause of death, worldwide. Many believe that genomic data will enable us to better predict the survival time of these patients, which will lead to better, more personalized treatment options and patient care. As standard survival prediction models have a hard time coping with the high-dimensionality of such gene expression data, many projects use some dimensionality reduction techniques to overcome this hurdle. We introduce a novel methodology, inspired by topic modeling from the natural language domain, to derive expressive features from the high-dimensional gene expression data. There, a document is represented as a mixture over a relatively small number of topics, where each topic corresponds to a distribution over the words; here, to accommodate the heterogeneity of a patient's cancer, we represent each patient (≈ document) as a mixture over cancer-topics, where each cancer-topic is a mixture over gene expression values (≈ words). This required some extensions to the standard LDA model-e.g., to accommodate the real-valued expression values-leading to our novel discretized Latent Dirichlet Allocation (dLDA) procedure. After using this dLDA to learn these cancer-topics, we can then express each patient as a distribution over a small number of cancer-topics, then use this low-dimensional "distribution vector" as input to a learning algorithm-here, we ran the recent survival prediction algorithm, MTLR, on this representation of the cancer dataset. We initially focus on the METABRIC dataset, which describes each of n = 1,981 breast cancer patients using the r = 49,576 gene expression values, from microarrays. Our results show that our approach (dLDA followed by MTLR) provides survival estimates that are more accurate than standard models, in terms of the standard Concordance measure. We then validate this "dLDA+MTLR" approach by running it on the n = 883 Pan-kidney (KIPAN) dataset, over r = 15,529 gene expression values-here using the mRNAseq modality-and find that it again achieves excellent results. In both cases, we also show that the resulting model is calibrated, using the recent "D-calibrated" measure. These successes, in two different cancer types and expression modalities, demonstrates the generality, and the effectiveness, of this approach. The dLDA+MTLR source code is available at https://github.com/nitsanluke/GE-LDA-Survival.


Assuntos
Regulação Neoplásica da Expressão Gênica , Modelos Biológicos , Processamento de Linguagem Natural , Neoplasias/mortalidade , Conjuntos de Dados como Assunto , Perfilação da Expressão Gênica/estatística & dados numéricos , Humanos , Estimativa de Kaplan-Meier , Neoplasias/genética , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Prognóstico
20.
PLoS One ; 14(11): e0224750, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31730674

RESUMO

Chronic obstructive pulmonary disease (COPD) was classified by the Centers for Disease Control and Prevention in 2014 as the 3rd leading cause of death in the United States (US). The main cause of COPD is exposure to tobacco smoke and air pollutants. Problems associated with COPD include under-diagnosis of the disease and an increase in the number of smokers worldwide. The goal of our study is to identify disease variability in the gene expression profiles of COPD subjects compared to controls, by reanalyzing pre-existing, publicly available microarray expression datasets. Our inclusion criteria for microarray datasets selected for smoking status, age and sex of blood donors reported. Our datasets used Affymetrix, Agilent microarray platforms (7 datasets, 1,262 samples). We re-analyzed the curated raw microarray expression data using R packages, and used Box-Cox power transformations to normalize datasets. To identify significant differentially expressed genes we used generalized least squares models with disease state, age, sex, smoking status and study as effects that also included binary interactions, followed by likelihood ratio tests (LRT). We found 3,315 statistically significant (Storey-adjusted q-value <0.05) differentially expressed genes with respect to disease state (COPD or control). We further filtered these genes for biological effect using results from LRT q-value <0.05 and model estimates' 10% two-tailed quantiles of mean differences between COPD and control), to identify 679 genes. Through analysis of disease, sex, age, and also smoking status and disease interactions we identified differentially expressed genes involved in a variety of immune responses and cell processes in COPD. We also trained a logistic regression model using the common array genes as features, which enabled prediction of disease status with 81.7% accuracy. Our results give potential for improving the diagnosis of COPD through blood and highlight novel gene expression disease signatures.


Assuntos
Mineração de Dados , Doença Pulmonar Obstrutiva Crônica/epidemiologia , Transcriptoma/genética , Fatores Etários , Poluentes Atmosféricos/efeitos adversos , Biomarcadores/metabolismo , Conjuntos de Dados como Assunto , Regulação para Baixo , Feminino , Perfilação da Expressão Gênica/estatística & dados numéricos , Humanos , Modelos Logísticos , Aprendizado de Máquina , Masculino , Modelos Genéticos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Doença Pulmonar Obstrutiva Crônica/diagnóstico , Doença Pulmonar Obstrutiva Crônica/etiologia , Doença Pulmonar Obstrutiva Crônica/genética , Medição de Risco/métodos , Fatores de Risco , Fatores Sexuais , Fumar/efeitos adversos , Fumar/epidemiologia , Estados Unidos/epidemiologia , Regulação para Cima
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA