Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 135
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 52(9): 4761-4783, 2024 May 22.
Artigo em Inglês | MEDLINE | ID: mdl-38619038

RESUMO

Single-cell RNA sequencing (scRNA-Seq) is a recent technology that allows for the measurement of the expression of all genes in each individual cell contained in a sample. Information at the single-cell level has been shown to be extremely useful in many areas. However, performing single-cell experiments is expensive. Although cellular deconvolution cannot provide the same comprehensive information as single-cell experiments, it can extract cell-type information from bulk RNA data, and therefore it allows researchers to conduct studies at cell-type resolution from existing bulk datasets. For these reasons, a great effort has been made to develop such methods for cellular deconvolution. The large number of methods available, the requirement of coding skills, inadequate documentation, and lack of performance assessment all make it extremely difficult for life scientists to choose a suitable method for their experiment. This paper aims to fill this gap by providing a comprehensive review of 53 deconvolution methods regarding their methodology, applications, performance, and outstanding challenges. More importantly, the article presents a benchmarking of all these 53 methods using 283 cell types from 30 tissues of 63 individuals. We also provide an R package named DeconBenchmark that allows readers to execute and benchmark the reviewed methods (https://github.com/tinnlab/DeconBenchmark).


Assuntos
Análise de Célula Única , Software , Análise de Célula Única/métodos , Humanos , Análise de Sequência de RNA/métodos , Animais , RNA-Seq/métodos , Benchmarking , Algoritmos , Perfilação da Expressão Gênica/métodos
2.
J Transl Med ; 21(1): 377, 2023 06 10.
Artigo em Inglês | MEDLINE | ID: mdl-37301958

RESUMO

AIMS: Long-COVID occurs after SARS-CoV-2 infection and results in diverse, prolonged symptoms. The present study aimed to unveil potential mechanisms, and to inform prognosis and treatment. METHODS: Plasma proteome from Long-COVID outpatients was analyzed in comparison to matched acutely ill COVID-19 (mild and severe) inpatients and healthy control subjects. The expression of 3072 protein biomarkers was determined with proximity extension assays and then deconvoluted with multiple bioinformatics tools into both cell types and signaling mechanisms, as well as organ specificity. RESULTS: Compared to age- and sex-matched acutely ill COVID-19 inpatients and healthy control subjects, Long-COVID outpatients showed natural killer cell redistribution with a dominant resting phenotype, as opposed to active, and neutrophils that formed extracellular traps. This potential resetting of cell phenotypes was reflected in prospective vascular events mediated by both angiopoietin-1 (ANGPT1) and vascular-endothelial growth factor-A (VEGFA). Several markers (ANGPT1, VEGFA, CCR7, CD56, citrullinated histone 3, elastase) were validated by serological methods in additional patient cohorts. Signaling of transforming growth factor-ß1 with probable connections to elevated EP/p300 suggested vascular inflammation and tumor necrosis factor-α driven pathways. In addition, a vascular proliferative state associated with hypoxia inducible factor 1 pathway suggested progression from acute COVID-19 to Long-COVID. The vasculo-proliferative process predicted in Long-COVID might contribute to changes in the organ-specific proteome reflective of neurologic and cardiometabolic dysfunction. CONCLUSIONS: Taken together, our findings point to a vasculo-proliferative process in Long-COVID that is likely initiated either prior hypoxia (localized or systemic) and/or stimulatory factors (i.e., cytokines, chemokines, growth factors, angiotensin, etc). Analyses of the plasma proteome, used as a surrogate for cellular signaling, unveiled potential organ-specific prognostic biomarkers and therapeutic targets.


Assuntos
COVID-19 , Humanos , Proteoma , SARS-CoV-2 , Síndrome de COVID-19 Pós-Aguda , Estudos Prospectivos , Encéfalo , Biomarcadores
3.
Nucleic Acids Res ; 49(W1): W114-W124, 2021 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-34037798

RESUMO

In molecular biology and genetics, there is a large gap between the ease of data collection and our ability to extract knowledge from these data. Contributing to this gap is the fact that living organisms are complex systems whose emerging phenotypes are the results of multiple complex interactions taking place on various pathways. This demands powerful yet user-friendly pathway analysis tools to translate the now abundant high-throughput data into a better understanding of the underlying biological phenomena. Here we introduce Consensus Pathway Analysis (CPA), a web-based platform that allows researchers to (i) perform pathway analysis using eight established methods (GSEA, GSA, FGSEA, PADOG, Impact Analysis, ORA/Webgestalt, KS-test, Wilcox-test), (ii) perform meta-analysis of multiple datasets, (iii) combine methods and datasets to accurately identify the impacted pathways underlying the studied condition and (iv) interactively explore impacted pathways, and browse relationships between pathways and genes. The platform supports three types of input: (i) a list of differentially expressed genes, (ii) genes and fold changes and (iii) an expression matrix. It also allows users to import data from NCBI GEO. The CPA platform currently supports the analysis of multiple organisms using KEGG and Gene Ontology, and it is freely available at http://cpa.tinnguyen-lab.com.


Assuntos
Expressão Gênica , Software , Doença de Alzheimer/genética , Conjuntos de Dados como Assunto , Ontologia Genética , Humanos , Internet
4.
Bioinformatics ; 37(17): 2691-2698, 2021 Sep 09.
Artigo em Inglês | MEDLINE | ID: mdl-33693506

RESUMO

MOTIVATION: COVID-19 has several distinct clinical phases: a viral replication phase, an inflammatory phase and in some patients, a hyper-inflammatory phase. High mortality is associated with patients developing cytokine storm syndrome. Treatment of hyper-inflammation in these patients using existing approved therapies with proven safety profiles could address the immediate need to reduce mortality. RESULTS: We analyzed the changes in the gene expression, pathways and putative mechanisms induced by SARS-CoV2 in NHBE, and A549 cells, as well as COVID-19 lung versus their respective controls. We used these changes to identify FDA approved drugs that could be repurposed to help COVID-19 patients with severe symptoms related to hyper-inflammation. We identified methylprednisolone (MP) as a potential leading therapy. The results were then confirmed in five independent validation datasets including Vero E6 cells, lung and intestinal organoids, as well as additional patient lung sample versus their respective controls. Finally, the efficacy of MP was validated in an independent clinical study. Thirty-day all-cause mortality occurred at a significantly lower rate in the MP-treated group compared to control group (29.6% versus 16.6%, P = 0.027). Clinical results confirmed the in silico prediction that MP could improve outcomes in severe cases of COVID-19. A low number needed to treat (NNT = 5) suggests MP may be more efficacious than dexamethasone or hydrocortisone. AVAILABILITY AND IMPLEMENTATION: iPathwayGuide is available at https://advaitabio.com/ipathwayguide/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

5.
Bioinformatics ; 36(2): 487-495, 2020 01 15.
Artigo em Inglês | MEDLINE | ID: mdl-31329248

RESUMO

MOTIVATION: Recent advances in biomedical research have made massive amount of transcriptomic data available in public repositories from different sources. Due to the heterogeneity present in the individual experiments, identifying reproducible biomarkers for a given disease from multiple independent studies has become a major challenge. The widely used meta-analysis approaches, such as Fisher's method, Stouffer's method, minP and maxP, have at least two major limitations: (i) they are sensitive to outliers, and (ii) they perform only one statistical test for each individual study, and hence do not fully utilize the potential sample size to gain statistical power. RESULTS: Here, we propose a gene-level meta-analysis framework that overcomes these limitations and identifies a gene signature that is reliable and reproducible across multiple independent studies of a given disease. The approach provides a comprehensive global signature that can be used to understand the underlying biological phenomena, and a smaller test signature that can be used to classify future samples of a given disease. We demonstrate the utility of the framework by constructing disease signatures for influenza and Alzheimer's disease using nine datasets including 1108 individuals. These signatures are then validated on 12 independent datasets including 912 individuals. The results indicate that the proposed approach performs better than the majority of the existing meta-analysis approaches in terms of both sensitivity as well as specificity. The proposed signatures could be further used in diagnosis, prognosis and identification of therapeutic targets. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Perfilação da Expressão Gênica , Transcriptoma , Biomarcadores , Humanos , Tamanho da Amostra , Sensibilidade e Especificidade
6.
Platelets ; 32(1): 130-137, 2021 Jan 02.
Artigo em Inglês | MEDLINE | ID: mdl-32892687

RESUMO

The coronavirus disease 19 (COVID-19) is a highly transmittable viral infection caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). SARS-CoV-2 utilizes metallocarboxyl peptidase angiotensin receptor (ACE) 2 to gain entry into human cells. Activation of several proteases facilitates the interaction of viral spike proteins (S1) and ACE2 receptor. This leads to cleavage of host ACE2 receptors. ACE2 activity counterbalances the angiotensin II effect, its loss may lead to elevated angiotensin II levels with modulation of platelet function, size and activity. COVID-19 disease encompasses a spectrum of systemic involvement far beyond respiratory failure alone. Several features of this disease, including the etiology of acute kidney injury (AKI) and the hypercoagulable state, remain poorly understood. Here, we show that there is a high incidence of AKI (81%) in the critically ill adults with COVID-19 in the setting of elevated D-dimer, elevated ferritin, C reactive protein (CRP) and lactate dehydrogenase (LDH) levels. Strikingly, there were unique features of platelets in these patients, including larger, more granular platelets and a higher mean platelet volume (MPV). There was a significant correlation between measured D-dimer levels and MVP; but a negative correlation between MPV and glomerular filtration rates (GFR) in critically ill cohort. Our data suggest that activated platelets may play a role in renal failure and possibly hypercoagulability status in COVID19 patients.


Assuntos
Injúria Renal Aguda/etiologia , Angiotensina II/metabolismo , Enzima de Conversão de Angiotensina 2/metabolismo , Plaquetas/patologia , COVID-19/complicações , Pandemias , Receptores Virais/metabolismo , SARS-CoV-2 , Trombocitopenia/etiologia , Trombofilia/etiologia , Injúria Renal Aguda/sangue , Injúria Renal Aguda/fisiopatologia , Adulto , Idoso , Idoso de 80 Anos ou mais , COVID-19/sangue , COVID-19/epidemiologia , Comorbidade , Diabetes Mellitus/epidemiologia , Feminino , Produtos de Degradação da Fibrina e do Fibrinogênio/análise , Taxa de Filtração Glomerular , Humanos , Hipertensão/epidemiologia , Masculino , Volume Plaquetário Médio , Pessoa de Meia-Idade , Sistema Renina-Angiotensina/fisiologia , Trombofilia/sangue , Adulto Jovem
7.
Genome Res ; 27(12): 2025-2039, 2017 12.
Artigo em Inglês | MEDLINE | ID: mdl-29066617

RESUMO

Advances in high-throughput technologies allow for measurements of many types of omics data, yet the meaningful integration of several different data types remains a significant challenge. Another important and difficult problem is the discovery of molecular disease subtypes characterized by relevant clinical differences, such as survival. Here we present a novel approach, called perturbation clustering for data integration and disease subtyping (PINS), which is able to address both challenges. The framework has been validated on thousands of cancer samples, using gene expression, DNA methylation, noncoding microRNA, and copy number variation data available from the Gene Expression Omnibus, the Broad Institute, The Cancer Genome Atlas (TCGA), and the European Genome-Phenome Archive. This simultaneous subtyping approach accurately identifies known cancer subtypes and novel subgroups of patients with significantly different survival profiles. The results were obtained from genome-scale molecular data without any other type of prior knowledge. The approach is sufficiently general to replace existing unsupervised clustering approaches outside the scope of bio-medical research, with the additional ability to integrate multiple types of data.


Assuntos
Interpretação Estatística de Dados , Doença/classificação , Algoritmos , Análise por Conglomerados , Metilação de DNA , Feminino , Expressão Gênica , Doenças Genéticas Inatas/classificação , Humanos , Masculino , MicroRNAs , RNA Mensageiro
8.
Brief Bioinform ; 19(5): 737-753, 2018 09 28.
Artigo em Inglês | MEDLINE | ID: mdl-28334228

RESUMO

DNA methylation is an important epigenetic mechanism that plays a crucial role in cellular regulatory systems. Recent advancements in sequencing technologies now enable us to generate high-throughput methylation data and to measure methylation up to single-base resolution. This wealth of data does not come without challenges, and one of the key challenges in DNA methylation studies is to identify the significant differences in the methylation levels of the base pairs across distinct biological conditions. Several computational methods have been developed to identify differential methylation using bisulfite sequencing data; however, there is no clear consensus among existing approaches. A comprehensive survey of these approaches would be of great benefit to potential users and researchers to get a complete picture of the available resources. In this article, we present a detailed survey of 22 such approaches focusing on their underlying statistical models, primary features, key advantages and major limitations. Importantly, the intrinsic drawbacks of the approaches pointed out in this survey could potentially be addressed by future research.


Assuntos
Metilação de DNA , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Biologia Computacional/métodos , Ilhas de CpG , Epigênese Genética , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Modelos Logísticos , Cadeias de Markov , Análise de Sequência de DNA/estatística & dados numéricos , Sulfitos
9.
Bioinformatics ; 35(19): 3672-3678, 2019 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-30840053

RESUMO

MOTIVATION: Drug repurposing is a potential alternative to the classical drug discovery pipeline. Repurposing involves finding novel indications for already approved drugs. In this work, we present a novel machine learning-based method for drug repurposing. This method explores the anti-similarity between drugs and a disease to uncover new uses for the drugs. More specifically, our proposed method takes into account three sources of information: (i) large-scale gene expression profiles corresponding to human cell lines treated with small molecules, (ii) gene expression profile of a human disease and (iii) the known relationship between Food and Drug Administration (FDA)-approved drugs and diseases. Using these data, our proposed method learns a similarity metric through a supervised machine learning-based algorithm such that a disease and its associated FDA-approved drugs have smaller distance than the other disease-drug pairs. RESULTS: We validated our framework by showing that the proposed method incorporating distance metric learning technique can retrieve FDA-approved drugs for their approved indications. Once validated, we used our approach to identify a few strong candidates for repurposing. AVAILABILITY AND IMPLEMENTATION: The R scripts are available on demand from the authors. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Reposicionamento de Medicamentos , Algoritmos , Biologia Computacional , Descoberta de Drogas , Humanos , Aprendizado de Máquina , Preparações Farmacêuticas
10.
Bioinformatics ; 35(16): 2843-2846, 2019 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-30590381

RESUMO

SUMMARY: Since cancer is a heterogeneous disease, tumor subtyping is crucial for improved treatment and prognosis. We have developed a subtype discovery tool, called PINSPlus, that is: (i) robust against noise and unstable quantitative assays, (ii) able to integrate multiple types of omics data in a single analysis and (iii) dramatically superior to established approaches in identifying known subtypes and novel subgroups with significant survival differences. Our validation on 12,158 samples from 44 datasets shows that PINSPlus vastly outperforms other approaches. The software is easy-to-use and can partition hundreds of patients in a few minutes on a personal computer. AVAILABILITY AND IMPLEMENTATION: The package is available at https://cran.r-project.org/package=PINSPlus. Data and R script used in this manuscript are available at https://bioinformatics.cse.unr.edu/software/PINSPlus/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genômica , Neoplasias , Humanos , Software
11.
Int J Mol Sci ; 21(2)2020 Jan 17.
Artigo em Inglês | MEDLINE | ID: mdl-31963593

RESUMO

The human placenta maintains pregnancy and supports the developing fetus by providing nutrition, gas-waste exchange, hormonal regulation, and an immunological barrier from the maternal immune system. The villous syncytiotrophoblast carries most of these functions and provides the interface between the maternal and fetal circulatory systems. The syncytiotrophoblast is generated by the biochemical and morphological differentiation of underlying cytotrophoblast progenitor cells. The dysfunction of the villous trophoblast development is implicated in placenta-mediated pregnancy complications. Herein, we describe gene modules and clusters involved in the dynamic differentiation of villous cytotrophoblasts into the syncytiotrophoblast. During this process, the immune defense functions are first established, followed by structural and metabolic changes, and then by peptide hormone synthesis. We describe key transcription regulatory molecules that regulate gene modules involved in placental functions. Based on transcriptomic evidence, we infer how villous trophoblast differentiation and functions are dysregulated in preterm preeclampsia, a life-threatening placenta-mediated obstetrical syndrome for the mother and fetus. In the conclusion, we uncover the blueprint for villous trophoblast development and its impairment in preterm preeclampsia, which may aid in the future development of non-invasive biomarkers for placental functions and early identification of women at risk for preterm preeclampsia as well as other placenta-mediated pregnancy complications.


Assuntos
Diferenciação Celular , Regulação da Expressão Gênica , Marcadores Genéticos , Placenta/patologia , Pré-Eclâmpsia/genética , Pré-Eclâmpsia/patologia , Transcriptoma , Trofoblastos/patologia , Feminino , Humanos , Placenta/metabolismo , Gravidez , Trofoblastos/metabolismo
12.
Bioinformatics ; 34(16): 2817-2825, 2018 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-29534151

RESUMO

Motivation: Identification of novel therapeutic effects for existing US Food and Drug Administration (FDA)-approved drugs, drug repurposing, is an approach aimed to dramatically shorten the drug discovery process, which is costly, slow and risky. Several computational approaches use transcriptional data to find potential repurposing candidates. The main hypothesis of such approaches is that if gene expression signature of a particular drug is opposite to the gene expression signature of a disease, that drug may have a potential therapeutic effect on the disease. However, this may not be optimal since it fails to consider the different roles of genes and their dependencies at the system level. Results: We propose a systems biology approach to discover novel therapeutic roles for established drugs that addresses some of the issues in the current approaches. To do so, we use publicly available drug and disease data to build a drug-disease network by considering all interactions between drug targets and disease-related genes in the context of all known signaling pathways. This network is integrated with gene-expression measurements to identify drugs with new desired therapeutic effects based on a system-level analysis method. We compare the proposed approach with the drug repurposing approach proposed by Sirota et al. on four human diseases: idiopathic pulmonary fibrosis, non-small cell lung cancer, prostate cancer and breast cancer. We evaluate the proposed approach based on its ability to re-discover drugs that are already FDA-approved for a given disease. Availability and implementation: The R package DrugDiseaseNet is under review for publication in Bioconductor and is available at https://github.com/azampvd/DrugDiseaseNet. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Reposicionamento de Medicamentos , Neoplasias/tratamento farmacológico , Biologia de Sistemas , Descoberta de Drogas/métodos , Reposicionamento de Medicamentos/métodos , Humanos , Transcriptoma
13.
Bioinformatics ; 34(9): 1441-1447, 2018 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-29220513

RESUMO

Motivation: Epigenetic mechanisms are known to play a major role in breast cancer. However, the role of 5-hydroxymethylcytosine (5hmC) remains understudied. We hypothesize that 5hmC mediates redox regulation of gene expression in an aggressive subtype known as triple negative breast cancer (TNBC). To address this, our objective was to highlight genes that may be the target of this process by identifying redox-regulated, antioxidant-sensitive, gene-localized 5hmC changes associated with mRNA changes in TNBC cells. Results: We proceeded to develop an approach to integrate novel Pvu-sequencing and RNA-sequencing data. The result of our approach to merge genome-wide, high-throughput TNBC cell line datasets to identify significant, concordant 5hmC and mRNA changes in response to antioxidant treatment produced a gene set with relevance to cancer stem cell function. Moreover, we have established a method that will be useful for continued research of 5hmC in TNBC cells and tissue samples. Availability and implementation: Data are available at Gene Expression Omnibus (GEO) under accession number GSE103850. Contact: bollig@karmanos.org.


Assuntos
5-Metilcitosina/análogos & derivados , Epigênese Genética , Regulação Neoplásica da Expressão Gênica , Neoplasias de Mama Triplo Negativas/genética , Linhagem Celular Tumoral , Biologia Computacional , Feminino , Perfilação da Expressão Gênica , Humanos , Análise de Sequência de DNA , Análise de Sequência de RNA
14.
Bioinformatics ; 33(13): 1987-1994, 2017 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-28200075

RESUMO

MOTIVATION: The ultimate goal of any experiment is to understand the biological phenomena underlying the condition investigated. This process often results in genes network through which a certain biological mechanism is explained. Such networks have been proven to be extremely useful, for the prediction of mechanisms of action of drugs or the responses of an organism to a specific impact (e.g. a disease, a treatment, etc.). Here, we introduce an approach able to build a network that captures the putative mechanisms at play in the given condition, by using datasets from multiple experiments studying the same phenotype. This method takes advantage of known interactions extracted from multiple sources such as protein-protein interactions and curated biological pathways. Based on such prior knowledge, we overcome the drawbacks of snap-shot data by considering the possible effects of each gene on its neighbors. RESULTS: We show the effectiveness of this approach in three different case studies and validate the results in two ways considering the identified genes and interactions between them. We compare our findings with the results of two widely-used methods in the same category as well as the classical approach of selecting differentially expressed (DE) genes in an investigated condition. The results show that 'neighbor-net' analysis is able to report biological mechanisms that are significantly relevant to the given diseases in all the three case studies, and performs better compared to all reference methods using both validation approaches. AVAILABILITY AND IMPLEMENTATION: The proposed method is implemented as in R and will be available an a Bioconductor package. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Redes Reguladoras de Genes , Software , Algoritmos , Regulação da Expressão Gênica , Humanos , Redes e Vias Metabólicas , Fenótipo , Mapas de Interação de Proteínas
15.
Nucleic Acids Res ; 44(11): 5034-44, 2016 06 20.
Artigo em Inglês | MEDLINE | ID: mdl-27193997

RESUMO

The goal of pathway analysis is to identify the pathways that are significantly impacted when a biological system is perturbed, e.g. by a disease or drug. Current methods treat pathways as independent entities. However, many signals are constantly sent from one pathway to another, essentially linking all pathways into a global, system-wide complex. In this work, we propose a set of three pathway analysis methods based on the impact analysis, that performs a system-level analysis by considering all signals between pathways, as well as their overlaps. Briefly, the global system is modeled in two ways: (i) considering the inter-pathway interaction exchange for each individual pathways, and (ii) combining all individual pathways to form a global, system-wide graph. The third analysis method is a hybrid of these two models. The new methods were compared with DAVID, GSEA, GSA, PathNet, Crosstalk and SPIA on 23 GEO data sets involving 19 tissues investigated in 12 conditions. The results show that both the ranking and the P-values of the target pathways are substantially improved when the analysis considers the system-wide dependencies and interactions between pathways.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica , Redes Reguladoras de Genes , Redes e Vias Metabólicas , Transdução de Sinais , Algoritmos , Perfilação da Expressão Gênica/métodos , Humanos , Reprodutibilidade dos Testes
16.
Bioinformatics ; 32(3): 409-16, 2016 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-26471455

RESUMO

MOTIVATION: The accumulation of high-throughput data in public repositories creates a pressing need for integrative analysis of multiple datasets from independent experiments. However, study heterogeneity, study bias, outliers and the lack of power of available methods present real challenge in integrating genomic data. One practical drawback of many P-value-based meta-analysis methods, including Fisher's, Stouffer's, minP and maxP, is that they are sensitive to outliers. Another drawback is that, because they perform just one statistical test for each individual experiment, they may not fully exploit the potentially large number of samples within each study. RESULTS: We propose a novel bi-level meta-analysis approach that employs the additive method and the Central Limit Theorem within each individual experiment and also across multiple experiments. We prove that the bi-level framework is robust against bias, less sensitive to outliers than other methods, and more sensitive to small changes in signal. For comparative analysis, we demonstrate that the intra-experiment analysis has more power than the equivalent statistical test performed on a single large experiment. For pathway analysis, we compare the proposed framework versus classical meta-analysis approaches (Fisher's, Stouffer's and the additive method) as well as against a dedicated pathway meta-analysis package (MetaPath), using 1252 samples from 21 datasets related to three human diseases, acute myeloid leukemia (9 datasets), type II diabetes (5 datasets) and Alzheimer's disease (7 datasets). Our framework outperforms its competitors to correctly identify pathways relevant to the phenotypes. The framework is sufficiently general to be applied to any type of statistical meta-analysis. AVAILABILITY AND IMPLEMENTATION: The R scripts are available on demand from the authors. CONTACT: sorin@wayne.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Doença de Alzheimer/genética , Interpretação Estatística de Dados , Diabetes Mellitus Tipo 2/genética , Perfilação da Expressão Gênica/métodos , Leucemia Mieloide Aguda/genética , Metanálise como Assunto , Transdução de Sinais , Estudos de Casos e Controles , Biologia Computacional/métodos , Redes Reguladoras de Genes , Genoma Humano , Genômica/métodos , Humanos
17.
Proc IEEE Inst Electr Electron Eng ; 105(3): 482-495, 2017 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-30337764

RESUMO

A crucial step in the understanding of any phenotype is the correct identification of the signaling pathways that are significantly impacted in that phenotype. However, most current pathway analysis methods produce both false positives as well as false negatives in certain circumstances. We hypothesized that such incorrect results are due to the fact that the existing methods fail to distinguish between the primary dis-regulation of a given gene itself and the effects of signaling coming from upstream. Furthermore, a modern whole-genome experiment performed with a next-generation technology spends a great deal of effort to measure the entire set of 30,000-100,000 transcripts in the genome. This is followed by the selection of a few hundreds differentially expressed genes, step that literally discards more than 99% of the collected data. We also hypothesized that such a drastic filtering could discard many genes that play crucial roles in the phenotype. We propose a novel topology-based pathway analysis method that identifies significantly impacted pathways using the entire set of measurements, thus allowing the full use of the data provided by NGS techniques. The results obtained on 24 real data sets involving 12 different human diseases, as well as on 8 yeast knock-out data sets show that the proposed method yields significant improvements with respect to the state-of-the-art methods: SPIA, GSEA and GSA. AVAILABILITY: Primary dis-regulation analysis is implemented in R and included in ROntoTools Bioconductor package (versions ≥ 2.0.0). https://www.bioconductor.org/packages/release/bioc/html/ROntoTools.html.

18.
Proc IEEE Inst Electr Electron Eng ; 105(3): 496-515, 2017 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-29706661

RESUMO

Identifying the pathways and mechanisms that are significantly impacted in a given phenotype is challenging. Issues include patient heterogeneity and noise. Many experiments do not have a large enough sample size to achieve the statistical power necessary to identify significantly impacted pathways. Meta-analysis based on combining p-values from individual experiments has been used to improve power. However, all classical meta-analysis approaches work under the assumption that the p-values produced by experiment-level statistical tests follow a uniform distribution under the null hypothesis. Here we show that this assumption does not hold for three mainstream pathway analysis methods, and significant bias is likely to affect many, if not all such meta-analysis studies. We introduce DANUBE, a novel and unbiased approach to combine statistics computed from individual studies. Our framework uses control samples to construct empirical null distributions, from which empirical p-values of individual studies are calculated and combined using either a Central Limit Theorem approach or the additive method. We assess the performance of DANUBE using four different pathway analysis methods. DANUBE is compared with five meta-analysis approaches, as well as with a pathway analysis approach that employs multiple datasets (MetaPath). The 25 approaches have been tested on 16 different datasets related to two human diseases, Alzheimer's disease (7 datasets) and acute myeloid leukemia (9 datasets). We demonstrate that DANUBE overcomes bias in order to consistently identify relevant pathways. We also show how the framework improves results in more general cases, compared to classical meta-analysis performed with common experiment-level statistical tests such as Wilcoxon and t-test.

19.
Genome Res ; 23(11): 1885-93, 2013 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-23934932

RESUMO

Identifying the pathways that are significantly impacted in a given condition is a crucial step in understanding the underlying biological phenomena. All approaches currently available for this purpose calculate a P-value that aims to quantify the significance of the involvement of each pathway in the given phenotype. These P-values were previously thought to be independent. Here we show that this is not the case, and that many pathways can considerably affect each other's P-values through a "crosstalk" phenomenon. Although it is intuitive that various pathways could influence each other, the presence and extent of this phenomenon have not been rigorously studied and, most importantly, there is no currently available technique able to quantify the amount of such crosstalk. Here, we show that all three major categories of pathway analysis methods (enrichment analysis, functional class scoring, and topology-based methods) are severely influenced by crosstalk phenomena. Using real pathways and data, we show that in some cases pathways with significant P-values are not biologically meaningful, and that some biologically meaningful pathways with nonsignificant P-values become statistically significant when the crosstalk effects of other pathways are removed. We describe a technique able to detect, quantify, and correct crosstalk effects, as well as identify independent functional modules. We assessed this novel approach on data from four experiments involving three phenotypes and two species. This method is expected to allow a better understanding of individual experiment results, as well as a more refined definition of the existing signaling pathways for specific phenotypes.


Assuntos
Biologia Computacional/métodos , Redes e Vias Metabólicas , Transdução de Sinais , Tecido Adiposo Branco/metabolismo , Animais , Maturidade Cervical , Colo do Útero/metabolismo , Feminino , Expressão Gênica , Humanos , Camundongos , Modelos Biológicos , Fenótipo , Gravidez , Especificidade da Espécie
20.
Bioinformatics ; 30(21): 3036-43, 2014 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-25028721

RESUMO

MOTIVATION: Oncogenes are known drivers of cancer phenotypes and targets of molecular therapies; however, the complex and diverse signaling mechanisms regulated by oncogenes and potential routes to targeted therapy resistance remain to be fully understood. To this end, we present an approach to infer regulatory mechanisms downstream of the HER2 driver oncogene in SUM-225 metastatic breast cancer cells from dynamic gene expression patterns using a succession of analytical techniques, including a novel MP grammars method to mathematically model putative regulatory interactions among sets of clustered genes. RESULTS: Our method highlighted regulatory interactions previously identified in the cell line and a novel finding that the HER2 oncogene, as opposed to the proto-oncogene, upregulates expression of the E2F2 transcription factor. By targeted gene knockdown we show the significance of this, demonstrating that cancer cell-matrix adhesion and outgrowth were markedly inhibited when E2F2 levels were reduced. Thus, validating in this context that upregulation of E2F2 represents a key intermediate event in a HER2 oncogene-directed gene expression-based signaling circuit. This work demonstrates how predictive modeling of longitudinal gene expression data combined with multiple systems-level analyses can be used to accurately predict downstream signaling pathways. Here, our integrated method was applied to reveal insights as to how the HER2 oncogene drives a specific cancer cell phenotype, but it is adaptable to investigate other oncogenes and model systems. AVAILABILITY AND IMPLEMENTATION: Accessibility of various tools is listed in methods; the Log-Gain Stoichiometric Stepwise algorithm is accessible at http://www.cbmc.it/software/Software.php.


Assuntos
Neoplasias da Mama/genética , Fator de Transcrição E2F2/fisiologia , Regulação Neoplásica da Expressão Gênica , Genes erbB-2 , Neoplasias da Mama/metabolismo , Neoplasias da Mama/patologia , Adesão Celular , Linhagem Celular Tumoral , Junções Célula-Matriz/metabolismo , Fator de Transcrição E2F2/genética , Fator de Transcrição E2F2/metabolismo , Feminino , Técnicas de Silenciamento de Genes , Humanos , Modelos Genéticos , Proto-Oncogene Mas , Transdução de Sinais/genética , Transcrição Gênica , Transcriptoma , Regulação para Cima
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA