Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
PLoS Comput Biol ; 16(2): e1007652, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-32069277

RESUMO

English Wikipedia, containing more than five millions articles, has approximately eleven thousands web pages devoted to proteins or genes most of which were generated by the Gene Wiki project. These pages contain information about interactions between proteins and their functional relationships. At the same time, they are interconnected with other Wikipedia pages describing biological functions, diseases, drugs and other topics curated by independent, not coordinated collective efforts. Therefore, Wikipedia contains a directed network of protein functional relations or physical interactions embedded into the global network of the encyclopedia terms, which defines hidden (indirect) functional proximity between proteins. We applied the recently developed reduced Google Matrix (REGOMAX) algorithm in order to extract the network of hidden functional connections between proteins in Wikipedia. In this network we discovered tight communities which reflect areas of interest in molecular biology or medicine and can be considered as definitions of biological functions shaped by collective intelligence. Moreover, by comparing two snapshots of Wikipedia graph (from years 2013 and 2017), we studied the evolution of the network of direct and hidden protein connections. We concluded that the hidden connections are more dynamic compared to the direct ones and that the size of the hidden interaction communities grows with time. We recapitulate the results of Wikipedia protein community analysis and annotation in the form of an interactive online map, which can serve as a portal to the Gene Wiki project.


Assuntos
Fenômenos Biológicos , Biologia Computacional/métodos , Mapeamento de Interação de Proteínas , Proteínas/química , Ferramenta de Busca , Algoritmos , Análise por Conglomerados , Bases de Dados Genéticas , Internet , Cadeias de Markov , Probabilidade
2.
Nat Commun ; 10(1): 4808, 2019 10 22.
Artigo em Inglês | MEDLINE | ID: mdl-31641119

RESUMO

The lack of integrated resources depicting the complexity of the innate immune response in cancer represents a bottleneck for high-throughput data interpretation. To address this challenge, we perform a systematic manual literature mining of molecular mechanisms governing the innate immune response in cancer and represent it as a signalling network map. The cell-type specific signalling maps of macrophages, dendritic cells, myeloid-derived suppressor cells and natural killers are constructed and integrated into a comprehensive meta map of the innate immune response in cancer. The meta-map contains 1466 chemical species as nodes connected by 1084 biochemical reactions, and it is supported by information from 820 articles. The resource helps to interpret single cell RNA-Seq data from macrophages and natural killer cells in metastatic melanoma that reveal different anti- or pro-tumor sub-populations within each cell type. Here, we report a new open source analytic platform that supports data visualisation and interpretation of tumour microenvironment activity in cancer.


Assuntos
Imunidade Inata , Neoplasias/imunologia , Células Dendríticas/imunologia , Humanos , Células Matadoras Naturais/imunologia , Macrófagos/imunologia , Neoplasias/genética , Transdução de Sinais , Microambiente Tumoral
3.
BMC Med Genomics ; 12(1): 132, 2019 09 18.
Artigo em Inglês | MEDLINE | ID: mdl-31533822

RESUMO

BACKGROUND: The amount of publicly available cancer-related "omics" data is constantly growing and can potentially be used to gain insights into the tumour biology of new cancer patients, their diagnosis and suitable treatment options. However, the integration of different datasets is not straightforward and requires specialized approaches to deal with heterogeneity at technical and biological levels. METHODS: Here we present a method that can overcome technical biases, predict clinically relevant outcomes and identify tumour-related biological processes in patients using previously collected large discovery datasets. The approach is based on independent component analysis (ICA) - an unsupervised method of signal deconvolution. We developed parallel consensus ICA that robustly decomposes transcriptomics datasets into expression profiles with minimal mutual dependency. RESULTS: By applying the method to a small cohort of primary melanoma and control samples combined with a large discovery melanoma dataset, we demonstrate that our method distinguishes cell-type specific signals from technical biases and allows to predict clinically relevant patient characteristics. We showed the potential of the method to predict cancer subtypes and estimate the activity of key tumour-related processes such as immune response, angiogenesis and cell proliferation. ICA-based risk score was proposed and its connection to patient survival was validated with an independent cohort of patients. Additionally, through integration of components identified for mRNA and miRNA data, the proposed method helped deducing biological functions of miRNAs, which would otherwise not be possible. CONCLUSIONS: We present a method that can be used to map new transcriptomic data from cancer patient samples onto large discovery datasets. The method corrects technical biases, helps characterizing activity of biological processes or cell types in the new samples and provides the prognosis of patient survival.


Assuntos
Biologia Computacional/métodos , Melanoma/genética , MicroRNAs/metabolismo , Transcriptoma , Bases de Dados Genéticas , Feminino , Humanos , Masculino , Melanoma/mortalidade , Melanoma/patologia , MicroRNAs/genética , Análise de Componente Principal , Análise de Sobrevida
4.
Int J Mol Sci ; 20(18)2019 Sep 07.
Artigo em Inglês | MEDLINE | ID: mdl-31500324

RESUMO

Independent component analysis (ICA) is a matrix factorization approach where the signals captured by each individual matrix factors are optimized to become as mutually independent as possible. Initially suggested for solving source blind separation problems in various fields, ICA was shown to be successful in analyzing functional magnetic resonance imaging (fMRI) and other types of biomedical data. In the last twenty years, ICA became a part of the standard machine learning toolbox, together with other matrix factorization methods such as principal component analysis (PCA) and non-negative matrix factorization (NMF). Here, we review a number of recent works where ICA was shown to be a useful tool for unraveling the complexity of cancer biology from the analysis of different types of omics data, mainly collected for tumoral samples. Such works highlight the use of ICA in dimensionality reduction, deconvolution, data pre-processing, meta-analysis, and others applied to different data types (transcriptome, methylome, proteome, single-cell data). We particularly focus on the technical aspects of ICA application in omics studies such as using different protocols, determining the optimal number of components, assessing and improving reproducibility of the ICA results, and comparison with other popular matrix factorization techniques. We discuss the emerging ICA applications to the integrative analysis of multi-level omics datasets and introduce a conceptual view on ICA as a tool for defining functional subsystems of a complex biological system and their interactions under various conditions. Our review is accompanied by a Jupyter notebook which illustrates the discussed concepts and provides a practical tool for applying ICA to the analysis of cancer omics datasets.


Assuntos
Biologia Computacional/métodos , Neoplasias/genética , Neoplasias/metabolismo , Algoritmos , Curadoria de Dados , Bases de Dados Factuais , Humanos , Aprendizado de Máquina , Imageamento por Ressonância Magnética , Neoplasias/diagnóstico por imagem , Análise de Componente Principal
5.
Nat Immunol ; 19(8): 885-897, 2018 08.
Artigo em Inglês | MEDLINE | ID: mdl-30013147

RESUMO

The functions and transcriptional profiles of dendritic cells (DCs) result from the interplay between ontogeny and tissue imprinting. How tumors shape human DCs is unknown. Here we used RNA-based next-generation sequencing to systematically analyze the transcriptomes of plasmacytoid pre-DCs (pDCs), cell populations enriched for type 1 conventional DCs (cDC1s), type 2 conventional DCs (cDC2s), CD14+ DCs and monocytes-macrophages from human primary luminal breast cancer (LBC) and triple-negative breast cancer (TNBC). By comparing tumor tissue with non-invaded tissue from the same patient, we found that 85% of the genes upregulated in DCs in LBC were specific to each DC subset. However, all DC subsets in TNBC commonly showed enrichment for the interferon pathway, but those in LBC did not. Finally, we defined transcriptional signatures specific for tumor DC subsets with a prognostic effect on their respective breast-cancer subtype. We conclude that the adjustment of DCs to the tumor microenvironment is subset specific and can be used to predict disease outcome. Our work also provides a resource for the identification of potential targets and biomarkers that might improve antitumor therapies.


Assuntos
Células Dendríticas/fisiologia , Glândulas Mamárias Humanas/fisiologia , Neoplasias de Mama Triplo Negativas/genética , Biomarcadores Tumorais , Diferenciação Celular , Movimento Celular , Feminino , Citometria de Fluxo , Redes Reguladoras de Genes , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Interferons/genética , Prognóstico , Transcriptoma , Neoplasias de Mama Triplo Negativas/diagnóstico , Microambiente Tumoral
6.
J Infect Dis ; 217(11): 1690-1698, 2018 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-29490079

RESUMO

Background: Early detection of severe dengue can improve patient care and survival. To date, no reliable single-gene biomarker exists. We hypothesized that robust multigene signatures exist. Methods: We performed a prospective study on Cambodian dengue patients aged 4 to 22 years. Peripheral blood mononuclear cells (PBMCs) were obtained at hospital admission. We analyzed 42 transcriptomic profiles of patients with secondary dengue infected with dengue serotype 1. Our novel signature discovery approach controls the number of included genes and captures nonlinear relationships between transcript concentrations and severity. We evaluated the signature on secondary cases infected with different serotypes using 2 datasets: 22 PBMC samples from additional patients in our cohort and 32 whole blood samples from an independent cohort. Results: We identified an 18-gene signature for detecting severe dengue in patients with secondary infection upon hospital admission with a sensitivity of 0.93 (95% confidence interval [CI], .82-.98), specificity of 0.67 (95% CI, .53-.80), and area under the receiver operating characteristic curve (AUC) of 0.86 (95% CI, .75-.97). At validation, the signature had empirical AUCs of 0.85 (95% CI, .69-1.00) and 0.83 (95% CI, .68-.98) for the PBMCs and whole blood datasets, respectively. Conclusions: The signature could detect severe dengue in secondary-infected patients upon hospital admission. Its genes offer new insights into the pathogenesis of severe dengue.


Assuntos
RNA/sangue , Dengue Grave/sangue , Dengue Grave/diagnóstico , Adolescente , Adulto , Criança , Pré-Escolar , Coinfecção/sangue , Coinfecção/diagnóstico , Coinfecção/virologia , Vírus da Dengue/genética , Feminino , Marcadores Genéticos/genética , Hospitalização , Hospitais , Humanos , Leucócitos Mononucleares/virologia , Masculino , Estudos Prospectivos , Curva ROC , Sensibilidade e Especificidade , Sorogrupo , Transcriptoma/genética , Adulto Jovem
7.
BMC Genomics ; 18(1): 712, 2017 Sep 11.
Artigo em Inglês | MEDLINE | ID: mdl-28893186

RESUMO

BACKGROUND: Independent Component Analysis (ICA) is a method that models gene expression data as an action of a set of statistically independent hidden factors. The output of ICA depends on a fundamental parameter: the number of components (factors) to compute. The optimal choice of this parameter, related to determining the effective data dimension, remains an open question in the application of blind source separation techniques to transcriptomic data. RESULTS: Here we address the question of optimizing the number of statistically independent components in the analysis of transcriptomic data for reproducibility of the components in multiple runs of ICA (within the same or within varying effective dimensions) and in multiple independent datasets. To this end, we introduce ranking of independent components based on their stability in multiple ICA computation runs and define a distinguished number of components (Most Stable Transcriptome Dimension, MSTD) corresponding to the point of the qualitative change of the stability profile. Based on a large body of data, we demonstrate that a sufficient number of dimensions is required for biological interpretability of the ICA decomposition and that the most stable components with ranks below MSTD have more chances to be reproduced in independent studies compared to the less stable ones. At the same time, we show that a transcriptomics dataset can be reduced to a relatively high number of dimensions without losing the interpretability of ICA, even though higher dimensions give rise to components driven by small gene sets. CONCLUSIONS: We suggest a protocol of ICA application to transcriptomics data with a possibility of prioritizing components with respect to their reproducibility that strengthens the biological interpretation. Computing too few components (much less than MSTD) is not optimal for interpretability of the results. The components ranked within MSTD range have more chances to be reproduced in independent studies.


Assuntos
Perfilação da Expressão Gênica , Neoplasias/genética , Reprodutibilidade dos Testes , Estatística como Assunto
8.
PLoS Comput Biol ; 13(3): e1005432, 2017 03.
Artigo em Inglês | MEDLINE | ID: mdl-28306714

RESUMO

The ability to build in-depth cell signaling networks from vast experimental data is a key objective of computational biology. The spleen tyrosine kinase (Syk) protein, a well-characterized key player in immune cell signaling, was surprisingly first shown by our group to exhibit an onco-suppressive function in mammary epithelial cells and corroborated by many other studies, but the molecular mechanisms of this function remain largely unsolved. Based on existing proteomic data, we report here the generation of an interaction-based network of signaling pathways controlled by Syk in breast cancer cells. Pathway enrichment of the Syk targets previously identified by quantitative phospho-proteomics indicated that Syk is engaged in cell adhesion, motility, growth and death. Using the components and interactions of these pathways, we bootstrapped the reconstruction of a comprehensive network covering Syk signaling in breast cancer cells. To generate in silico hypotheses on Syk signaling propagation, we developed a method allowing to rank paths between Syk and its targets. We first annotated the network according to experimental datasets. We then combined shortest path computation with random walk processes to estimate the importance of individual interactions and selected biologically relevant pathways in the network. Molecular and cell biology experiments allowed to distinguish candidate mechanisms that underlie the impact of Syk on the regulation of cortactin and ezrin, both involved in actin-mediated cell adhesion and motility. The Syk network was further completed with the results of our biological validation experiments. The resulting Syk signaling sub-networks can be explored via an online visualization platform.


Assuntos
Neoplasias da Mama/metabolismo , Regulação Neoplásica da Expressão Gênica , Modelos Biológicos , Proteínas de Neoplasias/metabolismo , Transdução de Sinais , Quinase Syk/metabolismo , Linhagem Celular Tumoral , Simulação por Computador , Feminino , Perfilação da Expressão Gênica/métodos , Humanos , Células MCF-7
9.
BMC Syst Biol ; 9: 46, 2015 Aug 14.
Artigo em Inglês | MEDLINE | ID: mdl-26271256

RESUMO

BACKGROUND: Visualization and analysis of molecular profiling data together with biological networks are able to provide new mechanistic insights into biological functions. Currently, it is possible to visualize high-throughput data on top of pre-defined network layouts, but they are not always adapted to a given data analysis task. A network layout based simultaneously on the network structure and the associated multidimensional data might be advantageous for data visualization and analysis in some cases. RESULTS: We developed a Cytoscape app, which allows constructing biological network layouts based on the data from molecular profiles imported as values of node attributes. DeDaL is a Cytoscape 3 app, which uses linear and non-linear algorithms of dimension reduction to produce data-driven network layouts based on multidimensional data (typically gene expression). DeDaL implements several data pre-processing and layout post-processing steps such as continuous morphing between two arbitrary network layouts and aligning one network layout with respect to another one by rotating and mirroring. The combination of all these functionalities facilitates the creation of insightful network layouts representing both structural network features and correlation patterns in multivariate data. We demonstrate the added value of applying DeDaL in several practical applications, including an example of a large protein-protein interaction network. CONCLUSIONS: DeDaL is a convenient tool for applying data dimensionality reduction methods and for designing insightful data displays based on data-driven layouts of biological networks, built within Cytoscape environment. DeDaL is freely available for downloading at http://bioinfo-out.curie.fr/projects/dedal/.


Assuntos
Biologia Computacional/métodos , Software , Algoritmos , Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Gráficos por Computador , Bases de Dados Genéticas , Perfilação da Expressão Gênica , Humanos , Modelos Lineares , Dinâmica não Linear , Especificidade de Órgãos , Análise de Sequência de RNA , Transdução de Sinais
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...