Pesquisa | Portal Regional da BVS

A2Sign: Agnostic Algorithms for Signatures-a universal method for identifying molecular signatures from transcriptomic datasets prior to cell-type deconvolution.

Boldina, Galina; Fogel, Paul; Rocher, Corinne; Bettembourg, Charles; Luta, George; Augé, Franck.

Bioinformatics ; 38(4): 1015-1021, 2022 01 27.

Artigo em Inglês | MEDLINE | ID: mdl-34788798

RESUMO

MOTIVATION: Molecular signatures are critical for inferring the proportions of cell types from bulk transcriptomics data. However, the identification of these signatures is based on a methodology that relies on prior biological knowledge of the cell types being studied. When working with less known biological material, a data-driven approach is required to uncover the underlying classes and generate ad hoc signatures from healthy or pathogenic tissue. RESULTS: We present a new approach, A2Sign: Agnostic Algorithms for Signatures, based on a non-negative tensor factorization (NTF) strategy that allows us to identify cell-type-specific molecular signatures, greatly reduce collinearities and also account for inter-individual variability. We propose a global framework that can be applied to uncover molecular signatures for cell-type deconvolution in arbitrary tissues using bulk transcriptome data. We also present two new molecular signatures for deconvolution of up to 16 immune cell types using microarray or RNA-seq data. AVAILABILITY AND IMPLEMENTATION: All steps of our analysis were implemented in annotated Python notebooks (https://github.com/paulfogel/A2SIGN). To perform NTF, we used the NMTF package, which can be downloaded using Python pip install. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Algoritmos , Transcriptoma , Perfilação da Expressão Gênica , RNA-Seq , Sequenciamento do Exoma

ABHD11, a new diacylglycerol lipase involved in weight gain regulation.

Escoubet, Johanna; Kenigsberg, Mireille; Derock, Murielle; Yaligara, Veeranagouda; Bock, Marie-Dominique; Roche, Sandrine; Massey, Florence; de Foucauld, Hélène; Bettembourg, Charles; Olivier, Anne; Berthemy, Antoine; Capdevielle, Joël; Legoux, Richard; Perret, Eric; Buzy, Armelle; Chardenot, Pascale; Destelle, Valérie; Leroy, Aurélie; Cahours, Christophe; Teixeira, Sandrine; Juvet, Patrick; Gauthier, Pascal; Leguet, Michaël; Rocheteau-Beaujouan, Laurence; Chatoux, Marie-Agnès; Deshayes, Willy; Clement, Margerie; Kabiri, Mostafa; Orsini, Cécile; Mikol, Vincent; Didier, Michel; Guillemot, Jean-Claude.

PLoS One ; 15(6): e0234780, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-32579589

RESUMO

Obesity epidemic continues to spread and obesity rates are increasing in the world. In addition to public health effort to reduce obesity, there is a need to better understand the underlying biology to enable more effective treatment and the discovery of new pharmacological agents. Abhydrolase domain-containing protein 11 (ABHD11) is a serine hydrolase enzyme, localized in mitochondria, that can synthesize the endocannabinoid 2-arachidonoyl glycerol (2AG) in vitro. In vivo preclinical studies demonstrated that knock-out ABHD11 mice have a similar 2AG level as WT mice and exhibit a lean metabolic phenotype. Such mice resist to weight gain in Diet Induced Obesity studies (DIO) and display normal biochemical plasma parameters. Metabolic and transcriptomic analyses on serum and tissues of ABHD11 KO mice from DIO studies show a modulation in bile salts associated with reduced fat intestinal absorption. These data suggest that modulating ABHD11 signaling pathway could be of therapeutic value for the treatment of metabolic disorders.

Assuntos

Serina Proteases/metabolismo , Aumento de Peso , Animais , Fezes/enzimologia , Perfilação da Expressão Gênica , Regulação Enzimológica da Expressão Gênica , Técnicas de Inativação de Genes , Humanos , Células MCF-7 , Camundongos , Mitocôndrias/metabolismo , Serina Proteases/deficiência , Serina Proteases/genética , Transdução de Sinais

Optimal Threshold Determination for Interpreting Semantic Similarity and Particularity: Application to the Comparison of Gene Sets and Metabolic Pathways Using GO and ChEBI.

Bettembourg, Charles; Diot, Christian; Dameron, Olivier.

PLoS One ; 10(7): e0133579, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26230274

RESUMO

BACKGROUND: The analysis of gene annotations referencing back to Gene Ontology plays an important role in the interpretation of high-throughput experiments results. This analysis typically involves semantic similarity and particularity measures that quantify the importance of the Gene Ontology annotations. However, there is currently no sound method supporting the interpretation of the similarity and particularity values in order to determine whether two genes are similar or whether one gene has some significant particular function. Interpretation is frequently based either on an implicit threshold, or an arbitrary one (typically 0.5). Here we investigate a method for determining thresholds supporting the interpretation of the results of a semantic comparison. RESULTS: We propose a method for determining the optimal similarity threshold by minimizing the proportions of false-positive and false-negative similarity matches. We compared the distributions of the similarity values of pairs of similar genes and pairs of non-similar genes. These comparisons were performed separately for all three branches of the Gene Ontology. In all situations, we found overlap between the similar and the non-similar distributions, indicating that some similar genes had a similarity value lower than the similarity value of some non-similar genes. We then extend this method to the semantic particularity measure and to a similarity measure applied to the ChEBI ontology. Thresholds were evaluated over the whole HomoloGene database. For each group of homologous genes, we computed all the similarity and particularity values between pairs of genes. Finally, we focused on the PPAR multigene family to show that the similarity and particularity patterns obtained with our thresholds were better at discriminating orthologs and paralogs than those obtained using default thresholds. CONCLUSION: We developed a method for determining optimal semantic similarity and particularity thresholds. We applied this method on the GO and ChEBI ontologies. Qualitative analysis using the thresholds on the PPAR multigene family yielded biologically-relevant patterns.

Assuntos

Redes e Vias Metabólicas/genética , Algoritmos , Biologia Computacional/métodos , Ontologia Genética , Humanos , Anotação de Sequência Molecular/métodos , Família Multigênica/genética , Receptores Ativados por Proliferador de Peroxissomo/genética , Semântica

Semantic particularity measure for functional characterization of gene sets using gene ontology.

Bettembourg, Charles; Diot, Christian; Dameron, Olivier.

PLoS One ; 9(1): e86525, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-24489737

RESUMO

BACKGROUND: Genetic and genomic data analyses are outputting large sets of genes. Functional comparison of these gene sets is a key part of the analysis, as it identifies their shared functions, and the functions that distinguish each set. The Gene Ontology (GO) initiative provides a unified reference for analyzing the genes molecular functions, biological processes and cellular components. Numerous semantic similarity measures have been developed to systematically quantify the weight of the GO terms shared by two genes. We studied how gene set comparisons can be improved by considering gene set particularity in addition to gene set similarity. RESULTS: We propose a new approach to compute gene set particularities based on the information conveyed by GO terms. A GO term informativeness can be computed using either its information content based on the term frequency in a corpus, or a function of the term's distance to the root. We defined the semantic particularity of a set of GO terms Sg1 compared to another set of GO terms Sg2. We combined our particularity measure with a similarity measure to compare gene sets. We demonstrated that the combination of semantic similarity and semantic particularity measures was able to identify genes with particular functions from among similar genes. This differentiation was not recognized using only a semantic similarity measure. CONCLUSION: Semantic particularity should be used in conjunction with semantic similarity to perform functional analysis of GO-annotated gene sets. The principle is generalizable to other ontologies.

Assuntos

Bases de Dados Genéticas , Ontologia Genética , Genes , Semântica , Animais , Aquaporinas/metabolismo , Transporte Biológico , Genes Fúngicos , Humanos , Carioferinas/genética , Ratos , Saccharomyces cerevisiae/genética , Homologia de Sequência do Ácido Nucleico , Triptofano/metabolismo

Measuring the evolution of ontology complexity: the gene ontology case study.

Dameron, Olivier; Bettembourg, Charles; Le Meur, Nolwenn.

PLoS One ; 8(10): e75993, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-24146805

RESUMO

Ontologies support automatic sharing, combination and analysis of life sciences data. They undergo regular curation and enrichment. We studied the impact of an ontology evolution on its structural complexity. As a case study we used the sixty monthly releases between January 2008 and December 2012 of the Gene Ontology and its three independent branches, i.e. biological processes (BP), cellular components (CC) and molecular functions (MF). For each case, we measured complexity by computing metrics related to the size, the nodes connectivity and the hierarchical structure. The number of classes and relations increased monotonously for each branch, with different growth rates. BP and CC had similar connectivity, superior to that of MF. Connectivity increased monotonously for BP, decreased for CC and remained stable for MF, with a marked increase for the three branches in November and December 2012. Hierarchy-related measures showed that CC and MF had similar proportions of leaves, average depths and average heights. BP had a lower proportion of leaves, and a higher average depth and average height. For BP and MF, the late 2012 increase of connectivity resulted in an increase of the average depth and average height and a decrease of the proportion of leaves, indicating that a major enrichment effort of the intermediate-level hierarchy occurred. The variation of the number of classes and relations in an ontology does not provide enough information about the evolution of its complexity. However, connectivity and hierarchy-related metrics revealed different patterns of values as well as of evolution for the three branches of the Gene Ontology. CC was similar to BP in terms of connectivity, and similar to MF in terms of hierarchy. Overall, BP complexity increased, CC was refined with the addition of leaves providing a finer level of annotations but decreasing slightly its complexity, and MF complexity remained stable.

Assuntos

Biologia Computacional/história , Ontologia Genética/tendências , Vocabulário Controlado/história , Ontologia Genética/estatística & dados numéricos , História do Século XXI , Humanos , Fatores de Tempo

The duplicated genes database: identification and functional annotation of co-localised duplicated genes across genomes.

Ouedraogo, Marion; Bettembourg, Charles; Bretaudeau, Anthony; Sallou, Olivier; Diot, Christian; Demeure, Olivier; Lecerf, Frédéric.

PLoS One ; 7(11): e50653, 2012.

Artigo em Inglês | MEDLINE | ID: mdl-23209799

RESUMO

BACKGROUND: There has been a surge in studies linking genome structure and gene expression, with special focus on duplicated genes. Although initially duplicated from the same sequence, duplicated genes can diverge strongly over evolution and take on different functions or regulated expression. However, information on the function and expression of duplicated genes remains sparse. Identifying groups of duplicated genes in different genomes and characterizing their expression and function would therefore be of great interest to the research community. The 'Duplicated Genes Database' (DGD) was developed for this purpose. METHODOLOGY: Nine species were included in the DGD. For each species, BLAST analyses were conducted on peptide sequences corresponding to the genes mapped on a same chromosome. Groups of duplicated genes were defined based on these pairwise BLAST comparisons and the genomic location of the genes. For each group, Pearson correlations between gene expression data and semantic similarities between functional GO annotations were also computed when the relevant information was available. CONCLUSIONS: The Duplicated Gene Database provides a list of co-localised and duplicated genes for several species with the available gene co-expression level and semantic similarity value of functional annotation. Adding these data to the groups of duplicated genes provides biological information that can prove useful to gene expression analyses. The Duplicated Gene Database can be freely accessed through the DGD website at http://dgd.genouest.org.

Assuntos

Bases de Dados Genéticas , Genes Duplicados/genética , Internet

GO2PUB: Querying PubMed with semantic expansion of gene ontology terms.

Bettembourg, Charles; Diot, Christian; Burgun, Anita; Dameron, Olivier.

J Biomed Semantics ; 3(1): 7, 2012 Sep 07.

Artigo em Inglês | MEDLINE | ID: mdl-22958570

RESUMO

BACKGROUND: With the development of high throughput methods of gene analyses, there is a growing need for mining tools to retrieve relevant articles in PubMed. As PubMed grows, literature searches become more complex and time-consuming. Automated search tools with good precision and recall are necessary. We developed GO2PUB to automatically enrich PubMed queries with gene names, symbols and synonyms annotated by a GO term of interest or one of its descendants. RESULTS: GO2PUB enriches PubMed queries based on selected GO terms and keywords. It processes the result and displays the PMID, title, authors, abstract and bibliographic references of the articles. Gene names, symbols and synonyms that have been generated as extra keywords from the GO terms are also highlighted. GO2PUB is based on a semantic expansion of PubMed queries using the semantic inheritance between terms through the GO graph. Two experts manually assessed the relevance of GO2PUB, GoPubMed and PubMed on three queries about lipid metabolism. Experts' agreement was high (kappa = 0.88). GO2PUB returned 69% of the relevant articles, GoPubMed: 40% and PubMed: 29%. GO2PUB and GoPubMed have 17% of their results in common, corresponding to 24% of the total number of relevant results. 70% of the articles returned by more than one tool were relevant. 36% of the relevant articles were returned only by GO2PUB, 17% only by GoPubMed and 14% only by PubMed. For determining whether these results can be generalized, we generated twenty queries based on random GO terms with a granularity similar to those of the first three queries and compared the proportions of GO2PUB and GoPubMed results. These were respectively of 77% and 40% for the first queries, and of 70% and 38% for the random queries. The two experts also assessed the relevance of seven of the twenty queries (the three related to lipid metabolism and four related to other domains). Expert agreement was high (0.93 and 0.8). GO2PUB and GoPubMed performances were similar to those of the first queries. CONCLUSIONS: We demonstrated that the use of genes annotated by either GO terms of interest or a descendant of these GO terms yields some relevant articles ignored by other tools. The comparison of GO2PUB, based on semantic expansion, with GoPubMed, based on text mining techniques, showed that both tools are complementary. The analysis of the randomly-generated queries suggests that the results obtained about lipid metabolism can be generalized to other biological processes. GO2PUB is available at http://go2pub.genouest.org.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA