Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
1.
Bioinformatics ; 39(4)2023 04 03.
Artigo em Inglês | MEDLINE | ID: mdl-37010521

RESUMO

MOTIVATION: Human traits are typically represented in both the biomedical literature and large population studies as descriptive text strings. Whilst a number of ontologies exist, none of these perfectly represent the entire human phenome and exposome. Mapping trait names across large datasets is therefore time-consuming and challenging. Recent developments in language modelling have created new methods for semantic representation of words and phrases, and these methods offer new opportunities to map human trait names in the form of words and short phrases, both to ontologies and to each other. Here, we present a comparison between a range of established and more recent language modelling approaches for the task of mapping trait names from UK Biobank to the Experimental Factor Ontology (EFO), and also explore how they compare to each other in direct trait-to-trait mapping. RESULTS: In our analyses of 1191 traits from UK Biobank with manual EFO mappings, the BioSentVec model performed best at predicting these, matching 40.3% of the manual mappings correctly. The BlueBERT-EFO model (finetuned on EFO) performed nearly as well (38.8% of traits matching the manual mapping). In contrast, Levenshtein edit distance only mapped 22% of traits correctly. Pairwise mapping of traits to each other demonstrated that many of the models can accurately group similar traits based on their semantic similarity. AVAILABILITY AND IMPLEMENTATION: Our code is available at https://github.com/MRCIEU/vectology.


Assuntos
Ontologias Biológicas , Semântica , Humanos , Idioma , Processamento de Linguagem Natural , Fenótipo
2.
Bioinformatics ; 33(2): 272-279, 2017 01 15.
Artigo em Inglês | MEDLINE | ID: mdl-27663502

RESUMO

MOTIVATION: LD score regression is a reliable and efficient method of using genome-wide association study (GWAS) summary-level results data to estimate the SNP heritability of complex traits and diseases, partition this heritability into functional categories, and estimate the genetic correlation between different phenotypes. Because the method relies on summary level results data, LD score regression is computationally tractable even for very large sample sizes. However, publicly available GWAS summary-level data are typically stored in different databases and have different formats, making it difficult to apply LD score regression to estimate genetic correlations across many different traits simultaneously. RESULTS: In this manuscript, we describe LD Hub - a centralized database of summary-level GWAS results for 173 diseases/traits from different publicly available resources/consortia and a web interface that automates the LD score regression analysis pipeline. To demonstrate functionality and validate our software, we replicated previously reported LD score regression analyses of 49 traits/diseases using LD Hub; and estimated SNP heritability and the genetic correlation across the different phenotypes. We also present new results obtained by uploading a recent atopic dermatitis GWAS meta-analysis to examine the genetic correlation between the condition and other potentially related traits. In response to the growing availability of publicly accessible GWAS summary-level results data, our database and the accompanying web interface will ensure maximal uptake of the LD score regression methodology, provide a useful database for the public dissemination of GWAS results, and provide a method for easily screening hundreds of traits for overlapping genetic aetiologies. AVAILABILITY AND IMPLEMENTATION: The web interface and instructions for using LD Hub are available at http://ldsc.broadinstitute.org/ CONTACT: jie.zheng@bristol.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Bases de Dados de Ácidos Nucleicos , Doenças Genéticas Inatas/genética , Estudo de Associação Genômica Ampla/métodos , Desequilíbrio de Ligação , Fenótipo , Polimorfismo de Nucleotídeo Único , Feminino , Predisposição Genética para Doença , Humanos , Masculino , Tamanho da Amostra , Software
3.
Gigascience ; 7(8)2018 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-30165448

RESUMO

Background: Identifying phenotypic correlations between complex traits and diseases can provide useful etiological insights. Restricted access to much individual-level phenotype data makes it difficult to estimate large-scale phenotypic correlation across the human phenome. Two state-of-the-art methods, metaCCA and LD score regression, provide an alternative approach to estimate phenotypic correlation using only genome-wide association study (GWAS) summary results. Results: Here, we present an integrated R toolkit, PhenoSpD, to use LD score regression to estimate phenotypic correlations using GWAS summary statistics and to utilize the estimated phenotypic correlations to inform correction of multiple testing for complex human traits using the spectral decomposition of matrices (SpD). The simulations suggest that it is possible to identify nonindependence of phenotypes using samples with partial overlap; as overlap decreases, the estimated phenotypic correlations will attenuate toward zero and multiple testing correction will be more stringent than in perfectly overlapping samples. Also, in contrast to LD score regression, metaCCA will provide approximate genetic correlations rather than phenotypic correlation, which limits its application for multiple testing correction. In a case study, PhenoSpD using UK Biobank GWAS results suggested 399.6 independent tests among 487 human traits, which is close to the 352.4 independent tests estimated using true phenotypic correlation. We further applied PhenoSpD to an estimated 5,618 pair-wise phenotypic correlations among 107 metabolites using GWAS summary statistics from Kettunen's publication and PhenoSpD suggested the equivalent of 33.5 independent tests for these metabolites. Conclusions: PhenoSpD extends the use of summary-level results, providing a simple and conservative way to reduce dimensionality for complex human traits using GWAS summary statistics. This is particularly valuable in the age of large-scale biobank and consortia studies, where GWAS results are much more accessible than individual-level data.


Assuntos
Estudo de Associação Genômica Ampla/métodos , Polimorfismo de Nucleotídeo Único , Software , Humanos , Desequilíbrio de Ligação , Modelos Genéticos
4.
Nat Commun ; 9(1): 2897, 2018 07 24.
Artigo em Inglês | MEDLINE | ID: mdl-30042390

RESUMO

The cellular and molecular basis of stromal cell recruitment, activation and crosstalk in carcinomas is poorly understood, limiting the development of targeted anti-stromal therapies. In mouse models of triple negative breast cancer (TNBC), Hedgehog ligand produced by neoplastic cells reprograms cancer-associated fibroblasts (CAFs) to provide a supportive niche for the acquisition of a chemo-resistant, cancer stem cell (CSC) phenotype via FGF5 expression and production of fibrillar collagen. Stromal treatment of patient-derived xenografts with smoothened inhibitors (SMOi) downregulates CSC markers expression and sensitizes tumors to docetaxel, leading to markedly improved survival and reduced metastatic burden. In the phase I clinical trial EDALINE, 3 of 12 patients with metastatic TNBC derived clinical benefit from combination therapy with the SMOi Sonidegib and docetaxel chemotherapy, with one patient experiencing a complete response. These studies identify Hedgehog signaling to CAFs as a novel mediator of CSC plasticity and an exciting new therapeutic target in TNBC.


Assuntos
Protocolos de Quimioterapia Combinada Antineoplásica/uso terapêutico , Resistencia a Medicamentos Antineoplásicos/efeitos dos fármacos , Células-Tronco Neoplásicas/efeitos dos fármacos , Neoplasias de Mama Triplo Negativas/tratamento farmacológico , Adulto , Idoso , Anilidas/administração & dosagem , Animais , Compostos de Bifenilo/administração & dosagem , Linhagem Celular Tumoral , Docetaxel/administração & dosagem , Feminino , Humanos , Camundongos Endogâmicos NOD , Camundongos Knockout , Camundongos SCID , Pessoa de Meia-Idade , Células-Tronco Neoplásicas/metabolismo , Piridinas/administração & dosagem , Resultado do Tratamento , Neoplasias de Mama Triplo Negativas/genética , Neoplasias de Mama Triplo Negativas/metabolismo , Ensaios Antitumorais Modelo de Xenoenxerto
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA