Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
1.
Bioinformatics ; 39(4)2023 04 03.
Artículo en Inglés | MEDLINE | ID: mdl-37010521

RESUMEN

MOTIVATION: Human traits are typically represented in both the biomedical literature and large population studies as descriptive text strings. Whilst a number of ontologies exist, none of these perfectly represent the entire human phenome and exposome. Mapping trait names across large datasets is therefore time-consuming and challenging. Recent developments in language modelling have created new methods for semantic representation of words and phrases, and these methods offer new opportunities to map human trait names in the form of words and short phrases, both to ontologies and to each other. Here, we present a comparison between a range of established and more recent language modelling approaches for the task of mapping trait names from UK Biobank to the Experimental Factor Ontology (EFO), and also explore how they compare to each other in direct trait-to-trait mapping. RESULTS: In our analyses of 1191 traits from UK Biobank with manual EFO mappings, the BioSentVec model performed best at predicting these, matching 40.3% of the manual mappings correctly. The BlueBERT-EFO model (finetuned on EFO) performed nearly as well (38.8% of traits matching the manual mapping). In contrast, Levenshtein edit distance only mapped 22% of traits correctly. Pairwise mapping of traits to each other demonstrated that many of the models can accurately group similar traits based on their semantic similarity. AVAILABILITY AND IMPLEMENTATION: Our code is available at https://github.com/MRCIEU/vectology.


Asunto(s)
Ontologías Biológicas , Semántica , Humanos , Lenguaje , Procesamiento de Lenguaje Natural , Fenotipo
2.
Bioinformatics ; 33(2): 272-279, 2017 01 15.
Artículo en Inglés | MEDLINE | ID: mdl-27663502

RESUMEN

MOTIVATION: LD score regression is a reliable and efficient method of using genome-wide association study (GWAS) summary-level results data to estimate the SNP heritability of complex traits and diseases, partition this heritability into functional categories, and estimate the genetic correlation between different phenotypes. Because the method relies on summary level results data, LD score regression is computationally tractable even for very large sample sizes. However, publicly available GWAS summary-level data are typically stored in different databases and have different formats, making it difficult to apply LD score regression to estimate genetic correlations across many different traits simultaneously. RESULTS: In this manuscript, we describe LD Hub - a centralized database of summary-level GWAS results for 173 diseases/traits from different publicly available resources/consortia and a web interface that automates the LD score regression analysis pipeline. To demonstrate functionality and validate our software, we replicated previously reported LD score regression analyses of 49 traits/diseases using LD Hub; and estimated SNP heritability and the genetic correlation across the different phenotypes. We also present new results obtained by uploading a recent atopic dermatitis GWAS meta-analysis to examine the genetic correlation between the condition and other potentially related traits. In response to the growing availability of publicly accessible GWAS summary-level results data, our database and the accompanying web interface will ensure maximal uptake of the LD score regression methodology, provide a useful database for the public dissemination of GWAS results, and provide a method for easily screening hundreds of traits for overlapping genetic aetiologies. AVAILABILITY AND IMPLEMENTATION: The web interface and instructions for using LD Hub are available at http://ldsc.broadinstitute.org/ CONTACT: jie.zheng@bristol.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Enfermedades Genéticas Congénitas/genética , Estudio de Asociación del Genoma Completo/métodos , Desequilibrio de Ligamiento , Fenotipo , Polimorfismo de Nucleótido Simple , Femenino , Predisposición Genética a la Enfermedad , Humanos , Masculino , Tamaño de la Muestra , Programas Informáticos
3.
Gigascience ; 7(8)2018 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-30165448

RESUMEN

Background: Identifying phenotypic correlations between complex traits and diseases can provide useful etiological insights. Restricted access to much individual-level phenotype data makes it difficult to estimate large-scale phenotypic correlation across the human phenome. Two state-of-the-art methods, metaCCA and LD score regression, provide an alternative approach to estimate phenotypic correlation using only genome-wide association study (GWAS) summary results. Results: Here, we present an integrated R toolkit, PhenoSpD, to use LD score regression to estimate phenotypic correlations using GWAS summary statistics and to utilize the estimated phenotypic correlations to inform correction of multiple testing for complex human traits using the spectral decomposition of matrices (SpD). The simulations suggest that it is possible to identify nonindependence of phenotypes using samples with partial overlap; as overlap decreases, the estimated phenotypic correlations will attenuate toward zero and multiple testing correction will be more stringent than in perfectly overlapping samples. Also, in contrast to LD score regression, metaCCA will provide approximate genetic correlations rather than phenotypic correlation, which limits its application for multiple testing correction. In a case study, PhenoSpD using UK Biobank GWAS results suggested 399.6 independent tests among 487 human traits, which is close to the 352.4 independent tests estimated using true phenotypic correlation. We further applied PhenoSpD to an estimated 5,618 pair-wise phenotypic correlations among 107 metabolites using GWAS summary statistics from Kettunen's publication and PhenoSpD suggested the equivalent of 33.5 independent tests for these metabolites. Conclusions: PhenoSpD extends the use of summary-level results, providing a simple and conservative way to reduce dimensionality for complex human traits using GWAS summary statistics. This is particularly valuable in the age of large-scale biobank and consortia studies, where GWAS results are much more accessible than individual-level data.


Asunto(s)
Estudio de Asociación del Genoma Completo/métodos , Polimorfismo de Nucleótido Simple , Programas Informáticos , Humanos , Desequilibrio de Ligamiento , Modelos Genéticos
4.
Nat Commun ; 9(1): 2897, 2018 07 24.
Artículo en Inglés | MEDLINE | ID: mdl-30042390

RESUMEN

The cellular and molecular basis of stromal cell recruitment, activation and crosstalk in carcinomas is poorly understood, limiting the development of targeted anti-stromal therapies. In mouse models of triple negative breast cancer (TNBC), Hedgehog ligand produced by neoplastic cells reprograms cancer-associated fibroblasts (CAFs) to provide a supportive niche for the acquisition of a chemo-resistant, cancer stem cell (CSC) phenotype via FGF5 expression and production of fibrillar collagen. Stromal treatment of patient-derived xenografts with smoothened inhibitors (SMOi) downregulates CSC markers expression and sensitizes tumors to docetaxel, leading to markedly improved survival and reduced metastatic burden. In the phase I clinical trial EDALINE, 3 of 12 patients with metastatic TNBC derived clinical benefit from combination therapy with the SMOi Sonidegib and docetaxel chemotherapy, with one patient experiencing a complete response. These studies identify Hedgehog signaling to CAFs as a novel mediator of CSC plasticity and an exciting new therapeutic target in TNBC.


Asunto(s)
Protocolos de Quimioterapia Combinada Antineoplásica/uso terapéutico , Resistencia a Antineoplásicos/efectos de los fármacos , Células Madre Neoplásicas/efectos de los fármacos , Neoplasias de la Mama Triple Negativas/tratamiento farmacológico , Adulto , Anciano , Anilidas/administración & dosificación , Animales , Compuestos de Bifenilo/administración & dosificación , Línea Celular Tumoral , Docetaxel/administración & dosificación , Femenino , Humanos , Ratones Endogámicos NOD , Ratones Noqueados , Ratones SCID , Persona de Mediana Edad , Células Madre Neoplásicas/metabolismo , Piridinas/administración & dosificación , Resultado del Tratamiento , Neoplasias de la Mama Triple Negativas/genética , Neoplasias de la Mama Triple Negativas/metabolismo , Ensayos Antitumor por Modelo de Xenoinjerto
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA