Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 44
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Cell ; 184(8): 2068-2083.e11, 2021 04 15.
Artículo en Inglés | MEDLINE | ID: mdl-33861964

RESUMEN

Understanding population health disparities is an essential component of equitable precision health efforts. Epidemiology research often relies on definitions of race and ethnicity, but these population labels may not adequately capture disease burdens and environmental factors impacting specific sub-populations. Here, we propose a framework for repurposing data from electronic health records (EHRs) in concert with genomic data to explore the demographic ties that can impact disease burdens. Using data from a diverse biobank in New York City, we identified 17 communities sharing recent genetic ancestry. We observed 1,177 health outcomes that were statistically associated with a specific group and demonstrated significant differences in the segregation of genetic variants contributing to Mendelian diseases. We also demonstrated that fine-scale population structure can impact the prediction of complex disease risk within groups. This work reinforces the utility of linking genomic data to EHRs and provides a framework toward fine-scale monitoring of population health.


Asunto(s)
Etnicidad/genética , Salud Poblacional , Bases de Datos Genéticas , Registros Electrónicos de Salud , Genómica , Humanos , Autoinforme
2.
Am J Hum Genet ; 109(4): 669-679, 2022 04 07.
Artículo en Inglés | MEDLINE | ID: mdl-35263625

RESUMEN

One mechanism by which genetic factors influence complex traits and diseases is altering gene expression. Direct measurement of gene expression in relevant tissues is rarely tenable; however, genetically regulated gene expression (GReX) can be estimated using prediction models derived from large multi-omic datasets. These approaches have led to the discovery of many gene-trait associations, but whether models derived from predominantly European ancestry (EA) reference panels can map novel associations in ancestrally diverse populations remains unclear. We applied PrediXcan to impute GReX in 51,520 ancestrally diverse Population Architecture using Genomics and Epidemiology (PAGE) participants (35% African American, 45% Hispanic/Latino, 10% Asian, and 7% Hawaiian) across 25 key cardiometabolic traits and relevant tissues to identify 102 novel associations. We then compared associations in PAGE to those in a random subset of 50,000 White British participants from UK Biobank (UKBB50k) for height and body mass index (BMI). We identified 517 associations across 47 tissues in PAGE but not UKBB50k, demonstrating the importance of diverse samples in identifying trait-associated GReX. We observed that variants used in PrediXcan models were either more or less differentiated across continental-level populations than matched-control variants depending on the specific population reflecting sampling bias. Additionally, variants from identified genes specific to either PAGE or UKBB50k analyses were more ancestrally differentiated than those in genes detected in both analyses, underlining the value of population-specific discoveries. This suggests that while EA-derived transcriptome imputation models can identify new associations in non-EA populations, models derived from closely matched reference panels may yield further insights. Our findings call for more diversity in reference datasets of tissue-specific gene expression.


Asunto(s)
Enfermedades Cardiovasculares , Estudio de Asociación del Genoma Completo , Predisposición Genética a la Enfermedad , Humanos , Estilo de Vida , Polimorfismo de Nucleótido Simple , Transcriptoma
3.
Nature ; 570(7762): 514-518, 2019 06.
Artículo en Inglés | MEDLINE | ID: mdl-31217584

RESUMEN

Genome-wide association studies (GWAS) have laid the foundation for investigations into the biology of complex traits, drug development and clinical guidelines. However, the majority of discovery efforts are based on data from populations of European ancestry1-3. In light of the differential genetic architecture that is known to exist between populations, bias in representation can exacerbate existing disease and healthcare disparities. Critical variants may be missed if they have a low frequency or are completely absent in European populations, especially as the field shifts its attention towards rare variants, which are more likely to be population-specific4-10. Additionally, effect sizes and their derived risk prediction scores derived in one population may not accurately extrapolate to other populations11,12. Here we demonstrate the value of diverse, multi-ethnic participants in large-scale genomic studies. The Population Architecture using Genomics and Epidemiology (PAGE) study conducted a GWAS of 26 clinical and behavioural phenotypes in 49,839 non-European individuals. Using strategies tailored for analysis of multi-ethnic and admixed populations, we describe a framework for analysing diverse populations, identify 27 novel loci and 38 secondary signals at known loci, as well as replicate 1,444 GWAS catalogue associations across these traits. Our data show evidence of effect-size heterogeneity across ancestries for published GWAS associations, substantial benefits for fine-mapping using diverse cohorts and insights into clinical implications. In the United States-where minority populations have a disproportionately higher burden of chronic conditions13-the lack of representation of diverse populations in genetic research will result in inequitable access to precision medicine for those with the highest burden of disease. We strongly advocate for continued, large genome-wide efforts in diverse populations to maximize genetic discovery and reduce health disparities.


Asunto(s)
Pueblo Asiatico/genética , Población Negra/genética , Estudio de Asociación del Genoma Completo/métodos , Hispánicos o Latinos/genética , Grupos Minoritarios , Herencia Multifactorial/genética , Salud de la Mujer , Estatura/genética , Estudios de Cohortes , Femenino , Genética Médica/métodos , Equidad en Salud/tendencias , Disparidades en el Estado de Salud , Humanos , Masculino , Estados Unidos
4.
PLoS Genet ; 16(3): e1008684, 2020 03.
Artículo en Inglés | MEDLINE | ID: mdl-32226016

RESUMEN

Lipid levels are important markers for the development of cardio-metabolic diseases. Although hundreds of associated loci have been identified through genetic association studies, the contribution of genetic factors to variation in lipids is not fully understood, particularly in U.S. minority groups. We performed genome-wide association analyses for four lipid traits in over 45,000 ancestrally diverse participants from the Population Architecture using Genomics and Epidemiology (PAGE) Study, followed by a meta-analysis with several European ancestry studies. We identified nine novel lipid loci, five of which showed evidence of replication in independent studies. Furthermore, we discovered one novel gene in a PrediXcan analysis, minority-specific independent signals at eight previously reported loci, and potential functional variants at two known loci through fine-mapping. Systematic examination of known lipid loci revealed smaller effect estimates in African American and Hispanic ancestry populations than those in Europeans, and better performance of polygenic risk scores based on minority-specific effect estimates. Our findings provide new insight into the genetic architecture of lipid traits and highlight the importance of conducting genetic studies in diverse populations in the era of precision medicine.


Asunto(s)
Lípidos/sangre , Lípidos/genética , Grupos Raciales/genética , Bases de Datos Genéticas , Femenino , Estudio de Asociación del Genoma Completo/métodos , Genotipo , Humanos , Lípidos/análisis , Masculino , Metagenómica/métodos , Grupos Minoritarios , Herencia Multifactorial/genética , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Estados Unidos/epidemiología
5.
Bioinformatics ; 37(19): 3372-3373, 2021 Oct 11.
Artículo en Inglés | MEDLINE | ID: mdl-33774671

RESUMEN

SUMMARY: Finding informative predictive features in high-dimensional biological case-control datasets is challenging. The Extreme Pseudo-Sampling (EPS) algorithm offers a solution to the challenge of feature selection via a combination of deep learning and linear regression models. First, using a variational autoencoder, it generates complex latent representations for the samples. Second, it classifies the latent representations of cases and controls via logistic regression. Third, it generates new samples (pseudo-samples) around the extreme cases and controls in the regression model. Finally, it trains a new regression model over the upsampled space. The most significant variables in this regression are selected. We present an open-source implementation of the algorithm that is easy to set up, use and customize. Our package enhances the original algorithm by providing new features and customizability for data preparation, model training and classification functionalities. We believe the new features will enable the adoption of the algorithm for a diverse range of datasets. AVAILABILITY AND IMPLEMENTATION: The software package for Python is available online at https://github.com/roohy/eps. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

6.
Sensors (Basel) ; 21(17)2021 Aug 28.
Artículo en Inglés | MEDLINE | ID: mdl-34502692

RESUMEN

Many approaches to time series classification rely on machine learning methods. However, there is growing interest in going beyond black box prediction models to understand discriminatory features of the time series and their associations with outcomes. One promising method is time-series shapelets (TSS), which identifies maximally discriminative subsequences of time series. For example, in environmental health applications TSS could be used to identify short-term patterns in exposure time series (shapelets) associated with adverse health outcomes. Identification of candidate shapelets in TSS is computationally intensive. The original TSS algorithm used exhaustive search. Subsequent algorithms introduced efficiencies by trimming/aggregating the set of candidates or training candidates from initialized values, but these approaches have limitations. In this paper, we introduce Wavelet-TSS (W-TSS) a novel intelligent method for identifying candidate shapelets in TSS using wavelet transformation discovery. We tested W-TSS on two datasets: (1) a synthetic example used in previous TSS studies and (2) a panel study relating exposures from residential air pollution sensors to symptoms in participants with asthma. Compared to previous TSS algorithms, W-TSS was more computationally efficient, more accurate, and was able to discover more discriminative shapelets. W-TSS does not require pre-specification of shapelet length.


Asunto(s)
Contaminación del Aire , Algoritmos , Humanos , Aprendizaje Automático , Proyectos de Investigación
7.
Hum Mol Genet ; 27(16): 2940-2953, 2018 08 15.
Artículo en Inglés | MEDLINE | ID: mdl-29878111

RESUMEN

C-reactive protein (CRP) is a circulating biomarker indicative of systemic inflammation. We aimed to evaluate genetic associations with CRP levels among non-European-ancestry populations through discovery, fine-mapping and conditional analyses. A total of 30 503 non-European-ancestry participants from 6 studies participating in the Population Architecture using Genomics and Epidemiology study had serum high-sensitivity CRP measurements and ∼200 000 single nucleotide polymorphisms (SNPs) genotyped on the Metabochip. We evaluated the association between each SNP and log-transformed CRP levels using multivariate linear regression, with additive genetic models adjusted for age, sex, the first four principal components of genetic ancestry, and study-specific factors. Differential linkage disequilibrium patterns between race/ethnicity groups were used to fine-map regions associated with CRP levels. Conditional analyses evaluated for multiple independent signals within genetic regions. One hundred and sixty-three unique variants in 12 loci in overall or race/ethnicity-stratified Metabochip-wide scans reached a Bonferroni-corrected P-value <2.5E-7. Three loci have no (HACL1, OLFML2B) or only limited (PLA2G6) previous associations with CRP levels. Six loci had different top hits in race/ethnicity-specific versus overall analyses. Fine-mapping refined the signal in six loci, particularly in HNF1A. Conditional analyses provided evidence for secondary signals in LEPR, IL1RN and HNF1A, and for multiple independent signals in CRP and APOE. We identified novel variants and loci associated with CRP levels, generalized known CRP associations to a multiethnic study population, refined association signals at several loci and found evidence for multiple independent signals at several well-known loci. This study demonstrates the benefit of conducting inclusive genetic association studies in large multiethnic populations.


Asunto(s)
Proteína C-Reactiva/genética , Estudio de Asociación del Genoma Completo , Metagenómica , Epidemiología Molecular/métodos , Liasas de Carbono-Carbono , Enoil-CoA Hidratasa/genética , Femenino , Glicoproteínas/genética , Fosfolipasas A2 Grupo VI/genética , Humanos , Desequilibrio de Ligamiento , Masculino , Polimorfismo de Nucleótido Simple , Población Blanca/genética
8.
Neuroimage ; 124(Pt B): 1196-1201, 2016 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-26087378

RESUMEN

In this paper, we describe an instance of the Northwestern University Schizophrenia Data and Software Tool (NUSDAST), a schizophrenia-related dataset hosted at XNAT Central, and the SchizConnect data portal used for accessing and sharing the dataset. NUSDAST was built and extended upon existing, standard schemas available for data sharing on XNAT Central (http://central.xnat.org/). With the creation of SchizConnect, we were able to link NUSDAST to other neuroimaging data sources and create a powerful, federated neuroimaging resource.


Asunto(s)
Bases de Datos Factuales , Difusión de la Información , Esquizofrenia/patología , Adulto , Femenino , Genotipo , Humanos , Internet , Estudios Longitudinales , Imagen por Resonancia Magnética , Masculino , Neuroimagen , Esquizofrenia/genética , Psicología del Esquizofrénico
9.
Neuroimage ; 124(Pt B): 1155-1167, 2016 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-26142271

RESUMEN

SchizConnect (www.schizconnect.org) is built to address the issues of multiple data repositories in schizophrenia neuroimaging studies. It includes a level of mediation--translating across data sources--so that the user can place one query, e.g. for diffusion images from male individuals with schizophrenia, and find out from across participating data sources how many datasets there are, as well as downloading the imaging and related data. The current version handles the Data Usage Agreements across different studies, as well as interpreting database-specific terminologies into a common framework. New data repositories can also be mediated to bring immediate access to existing datasets. Compared with centralized, upload data sharing models, SchizConnect is a unique, virtual database with a focus on schizophrenia and related disorders that can mediate live data as information is being updated at each data source. It is our hope that SchizConnect can facilitate testing new hypotheses through aggregated datasets, promoting discovery related to the mechanisms underlying schizophrenic dysfunction.


Asunto(s)
Bases de Datos Factuales , Conjuntos de Datos como Asunto , Difusión de la Información/métodos , Neuroimagen , Esquizofrenia/patología , Adolescente , Adulto , Anciano , Niño , Sistemas de Administración de Bases de Datos , Femenino , Humanos , Internet , Masculino , Persona de Mediana Edad , Terminología como Asunto , Interfaz Usuario-Computador , Adulto Joven
10.
PLoS Genet ; 9(1): e1003087, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23382687

RESUMEN

Using a phenome-wide association study (PheWAS) approach, we comprehensively tested genetic variants for association with phenotypes available for 70,061 study participants in the Population Architecture using Genomics and Epidemiology (PAGE) network. Our aim was to better characterize the genetic architecture of complex traits and identify novel pleiotropic relationships. This PheWAS drew on five population-based studies representing four major racial/ethnic groups (European Americans (EA), African Americans (AA), Hispanics/Mexican-Americans, and Asian/Pacific Islanders) in PAGE, each site with measurements for multiple traits, associated laboratory measures, and intermediate biomarkers. A total of 83 single nucleotide polymorphisms (SNPs) identified by genome-wide association studies (GWAS) were genotyped across two or more PAGE study sites. Comprehensive tests of association, stratified by race/ethnicity, were performed, encompassing 4,706 phenotypes mapped to 105 phenotype-classes, and association results were compared across study sites. A total of 111 PheWAS results had significant associations for two or more PAGE study sites with consistent direction of effect with a significance threshold of p<0.01 for the same racial/ethnic group, SNP, and phenotype-class. Among results identified for SNPs previously associated with phenotypes such as lipid traits, type 2 diabetes, and body mass index, 52 replicated previously published genotype-phenotype associations, 26 represented phenotypes closely related to previously known genotype-phenotype associations, and 33 represented potentially novel genotype-phenotype associations with pleiotropic effects. The majority of the potentially novel results were for single PheWAS phenotype-classes, for example, for CDKN2A/B rs1333049 (previously associated with type 2 diabetes in EA) a PheWAS association was identified for hemoglobin levels in AA. Of note, however, GALNT2 rs2144300 (previously associated with high-density lipoprotein cholesterol levels in EA) had multiple potentially novel PheWAS associations, with hypertension related phenotypes in AA and with serum calcium levels and coronary artery disease phenotypes in EA. PheWAS identifies associations for hypothesis generation and exploration of the genetic architecture of complex traits.


Asunto(s)
Estudios de Asociación Genética , Pleiotropía Genética , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Calcio/sangre , Enfermedad de la Arteria Coronaria/genética , Inhibidor p16 de la Quinasa Dependiente de Ciclina/genética , Etnicidad/genética , Redes Reguladoras de Genes , Genómica , Hemoglobinas/genética , Humanos , Hipertensión/genética , N-Acetilgalactosaminiltransferasas , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Polipéptido N-Acetilgalactosaminiltransferasa
11.
Gut ; 63(5): 800-7, 2014 May.
Artículo en Inglés | MEDLINE | ID: mdl-23935004

RESUMEN

OBJECTIVE: Genome-wide association studies have identified a large number of single nucleotide polymorphisms (SNPs) associated with a wide array of cancer sites. Several of these variants demonstrate associations with multiple cancers, suggesting pleiotropic effects and shared biological mechanisms across some cancers. We hypothesised that SNPs previously associated with other cancers may additionally be associated with colorectal cancer. In a large-scale study, we examined 171 SNPs previously associated with 18 different cancers for their associations with colorectal cancer. DESIGN: We examined 13 338 colorectal cancer cases and 40 967 controls from three consortia: Population Architecture using Genomics and Epidemiology (PAGE), Genetic Epidemiology of Colorectal Cancer (GECCO), and the Colon Cancer Family Registry (CCFR). Study-specific logistic regression results, adjusted for age, sex, principal components of genetic ancestry, and/or study specific factors (as relevant) were combined using fixed-effect meta-analyses to evaluate the association between each SNP and colorectal cancer risk. A Bonferroni-corrected p value of 2.92×10(-4) was used to determine statistical significance of the associations. RESULTS: Two correlated SNPs--rs10090154 and rs4242382--in Region 1 of chromosome 8q24, a prostate cancer susceptibility region, demonstrated statistically significant associations with colorectal cancer risk. The most significant association was observed with rs4242382 (meta-analysis OR=1.12; 95% CI 1.07 to 1.18; p=1.74×10(-5)), which also demonstrated similar associations across racial/ethnic populations and anatomical sub-sites. CONCLUSIONS: This is the first study to clearly demonstrate Region 1 of chromosome 8q24 as a susceptibility locus for colorectal cancer; thus, adding colorectal cancer to the list of cancer sites linked to this particular multicancer risk region at 8q24.


Asunto(s)
Neoplasias Colorrectales/genética , Pleiotropía Genética , Predisposición Genética a la Enfermedad , Polimorfismo de Nucleótido Simple , Anciano , Cromosomas Humanos Par 8 , Femenino , Marcadores Genéticos , Estudio de Asociación del Genoma Completo , Técnicas de Genotipaje , Humanos , Modelos Logísticos , Masculino , Persona de Mediana Edad , Análisis de Componente Principal , Sistema de Registros , Factores de Riesgo
12.
Commun Biol ; 7(1): 400, 2024 Apr 02.
Artículo en Inglés | MEDLINE | ID: mdl-38565955

RESUMEN

Unlocking the full dimensionality of single-cell RNA sequencing data (scRNAseq) is the next frontier to a richer, fuller understanding of cell biology. We introduce q-diffusion, a framework for capturing the coexpression structure of an entire library of genes, improving on state-of-the-art analysis tools. The method is demonstrated via three case studies. In the first, q-diffusion helps gain statistical significance for differential effects on patient outcomes when analyzing the CALGB/SWOG 80405 randomized phase III clinical trial, suggesting precision guidance for the treatment of metastatic colorectal cancer. Secondly, q-diffusion is benchmarked against existing scRNAseq classification methods using an in vitro PBMC dataset, in which the proposed method discriminates IFN-γ stimulation more accurately. The same case study demonstrates improvements in unsupervised cell clustering with the recent Tabula Sapiens human atlas. Finally, a local distributional segmentation approach for spatial scRNAseq, driven by q-diffusion, yields interpretable structures of human cortical tissue.


Asunto(s)
Leucocitos Mononucleares , Análisis de la Célula Individual , Humanos , Análisis de la Célula Individual/métodos , Perfilación de la Expresión Génica/métodos , Análisis por Conglomerados
13.
Am J Epidemiol ; 178(5): 780-90, 2013 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-23820787

RESUMEN

Common obesity risk variants have been associated with macronutrient intake; however, these associations' generalizability across populations has not been demonstrated. We investigated the associations between 6 obesity risk variants in (or near) the NEGR1, TMEM18, BDNF, FTO, MC4R, and KCTD15 genes and macronutrient intake (carbohydrate, protein, ethanol, and fat) in 3 Population Architecture using Genomics and Epidemiology (PAGE) studies: the Multiethnic Cohort Study (1993-2006) (n = 19,529), the Atherosclerosis Risk in Communities Study (1987-1989) (n = 11,114), and the Epidemiologic Architecture for Genes Linked to Environment (EAGLE) Study, which accesses data from the Third National Health and Nutrition Examination Survey (1991-1994) (n = 6,347). We used linear regression, with adjustment for age, sex, and ethnicity, to estimate the associations between obesity risk genotypes and macronutrient intake. A fixed-effects meta-analysis model showed that the FTO rs8050136 A allele (n = 36,973) was positively associated with percentage of calories derived from fat (ßmeta = 0.2244 (standard error, 0.0548); P = 4 × 10(-5)) and inversely associated with percentage of calories derived from carbohydrate (ßmeta = -0.2796 (standard error, 0.0709); P = 8 × 10(-5)). In the Multiethnic Cohort Study, percentage of calories from fat assessed at baseline was a partial mediator of the rs8050136 effect on body mass index (weight (kg)/height (m)(2)) obtained at 10 years of follow-up (mediation of effect = 0.0823 kg/m(2), 95% confidence interval: 0.0559, 0.1128). Our data provide additional evidence that the association of FTO with obesity is partially mediated by dietary intake.


Asunto(s)
Grasas de la Dieta/administración & dosificación , Ingestión de Energía , Etnicidad/genética , Obesidad/genética , Proteínas/genética , Grupos Raciales/genética , Adulto , Anciano , Dioxigenasa FTO Dependiente de Alfa-Cetoglutarato , Dieta , Femenino , Genotipo , Humanos , Masculino , Persona de Mediana Edad , Obesidad/etnología , Polimorfismo de Nucleótido Simple , Factores de Riesgo
14.
Ann Hum Genet ; 77(5): 416-25, 2013 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-23808484

RESUMEN

Numerous common genetic variants that influence plasma high-density lipoprotein cholesterol, low-density lipoprotein cholesterol (LDL-C), and triglyceride distributions have been identified via genome-wide association studies (GWAS). However, whether or not these associations are age-dependent has largely been overlooked. We conducted an association study and meta-analysis in more than 22,000 European Americans between 49 previously identified GWAS variants and the three lipid traits, stratified by age (males: <50 or ≥50 years of age; females: pre- or postmenopausal). For each variant, a test of heterogeneity was performed between the two age strata and significant Phet values were used as evidence of age-specific genetic effects. We identified seven associations in females and eight in males that displayed suggestive heterogeneity by age (Phet < 0.05). The association between rs174547 (FADS1) and LDL-C in males displayed the most evidence for heterogeneity between age groups (Phet = 1.74E-03, I(2) = 89.8), with a significant association in older males (P = 1.39E-06) but not younger males (P = 0.99). However, none of the suggestive modifying effects survived adjustment for multiple testing, highlighting the challenges of identifying modifiers of modest SNP-trait associations despite large sample sizes.


Asunto(s)
Estudio de Asociación del Genoma Completo , Lípidos/sangre , Sitios de Carácter Cuantitativo , Carácter Cuantitativo Heredable , Adulto , Anciano , delta-5 Desaturasa de Ácido Graso , Femenino , Estudios de Asociación Genética , Humanos , Masculino , Persona de Mediana Edad , Polimorfismo de Nucleótido Simple , Factores de Riesgo , Población Blanca/genética
15.
Hum Genet ; 132(12): 1427-31, 2013 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-24100633

RESUMEN

Genome-wide association studies (GWAS) have identified many variants that influence high-density lipoprotein cholesterol, low-density lipoprotein cholesterol, and/or triglycerides. However, environmental modifiers, such as smoking, of these known genotype-phenotype associations are just recently emerging in the literature. We have tested for interactions between smoking and 49 GWAS-identified variants in over 41,000 racially/ethnically diverse samples with lipid levels from the Population Architecture Using Genomics and Epidemiology (PAGE) study. Despite their biological plausibility, we were unable to detect significant SNP × smoking interactions.


Asunto(s)
Etnicidad/genética , Interacción Gen-Ambiente , Estudio de Asociación del Genoma Completo/estadística & datos numéricos , Metabolismo de los Lípidos/genética , Polimorfismo de Nucleótido Simple , Fumar/genética , HDL-Colesterol/metabolismo , LDL-Colesterol/metabolismo , Estudios de Cohortes , Femenino , Frecuencia de los Genes , Genética de Población , Humanos , Masculino , Prevalencia , Fumar/epidemiología , Fumar/etnología , Fumar/metabolismo , Triglicéridos/metabolismo , Adulto Joven
16.
BMC Med Genet ; 14: 98, 2013 Sep 25.
Artículo en Inglés | MEDLINE | ID: mdl-24063630

RESUMEN

BACKGROUND: Multiple genome-wide association studies (GWAS) within European populations have implicated common genetic variants associated with insulin and glucose concentrations. In contrast, few studies have been conducted within minority groups, which carry the highest burden of impaired glucose homeostasis and type 2 diabetes in the U.S. METHODS: As part of the 'Population Architecture using Genomics and Epidemiology (PAGE) Consortium, we investigated the association of up to 10 GWAS-identified single nucleotide polymorphisms (SNPs) in 8 genetic regions with glucose or insulin concentrations in up to 36,579 non-diabetic subjects including 23,323 European Americans (EA) and 7,526 African Americans (AA), 3,140 Hispanics, 1,779 American Indians (AI), and 811 Asians. We estimated the association between each SNP and fasting glucose or log-transformed fasting insulin, followed by meta-analysis to combine results across PAGE sites. RESULTS: Overall, our results show that 9/9 GWAS SNPs are associated with glucose in EA (p = 0.04 to 9 × 10-15), versus 3/9 in AA (p= 0.03 to 6 × 10-5), 3/4 SNPs in Hispanics, 2/4 SNPs in AI, and 1/2 SNPs in Asians. For insulin we observed a significant association with rs780094/GCKR in EA, Hispanics and AI only. CONCLUSIONS: Generalization of results across multiple racial/ethnic groups helps confirm the relevance of some of these loci for glucose and insulin metabolism. Lack of association in non-EA groups may be due to insufficient power, or to unique patterns of linkage disequilibrium.


Asunto(s)
Glucemia/análisis , Estudio de Asociación del Genoma Completo , Insulina/genética , Proteínas Adaptadoras Transductoras de Señales/genética , Adulto , Negro o Afroamericano/genética , Anciano , Alelos , Pueblo Asiatico/genética , Diabetes Mellitus Tipo 2/epidemiología , Diabetes Mellitus Tipo 2/etnología , Diabetes Mellitus Tipo 2/genética , Femenino , Frecuencia de los Genes , Sitios Genéticos , Genómica , Hispánicos o Latinos/genética , Humanos , Indígenas Norteamericanos/genética , Insulina/sangre , Masculino , Persona de Mediana Edad , Polimorfismo de Nucleótido Simple , Proteína 2 Similar al Factor de Transcripción 7/genética , Población Blanca/genética
17.
BMC Genet ; 14: 33, 2013 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-23634756

RESUMEN

BACKGROUND: High-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C), and triglyceride (TG) levels are influenced by both genes and the environment. Genome-wide association studies (GWAS) have identified ~100 common genetic variants associated with HDL-C, LDL-C, and/or TG levels, mostly in populations of European descent, but little is known about the modifiers of these associations. Here, we investigated whether GWAS-identified SNPs for lipid traits exhibited heterogeneity by sex in the Population Architecture using Genomics and Epidemiology (PAGE) study. RESULTS: A sex-stratified meta-analysis was performed for 49 GWAS-identified SNPs for fasting HDL-C, LDL-C, and ln(TG) levels among adults self-identified as European American (25,013). Heterogeneity by sex was established when phet < 0.001. There was evidence for heterogeneity by sex for two SNPs for ln(TG) in the APOA1/C3/A4/A5/BUD13 gene cluster: rs28927680 (p(het) = 7.4 x 10(-7)) and rs3135506 (p(het) = 4.3 x 10(-4)one SNP in PLTP for HDL levels (rs7679; p(het) = 9.9 x 10(-4)), and one in HMGCR for LDL levels (rs12654264; p(het) = 3.1 x 10(-5)). We replicated heterogeneity by sex in five of seventeen loci previously reported by genome-wide studies (binomial p = 0.0009). We also present results for other racial/ethnic groups in the supplementary materials, to provide a resource for future meta-analyses. CONCLUSIONS: We provide further evidence for sex-specific effects of SNPs in the APOA1/C3/A4/A5/BUD13 gene cluster, PLTP, and HMGCR on fasting triglyceride levels in European Americans from the PAGE study. Our findings emphasize the need for considering context-specific effects when interpreting genetic associations emerging from GWAS, and also highlight the difficulties in replicating interaction effects across studies and across racial/ethnic groups.


Asunto(s)
Genoma Humano , Lípidos/genética , Femenino , Heterogeneidad Genética , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Polimorfismo de Nucleótido Simple , Grupos de Población/genética
18.
Front Neuroinform ; 17: 1216443, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37554248

RESUMEN

Background: Despite the efforts of the neuroscience community, there are many published neuroimaging studies with data that are still not findable or accessible. Users face significant challenges in reusing neuroimaging data due to the lack of provenance metadata, such as experimental protocols, study instruments, and details about the study participants, which is also required for interoperability. To implement the FAIR guidelines for neuroimaging data, we have developed an iterative ontology engineering process and used it to create the NeuroBridge ontology. The NeuroBridge ontology is a computable model of provenance terms to implement FAIR principles and together with an international effort to annotate full text articles with ontology terms, the ontology enables users to locate relevant neuroimaging datasets. Methods: Building on our previous work in metadata modeling, and in concert with an initial annotation of a representative corpus, we modeled diagnosis terms (e.g., schizophrenia, alcohol usage disorder), magnetic resonance imaging (MRI) scan types (T1-weighted, task-based, etc.), clinical symptom assessments (PANSS, AUDIT), and a variety of other assessments. We used the feedback of the annotation team to identify missing metadata terms, which were added to the NeuroBridge ontology, and we restructured the ontology to support both the final annotation of the corpus of neuroimaging articles by a second, independent set of annotators, as well as the functionalities of the NeuroBridge search portal for neuroimaging datasets. Results: The NeuroBridge ontology consists of 660 classes with 49 properties with 3,200 axioms. The ontology includes mappings to existing ontologies, enabling the NeuroBridge ontology to be interoperable with other domain specific terminological systems. Using the ontology, we annotated 186 neuroimaging full-text articles describing the participant types, scanning, clinical and cognitive assessments. Conclusion: The NeuroBridge ontology is the first computable metadata model that represents the types of data available in recent neuroimaging studies in schizophrenia and substance use disorders research; it can be extended to include more granular terms as needed. This metadata ontology is expected to form the computational foundation to help both investigators to make their data FAIR compliant and support users to conduct reproducible neuroimaging research.

19.
Pac Symp Biocomput ; 28: 121-132, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36540970

RESUMEN

Groups of distantly related individuals who share a short segment of their genome identical-by-descent (IBD) can provide insights about rare traits and diseases in massive biobanks using IBD mapping. Clustering algorithms play an important role in finding these groups accurately and at scale. We set out to analyze the fitness of commonly used, fast and scalable clustering algorithms for IBD mapping applications. We designed a realistic benchmark for local IBD graphs and utilized it to compare the statistical power of clustering algorithms via simulating 2.3 million clusters across 850 experiments. We found Infomap and Markov Clustering (MCL) community detection methods to have high statistical power in most of the scenarios. They yield a 30% increase in power compared to the current state-of-art approach, with a 3 orders of magnitude lower runtime. We also found that standard clustering metrics, such as modularity, cannot predict statistical power of algorithms in IBD mapping applications. We extend our findings to real datasets by analyzing the Population Architecture using Genomics and Epidemiology (PAGE) Study dataset with 51,000 samples and 2 million shared segments on Chromosome 1, resulting in the extraction of 39 million local IBD clusters. We demonstrate the power of our approach by recovering signals of rare genetic variation in the Whole-Exome Sequence data of 200,000 individuals in the UK Biobank. We provide an efficient implementation to enable clustering at scale for IBD mapping for various populations and scenarios.Supplementary Information: The code, along with supplementary methods and figures are available at https://github.com/roohy/localIBDClustering.


Asunto(s)
Algoritmos , Biología Computacional , Humanos , Genómica , Análisis por Conglomerados
20.
Front Neuroinform ; 17: 1215261, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37720825

RESUMEN

Introduction: Open science initiatives have enabled sharing of large amounts of already collected data. However, significant gaps remain regarding how to find appropriate data, including underutilized data that exist in the long tail of science. We demonstrate the NeuroBridge prototype and its ability to search PubMed Central full-text papers for information relevant to neuroimaging data collected from schizophrenia and addiction studies. Methods: The NeuroBridge architecture contained the following components: (1) Extensible ontology for modeling study metadata: subject population, imaging techniques, and relevant behavioral, cognitive, or clinical data. Details are described in the companion paper in this special issue; (2) A natural-language based document processor that leveraged pre-trained deep-learning models on a small-sample document corpus to establish efficient representations for each article as a collection of machine-recognized ontological terms; (3) Integrated search using ontology-driven similarity to query PubMed Central and NeuroQuery, which provides fMRI activation maps along with PubMed source articles. Results: The NeuroBridge prototype contains a corpus of 356 papers from 2018 to 2021 describing schizophrenia and addiction neuroimaging studies, of which 186 were annotated with the NeuroBridge ontology. The search portal on the NeuroBridge website https://neurobridges.org/ provides an interactive Query Builder, where the user builds queries by selecting NeuroBridge ontology terms to preserve the ontology tree structure. For each return entry, links to the PubMed abstract as well as to the PMC full-text article, if available, are presented. For each of the returned articles, we provide a list of clinical assessments described in the Section "Methods" of the article. Articles returned from NeuroQuery based on the same search are also presented. Conclusion: The NeuroBridge prototype combines ontology-based search with natural-language text-mining approaches to demonstrate that papers relevant to a user's research question can be identified. The NeuroBridge prototype takes a first step toward identifying potential neuroimaging data described in full-text papers. Toward the overall goal of discovering "enough data of the right kind," ongoing work includes validating the document processor with a larger corpus, extending the ontology to include detailed imaging data, and extracting information regarding data availability from the returned publications and incorporating XNAT-based neuroimaging databases to enhance data accessibility.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA