Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 49
Filtrar
Más filtros

Banco de datos
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Nature ; 590(7845): 290-299, 2021 02.
Artículo en Inglés | MEDLINE | ID: mdl-33568819

RESUMEN

The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)1. In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.


Asunto(s)
Variación Genética/genética , Genoma Humano/genética , Genómica , National Heart, Lung, and Blood Institute (U.S.) , Medicina de Precisión , Citocromo P-450 CYP2D6/genética , Haplotipos/genética , Heterocigoto , Humanos , Mutación INDEL , Mutación con Pérdida de Función , Mutagénesis , Fenotipo , Polimorfismo de Nucleótido Simple , Densidad de Población , Medicina de Precisión/normas , Control de Calidad , Tamaño de la Muestra , Estados Unidos , Secuenciación Completa del Genoma/normas
2.
Am J Hum Genet ; 109(6): 1175-1181, 2022 06 02.
Artículo en Inglés | MEDLINE | ID: mdl-35504290

RESUMEN

Current publicly available tools that allow rapid exploration of linkage disequilibrium (LD) between markers (e.g., HaploReg and LDlink) are based on whole-genome sequence (WGS) data from 2,504 individuals in the 1000 Genomes Project. Here, we present TOP-LD, an online tool to explore LD inferred with high-coverage (∼30×) WGS data from 15,578 individuals in the NHLBI Trans-Omics for Precision Medicine (TOPMed) program. TOP-LD provides a significant upgrade compared to current LD tools, as the TOPMed WGS data provide a more comprehensive representation of genetic variation than the 1000 Genomes data, particularly for rare variants and in the specific populations that we analyzed. For example, TOP-LD encompasses LD information for 150.3, 62.2, and 36.7 million variants for European, African, and East Asian ancestral samples, respectively, offering 2.6- to 9.1-fold increase in variant coverage compared to HaploReg 4.0 or LDlink. In addition, TOP-LD includes tens of thousands of structural variants (SVs). We demonstrate the value of TOP-LD in fine-mapping at the GGT1 locus associated with gamma glutamyltransferase in the African ancestry participants in UK Biobank. Beyond fine-mapping, TOP-LD can facilitate a wide range of applications that are based on summary statistics and estimates of LD. TOP-LD is freely available online.


Asunto(s)
Estudio de Asociación del Genoma Completo , Medicina de Precisión , Pueblo Asiatico , Humanos , Desequilibrio de Ligamiento/genética , Polimorfismo de Nucleótido Simple/genética , Secuenciación Completa del Genoma
3.
Nat Methods ; 19(12): 1599-1611, 2022 12.
Artículo en Inglés | MEDLINE | ID: mdl-36303018

RESUMEN

Large-scale whole-genome sequencing studies have enabled analysis of noncoding rare-variant (RV) associations with complex human diseases and traits. Variant-set analysis is a powerful approach to study RV association. However, existing methods have limited ability in analyzing the noncoding genome. We propose a computationally efficient and robust noncoding RV association detection framework, STAARpipeline, to automatically annotate a whole-genome sequencing study and perform flexible noncoding RV association analysis, including gene-centric analysis and fixed window-based and dynamic window-based non-gene-centric analysis by incorporating variant functional annotations. In gene-centric analysis, STAARpipeline uses STAAR to group noncoding variants based on functional categories of genes and incorporate multiple functional annotations. In non-gene-centric analysis, STAARpipeline uses SCANG-STAAR to incorporate dynamic window sizes and multiple functional annotations. We apply STAARpipeline to identify noncoding RV sets associated with four lipid traits in 21,015 discovery samples from the Trans-Omics for Precision Medicine (TOPMed) program and replicate several of them in an additional 9,123 TOPMed samples. We also analyze five non-lipid TOPMed traits.


Asunto(s)
Estudio de Asociación del Genoma Completo , Genoma , Humanos , Estudio de Asociación del Genoma Completo/métodos , Secuenciación Completa del Genoma/métodos , Fenotipo , Variación Genética
4.
Nature ; 570(7762): 514-518, 2019 06.
Artículo en Inglés | MEDLINE | ID: mdl-31217584

RESUMEN

Genome-wide association studies (GWAS) have laid the foundation for investigations into the biology of complex traits, drug development and clinical guidelines. However, the majority of discovery efforts are based on data from populations of European ancestry1-3. In light of the differential genetic architecture that is known to exist between populations, bias in representation can exacerbate existing disease and healthcare disparities. Critical variants may be missed if they have a low frequency or are completely absent in European populations, especially as the field shifts its attention towards rare variants, which are more likely to be population-specific4-10. Additionally, effect sizes and their derived risk prediction scores derived in one population may not accurately extrapolate to other populations11,12. Here we demonstrate the value of diverse, multi-ethnic participants in large-scale genomic studies. The Population Architecture using Genomics and Epidemiology (PAGE) study conducted a GWAS of 26 clinical and behavioural phenotypes in 49,839 non-European individuals. Using strategies tailored for analysis of multi-ethnic and admixed populations, we describe a framework for analysing diverse populations, identify 27 novel loci and 38 secondary signals at known loci, as well as replicate 1,444 GWAS catalogue associations across these traits. Our data show evidence of effect-size heterogeneity across ancestries for published GWAS associations, substantial benefits for fine-mapping using diverse cohorts and insights into clinical implications. In the United States-where minority populations have a disproportionately higher burden of chronic conditions13-the lack of representation of diverse populations in genetic research will result in inequitable access to precision medicine for those with the highest burden of disease. We strongly advocate for continued, large genome-wide efforts in diverse populations to maximize genetic discovery and reduce health disparities.


Asunto(s)
Pueblo Asiatico/genética , Población Negra/genética , Estudio de Asociación del Genoma Completo/métodos , Hispánicos o Latinos/genética , Grupos Minoritarios , Herencia Multifactorial/genética , Salud de la Mujer , Estatura/genética , Estudios de Cohortes , Femenino , Genética Médica/métodos , Equidad en Salud/tendencias , Disparidades en el Estado de Salud , Humanos , Masculino , Estados Unidos
5.
Am J Hum Genet ; 108(10): 1836-1851, 2021 10 07.
Artículo en Inglés | MEDLINE | ID: mdl-34582791

RESUMEN

Many common and rare variants associated with hematologic traits have been discovered through imputation on large-scale reference panels. However, the majority of genome-wide association studies (GWASs) have been conducted in Europeans, and determining causal variants has proved challenging. We performed a GWAS of total leukocyte, neutrophil, lymphocyte, monocyte, eosinophil, and basophil counts generated from 109,563,748 variants in the autosomes and the X chromosome in the Trans-Omics for Precision Medicine (TOPMed) program, which included data from 61,802 individuals of diverse ancestry. We discovered and replicated 7 leukocyte trait associations, including (1) the association between a chromosome X, pseudo-autosomal region (PAR), noncoding variant located between cytokine receptor genes (CSF2RA and CLRF2) and lower eosinophil count; and (2) associations between single variants found predominantly among African Americans at the S1PR3 (9q22.1) and HBB (11p15.4) loci and monocyte and lymphocyte counts, respectively. We further provide evidence indicating that the newly discovered eosinophil-lowering chromosome X PAR variant might be associated with reduced susceptibility to common allergic diseases such as atopic dermatitis and asthma. Additionally, we found a burden of very rare FLT3 (13q12.2) variants associated with monocyte counts. Together, these results emphasize the utility of whole-genome sequencing in diverse samples in identifying associations missed by European-ancestry-driven GWASs.


Asunto(s)
Asma/epidemiología , Biomarcadores/metabolismo , Dermatitis Atópica/epidemiología , Leucocitos/patología , Polimorfismo de Nucleótido Simple , Enfermedad Pulmonar Obstructiva Crónica/epidemiología , Sitios de Carácter Cuantitativo , Asma/genética , Asma/metabolismo , Asma/patología , Dermatitis Atópica/genética , Dermatitis Atópica/metabolismo , Dermatitis Atópica/patología , Predisposición Genética a la Enfermedad , Genoma Humano , Estudio de Asociación del Genoma Completo , Humanos , National Heart, Lung, and Blood Institute (U.S.) , Fenotipo , Pronóstico , Proteoma/análisis , Proteoma/metabolismo , Enfermedad Pulmonar Obstructiva Crónica/genética , Enfermedad Pulmonar Obstructiva Crónica/metabolismo , Enfermedad Pulmonar Obstructiva Crónica/patología , Reino Unido/epidemiología , Estados Unidos/epidemiología , Secuenciación Completa del Genoma
6.
Hum Mol Genet ; 30(23): 2362-2369, 2021 11 16.
Artículo en Inglés | MEDLINE | ID: mdl-34270706

RESUMEN

Numerous genome-wide association studies (GWASs) have been conducted for the identification of genetic variants involved with human height. The vast majority of these studies, however, have been conducted in populations of European ancestry. Here, we report the first GWAS of adult height in the Taiwan Biobank using a discovery sample of 14 571 individuals and an independent replication sample of 20 506 individuals. From our analysis, we generalize to the Taiwanese population genome-wide significant associations with height and 18 previously identified genes in European and non-Taiwanese East Asian populations. We also identify and replicate, at the genome-wide significance level, associated variants for height in four novel genes at two loci that have not previously been reported: RASA2 on chromosome 3 and NABP2, RNF41 and SLC39A5 at 12q13.3 on chromosome 12. RASA2 and RNF41 are strong candidates for having a role in height with copy number and loss of function variants in RASA2 previously found to be associated with short stature disorders, and decreased expression of the RNF41 gene resulting in insulin resistance in skeletal muscle. The results from our analysis of the Taiwan Biobank underscore the potential for the identification of novel genetic discoveries in underrepresented worldwide populations, even for traits, such as height, that have been extensively investigated in large-scale studies of European ancestry populations.


Asunto(s)
Bancos de Muestras Biológicas , Estatura/genética , Proteínas de Transporte de Catión/genética , Estudio de Asociación del Genoma Completo , Ubiquitina-Proteína Ligasas/genética , Proteínas Activadoras de ras GTPasa/genética , Adulto , Alelos , Femenino , Estudios de Asociación Genética , Humanos , Masculino , Persona de Mediana Edad , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Taiwán
7.
PLoS Genet ; 15(4): e1007739, 2019 04.
Artículo en Inglés | MEDLINE | ID: mdl-30990817

RESUMEN

Sleep disordered breathing (SDB)-related overnight hypoxemia is associated with cardiometabolic disease and other comorbidities. Understanding the genetic bases for variations in nocturnal hypoxemia may help understand mechanisms influencing oxygenation and SDB-related mortality. We conducted genome-wide association tests across 10 cohorts and 4 populations to identify genetic variants associated with three correlated measures of overnight oxyhemoglobin saturation: average and minimum oxyhemoglobin saturation during sleep and the percent of sleep with oxyhemoglobin saturation under 90%. The discovery sample consisted of 8,326 individuals. Variants with p < 1 × 10(-6) were analyzed in a replication group of 14,410 individuals. We identified 3 significantly associated regions, including 2 regions in multi-ethnic analyses (2q12, 10q22). SNPs in the 2q12 region associated with minimum SpO2 (rs78136548 p = 2.70 × 10(-10)). SNPs at 10q22 were associated with all three traits including average SpO2 (rs72805692 p = 4.58 × 10(-8)). SNPs in both regions were associated in over 20,000 individuals and are supported by prior associations or functional evidence. Four additional significant regions were detected in secondary sex-stratified and combined discovery and replication analyses, including a region overlapping Reelin, a known marker of respiratory complex neurons.These are the first genome-wide significant findings reported for oxyhemoglobin saturation during sleep, a phenotype of high clinical interest. Our replicated associations with HK1 and IL18R1 suggest that variants in inflammatory pathways, such as the biologically-plausible NLRP3 inflammasome, may contribute to nocturnal hypoxemia.


Asunto(s)
Hexoquinasa/genética , Subunidad alfa del Receptor de Interleucina-18/genética , Oxihemoglobinas/metabolismo , Sueño/genética , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Moléculas de Adhesión Celular Neuronal/genética , Biología Computacional , Proteínas de la Matriz Extracelular/genética , Femenino , Redes Reguladoras de Genes , Variación Genética , Estudio de Asociación del Genoma Completo , Humanos , Hipoxia/sangre , Hipoxia/genética , Masculino , Persona de Mediana Edad , Proteína con Dominio Pirina 3 de la Familia NLR/genética , Proteínas del Tejido Nervioso/genética , Oxígeno/sangre , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Proteína Reelina , Serina Endopeptidasas/genética , Síndromes de la Apnea del Sueño/sangre , Síndromes de la Apnea del Sueño/genética , Adulto Joven
8.
Hum Mol Genet ; 28(4): 675-687, 2019 02 15.
Artículo en Inglés | MEDLINE | ID: mdl-30403821

RESUMEN

Obstructive sleep apnea (OSA) is a common disorder associated with increased risk of cardiovascular disease and mortality. Its prevalence and severity vary across ancestral background. Although OSA traits are heritable, few genetic associations have been identified. To identify genetic regions associated with OSA and improve statistical power, we applied admixture mapping on three primary OSA traits [the apnea hypopnea index (AHI), overnight average oxyhemoglobin saturation (SaO2) and percentage time SaO2 < 90%] and a secondary trait (respiratory event duration) in a Hispanic/Latino American population study of 11 575 individuals with significant variation in ancestral background. Linear mixed models were performed using previously inferred African, European and Amerindian local genetic ancestry markers. Global African ancestry was associated with a lower AHI, higher SaO2 and shorter event duration. Admixture mapping analysis of the primary OSA traits identified local African ancestry at the chromosomal region 2q37 as genome-wide significantly associated with AHI (P < 5.7 × 10-5), and European and Amerindian ancestries at 18q21 suggestively associated with both AHI and percentage time SaO2 < 90% (P < 10-3). Follow-up joint ancestry-SNP association analyses identified novel variants in ferrochelatase (FECH), significantly associated with AHI and percentage time SaO2 < 90% after adjusting for multiple tests (P < 8 × 10-6). These signals contributed to the admixture mapping associations and were replicated in independent cohorts. In this first admixture mapping study of OSA, novel associations with variants in the iron/heme metabolism pathway suggest a role for iron in influencing respiratory traits underlying OSA.


Asunto(s)
Ferroquelatasa/genética , Estudio de Asociación del Genoma Completo , Apnea Obstructiva del Sueño/genética , Anciano , Mapeo Cromosómico , Femenino , Genotipo , Hispánicos o Latinos/genética , Humanos , Masculino , Persona de Mediana Edad , Polimorfismo de Nucleótido Simple/genética , Polisomnografía , Apnea Obstructiva del Sueño/diagnóstico por imagen , Apnea Obstructiva del Sueño/fisiopatología , Población Blanca/genética
9.
Bioinformatics ; 35(24): 5346-5348, 2019 12 15.
Artículo en Inglés | MEDLINE | ID: mdl-31329242

RESUMEN

SUMMARY: The Genomic Data Storage (GDS) format provides efficient storage and retrieval of genotypes measured by microarrays and sequencing. We developed GENESIS to perform various single- and aggregate-variant association tests using genotype data stored in GDS format. GENESIS implements highly flexible mixed models, allowing for different link functions, multiple variance components and phenotypic heteroskedasticity. GENESIS integrates cohesively with other R/Bioconductor packages to build a complete genomic analysis workflow entirely within the R environment. AVAILABILITY AND IMPLEMENTATION: https://bioconductor.org/packages/GENESIS; vignettes included. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genómica , Programas Informáticos , Pruebas Genéticas , Genoma , Análisis de Secuencia
10.
Hum Mol Genet ; 26(10): 1966-1978, 2017 05 15.
Artículo en Inglés | MEDLINE | ID: mdl-28334935

RESUMEN

Genetic variants contribute to normal variation of iron-related traits and may also cause clinical syndromes of iron deficiency or excess. Iron overload and deficiency can adversely affect human health. For example, elevated iron storage is associated with increased diabetes risk, although mechanisms are still being investigated. We conducted the first genome-wide association study of serum iron, total iron binding capacity (TIBC), transferrin saturation, and ferritin in a Hispanic/Latino cohort, the Hispanic Community Health Study/Study of Latinos (>12 000 participants) and also assessed the generalization of previously known loci to this population. We then evaluated whether iron-associated variants were associated with diabetes and glycemic traits. We found evidence for a novel association between TIBC and a variant near the gene for protein phosphatase 1, regulatory subunit 3B (PPP1R3B; rs4841132, ß = -0.116, P = 7.44 × 10-8). The effect strengthened when iron deficient individuals were excluded (ß = -0.121, P = 4.78 × 10-9). Ten of sixteen variants previously associated with iron traits generalized to HCHS/SOL, including variants at the transferrin (TF), hemochromatosis (HFE), fatty acid desaturase 2 (FADS2)/myelin regulatory factor (MYRF), transmembrane protease, serine 6 (TMPRSS6), transferrin receptor (TFR2), N-acetyltransferase 2 (arylamine N-acetyltransferase) (NAT2), ABO blood group (ABO), and GRB2 associated binding protein 3 (GAB3) loci. In examining iron variant associations with glucose homeostasis, an iron-raising variant of TMPRSS6 was associated with lower HbA1c levels (P = 8.66 × 10-10). This association was attenuated upon adjustment for iron measures. In contrast, the iron-raising allele of PPP1R3B was associated with higher levels of fasting glucose (P = 7.70 × 10-7) and fasting insulin (P = 4.79 × 10-6), but these associations were not attenuated upon adjustment for TIBC-so iron is not likely a mediator. These results provide new genetic information on iron traits and their connection with glucose homeostasis.


Asunto(s)
Glucosa/genética , Glucosa/metabolismo , Hierro/metabolismo , Adulto , Anemia Ferropénica/sangre , Antígenos CD , Glucemia/metabolismo , Diabetes Mellitus/genética , Diabetes Mellitus/metabolismo , Ayuno , Femenino , Ferritinas/análisis , Ferritinas/sangre , Ferritinas/metabolismo , Estudios de Asociación Genética/métodos , Variación Genética/genética , Estudio de Asociación del Genoma Completo , Genómica , Hemocromatosis/genética , Hispánicos o Latinos/genética , Hospitales Comunitarios/métodos , Humanos , Insulina/metabolismo , Hierro/sangre , Masculino , Proteínas de la Membrana/genética , Proteínas de la Membrana/metabolismo , Persona de Mediana Edad , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Receptores de Transferrina/genética , Factores de Riesgo , Serina Endopeptidasas/genética , Serina Endopeptidasas/metabolismo , Transferrina/análisis , Transferrina/metabolismo
11.
Am J Hum Genet ; 98(1): 127-48, 2016 Jan 07.
Artículo en Inglés | MEDLINE | ID: mdl-26748516

RESUMEN

Genealogical inference from genetic data is essential for a variety of applications in human genetics. In genome-wide and sequencing association studies, for example, accurate inference on both recent genetic relatedness, such as family structure, and more distant genetic relatedness, such as population structure, is necessary for protection against spurious associations. Distinguishing familial relatedness from population structure with genotype data, however, is difficult because both manifest as genetic similarity through the sharing of alleles. Existing approaches for inference on recent genetic relatedness have limitations in the presence of population structure, where they either (1) make strong and simplifying assumptions about population structure, which are often untenable, or (2) require correct specification of and appropriate reference population panels for the ancestries in the sample, which might be unknown or not well defined. Here, we propose PC-Relate, a model-free approach for estimating commonly used measures of recent genetic relatedness, such as kinship coefficients and IBD sharing probabilities, in the presence of unspecified structure. PC-Relate uses principal components calculated from genome-screen data to partition genetic correlations among sampled individuals due to the sharing of recent ancestors and more distant common ancestry into two separate components, without requiring specification of the ancestral populations or reference population panels. In simulation studies with population structure, including admixture, we demonstrate that PC-Relate provides accurate estimates of genetic relatedness and improved relationship classification over widely used approaches. We further demonstrate the utility of PC-Relate in applications to three ancestrally diverse samples that vary in both size and genealogical complexity.


Asunto(s)
Modelos Genéticos , Humanos , Los Angeles , Americanos Mexicanos/genética
12.
Am J Hum Genet ; 98(4): 653-66, 2016 Apr 07.
Artículo en Inglés | MEDLINE | ID: mdl-27018471

RESUMEN

Linear mixed models (LMMs) are widely used in genome-wide association studies (GWASs) to account for population structure and relatedness, for both continuous and binary traits. Motivated by the failure of LMMs to control type I errors in a GWAS of asthma, a binary trait, we show that LMMs are generally inappropriate for analyzing binary traits when population stratification leads to violation of the LMM's constant-residual variance assumption. To overcome this problem, we develop a computationally efficient logistic mixed model approach for genome-wide analysis of binary traits, the generalized linear mixed model association test (GMMAT). This approach fits a logistic mixed model once per GWAS and performs score tests under the null hypothesis of no association between a binary trait and individual genetic variants. We show in simulation studies and real data analysis that GMMAT effectively controls for population structure and relatedness when analyzing binary traits in a wide variety of study designs.


Asunto(s)
Estudios de Asociación Genética/métodos , Genética de Población/métodos , Modelos Lineales , Fenotipo , Asma/genética , Estudios de Casos y Controles , América Central , Simulación por Computador , Técnicas de Genotipaje , Humanos , Modelos Logísticos , Modelos Genéticos , Filogeografía , Polimorfismo de Nucleótido Simple , América del Sur
13.
Am J Hum Genet ; 98(2): 229-42, 2016 Feb 04.
Artículo en Inglés | MEDLINE | ID: mdl-26805783

RESUMEN

Platelets play an essential role in hemostasis and thrombosis. We performed a genome-wide association study of platelet count in 12,491 participants of the Hispanic Community Health Study/Study of Latinos by using a mixed-model method that accounts for admixture and family relationships. We discovered and replicated associations with five genes (ACTN1, ETV7, GABBR1-MOG, MEF2C, and ZBTB9-BAK1). Our strongest association was with Amerindian-specific variant rs117672662 (p value = 1.16 × 10(-28)) in ACTN1, a gene implicated in congenital macrothrombocytopenia. rs117672662 exhibited allelic differences in transcriptional activity and protein binding in hematopoietic cells. Our results underscore the value of diverse populations to extend insights into the allelic architecture of complex traits.


Asunto(s)
Estudios de Asociación Genética/métodos , Sitios Genéticos , Hispánicos o Latinos/genética , Recuento de Plaquetas , Actinina/genética , Adolescente , Adulto , Anciano , Alelos , Frecuencia de los Genes , Genotipo , Técnicas de Genotipaje , Humanos , Factores de Transcripción MEF2/genética , Proteínas de la Membrana/genética , Persona de Mediana Edad , Fenotipo , Polimorfismo de Nucleótido Simple , Receptores de GABA-B/genética , Adulto Joven
14.
Am J Hum Genet ; 98(1): 165-84, 2016 Jan 07.
Artículo en Inglés | MEDLINE | ID: mdl-26748518

RESUMEN

US Hispanic/Latino individuals are diverse in genetic ancestry, culture, and environmental exposures. Here, we characterized and controlled for this diversity in genome-wide association studies (GWASs) for the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). We simultaneously estimated population-structure principal components (PCs) robust to familial relatedness and pairwise kinship coefficients (KCs) robust to population structure, admixture, and Hardy-Weinberg departures. The PCs revealed substantial genetic differentiation within and among six self-identified background groups (Cuban, Dominican, Puerto Rican, Mexican, and Central and South American). To control for variation among groups, we developed a multi-dimensional clustering method to define a "genetic-analysis group" variable that retains many properties of self-identified background while achieving substantially greater genetic homogeneity within groups and including participants with non-specific self-identification. In GWASs of 22 biomedical traits, we used a linear mixed model (LMM) including pairwise empirical KCs to account for familial relatedness, PCs for ancestry, and genetic-analysis groups for additional group-associated effects. Including the genetic-analysis group as a covariate accounted for significant trait variation in 8 of 22 traits, even after we fit 20 PCs. Additionally, genetic-analysis groups had significant heterogeneity of residual variance for 20 of 22 traits, and modeling this heteroscedasticity within the LMM reduced genomic inflation for 19 traits. Furthermore, fitting an LMM that utilized a genetic-analysis group rather than a self-identified background group achieved higher power to detect previously reported associations. We expect that the methods applied here will be useful in other studies with multiple ethnic groups, admixture, and relatedness.


Asunto(s)
Variación Genética , Hispánicos o Latinos/genética , Estudio de Asociación del Genoma Completo , Humanos , Estados Unidos
15.
Am J Respir Cell Mol Biol ; 58(3): 391-401, 2018 03.
Artículo en Inglés | MEDLINE | ID: mdl-29077507

RESUMEN

Obstructive sleep apnea (OSA) is a common heritable disorder displaying marked sexual dimorphism in disease prevalence and progression. Previous genetic association studies have identified a few genetic loci associated with OSA and related quantitative traits, but they have only focused on single ethnic groups, and a large proportion of the heritability remains unexplained. The apnea-hypopnea index (AHI) is a commonly used quantitative measure characterizing OSA severity. Because OSA differs by sex, and the pathophysiology of obstructive events differ in rapid eye movement (REM) and non-REM (NREM) sleep, we hypothesized that additional genetic association signals would be identified by analyzing the NREM/REM-specific AHI and by conducting sex-specific analyses in multiethnic samples. We performed genome-wide association tests for up to 19,733 participants of African, Asian, European, and Hispanic/Latino American ancestry in 7 studies. We identified rs12936587 on chromosome 17 as a possible quantitative trait locus for NREM AHI in men (N = 6,737; P = 1.7 × 10-8) but not in women (P = 0.77). The association with NREM AHI was replicated in a physiological research study (N = 67; P = 0.047). This locus overlapping the RAI1 gene and encompassing genes PEMT1, SREBF1, and RASD1 was previously reported to be associated with coronary artery disease, lipid metabolism, and implicated in Potocki-Lupski syndrome and Smith-Magenis syndrome, which are characterized by abnormal sleep phenotypes. We also identified gene-by-sex interactions in suggestive association regions, suggesting that genetic variants for AHI appear to vary by sex, consistent with the clinical observations of strong sexual dimorphism.


Asunto(s)
Estudio de Asociación del Genoma Completo , Sitios de Carácter Cuantitativo/genética , Apnea Obstructiva del Sueño/genética , Sueño REM/fisiología , Factores de Transcripción/genética , Adulto , Anciano , Femenino , Humanos , Masculino , Persona de Mediana Edad , Fosfatidiletanolamina N-Metiltransferasa/genética , Caracteres Sexuales , Proteína 1 de Unión a los Elementos Reguladores de Esteroles/genética , Transactivadores , Proteínas ras/genética
16.
Hum Mol Genet ; 25(4): 807-16, 2016 Feb 15.
Artículo en Inglés | MEDLINE | ID: mdl-26662797

RESUMEN

Dental caries is the most common chronic disease worldwide, and exhibits profound disparities in the USA with racial and ethnic minorities experiencing disproportionate disease burden. Though heritable, the specific genes influencing risk of dental caries remain largely unknown. Therefore, we performed genome-wide association scans (GWASs) for dental caries in a population-based cohort of 12 000 Hispanic/Latino participants aged 18-74 years from the HCHS/SOL. Intra-oral examinations were used to generate two common indices of dental caries experience which were tested for association with 27.7 M genotyped or imputed single-nucleotide polymorphisms separately in the six ancestry groups. A mixed-models approach was used, which adjusted for age, sex, recruitment site, five principal components of ancestry and additional features of the sampling design. Meta-analyses were used to combine GWAS results across ancestry groups. Heritability estimates ranged from 20-53% in the six ancestry groups. The most significant association observed via meta-analysis for both phenotypes was in the region of the NAMPT gene (rs190395159; P-value = 6 × 10(-10)), which is involved in many biological processes including periodontal healing. Another significant association was observed for rs72626594 (P-value = 3 × 10(-8)) downstream of BMP7, a tooth development gene. Other associations were observed in genes lacking known or plausible roles in dental caries. In conclusion, this was the largest GWAS of dental caries, to date and was the first to target Hispanic/Latino populations. Understanding the factors influencing dental caries susceptibility may lead to improvements in prediction, prevention and disease management, which may ultimately reduce the disparities in oral health across racial, ethnic and socioeconomic strata.


Asunto(s)
Caries Dental/etnología , Caries Dental/genética , Hispánicos o Latinos/genética , Adulto , Anciano , Centros Comunitarios de Salud , Femenino , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Persona de Mediana Edad , Polimorfismo de Nucleótido Simple
17.
Bioinformatics ; 33(15): 2251-2257, 2017 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-28334390

RESUMEN

MOTIVATION: Whole-genome sequencing (WGS) data are being generated at an unprecedented rate. Analysis of WGS data requires a flexible data format to store the different types of DNA variation. Variant call format (VCF) is a general text-based format developed to store variant genotypes and their annotations. However, VCF files are large and data retrieval is relatively slow. Here we introduce a new WGS variant data format implemented in the R/Bioconductor package 'SeqArray' for storing variant calls in an array-oriented manner which provides the same capabilities as VCF, but with multiple high compression options and data access using high-performance parallel computing. RESULTS: Benchmarks using 1000 Genomes Phase 3 data show file sizes are 14.0 Gb (VCF), 12.3 Gb (BCF, binary VCF), 3.5 Gb (BGT) and 2.6 Gb (SeqArray) respectively. Reading genotypes in the SeqArray package are two to three times faster compared with the htslib C library using BCF files. For the allele frequency calculation, the implementation in the SeqArray package is over 5 times faster than PLINK v1.9 with VCF and BCF files, and over 16 times faster than vcftools. When used in conjunction with R/Bioconductor packages, the SeqArray package provides users a flexible, feature-rich, high-performance programming environment for analysis of WGS variant data. AVAILABILITY AND IMPLEMENTATION: http://www.bioconductor.org/packages/SeqArray. CONTACT: zhengx@u.washington.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Compresión de Datos/métodos , Variación Genética , Programas Informáticos , Secuenciación Completa del Genoma/métodos , Genoma Humano , Genómica/métodos , Humanos
18.
Eur Respir J ; 49(5)2017 05.
Artículo en Inglés | MEDLINE | ID: mdl-28461288

RESUMEN

Puerto Ricans are disproportionately affected with asthma in the USA. In this study, we aim to identify genetic variants that confer susceptibility to asthma in Puerto Ricans.We conducted a meta-analysis of genome-wide association studies (GWAS) of asthma in Puerto Ricans, including participants from: the Genetics of Asthma in Latino Americans (GALA) I-II, the Hartford-Puerto Rico Study and the Hispanic Community Health Study. Moreover, we examined whether susceptibility loci identified in previous meta-analyses of GWAS are associated with asthma in Puerto Ricans.The only locus to achieve genome-wide significance was chromosome 17q21, as evidenced by our top single nucleotide polymorphism (SNP), rs907092 (OR 0.71, p=1.2×10-12) at IKZF3 Similar to results in non-Puerto Ricans, SNPs in genes in the same linkage disequilibrium block as IKZF3 (e.g. ZPBP2, ORMDL3 and GSDMB) were significantly associated with asthma in Puerto Ricans. With regard to results from a meta-analysis in Europeans, we replicated findings for rs2305480 at GSDMB, but not for SNPs in any other genes. On the other hand, we replicated results from a meta-analysis of North American populations for SNPs at IL1RL1, TSLP and GSDMB but not for IL33Our findings suggest that common variants on chromosome 17q21 have the greatest effects on asthma in Puerto Ricans.


Asunto(s)
Asma/genética , Estudio de Asociación del Genoma Completo , Hispánicos o Latinos/genética , Polimorfismo de Nucleótido Simple , Adolescente , Adulto , Asma/etnología , Niño , Cromosomas Humanos Par 17/genética , Femenino , Predisposición Genética a la Enfermedad , Humanos , Desequilibrio de Ligamiento , Modelos Logísticos , Masculino , Persona de Mediana Edad , Puerto Rico/epidemiología , Adulto Joven
19.
Am J Respir Crit Care Med ; 194(7): 886-897, 2016 Oct 01.
Artículo en Inglés | MEDLINE | ID: mdl-26977737

RESUMEN

RATIONALE: Obstructive sleep apnea is a common disorder associated with increased risk for cardiovascular disease, diabetes, and premature mortality. Although there is strong clinical and epidemiologic evidence supporting the importance of genetic factors in influencing obstructive sleep apnea, its genetic basis is still largely unknown. Prior genetic studies focused on traits defined using the apnea-hypopnea index, which contains limited information on potentially important genetically determined physiologic factors, such as propensity for hypoxemia and respiratory arousability. OBJECTIVES: To define novel obstructive sleep apnea genetic risk loci for obstructive sleep apnea, we conducted genome-wide association studies of quantitative traits in Hispanic/Latino Americans from three cohorts. METHODS: Genome-wide data from as many as 12,558 participants in the Hispanic Community Health Study/Study of Latinos, Multi-Ethnic Study of Atherosclerosis, and Starr County Health Studies population-based cohorts were metaanalyzed for association with the apnea-hypopnea index, average oxygen saturation during sleep, and average respiratory event duration. MEASUREMENTS AND MAIN RESULTS: Two novel loci were identified at genome-level significance (rs11691765, GPR83, P = 1.90 × 10-8 for the apnea-hypopnea index, and rs35424364; C6ORF183/CCDC162P, P = 4.88 × 10-8 for respiratory event duration) and seven additional loci were identified with suggestive significance (P < 5 × 10-7). Secondary sex-stratified analyses also identified one significant and several suggestive associations. Multiple loci overlapped genes with biologic plausibility. CONCLUSIONS: These are the first genome-level significant findings reported for obstructive sleep apnea-related physiologic traits in any population. These findings identify novel associations in inflammatory, hypoxia signaling, and sleep pathways.

20.
Genet Epidemiol ; 39(4): 276-93, 2015 May.
Artículo en Inglés | MEDLINE | ID: mdl-25810074

RESUMEN

Population structure inference with genetic data has been motivated by a variety of applications in population genetics and genetic association studies. Several approaches have been proposed for the identification of genetic ancestry differences in samples where study participants are assumed to be unrelated, including principal components analysis (PCA), multidimensional scaling (MDS), and model-based methods for proportional ancestry estimation. Many genetic studies, however, include individuals with some degree of relatedness, and existing methods for inferring genetic ancestry fail in related samples. We present a method, PC-AiR, for robust population structure inference in the presence of known or cryptic relatedness. PC-AiR utilizes genome-screen data and an efficient algorithm to identify a diverse subset of unrelated individuals that is representative of all ancestries in the sample. The PC-AiR method directly performs PCA on the identified ancestry representative subset and then predicts components of variation for all remaining individuals based on genetic similarities. In simulation studies and in applications to real data from Phase III of the HapMap Project, we demonstrate that PC-AiR provides a substantial improvement over existing approaches for population structure inference in related samples. We also demonstrate significant efficiency gains, where a single axis of variation from PC-AiR provides better prediction of ancestry in a variety of structure settings than using 10 (or more) components of variation from widely used PCA and MDS approaches. Finally, we illustrate that PC-AiR can provide improved population stratification correction over existing methods in genetic association studies with population structure and relatedness.


Asunto(s)
Algoritmos , Estudios de Asociación Genética , Genética de Población , Grupos de Población/genética , Mapeo Cromosómico , Simulación por Computador , Proyecto Mapa de Haplotipos , Humanos , Modelos Genéticos , Polimorfismo de Nucleótido Simple/genética , Análisis de Componente Principal
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA