Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
1.
Thorax ; 2021 Apr 22.
Artigo em Inglês | MEDLINE | ID: mdl-33888571

RESUMO

Most genome-wide association studies of obesity and body mass index (BMI) have so far assumed an additive mode of inheritance in their analysis, although association testing supports a recessive effect for some of the established loci, for example, rs1421085 in FTO In two whole-genome sequencing (WGS) studies of children with asthma and their parents (892 Costa Rican trios and 286 North American trios), we discovered an association between a locus (rs9292139) in LOC102724122 and BMI that reaches genome-wide significance under a recessive model in the combined analysis. As the association does not achieve significance under an additive model, our finding illustrates the benefits of the recessive model in WGS analyses.

2.
Bioinformatics ; 2020 Dec 26.
Artigo em Inglês | MEDLINE | ID: mdl-33367522

RESUMO

MOTIVATION: Analysis of rare variants in family-based studies remains a challenge. Transmission-based approaches provide robustness against population stratification, but the evaluation of the significance of test statistics based on asymptotic theory can be imprecise. In addition, power will depend heavily on the choice of the test statistic and on the underlying genetic architecture of the locus, which will be generally unknown. RESULTS: In our proposed framework, we utilize the FBAT haplotype algorithm to obtain the conditional offspring genotype distribution under the null hypothesis given the sufficient statistic. Based on this conditional offspring genotype distribution, the significance of virtually any association test statistic can be evaluated based on simulations or exact computations, without the need for asymptotic approximations. Besides standard linear burden-type statistics, this enables our approach to also evaluate other test statistics such as SKATs, higher criticism approaches, and maximum-single-variant-statistics, where asymptotic theory might be involved or does not provide accurate approximations for rare variant data. Based on the p-values, combined test statistics such as the aggregated Cauchy association test (ACAT) can also be utilized. In simulation studies, we show that our framework outperforms existing approaches for family-based studies in several scenarios. We also applied our methodology to a TOPMed whole-genome sequencing dataset with 897 asthmatic trios from Costa Rica. AVAILABILITY: FBAT software is available at https://sites.google.com/view/fbatwebpage. Simulation code is available at https://github.com/julianhecker/FBAT_rare_variant_test_simulations. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

3.
Genet Epidemiol ; 2020 Sep 14.
Artigo em Inglês | MEDLINE | ID: mdl-32929743

RESUMO

locStra is an R -package for the analysis of regional and global population stratification in whole-genome sequencing (WGS) studies, where regional stratification refers to the substructure defined by the loci in a particular region on the genome. Population substructure can be assessed based on the genetic covariance matrix, the genomic relationship matrix, and the unweighted/weighted genetic Jaccard similarity matrix. Using a sliding window approach, the regional similarity matrices are compared with the global ones, based on user-defined window sizes and metrics, for example, the correlation between regional and global eigenvectors. An algorithm for the specification of the window size is provided. As the implementation fully exploits sparse matrix algebra and is written in C++, the analysis is highly efficient. Even on single cores, for realistic study sizes (several thousand subjects, several million rare variants per subject), the runtime for the genome-wide computation of all regional similarity matrices does typically not exceed one hour, enabling an unprecedented investigation of regional stratification across the entire genome. The package is applied to three WGS studies, illustrating the varying patterns of regional substructure across the genome and its beneficial effects on association testing.

5.
Eur Respir J ; 2020 Aug 27.
Artigo em Inglês | MEDLINE | ID: mdl-32855217

RESUMO

BACKGROUND: Most children diagnosed with asthma suffer from respiratory symptoms such as cough, dyspnea, and wheezing which are also important markers of overall respiratory function. A decade of genome-wide association studies (GWAS) have investigated the genetic susceptibility of asthma diagnosis itself, but few have focused on important respiratory symptoms that characterise childhood asthma. METHOD: Using whole-genome sequencing (WGS) data for 894 asthmatic trios from a Costa Rican cohort, we performed family-based association tests (FBATs) to assess the association between genetic variants and multiple asthma-relevant respiratory phenotypes: cough, phlegm, wheezing, exertional dyspnea, and exertional chest tightness. We tested whether genome-wide significant associations replicated in two additional studies: 1) 286 WGS trios from the Childhood Asthma Management Program (CAMP), and 2) 2691 African American (AA) current or former smokers from the COPDGene study. RESULTS: In the 894 Costa Rican trios, we identified a genome-wide significant association between exertional dyspnea and single nucleotide polymorphism (SNP) rs10165869, located on chromosome 2q37.3 with a p value of 3.49×10-9 that was replicated in the CAMP cohort (p=0.0222) with the same direction of association (combined p=5.54×10-10), but was not associated in the AA subjects from COPDGene. We also found suggestive evidence of a link between SNP rs10165869 and the atypical chemokine receptor 3 (ACKR3) for the biological interpretation. CONCLUSION: We identified and replicated a novel association between exertional dyspnea and SNP rs10165869 in childhood asthma which encourages to discover respiratory symptom associated variants in various airway diseases.

6.
Sci Rep ; 10(1): 5029, 2020 03 19.
Artigo em Inglês | MEDLINE | ID: mdl-32193444

RESUMO

With the advent of whole genome-sequencing (WGS) studies, family-based designs enable sex-specific analysis approaches that can be applied to only affected individuals; tests using family-based designs are attractive because they are completely robust against the effects of population substructure. These advantages make family-based association tests (FBATs) that use siblings as well as parents especially suited for the analysis of late-onset diseases such as Alzheimer's Disease (AD). However, the application of FBATs to assess sex-specific effects can require additional filtering steps, as sensitivity to sequencing errors is amplified in this type of analysis. Here, we illustrate the implementation of robust analysis approaches and additional filtering steps that can minimize the chances of false positive-findings due to sex-specific sequencing errors. We apply this approach to two family-based AD datasets and identify four novel loci (GRID1, RIOK3, MCPH1, ZBTB7C) showing sex-specific association with AD risk. Following stringent quality control filtering, the strongest candidate is ZBTB7C (Pinter = 1.83 × 10-7), in which the minor allele of rs1944572 confers increased risk for AD in females and protection in males. ZBTB7C encodes the Zinc Finger and BTB Domain Containing 7C, a transcriptional repressor of membrane metalloproteases (MMP). Members of this MMP family were implicated in AD neuropathology.

7.
Transl Psychiatry ; 10(1): 57, 2020 02 04.
Artigo em Inglês | MEDLINE | ID: mdl-32066727

RESUMO

Bipolar disorder (BD) is a highly heritable neuropsychiatric disease characterized by recurrent episodes of depression and mania. Research suggests that the cumulative impact of common alleles explains 25-38% of phenotypic variance, and that rare variants may contribute to BD susceptibility. To identify rare, high-penetrance susceptibility variants for BD, whole-exome sequencing (WES) was performed in three affected individuals from each of 27 multiply affected families from Spain and Germany. WES identified 378 rare, non-synonymous, and potentially functional variants. These spanned 368 genes, and were carried by all three affected members in at least one family. Eight of the 368 genes harbored rare variants that were implicated in at least two independent families. In an extended segregation analysis involving additional family members, five of these eight genes harbored variants showing full or nearly full cosegregation with BD. These included the brain-expressed genes RGS12 and NCKAP5, which were considered the most promising BD candidates on the basis of independent evidence. Gene enrichment analysis for all 368 genes revealed significant enrichment for four pathways, including genes reported in de novo studies of autism (padj < 0.006) and schizophrenia (padj = 0.015). These results suggest a possible genetic overlap with BD for autism and schizophrenia at the rare-sequence-variant level. The present study implicates novel candidate genes for BD development, and may contribute to an improved understanding of the biological basis of this common and often devastating disease.

8.
Cancer Epidemiol Biomarkers Prev ; 29(2): 427-433, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-31748258

RESUMO

BACKGROUND: Obesity is a major risk factor for esophageal adenocarcinoma (EA) and its precursor Barrett's esophagus (BE). Research suggests that individuals with high genetic risk to obesity have a higher BE/EA risk. To facilitate understanding of biological factors that lead to progression from BE to EA, the present study investigated the shared genetic background of BE/EA and obesity-related traits. METHODS: Cross-trait linkage disequilibrium score regression was applied to summary statistics from genome-wide association meta-analyses on BE/EA and on obesity traits. Body mass index (BMI) was used as a proxy for general obesity, and waist-to-hip ratio (WHR) for abdominal obesity. For single marker analyses, all genome-wide significant risk alleles for BMI and WHR were compared with summary statistics of the BE/EA meta-analyses. RESULTS: Sex-combined analyses revealed a significant genetic correlation between BMI and BE/EA (rg = 0.13, P = 2 × 10-04) and a rg of 0.12 between WHR and BE/EA (P = 1 × 10-02). Sex-specific analyses revealed a pronounced genetic correlation between BMI and EA in females (rg = 0.17, P = 1.2 × 10-03), and WHR and EA in males (rg = 0.18, P = 1.51 × 10-02). On the single marker level, significant enrichment of concordant effects was observed for BMI and BE/EA risk variants (P = 8.45 × 10-03) and WHR and BE/EA risk variants (P = 2 × 10-02). CONCLUSIONS: Our study provides evidence for sex-specific genetic correlations that might reflect specific biological mecha-nisms. The data demonstrate that shared genetic factors are particularly relevant in progression from BE to EA. IMPACT: Our study quantifies the genetic correlation between BE/EA and obesity. Further research is now warranted to elucidate these effects and to understand the shared pathophysiology.

9.
Genet Epidemiol ; 44(2): 139-147, 2020 03.
Artigo em Inglês | MEDLINE | ID: mdl-31713269

RESUMO

In the analysis of current life science datasets, we often encounter scenarios in which the application of asymptotic theory to hypothesis testing can be problematic. Besides improved asymptotic results, permutation/simulation-based tests are a general approach to address this issue. However, these randomized tests can impose a massive computational burden, for example, in scenarios in which large numbers of statistical tests are computed, and the specified significance level is very small. Stopping rules aim to assess significance with the smallest possible number of draws while controlling the probabilities of errors due to statistical uncertainty. In this communication, we derive a general stopping rule, QUICK-STOP, based on the sequential testing theory that is easy to implement, controls the error probabilities rigorously, and is nearly optimal in terms of expected draws. In a simulation study, we show that our approach outperforms current stopping approaches for general randomized tests by factor 10 and does not impose an additional computational burden. We illustrate our approach by applying our stopping rule to a single-variant analysis of a whole-genome sequencing study for lung function.


Assuntos
Simulação por Computador , Intervalos de Confiança , Genoma Humano , Estudo de Associação Genômica Ampla , Humanos , Modelos Genéticos , Análise Numérica Assistida por Computador , Maleabilidade , Probabilidade , Doença Pulmonar Obstrutiva Crônica/genética
10.
Mol Psychiatry ; 2019 Nov 11.
Artigo em Inglês | MEDLINE | ID: mdl-31712720

RESUMO

Panic disorder (PD) has a lifetime prevalence of 2-4% and heritability estimates of 40%. The contributory genetic variants remain largely unknown, with few and inconsistent loci having been reported. The present report describes the largest genome-wide association study (GWAS) of PD to date comprising genome-wide genotype data of 2248 clinically well-characterized PD patients and 7992 ethnically matched controls. The samples originated from four European countries (Denmark, Estonia, Germany, and Sweden). Standard GWAS quality control procedures were conducted on each individual dataset, and imputation was performed using the 1000 Genomes Project reference panel. A meta-analysis was then performed using the Ricopili pipeline. No genome-wide significant locus was identified. Leave-one-out analyses generated highly significant polygenic risk scores (PRS) (explained variance of up to 2.6%). Linkage disequilibrium (LD) score regression analysis of the GWAS data showed that the estimated heritability for PD was 28.0-34.2%. After correction for multiple testing, a significant genetic correlation was found between PD and major depressive disorder, depressive symptoms, and neuroticism. A total of 255 single-nucleotide polymorphisms (SNPs) with p < 1 × 10-4 were followed up in an independent sample of 2408 PD patients and 228,470 controls from Denmark, Iceland and the Netherlands. In the combined analysis, SNP rs144783209 showed the strongest association with PD (pcomb = 3.10 × 10-7). Sign tests revealed a significant enrichment of SNPs with a discovery p-value of <0.0001 in the combined follow up cohort (p = 0.048). The present integrative analysis represents a major step towards the elucidation of the genetic susceptibility to PD.

11.
Chest ; 156(6): 1068-1079, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31557467

RESUMO

BACKGROUND: Asthma is a common respiratory disorder with a highly heterogeneous nature that remains poorly understood. The objective was to use whole genome sequencing (WGS) data to identify regions of common genetic variation contributing to lung function in individuals with a diagnosis of asthma. METHODS: WGS data were generated for 1,053 individuals from trios and extended pedigrees participating in the family-based Genetic Epidemiology of Asthma in Costa Rica study. Asthma affection status was defined through a physician's diagnosis of asthma, and most participants with asthma also had airway hyperresponsiveness (AHR) to methacholine. Family-based association tests for single variants were performed to assess the associations with lung function phenotypes. RESULTS: A genome-wide significant association was identified between baseline FEV1/FVC ratio and a single-nucleotide polymorphism in the top hit cysteine-rich secretory protein LCCL domain-containing 2 (CRISPLD2) (rs12051168; P = 3.6 × 10-8 in the unadjusted model) that retained suggestive significance in the covariate-adjusted model (P = 5.6 × 10-6). Rs12051168 was also nominally associated with other related phenotypes: baseline FEV1 (P = 3.3 × 10-3), postbronchodilator (PB) FEV1 (7.3 × 10-3), and PB FEV1/FVC ratio (P = 2.7 × 10-3). The identified baseline FEV1/FVC ratio and rs12051168 association was meta-analyzed and replicated in three independent cohorts in which most participants with asthma also had confirmed AHR (combined weighted z-score P = .015) but not in cohorts without information about AHR. CONCLUSIONS: These findings suggest that using specific asthma characteristics, such as AHR, can help identify more genetically homogeneous asthma subgroups with genotype-phenotype associations that may not be observed in all children with asthma. CRISPLD2 also may be important for baseline lung function in individuals with asthma who also may have AHR.


Assuntos
Asma/genética , Asma/fisiopatologia , Moléculas de Adesão Celular/genética , Volume Expiratório Forçado/genética , Fatores Reguladores de Interferon/genética , Capacidade Vital/genética , Sequenciamento Completo do Genoma , Adolescente , Adulto , Criança , Pré-Escolar , Costa Rica , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Fenômenos Fisiológicos Respiratórios/genética , Adulto Jovem
12.
Genet Epidemiol ; 43(8): 1046-1055, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31429121

RESUMO

Proportions of false-positive rates in genome-wide association analysis are affected by population stratification, and if it is not correctly adjusted, the statistical analysis can produce the large false-negative finding. Therefore various approaches have been proposed to adjust such problems in genome-wide association studies. However, in spite of its importance, a few studies have been conducted in genome-wide single nucleotide polymorphism (SNP)-by-environment interaction studies. In this report, we illustrate in which scenarios can lead to the false-positive rates in association mapping and approach to maintaining the overall type-1 error rate.


Assuntos
Interação Gene-Ambiente , Polimorfismo de Nucleotídeo Único , Idoso , Idoso de 80 Anos ou mais , Genética Populacional , Estudo de Associação Genômica Ampla , Humanos , Pessoa de Meia-Idade , Doença Pulmonar Obstrutiva Crônica/genética
13.
Am J Respir Crit Care Med ; 200(6): 677-690, 2019 09 15.
Artigo em Inglês | MEDLINE | ID: mdl-30908940

RESUMO

Chronic obstructive pulmonary disease (COPD) is a common and progressive disease that is influenced by both genetic and environmental factors. For many years, knowledge of the genetic basis of COPD was limited to Mendelian syndromes, such as alpha-1 antitrypsin deficiency and cutis laxa, caused by rare genetic variants. Over the past decade, the proliferation of genome-wide association studies, the accessibility of whole-genome sequencing, and the development of novel methods for analyzing genetic variation data have led to a substantial increase in the understanding of genetic variants that play a role in COPD susceptibility and COPD-related phenotypes. COPDGene (Genetic Epidemiology of COPD), a multicenter, longitudinal study of over 10,000 current and former cigarette smokers, has been pivotal to these breakthroughs in understanding the genetic basis of COPD. To date, over 20 genetic loci have been convincingly associated with COPD affection status, with additional loci demonstrating association with COPD-related phenotypes such as emphysema, chronic bronchitis, and hypoxemia. In this review, we discuss the contributions of the COPDGene study to the discovery of these genetic associations as well as the ongoing genetic investigations of COPD subtypes, protein biomarkers, and post-genome-wide association study analysis.


Assuntos
Predisposição Genética para Doença , Doença Pulmonar Obstrutiva Crônica/genética , Doença Pulmonar Obstrutiva Crônica/fisiopatologia , Idoso , Idoso de 80 Anos ou mais , Biomarcadores , Feminino , Estudo de Associação Genômica Ampla , Humanos , Estudos Longitudinais , Masculino , Pessoa de Meia-Idade , Fenótipo , Polimorfismo de Nucleotídeo Único , Medição de Risco
14.
Genet Epidemiol ; 43(3): 300-317, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30609057

RESUMO

The transmission disequilibrium test (TDT) is the gold standard for testing the association between a genetic variant and disease in samples consisting of affected individuals and their parents. In practice, more complex pedigree structures, that is siblings with no parents, or three-generational pedigrees with possibly missing genotypes, are common. There are several generalizations of the TDT that are suitable for use with arbitrary pedigree structures. We consider three such frequently used generalizations, family-based association test, pedigree disequilibrium test, and generalized disequilibrium test, that have accompanying software and compare them regarding validity and power in the single variant setting. We use simulations to study the effects of population admixture, populations whose genotypes are not in Hardy-Weinberg equilibrium (HWE), different pedigree structures, and the presence of linkage. Whereas our results show that some TDT generalizations can have a substantially increased Type 1 error, these tests are often used in substantive research without caveats about the validity of their Type 1 error. For the association analysis of rare variants in sequencing studies, region-based extensions of the TDT generalizations, that rely on the postulated robustness of the single variant tests, have been proposed. We discuss the implications of our results for these region-based extensions.


Assuntos
Estudos de Associação Genética , Desequilíbrio de Ligação/genética , Simulação por Computador , Família , Feminino , Frequência do Gene/genética , Ligação Genética , Humanos , Masculino , Modelos Genéticos , Pais , Linhagem , Software
15.
PLoS One ; 13(10): e0205895, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30379966

RESUMO

Bipolar disorder (BD) is a major psychiatric illness affecting around 1% of the global population. BD is characterized by recurrent manic and depressive episodes, and has an estimated heritability of around 70%. Research has identified the first BD susceptibility genes. However, the underlying pathways and regulatory networks remain largely unknown. Research suggests that the cumulative impact of common alleles with small effects explains only around 25-38% of the phenotypic variance for BD. A plausible hypothesis therefore is that rare, high penetrance variants may contribute to BD risk. The present study investigated the role of rare, nonsynonymous, and potentially functional variants via whole exome sequencing in 15 BD cases from two large, multiply affected families from Cuba. The high prevalence of BD in these pedigrees renders them promising in terms of the identification of genetic risk variants with large effect sizes. In addition, SNP array data were used to calculate polygenic risk scores for affected and unaffected family members. After correction for multiple testing, no significant increase in polygenic risk scores for common, BD-associated genetic variants was found in BD cases compared to healthy relatives. Exome sequencing identified a total of 17 rare and potentially damaging variants in 17 genes. The identified variants were shared by all investigated BD cases in the respective pedigree. The most promising variant was located in the gene SERPING1 (p.L349F), which has been reported previously as a genome-wide significant risk gene for schizophrenia. The present data suggest novel candidate genes for BD susceptibility, and may facilitate the discovery of disease-relevant pathways and regulatory networks.


Assuntos
Transtorno Bipolar/genética , Proteína Inibidora do Complemento C1/genética , Exoma , Redes Reguladoras de Genes , Predisposição Genética para Doença , Polimorfismo de Nucleotídeo Único , Alelos , Transtorno Bipolar/diagnóstico , Transtorno Bipolar/fisiopatologia , Cuba , Família , Feminino , Expressão Gênica , Estudo de Associação Genômica Ampla , Humanos , Masculino , Linhagem , Penetrância , Risco , Sequenciamento Completo do Exoma
16.
Front Psychiatry ; 9: 207, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29904359

RESUMO

Bipolar disorder (BD) is a common, highly heritable neuropsychiatric disease characterized by recurrent episodes of mania and depression. Lithium is the best-established long-term treatment for BD, even though individual response is highly variable. Evidence suggests that some of this variability has a genetic basis. This is supported by the largest genome-wide association study (GWAS) of lithium response to date conducted by the International Consortium on Lithium Genetics (ConLiGen). Recently, we performed the first genome-wide analysis of the involvement of miRNAs in BD and identified nine BD-associated miRNAs. However, it is unknown whether these miRNAs are also associated with lithium response in BD. In the present study, we therefore tested whether common variants at these nine candidate miRNAs contribute to the variance in lithium response in BD. Furthermore, we systematically analyzed whether any other miRNA in the genome is implicated in the response to lithium. For this purpose, we performed gene-based tests for all known miRNA coding genes in the ConLiGen GWAS dataset (n = 2,563 patients) using a set-based testing approach adapted from the versatile gene-based test for GWAS (VEGAS2). In the candidate approach, miR-499a showed a nominally significant association with lithium response, providing some evidence for involvement in both development and treatment of BD. In the genome-wide miRNA analysis, 71 miRNAs showed nominally significant associations with the dichotomous phenotype and 106 with the continuous trait for treatment response. A total of 15 miRNAs revealed nominal significance in both phenotypes with miR-633 showing the strongest association with the continuous trait (p = 9.80E-04) and miR-607 with the dichotomous phenotype (p = 5.79E-04). No association between miRNAs and treatment response to lithium in BD in either of the tested conditions withstood multiple testing correction. Given the limited power of our study, the investigation of miRNAs in larger GWAS samples of BD and lithium response is warranted.

17.
Biostatistics ; 19(3): 295-306, 2018 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-28968646

RESUMO

To quantify polygenic effects, i.e. undetected genetic effects, in large-scale association studies, we propose a generalized estimating equation (GEE) based estimation framework. We develop a marginal model for single-variant association test statistics of complex diseases that generalizes existing approaches such as LD Score regression and that is applicable to population-based designs, to family-based designs or to arbitrary combinations of both. We extend the standard GEE approach so that the parameters of the proposed marginal model can be estimated based on working-correlation/linkage-disequilibrium (LD) matrices from external reference panels. Our method achieves substantial efficiency gains over standard approaches, while it is robust against misspecification of the LD structure, i.e. the LD structure of the reference panel can differ substantially from the true LD structure in the study population. In simulation studies and in applications to population-based and family-based studies, we illustrate the features of the proposed GEE framework. Our results suggest that our approach can be up to 100% more efficient than existing methodology.


Assuntos
Bioestatística/métodos , Estudo de Associação Genômica Ampla/métodos , Desequilíbrio de Ligação , Modelos Estatísticos , Simulação por Computador , Humanos , Transtornos Mentais/genética , Análise de Regressão
18.
Genet Epidemiol ; 42(1): 123-126, 2018 02.
Artigo em Inglês | MEDLINE | ID: mdl-29159827

RESUMO

For family-based association studies, Horvath et al. proposed an algorithm for the association analysis between haplotypes and arbitrary phenotypes when the phase of the haplotypes is unknown, that is, genotype data is given. Their approach to haplotype analysis maintains the original features of the TDT/FBAT-approach, that is, complete robustness against genetic confounding and misspecification of the phenotype. The algorithm has been implemented in the FBAT and PBAT software package and has been used in numerous substantive manuscripts. Here, we propose a simplification of the original algorithm that maintains the original approach but reduces the computational burden of the approach substantially and gives valuable insights regarding the conditional distribution. With the modified algorithm, the application to whole-genome sequencing (WGS) studies becomes feasible; for example, in sliding window approaches or spatial-clustering approaches. The reduction of the computational burden that our modification provides is especially dramatic when both parental genotypes are missing. For example, for eight variants and 441 nuclear families with mostly offspring-only families, in a WGS study at the APOE locus, the running time decreased from approximately 21 hr for the original algorithm to 0.11 sec after our modification.


Assuntos
Algoritmos , Haplótipos , Núcleo Familiar , Fenótipo , Apolipoproteínas E/genética , Análise por Conglomerados , Feminino , Humanos , Masculino , Modelos Genéticos , Fatores de Tempo , Sequenciamento Completo do Genoma
19.
Transl Psychiatry ; 7(12): 1273, 2017 12 11.
Artigo em Inglês | MEDLINE | ID: mdl-29225345

RESUMO

Bipolar disorder (BPD) and major depressive disorder (MDD) are primary major mood disorders. Recent studies suggest that they share certain psychopathological features and common risk genes, but unraveling the full genetic architecture underlying the risk of major mood disorders remains an important scientific task. The public genome-wide association study (GWAS) data sets offer the opportunity to examine this topic by utilizing large amounts of combined genetic data, which should ultimately allow a better understanding of the onset and development of these illnesses. Genome-wide meta-analysis was performed by combining two GWAS data sets on BPD and MDD (19,637 cases and 18,083 controls), followed by replication analyses for the loci of interest in independent 12,364 cases and 76,633 controls from additional samples that were not included in the two GWAS data sets. The single-nucleotide polymorphism (SNP) rs10791889 at 11q13.2 was significant in both discovery and replication samples. When combining all samples, this SNP and multiple other SNPs at 2q11.2 (rs717454), 8q21.3 (rs10103191), and 11q13.2 (rs2167457) exhibited genome-wide significant association with major mood disorders. The SNPs in 2q11.2 and 8q21.3 were novel risk SNPs that were not previously reported, and SNPs at 11q13.2 were in high LD with potential BPD risk SNPs implicated in a previous GWAS. The genome-wide significant loci at 2q11.2 and 11q13.2 exhibited strong effects on the mRNA expression of certain nearby genes in cerebellum. In conclusion, we have identified several novel loci associated with major mood disorders, adding further support for shared genetic risk between BPD and MDD. Our study highlights the necessity and importance of mining public data sets to explore risk genes for complex diseases such as mood disorders.


Assuntos
Transtorno Bipolar/genética , Transtorno Depressivo Maior/genética , Transtornos do Humor/genética , Polimorfismo de Nucleotídeo Único , Feminino , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Humanos , Masculino
20.
Genet Epidemiol ; 41(4): 332-340, 2017 05.
Artigo em Inglês | MEDLINE | ID: mdl-28318110

RESUMO

For the association analysis of whole-genome sequencing (WGS) studies, we propose an efficient and fast spatial-clustering algorithm. Compared to existing analysis approaches for WGS data, that define the tested regions either by sliding or consecutive windows of fixed sizes along variants, a meaningful grouping of nearby variants into consecutive regions has the advantage that, compared to sliding window approaches, the number of tested regions is likely to be smaller. In comparison to consecutive, fixed-window approaches, our approach is likely to group nearby variants together. Given existing biological evidence that disease-associated mutations tend to physically cluster in specific regions along the chromosome, the identification of meaningful groups of nearby located variants could thus lead to a potential power gain for association analysis. Our algorithm defines consecutive genomic regions based on the physical positions of the variants, assuming an inhomogeneous Poisson process and groups together nearby variants. As parameters are estimated locally, the algorithm takes the differing variant density along the chromosome into account and provides locally optimal partitioning of variants into consecutive regions. An R-implementation of the algorithm is provided. We discuss the theoretical advances of our algorithm compared to existing, window-based approaches and show the performance and advantage of our introduced algorithm in a simulation study and by an application to Alzheimer's disease WGS data. Our analysis identifies a region in the ITGB3 gene that potentially harbors disease susceptibility loci for Alzheimer's disease. The region-based association signal of ITGB3 replicates in an independent data set and achieves formally genome-wide significance. Software Implementation: An implementation of the algorithm in R is available at: https://github.com/heidefier/cluster_wgs_data.


Assuntos
Estudo de Associação Genômica Ampla , Genoma , Análise de Sequência de DNA , Algoritmos , Doença de Alzheimer/genética , Análise por Conglomerados , Simulação por Computador , Genômica , Humanos , Modelos Genéticos , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...