RESUMO
It is well known that inbreeding increases the risk of recessive monogenic diseases, but it is less certain whether it contributes to the etiology of complex diseases such as schizophrenia. One way to estimate the effects of inbreeding is to examine the association between disease diagnosis and genome-wide autozygosity estimated using runs of homozygosity (ROH) in genome-wide single nucleotide polymorphism arrays. Using data for schizophrenia from the Psychiatric Genomics Consortium (n = 21,868), Keller et al. (2012) estimated that the odds of developing schizophrenia increased by approximately 17% for every additional percent of the genome that is autozygous (ß = 16.1, CI(ß) = [6.93, 25.7], Z = 3.44, p = 0.0006). Here we describe replication results from 22 independent schizophrenia case-control datasets from the Psychiatric Genomics Consortium (n = 39,830). Using the same ROH calling thresholds and procedures as Keller et al. (2012), we were unable to replicate the significant association between ROH burden and schizophrenia in the independent PGC phase II data, although the effect was in the predicted direction, and the combined (original + replication) dataset yielded an attenuated but significant relationship between Froh and schizophrenia (ß = 4.86,CI(ß) = [0.90,8.83],Z = 2.40,p = 0.02). Since Keller et al. (2012), several studies reported inconsistent association of ROH burden with complex traits, particularly in case-control data. These conflicting results might suggest that the effects of autozygosity are confounded by various factors, such as socioeconomic status, education, urbanicity, and religiosity, which may be associated with both real inbreeding and the outcome measures of interest.
Assuntos
Consanguinidade , Estudo de Associação Genômica Ampla , Esquizofrenia/genética , Feminino , Genoma Humano , Genômica , Homozigoto , Humanos , Masculino , Polimorfismo de Nucleotídeo Único , Esquizofrenia/epidemiologia , Esquizofrenia/patologiaRESUMO
Heritability is a fundamental parameter in genetics. Traditional estimates based on family or twin studies can be biased due to shared environmental or non-additive genetic variance. Alternatively, those based on genotyped or imputed variants typically underestimate narrow-sense heritability contributed by rare or otherwise poorly tagged causal variants. Identical-by-descent (IBD) segments of the genome share all variants between pairs of chromosomes except new mutations that have arisen since the last common ancestor. Therefore, relating phenotypic similarity to degree of IBD sharing among classically unrelated individuals is an appealing approach to estimating the near full additive genetic variance while possibly avoiding biases that can occur when modeling close relatives. We applied an IBD-based approach (GREML-IBD) to estimate heritability in unrelated individuals using phenotypic simulation with thousands of whole-genome sequences across a range of stratification, polygenicity levels, and the minor allele frequencies of causal variants (CVs). In simulations, the IBD-based approach produced unbiased heritability estimates, even when CVs were extremely rare, although precision was low. However, population stratification and non-genetic familial environmental effects shared across generations led to strong biases in IBD-based heritability. We used data on two traits in ~120,000 people from the UK Biobank to demonstrate that, depending on the trait and possible confounding environmental effects, GREML-IBD can be applied to very large genetic datasets to infer the contribution of very rare variants lost using other methods. However, we observed apparent biases in these real data, suggesting that more work may be required to understand and mitigate factors that influence IBD-based heritability estimates.
Assuntos
Cromossomos Humanos , Frequência do Gene , Genoma Humano , Haplótipos , Humanos , Fenótipo , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Multiple methods have been developed to estimate narrow-sense heritability, h2, using single nucleotide polymorphisms (SNPs) in unrelated individuals. However, a comprehensive evaluation of these methods has not yet been performed, leading to confusion and discrepancy in the literature. We present the most thorough and realistic comparison of these methods to date. We used thousands of real whole-genome sequences to simulate phenotypes under varying genetic architectures and confounding variables, and we used array, imputed, or whole genome sequence SNPs to obtain 'SNP-heritability' estimates. We show that SNP-heritability can be highly sensitive to assumptions about the frequencies, effect sizes, and levels of linkage disequilibrium of underlying causal variants, but that methods that bin SNPs according to minor allele frequency and linkage disequilibrium are less sensitive to these assumptions across a wide range of genetic architectures and possible confounding factors. These findings provide guidance for best practices and proper interpretation of published estimates.
Assuntos
Genoma/genética , Característica Quantitativa Herdável , Frequência do Gene/genética , Estudo de Associação Genômica Ampla/métodos , Genótipo , Humanos , Desequilíbrio de Ligação , Modelos Genéticos , Herança Multifatorial/genética , Fenótipo , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
Identical by descent (IBD) segments are used to understand a number of fundamental issues in genetics. IBD segments are typically detected using long stretches of identical alleles between haplotypes in phased, whole-genome SNP data. Phase or SNP call errors in genomic data can degrade accuracy of IBD detection and lead to false-positive/negative calls and to under/overextension of true IBD segments. Furthermore, the number of comparisons increases quadratically with sample size, requiring high computational efficiency. We developed a new IBD segment detection program, FISHR (Find IBD Shared Haplotypes Rapidly), in an attempt to accurately detect IBD segments and to better estimate their endpoints using an algorithm that is fast enough to be deployed on very large whole-genome SNP data sets. We compared the performance of FISHR to three leading IBD segment detection programs: GERMLINE, refined IBD, and HaploScore. Using simulated and real genomic sequence data, we show that FISHR is slightly more accurate than all programs at detecting long (>3 cm) IBD segments but slightly less accurate than refined IBD at detecting short (~1 cm) IBD segments. More centrally, FISHR outperforms all programs in determining the true endpoints of IBD segments, which is crucial for several applications of IBD information. FISHR takes two to three times longer than GERMLINE to run, whereas both GERMLINE and FISHR were orders of magnitude faster than refined IBD and HaploScore. Overall, FISHR provides accurate IBD detection in unrelated individuals and is computationally efficient enough to be utilized on large SNP data sets >60 000 individuals.