RESUMO
Rare copy-number variants (rCNVs) include deletions and duplications that occur infrequently in the global human population and can confer substantial risk for disease. In this study, we aimed to quantify the properties of haploinsufficiency (i.e., deletion intolerance) and triplosensitivity (i.e., duplication intolerance) throughout the human genome. We harmonized and meta-analyzed rCNVs from nearly one million individuals to construct a genome-wide catalog of dosage sensitivity across 54 disorders, which defined 163 dosage sensitive segments associated with at least one disorder. These segments were typically gene dense and often harbored dominant dosage sensitive driver genes, which we were able to prioritize using statistical fine-mapping. Finally, we designed an ensemble machine-learning model to predict probabilities of dosage sensitivity (pHaplo & pTriplo) for all autosomal genes, which identified 2,987 haploinsufficient and 1,559 triplosensitive genes, including 648 that were uniquely triplosensitive. This dosage sensitivity resource will provide broad utility for human disease research and clinical genetics.
Assuntos
Variações do Número de Cópias de DNA , Genoma Humano , Variações do Número de Cópias de DNA/genética , Dosagem de Genes , Haploinsuficiência/genética , HumanosRESUMO
Allele-specific methylation (ASM) is an epigenetic modification whereby one parental allele becomes methylated and the other unmethylated at a specific locus. ASM is most often driven by the presence of nearby heterozygous variants that influence methylation, but also occurs somatically in the context of genomic imprinting. In this study, we investigate ASM using publicly available single-cell reduced representation bisulfite sequencing (scRRBS) data on 608 B cells sampled from six healthy B cell samples and 1,230 cells from 11 chronic lymphocytic leukemia (CLL) samples. We developed a likelihood-based criterion to test whether a CpG exhibited ASM, based on the distributions of methylated and unmethylated reads both within and across cells. Applying our likelihood ratio test, 65,998 CpG sites exhibited ASM in healthy B cell samples according to a Bonferroni criterion (p < 8.4 × 10-9), and 32,862 CpG sites exhibited ASM in CLL samples (p < 8.5 × 10-9). We also called ASM at the sample level. To evaluate the accuracy of our method, we called heterozygous variants from the scRRBS data, which enabled variant-based calls of ASM within each cell. Comparing sample-level ASM calls to the variant-based measures of ASM, we observed a positive predictive value of 76%-100% across samples. We observed high concordance of ASM across samples and an overrepresentation of ASM in previously reported imprinted genes and genes with imprinting binding motifs. Our study demonstrates that single-cell bisulfite sequencing is a potentially powerful tool to investigate ASM, especially as studies expand to increase the number of samples and cells sequenced.
Assuntos
Metilação de DNA , Leucemia Linfocítica Crônica de Células B , Sulfitos , Humanos , Metilação de DNA/genética , Alelos , Leucemia Linfocítica Crônica de Células B/genética , Funções Verossimilhança , Impressão Genômica/genética , Ilhas de CpG/genéticaRESUMO
Large-scale, multi-ethnic whole-genome sequencing (WGS) studies, such as the National Human Genome Research Institute Genome Sequencing Program's Centers for Common Disease Genomics (CCDG), play an important role in increasing diversity for genetic research. Before performing association analyses, assessing Hardy-Weinberg equilibrium (HWE) is a crucial step in quality control procedures to remove low quality variants and ensure valid downstream analyses. Diverse WGS studies contain ancestrally heterogeneous samples; however, commonly used HWE methods assume that the samples are homogeneous. Therefore, directly applying these to the whole dataset can yield statistically invalid results. To account for this heterogeneity, HWE can be tested on subsets of samples that have genetically homogeneous ancestries and the results aggregated at each variant. To facilitate valid HWE subset testing, we developed a semi-supervised learning approach that predicts homogeneous ancestries based on the genotype. This method provides a convenient tool for estimating HWE in the presence of population structure and missing self-reported race and ethnicities in diverse WGS studies. In addition, assessing HWE within the homogeneous ancestries provides reliable HWE estimates that will directly benefit downstream analyses, including association analyses in WGS studies. We applied our proposed method on the CCDG dataset, predicting homogeneous genetic ancestry groups for 60,545 multi-ethnic WGS samples to assess HWE within each group.
Assuntos
Aprendizado de Máquina Supervisionado , Sequenciamento Completo do Genoma , Humanos , Sequenciamento Completo do Genoma/métodos , Genoma Humano , Genética Populacional/métodos , Etnicidade/genética , Estudo de Associação Genômica Ampla/métodos , Polimorfismo de Nucleotídeo Único , GenótipoRESUMO
Both trio and population designs are popular study designs for identifying risk genetic variants in genome-wide association studies (GWASs). The trio design, as a family-based design, is robust to confounding due to population structure, whereas the population design is often more powerful due to larger sample sizes. Here, we propose KnockoffHybrid, a knockoff-based statistical method for hybrid analysis of both the trio and population designs. KnockoffHybrid provides a unified framework that brings together the advantages of both designs and produces powerful hybrid analysis while controlling the false discovery rate (FDR) in the presence of linkage disequilibrium and population structure. Furthermore, KnockoffHybrid has the flexibility to leverage different types of summary statistics for hybrid analyses, including expression quantitative trait loci (eQTL) and GWAS summary statistics. We demonstrate in simulations that KnockoffHybrid offers power gains over non-hybrid methods for the trio and population designs with the same number of cases while controlling the FDR with complex correlation among variants and population structure among subjects. In hybrid analyses of three trio cohorts for autism spectrum disorders (ASDs) from the Autism Speaks MSSNG, Autism Sequencing Consortium, and Autism Genome Project with GWAS summary statistics from the iPSYCH project and eQTL summary statistics from the MetaBrain project, KnockoffHybrid outperforms conventional methods by replicating several known risk genes for ASDs and identifying additional associations with variants in other genes, including the PRAME family genes involved in axon guidance and which may act as common targets for human speech/language evolution and related disorders.
Assuntos
Transtorno do Espectro Autista , Estudo de Associação Genômica Ampla , Desequilíbrio de Ligação , Locos de Características Quantitativas , Estudo de Associação Genômica Ampla/métodos , Humanos , Transtorno do Espectro Autista/genética , Predisposição Genética para Doença , Polimorfismo de Nucleotídeo Único , Simulação por Computador , Modelos GenéticosRESUMO
We present LDAK-GBAT, a tool for gene-based association testing using summary statistics from genome-wide association studies that is computationally efficient, produces well-calibrated p values, and is significantly more powerful than existing tools. LDAK-GBAT takes approximately 30 min to analyze imputed data (2.9M common, genic SNPs), requiring less than 10 Gb memory. It shows good control of type 1 error given an appropriate reference panel. Across 109 phenotypes (82 from the UK Biobank, 18 from the Million Veteran Program, and nine from the Psychiatric Genetics Consortium), LDAK-GBAT finds on average 19% (SE: 1%) more significant genes than the existing tool sumFREGAT-ACAT, with even greater gains in comparison with MAGMA, GCTA-fastBAT, sumFREGAT-SKAT-O, and sumFREGAT-PCA.
Assuntos
Testes Genéticos , Estudo de Associação Genômica Ampla , Fenótipo , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
Multimorbidity is a rising public health challenge with important implications for health management and policy. The most common multimorbidity pattern is the combination of cardiometabolic and osteoarticular diseases. Here, we study the genetic underpinning of the comorbidity between type 2 diabetes and osteoarthritis. We find genome-wide genetic correlation between the two diseases and robust evidence for association-signal colocalization at 18 genomic regions. We integrate multi-omics and functional information to resolve the colocalizing signals and identify high-confidence effector genes, including FTO and IRX3, which provide proof-of-concept insights into the epidemiologic link between obesity and both diseases. We find enrichment for lipid metabolism and skeletal formation pathways for signals underpinning the knee and hip osteoarthritis comorbidities with type 2 diabetes, respectively. Causal inference analysis identifies complex effects of tissue-specific gene expression on comorbidity outcomes. Our findings provide insights into the biological basis for the type 2 diabetes-osteoarthritis disease co-occurrence.
Assuntos
Diabetes Mellitus Tipo 2 , Osteoartrite , Humanos , Diabetes Mellitus Tipo 2/complicações , Diabetes Mellitus Tipo 2/genética , Comorbidade , Osteoartrite/epidemiologia , Osteoartrite/genética , Obesidade/complicações , Obesidade/epidemiologia , Obesidade/genética , Causalidade , Estudo de Associação Genômica Ampla , Análise da Randomização Mendeliana , Polimorfismo de Nucleotídeo Único , Dioxigenase FTO Dependente de alfa-Cetoglutarato/genéticaRESUMO
Some rare genetic disorders, such as retinitis pigmentosa or Alport syndrome, are caused by the co-inheritance of DNA variants at two different genetic loci (digenic inheritance). To capture the effects of these disease-causing variants and their possible interactive effects, various statistical methods have been developed in human genetics. Analogous developments have taken place in the field of machine learning, particularly for the field that is now called Big Data. In the past, these two areas have grown independently and have started to converge only in recent years. We discuss an overview of each of the two fields, paying special attention to machine learning methods for uncovering the combined effects of pairs of variants on human disease.
Assuntos
Padrões de Herança , Herança Multifatorial , Humanos , Padrões de Herança/genética , Aprendizado de Máquina , Mutação , LinhagemRESUMO
Transcriptome-wide association studies (TWASs) are a powerful approach to identify genes whose expression is associated with complex disease risk. However, non-causal genes can exhibit association signals due to confounding by linkage disequilibrium (LD) patterns and eQTL pleiotropy at genomic risk regions, which necessitates fine-mapping of TWAS signals. Here, we present MA-FOCUS, a multi-ancestry framework for the improved identification of genes underlying traits of interest. We demonstrate that by leveraging differences in ancestry-specific patterns of LD and eQTL signals, MA-FOCUS consistently outperforms single-ancestry fine-mapping approaches with equivalent total sample sizes across multiple metrics. We perform TWASs for 15 blood traits using genome-wide summary statistics (average nEA = 511 k, nAA = 13 k) and lymphoblastoid cell line eQTL data from cohorts of primarily European and African continental ancestries. We recapitulate evidence demonstrating shared genetic architectures for eQTL and blood traits between the two ancestry groups and observe that gene-level effects correlate 20% more strongly across ancestries than SNP-level effects. Lastly, we perform fine-mapping using MA-FOCUS and find evidence that genes at TWAS risk regions are more likely to be shared across ancestries than they are to be ancestry specific. Using multiple lines of evidence to validate our findings, we find that gene sets produced by MA-FOCUS are more enriched in hematopoietic categories than alternative approaches (p = 2.36 × 10-15). Our work demonstrates that including and appropriately accounting for genetic diversity can drive more profound insights into the genetic architecture of complex traits.
Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Humanos , Desequilíbrio de Ligação , Herança Multifatorial/genética , Polimorfismo de Nucleotídeo Único/genética , Transcriptoma/genéticaRESUMO
Despite the growing number of genome-wide association studies (GWASs), it remains unclear to what extent gene-by-gene and gene-by-environment interactions influence complex traits in humans. The magnitude of genetic interactions in complex traits has been difficult to quantify because GWASs are generally underpowered to detect individual interactions of small effect. Here, we develop a method to test for genetic interactions that aggregates information across all trait-associated loci. Specifically, we test whether SNPs in regions of European ancestry shared between European American and admixed African American individuals have the same causal effect sizes. We hypothesize that in African Americans, the presence of genetic interactions will drive the causal effect sizes of SNPs in regions of European ancestry to be more similar to those of SNPs in regions of African ancestry. We apply our method to two traits: gene expression in 296 African Americans and 482 European Americans in the Multi-Ethnic Study of Atherosclerosis (MESA) and low-density lipoprotein cholesterol (LDL-C) in 74K African Americans and 296K European Americans in the Million Veteran Program (MVP). We find significant evidence for genetic interactions in our analysis of gene expression; for LDL-C, we observe a similar point estimate, although this is not significant, most likely due to lower statistical power. These results suggest that gene-by-gene or gene-by-environment interactions modify the effect sizes of causal variants in human complex traits.
Assuntos
Estudo de Associação Genômica Ampla , Herança Multifatorial , LDL-Colesterol , Expressão Gênica , Humanos , Herança Multifatorial/genética , Polimorfismo de Nucleotídeo Único/genética , População Branca/genéticaRESUMO
Hypoxia-inducible factor prolyl hydroxylase inhibitors (HIF-PHIs) are currently under clinical development for treating anemia in chronic kidney disease (CKD), but it is important to monitor their cardiovascular safety. Genetic variants can be used as predictors to help inform the potential risk of adverse effects associated with drug treatments. We therefore aimed to use human genetics to help assess the risk of adverse cardiovascular events associated with therapeutically altered EPO levels to help inform clinical trials studying the safety of HIF-PHIs. By performing a genome-wide association meta-analysis of EPO (n = 6,127), we identified a cis-EPO variant (rs1617640) lying in the EPO promoter region. We validated this variant as most likely causal in controlling EPO levels by using genetic and functional approaches, including single-base gene editing. Using this variant as a partial predictor for therapeutic modulation of EPO and large genome-wide association data in Mendelian randomization tests, we found no evidence (at p < 0.05) that genetically predicted long-term rises in endogenous EPO, equivalent to a 2.2-unit increase, increased risk of coronary artery disease (CAD, OR [95% CI] = 1.01 [0.93, 1.07]), myocardial infarction (MI, OR [95% CI] = 0.99 [0.87, 1.15]), or stroke (OR [95% CI] = 0.97 [0.87, 1.07]). We could exclude increased odds of 1.15 for cardiovascular disease for a 2.2-unit EPO increase. A combination of genetic and functional studies provides a powerful approach to investigate the potential therapeutic profile of EPO-increasing therapies for treating anemia in CKD.
Assuntos
Anemia , Doença da Artéria Coronariana , Infarto do Miocárdio , Insuficiência Renal Crônica , Anemia/tratamento farmacológico , Anemia/genética , Doença da Artéria Coronariana/genética , Estudo de Associação Genômica Ampla , Humanos , Análise da Randomização Mendeliana , Infarto do Miocárdio/genética , Insuficiência Renal Crônica/genéticaRESUMO
BACKGROUND: Approximately 95% of samples analyzed in univariate genome-wide association studies (GWAS) are of European ancestry. This bias toward European ancestry populations in association screening also exists for other analyses and methods that are often developed and tested on European ancestry only. However, existing data in non-European populations, which are often of modest sample size, could benefit from innovative approaches as recently illustrated in the context of polygenic risk scores. METHODS: Here, we extend and assess the potential limitations and gains of our multi-trait GWAS pipeline, JASS (Joint Analysis of Summary Statistics), for the analysis of non-European ancestries. To this end, we conducted the joint GWAS of 19 hematological traits and glycemic traits across five ancestries (European (EUR), admixed American (AMR), African (AFR), East Asian (EAS), and South-East Asian (SAS)). RESULTS: We detected 367 new genome-wide significant associations in non-European populations (15 in Admixed American (AMR), 72 in African (AFR) and 280 in East Asian (EAS)). New associations detected represent 5%, 17% and 13% of associations in the AFR, AMR and EAS populations, respectively. Overall, multi-trait testing increases the replication of European associated loci in non-European ancestry by 15%. Pleiotropic effects were highly similar at significant loci across ancestries (e.g. the mean correlation between multi-trait genetic effects of EUR and EAS ancestries was 0.88). For hematological traits, strong discrepancies in multi-trait genetic effects are tied to known evolutionary divergences: the ARKC1 loci, which is adaptive to overcome p.vivax induced malaria. CONCLUSIONS: Multi-trait GWAS can be a valuable tool to narrow the genetic knowledge gap between European and non-European populations.
Assuntos
Povo Asiático , População Negra , Estudo de Associação Genômica Ampla , Humanos , Povo Asiático/genética , População Negra/genética , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/métodos , Fenótipo , Polimorfismo de Nucleotídeo Único , População Europeia/genéticaRESUMO
The integration of genomic data into health systems offers opportunities to identify genomic factors underlying the continuum of rare and common disease. We applied a population-scale haplotype association approach based on identity-by-descent (IBD) in a large multi-ethnic biobank to a spectrum of disease outcomes derived from electronic health records (EHRs) and uncovered a risk locus for liver disease. We used genome sequencing and in silico approaches to fine-map the signal to a non-coding variant (c.2784-12T>C) in the gene ABCB4. In vitro analysis confirmed the variant disrupted splicing of the ABCB4 pre-mRNA. Four of five homozygotes had evidence of advanced liver disease, and there was a significant association with liver disease among heterozygotes, suggesting the variant is linked to increased risk of liver disease in an allele dose-dependent manner. Population-level screening revealed the variant to be at a carrier rate of 1.95% in Puerto Rican individuals, likely as the result of a Puerto Rican founder effect. This work demonstrates that integrating EHR and genomic data at a population scale can facilitate strategies for understanding the continuum of genomic risk for common diseases, particularly in populations underrepresented in genomic medicine.
Assuntos
Atenção à Saúde/organização & administração , Predisposição Genética para Doença , Hepatopatias/genética , Subfamília B de Transportador de Cassetes de Ligação de ATP/genética , Registros Eletrônicos de Saúde , Haplótipos , Heterozigoto , Hispânico ou Latino/genética , Homozigoto , Humanos , Porto RicoRESUMO
Microbiome scientists critically need modern tools to explore and analyze microbial evolution. Often this involves studying the evolution of microbial genomes as a whole. However, different genes in a single genome can be subject to different evolutionary pressures, which can result in distinct gene-level evolutionary histories. To address this challenge, we propose to treat estimated gene-level phylogenies as data objects, and present an interactive method for the analysis of a collection of gene phylogenies. We use a local linear approximation of phylogenetic tree space to visualize estimated gene trees as points in low-dimensional Euclidean space, and address important practical limitations of existing related approaches, allowing an intuitive visualization of complex data objects. We demonstrate the utility of our proposed approach through microbial data analyses, including by identifying outlying gene histories in strains of Prevotella, and by contrasting Streptococcus phylogenies estimated using different gene sets. Our method is available as an open-source R package, and assists with estimating, visualizing, and interacting with a collection of bacterial gene phylogenies.
RESUMO
BACKGROUND: Human endogenous retroviruses (HERVs) are sequences in the human genome that originated from infections with ancient retroviruses during our evolution. Previous studies have linked HERVs to neurodegenerative diseases, but defining their role in aetiology has been challenging. Here, we used a retrotranscriptome-wide association study (rTWAS) approach to assess the relationships between genetic risk for neurodegenerative diseases and HERV expression in the brain, calculated with genomic precision. METHODS: We analysed genetic association statistics pertaining to Alzheimer's disease, amyotrophic lateral sclerosis, multiple sclerosis, and Parkinson's disease, using HERV expression models calculated from 792 cortical samples. Robust risk factors were considered those that survived multiple testing correction in the primary analysis, which were also significant in conditional and joint analyses, and that had a posterior inclusion probability above 0.5 in fine-mapping analyses. RESULTS: The primary analysis identified 12 HERV expression signatures associated with neurodegenerative disease susceptibility. We found one HERV expression signature robustly associated with amyotrophic lateral sclerosis on chromosome 12q14 (MER61_12q14.2) and one robustly associated with multiple sclerosis on chromosome 1p36 (ERVLE_1p36.32a). A co-expression analysis suggested that these HERVs are involved in homophilic cell adhesion via plasma membrane adhesion molecules. CONCLUSIONS: We found HERV expression profiles robustly associated with amyotrophic lateral sclerosis and multiple sclerosis susceptibility, highlighting novel risk mechanisms underlying neurodegenerative disease, and offering potential new targets for therapeutic intervention.
RESUMO
BACKGROUND: Genome-wide tests, including genome-wide association studies (GWAS) of germ-line genetic variants, driver tests of cancer somatic mutations, and transcriptome-wide association tests of RNAseq data, carry a high multiple testing burden. This burden can be overcome by enrolling larger cohorts or alleviated by using prior biological knowledge to favor some hypotheses over others. Here we compare these two methods in terms of their abilities to boost the power of hypothesis testing. RESULTS: We provide a quantitative estimate for progress in cohort sizes and present a theoretical analysis of the power of oracular hard priors: priors that select a subset of hypotheses for testing, with an oracular guarantee that all true positives are within the tested subset. This theory demonstrates that for GWAS, strong priors that limit testing to 100-1000 genes provide less power than typical annual 20-40% increases in cohort sizes. Furthermore, non-oracular priors that exclude even a small fraction of true positives from the tested set can perform worse than not using a prior at all. CONCLUSION: Our results provide a theoretical explanation for the continued dominance of simple, unbiased univariate hypothesis tests for GWAS: if a statistical question can be answered by larger cohort sizes, it should be answered by larger cohort sizes rather than by more complicated biased methods involving priors. We suggest that priors are better suited for non-statistical aspects of biology, such as pathway structure and causality, that are not yet easily captured by standard hypothesis tests.
Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Humanos , Densidade Demográfica , TranscriptomaRESUMO
This review is about statistical genetics, an interdisciplinary topic between statistical physics and population biology. The focus is on the phase ofquasi-linkage equilibrium(QLE). Our goals here are to clarify under which conditions the QLE phase can be expected to hold in population biology and how the stability of the QLE phase is lost. The QLE state, which has many similarities to a thermal equilibrium state in statistical mechanics, was discovered by M Kimura for a two-locus two-allele model, and was extended and generalized to the global genome scale byNeher&Shraiman (2011). What we will refer to as the Kimura-Neher-Shraiman theory describes a population evolving due to the mutations, recombination, natural selection and possibly genetic drift. A QLE phase exists at sufficiently high recombination rate (r) and/or mutation ratesµwith respect to selection strength. We show how in QLE it is possible to infer the epistatic parameters of the fitness function from the knowledge of the (dynamical) distribution of genotypes in a population. We further consider the breakdown of the QLE regime for high enough selection strength. We review recent results for the selection-mutation and selection-recombination dynamics. Finally, we identify and characterize a new phase which we call the non-random coexistence where variability persists in the population without either fixating or disappearing.
Assuntos
Modelos Genéticos , Seleção Genética , Desequilíbrio de Ligação , Mutação , Genótipo , Genética PopulacionalRESUMO
By reviewing previous CpG-related studies, we consider that the transcription regulation of about half of the human genes, mostly housekeeping (HK) genes, involves CpG islands (CGIs), their methylation states, CpG spacing and other chromosomal parameters. However, the precise CGI definition and positioning of CGIs within gene structures, as well as specific CGI-associated regulatory mechanisms, all remain to be explained at individual gene and gene-family levels, together with consideration of species and lineage specificity. Although previous studies have already classified CGIs into high-CpG (HCGI), intermediate-CpG (ICGI) and low-CpG (LCGI) densities based on CpG density variation, the correlation between CGI density and gene expression regulation, such as co-regulation of CGIs and TATA box on HK genes, remains to be elucidated. First, this study introduces such a problem-solving protocol for human-genome annotation, which is based on a combination of GTEx, JBLA and Gene Ontology (GO) analysis. Next, we discuss why CGI-associated genes are most likely regulated by HCGI and tend to be HK genes; the HCGI/TATA± and LCGI/TATA± combinations show different GO enrichment, whereas the ICGI/TATA± combination is less characteristic based on GO enrichment analysis. Finally, we demonstrate that Hadoop MapReduce-based MR-JBLA algorithm is more efficient than the original JBLA in k-mer counting and CGI-associated gene analysis.
Assuntos
Ilhas de CpG , Genes Essenciais , Anotação de Sequência Molecular/métodos , Software , Metilação de DNA , Humanos , TATA BoxRESUMO
PURPOSE: The congenital Long QT Syndrome (LQTS) and Brugada Syndrome (BrS) are Mendelian autosomal dominant diseases that frequently precipitate fatal cardiac arrhythmias. Incomplete penetrance is a barrier to clinical management of heterozygotes harboring variants in the major implicated disease genes KCNQ1, KCNH2, and SCN5A. We apply and evaluate a Bayesian penetrance estimation strategy that accounts for this phenomenon. METHODS: We generated Bayesian penetrance models for KCNQ1-LQT1 and SCN5A-LQT3 using variant-specific features and clinical data from the literature, international arrhythmia genetic centers, and population controls. We analyzed the distribution of posterior penetrance estimates across 4 genotype-phenotype relationships and compared continuous estimates with ClinVar annotations. Posterior estimates were mapped onto protein structure. RESULTS: Bayesian penetrance estimates of KCNQ1-LQT1 and SCN5A-LQT3 are empirically equivalent to 10 and 5 clinically phenotype heterozygotes, respectively. Posterior penetrance estimates were bimodal for KCNQ1-LQT1 and KCNH2-LQT2, with a higher fraction of missense variants with high penetrance among KCNQ1 variants. There was a wide distribution of variant penetrance estimates among identical ClinVar categories. Structural mapping revealed heterogeneity among "hot spot" regions and featured high penetrance estimates for KCNQ1 variants in contact with calmodulin and the S6 domain. CONCLUSIONS: Bayesian penetrance estimates provide a continuous framework for variant interpretation.
Assuntos
Canalopatias , Canal de Potássio KCNQ1 , Humanos , Canal de Potássio KCNQ1/genética , Mutação , Penetrância , Teorema de Bayes , Canalopatias/genética , Arritmias Cardíacas/genéticaRESUMO
False discovery rates are routinely controlled by application of the Benjamini-Hochberg step-up procedure to a set of p-values. A method is demonstrated for representing the values so obtained (the BH-FDRs) on a quantile-quantile (Q-Q) plot of the p-values transformed to the negative-logarithmic scale. Recognition of this connection between the BH-FDR and the Q-Q plot facilitates both understanding of the meaning of the BH-FDR and interpretation of the BH-FDR in a particular data set.
RESUMO
BACKGROUND: Organisms in the wild can acquire disease- and stress-resistance traits that outstrip the programs endogenous to humans. Finding the molecular basis of such natural resistance characters is a key goal of evolutionary genetics. Standard statistical-genetic methods toward this end can perform poorly in organismal systems that lack high rates of meiotic recombination, like Caenorhabditis worms. RESULTS: Here we discovered unique ER stress resistance in a wild Kenyan C. elegans isolate, which in inter-strain crosses was passed by hermaphrodite mothers to hybrid offspring. We developed an unbiased version of the reciprocal hemizygosity test, RH-seq, to explore the genetics of this parent-of-origin-dependent phenotype. Among top-scoring gene candidates from a partial-coverage RH-seq screen, we focused on the neuronally-expressed, cuticlin-like gene cutl-24 for validation. In gene-disruption and controlled crossing experiments, we found that cutl-24 was required in Kenyan hermaphrodite mothers for ER stress tolerance in their inter-strain hybrid offspring; cutl-24 was also a contributor to the trait in purebred backgrounds. CONCLUSIONS: These data establish the Kenyan strain allele of cutl-24 as a determinant of a natural stress-resistant state, and they set a precedent for the dissection of natural trait diversity in invertebrate animals without the need for a panel of meiotic recombinants.