Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 149
Filtrar
1.
Nat Genet ; 56(5): 838-845, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38741015

RESUMEN

Autoimmune and inflammatory diseases are polygenic disorders of the immune system. Many genomic loci harbor risk alleles for several diseases, but the limited resolution of genetic mapping prevents determining whether the same allele is responsible, indicating a shared underlying mechanism. Here, using a collection of 129,058 cases and controls across 6 diseases, we show that ~40% of overlapping associations are due to the same allele. We improve fine-mapping resolution for shared alleles twofold by combining cases and controls across diseases, allowing us to identify more expression quantitative trait loci driven by the shared alleles. The patterns indicate widespread sharing of pathogenic mechanisms but not a single global autoimmune mechanism. Our approach can be applied to any set of traits and is particularly valuable as sample collections become depleted.


Asunto(s)
Alelos , Enfermedades Autoinmunes , Mapeo Cromosómico , Predisposición Genética a la Enfermedad , Sitios de Carácter Cuantitativo , Humanos , Enfermedades Autoinmunes/genética , Polimorfismo de Nucleótido Simple , Estudio de Asociación del Genoma Completo , Estudios de Casos y Controles , Herencia Multifactorial/genética
2.
bioRxiv ; 2024 Mar 07.
Artículo en Inglés | MEDLINE | ID: mdl-38496508

RESUMEN

Whether neurodegenerative diseases linked to misfolding of the same protein share genetic risk drivers or whether different protein-aggregation pathologies in neurodegeneration are mechanistically related remains uncertain. Conventional genetic analyses are underpowered to address these questions. Through careful selection of patients based on protein aggregation phenotype (rather than clinical diagnosis) we can increase statistical power to detect associated variants in a targeted set of genes that modify proteotoxicities. Genetic modifiers of alpha-synuclein (ɑS) and beta-amyloid (Aß) cytotoxicity in yeast are enriched in risk factors for Parkinson's disease (PD) and Alzheimer's disease (AD), respectively. Here, along with known AD/PD risk genes, we deeply sequenced exomes of 430 ɑS/Aß modifier genes in patients across alpha-synucleinopathies (PD, Lewy body dementia and multiple system atrophy). Beyond known PD genes GBA1 and LRRK2, rare variants AD genes (CD33, CR1 and PSEN2) and Aß toxicity modifiers involved in RhoA/actin cytoskeleton regulation (ARGHEF1, ARHGEF28, MICAL3, PASK, PKN2, PSEN2) were shared risk factors across synucleinopathies. Actin pathology occurred in iPSC synucleinopathy models and RhoA downregulation exacerbated ɑS pathology. Even in sporadic PD, the expression of these genes was altered across CNS cell types. Genome-wide CRISPR screens revealed the essentiality of PSEN2 in both human cortical and dopaminergic neurons, and PSEN2 mutation carriers exhibited diffuse brainstem and cortical synucleinopathy independent of AD pathology. PSEN2 contributes to a common-risk signal in PD GWAS and regulates ɑS expression in neurons. Our results identify convergent mechanisms across synucleinopathies, some shared with AD.

3.
bioRxiv ; 2024 Feb 16.
Artículo en Inglés | MEDLINE | ID: mdl-38405764

RESUMEN

Genomics for rare disease diagnosis has advanced at a rapid pace due to our ability to perform "N-of-1" analyses on individual patients. The increasing sizes of ultra-rare, "N-of-1" disease cohorts internationally newly enables cohort-wide analyses for new discoveries, but well-calibrated statistical genetics approaches for jointly analyzing these patients are still under development.1,2 The Undiagnosed Diseases Network (UDN) brings multiple clinical, research and experimental centers under the same umbrella across the United States to facilitate and scale N-of-1 analyses. Here, we present the first joint analysis of whole genome sequencing data of UDN patients across the network. We apply existing and introduce new, well-calibrated statistical methods for prioritizing disease genes with de novo recurrence and compound heterozygosity. We also detect pathways enriched with candidate and known diagnostic genes. Our computational analysis, coupled with a systematic clinical review, recapitulated known diagnoses and revealed new disease associations. We make our gene-level findings and variant-level information across the cohort available in a public-facing browser (https://dbmi-bgm.github.io/udn-browser/). These results show that N-of-1 efforts should be supplemented by a joint genomic analysis across cohorts.

4.
Genome Biol ; 25(1): 39, 2024 Jan 31.
Artículo en Inglés | MEDLINE | ID: mdl-38297326

RESUMEN

Expansions of tandem repeats (TRs) cause approximately 60 monogenic diseases. We expect that the discovery of additional pathogenic repeat expansions will narrow the diagnostic gap in many diseases. A growing number of TR expansions are being identified, and interpreting them is a challenge. We present RExPRT (Repeat EXpansion Pathogenicity pRediction Tool), a machine learning tool for distinguishing pathogenic from benign TR expansions. Our results demonstrate that an ensemble approach classifies TRs with an average precision of 93% and recall of 83%. RExPRT's high precision will be valuable in large-scale discovery studies, which require prioritization of candidate loci for follow-up studies.


Asunto(s)
Aprendizaje Automático , Secuencias Repetidas en Tándem , Virulencia
5.
Blood ; 143(11): 1032-1044, 2024 Mar 14.
Artículo en Inglés | MEDLINE | ID: mdl-38096369

RESUMEN

ABSTRACT: Extreme disease phenotypes can provide key insights into the pathophysiology of common conditions, but studying such cases is challenging due to their rarity and the limited statistical power of existing methods. Herein, we used a novel approach to pathway-based mutational burden testing, the rare variant trend test (RVTT), to investigate genetic risk factors for an extreme form of sepsis-induced coagulopathy, infectious purpura fulminans (PF). In addition to prospective patient sample collection, we electronically screened over 10.4 million medical records from 4 large hospital systems and identified historical cases of PF for which archived specimens were available to perform germline whole-exome sequencing. We found a significantly increased burden of low-frequency, putatively function-altering variants in the complement system in patients with PF compared with unselected patients with sepsis (P = .01). A multivariable logistic regression analysis found that the number of complement system variants per patient was independently associated with PF after controlling for age, sex, and disease acuity (P = .01). Functional characterization of PF-associated variants in the immunomodulatory complement receptors CR3 and CR4 revealed that they result in partial or complete loss of anti-inflammatory CR3 function and/or gain of proinflammatory CR4 function. Taken together, these findings suggest that inherited defects in CR3 and CR4 predispose to the maladaptive hyperinflammation that characterizes severe sepsis with coagulopathy.


Asunto(s)
Púrpura Fulminante , Sepsis , Humanos , Púrpura Fulminante/genética , Estudios Prospectivos , Receptores de Complemento
6.
medRxiv ; 2023 Dec 04.
Artículo en Inglés | MEDLINE | ID: mdl-38106023

RESUMEN

The genetic architecture of human diseases and complex traits has been extensively studied, but little is known about the relationship of causal disease effect sizes between proximal SNPs, which have largely been assumed to be independent. We introduce a new method, LD SNP-pair effect correlation regression (LDSPEC), to estimate the correlation of causal disease effect sizes of derived alleles between proximal SNPs, depending on their allele frequencies, LD, and functional annotations; LDSPEC produced robust estimates in simulations across various genetic architectures. We applied LDSPEC to 70 diseases and complex traits from the UK Biobank (average N=306K), meta-analyzing results across diseases/traits. We detected significantly nonzero effect correlations for proximal SNP pairs (e.g., -0.37±0.09 for low-frequency positive-LD 0-100bp SNP pairs) that decayed with distance (e.g., -0.07±0.01 for low-frequency positive-LD 1-10kb), varied with allele frequency (e.g., -0.15±0.04 for common positive-LD 0-100bp), and varied with LD between SNPs (e.g., +0.12±0.05 for common negative-LD 0-100bp) (because we consider derived alleles, positive-LD and negative-LD SNP pairs may yield very different results). We further determined that SNP pairs with shared functions had stronger effect correlations that spanned longer genomic distances, e.g., -0.37±0.08 for low-frequency positive-LD same-gene promoter SNP pairs (average genomic distance of 47kb (due to alternative splicing)) and -0.32±0.04 for low-frequency positive-LD H3K27ac 0-1kb SNP pairs. Consequently, SNP-heritability estimates were substantially smaller than estimates of the sum of causal effect size variances across all SNPs (ratio of 0.87±0.02 across diseases/traits), particularly for certain functional annotations (e.g., 0.78±0.01 for common Super enhancer SNPs)-even though these quantities are widely assumed to be equal. We recapitulated our findings via forward simulations with an evolutionary model involving stabilizing selection, implicating the action of linkage masking, whereby haplotypes containing linked SNPs with opposite effects on disease have reduced effects on fitness and escape negative selection.

7.
Nat Genet ; 55(12): 2235-2242, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-38036792

RESUMEN

De novo mutations occur at substantially different rates depending on genomic location, sequence context and DNA strand. The success of methods to estimate selection intensity, infer demographic history and map rare disease genes, depends strongly on assumptions about the local mutation rate. Here we present Roulette, a genome-wide mutation rate model at basepair resolution that incorporates known determinants of local mutation rate. Roulette is shown to be more accurate than existing models. We use Roulette to refine the estimates of population growth within Europe by incorporating the full range of human mutation rates. The analysis of significant deviations from the model predictions revealed a tenfold increase in mutation rate in nearly all genes transcribed by polymerase III (Pol III), suggesting a new mutagenic mechanism. We also detected an elevated mutation rate within transcription factor binding sites restricted to sites actively used in testis and residing in promoters.


Asunto(s)
Mutágenos , Tasa de Mutación , ARN Polimerasa III , Transcripción Genética , Humanos , Masculino , ADN/genética , Mutagénesis , Mutación , Nucleotidiltransferasas , Regiones Promotoras Genéticas/genética , Transcripción Genética/genética , ARN Polimerasa III/metabolismo
8.
Genetics ; 224(3)2023 Jul 06.
Artículo en Inglés | MEDLINE | ID: mdl-36967220

RESUMEN

Recurrent mutation produces multiple copies of the same allele which may be co-segregating in a population. Yet, most analyses of allele-frequency or site-frequency spectra assume that all observed copies of an allele trace back to a single mutation. We develop a sampling theory for the number of latent mutations in the ancestry of a rare variant, specifically a variant observed in relatively small count in a large sample. Our results follow from the statistical independence of low-count mutations, which we show to hold for the standard neutral coalescent or diffusion model of population genetics as well as for more general coalescent trees. For populations of constant size, these counts are distributed like the number of alleles in the Ewens sampling formula. We develop a Poisson sampling model for populations of varying size and illustrate it using new results for site-frequency spectra in an exponentially growing population. We apply our model to a large data set of human SNPs and use it to explain dramatic differences in site-frequency spectra across the range of mutation rates in the human genome.


Asunto(s)
Genética de Población , Modelos Genéticos , Humanos , Mutación , Frecuencia de los Genes , Tasa de Mutación , Alelos
9.
medRxiv ; 2023 Oct 12.
Artículo en Inglés | MEDLINE | ID: mdl-36032980

RESUMEN

A multitude of demographic, health, and genetic factors are associated with the risk of developing severe COVID-19 following infection by the SARS-CoV-2. There is a need to perform studies across human societies and to investigate the full spectrum of genetic variation of the virus. Using data from 869 COVID-19 patients in Bahrain between March 2020 and March 2021, we analyzed paired viral sequencing and non-genetic host data to understand host and viral determinants of severe COVID-19. We estimated the effects of demographic variables specific to the Bahrain population and found that the impact of health factors are largely consistent with other populations. To extend beyond the common variants of concern in the Spike protein analyzed by previous studies, we used a viral burden approach and detected a protective effect of low-frequency missense viral mutations in the RNA-dependent RNA polymerase (Pol) gene on disease severity. Our results contribute to the survey of severe COVID-19 in diverse populations and highlight the benefits of studying rare viral mutations.

10.
Nucleic Acids Res ; 51(D1): D1300-D1311, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36350676

RESUMEN

Large biobank-scale whole genome sequencing (WGS) studies are rapidly identifying a multitude of coding and non-coding variants. They provide an unprecedented resource for illuminating the genetic basis of human diseases. Variant functional annotations play a critical role in WGS analysis, result interpretation, and prioritization of disease- or trait-associated causal variants. Existing functional annotation databases have limited scope to perform online queries and functionally annotate the genotype data of large biobank-scale WGS studies. We develop the Functional Annotation of Variants Online Resources (FAVOR) to meet these pressing needs. FAVOR provides a comprehensive multi-faceted variant functional annotation online portal that summarizes and visualizes findings of all possible nine billion single nucleotide variants (SNVs) across the genome. It allows for rapid variant-, gene- and region-level queries of variant functional annotations. FAVOR integrates variant functional information from multiple sources to describe the functional characteristics of variants and facilitates prioritizing plausible causal variants influencing human phenotypes. Furthermore, we provide a scalable annotation tool, FAVORannotator, to functionally annotate large-scale WGS studies and efficiently store the genotype and their variant functional annotation data in a single file using the annotated Genomic Data Structure (aGDS) format, making downstream analysis more convenient. FAVOR and FAVORannotator are available at https://favor.genohub.org.


Asunto(s)
Genoma Humano , Programas Informáticos , Humanos , Anotación de Secuencia Molecular , Genómica , Genotipo , Variación Genética
11.
Res Sq ; 2023 Dec 15.
Artículo en Inglés | MEDLINE | ID: mdl-38168385

RESUMEN

The genetic architecture of human diseases and complex traits has been extensively studied, but little is known about the relationship of causal disease effect sizes between proximal SNPs, which have largely been assumed to be independent. We introduce a new method, LD SNP-pair effect correlation regression (LDSPEC), to estimate the correlation of causal disease effect sizes of derived alleles between proximal SNPs, depending on their allele frequencies, LD, and functional annotations; LDSPEC produced robust estimates in simulations across various genetic architectures. We applied LDSPEC to 70 diseases and complex traits from the UK Biobank (average N=306K), meta-analyzing results across diseases/traits. We detected significantly nonzero effect correlations for proximal SNP pairs (e.g., -0.37±0.09 for low-frequency positive-LD 0-100bp SNP pairs) that decayed with distance (e.g., -0.07±0.01 for low-frequency positive-LD 1-10kb), varied with allele frequency (e.g., -0.15±0.04 for common positive-LD 0-100bp), and varied with LD between SNPs (e.g., +0.12±0.05 for common negative-LD 0-100bp) (because we consider derived alleles, positive-LD and negative-LD SNP pairs may yield very different results). We further determined that SNP pairs with shared functions had stronger effect correlations that spanned longer genomic distances, e.g., -0.37±0.08 for low-frequency positive-LD same-gene promoter SNP pairs (average genomic distance of 47kb (due to alternative splicing)) and -0.32±0.04 for low-frequency positive-LD H3K27ac 0-1kb SNP pairs. Consequently, SNP-heritability estimates were substantially smaller than estimates of the sum of causal effect size variances across all SNPs (ratio of 0.87±0.02 across diseases/traits), particularly for certain functional annotations (e.g., 0.78±0.01 for common Super enhancer SNPs)-even though these quantities are widely assumed to be equal. We recapitulated our findings via forward simulations with an evolutionary model involving stabilizing selection, implicating the action of linkage masking, whereby haplotypes containing linked SNPs with opposite effects on disease have reduced effects on fitness and escape negative selection.

12.
Elife ; 112022 12 14.
Artículo en Inglés | MEDLINE | ID: mdl-36515579

RESUMEN

The genetic basis of most traits is highly polygenic and dominated by non-coding alleles. It is widely assumed that such alleles exert small regulatory effects on the expression of cis-linked genes. However, despite the availability of gene expression and epigenomic datasets, few variant-to-gene links have emerged. It is unclear whether these sparse results are due to limitations in available data and methods, or to deficiencies in the underlying assumed model. To better distinguish between these possibilities, we identified 220 gene-trait pairs in which protein-coding variants influence a complex trait or its Mendelian cognate. Despite the presence of expression quantitative trait loci near most GWAS associations, by applying a gene-based approach we found limited evidence that the baseline expression of trait-related genes explains GWAS associations, whether using colocalization methods (8% of genes implicated), transcription-wide association (2% of genes implicated), or a combination of regulatory annotations and distance (4% of genes implicated). These results contradict the hypothesis that most complex trait-associated variants coincide with homeostatic expression QTLs, suggesting that better models are needed. The field must confront this deficit and pursue this 'missing regulation.'


Asunto(s)
Estudio de Asociación del Genoma Completo , Sitios de Carácter Cuantitativo , Humanos , Estudio de Asociación del Genoma Completo/métodos , Fenotipo , Herencia Multifactorial/genética , Epigenómica , Polimorfismo de Nucleótido Simple , Predisposición Genética a la Enfermedad
13.
PLoS Genet ; 18(12): e1010557, 2022 12.
Artículo en Inglés | MEDLINE | ID: mdl-36574455

RESUMEN

Genetic association studies of many heritable traits resulting from physiological testing often have modest sample sizes due to the cost and burden of the required phenotyping. This reduces statistical power and limits discovery of multiple genetic associations. We present a strategy to leverage pleiotropy between traits to both discover new loci and to provide mechanistic hypotheses of the underlying pathophysiology. Specifically, we combine a colocalization test with a locus-level test of pleiotropy. In simulations, we show that this approach is highly selective for identifying true pleiotropy driven by the same causative variant, thereby improves the chance to replicate the associations in underpowered validation cohorts and leads to higher interpretability. Here, as an exemplar, we use Obstructive Sleep Apnea (OSA), a common disorder diagnosed using overnight multi-channel physiological testing. We leverage pleiotropy with relevant cellular and cardio-metabolic phenotypes and gene expression traits to map new risk loci in an underpowered OSA GWAS. We identify several pleiotropic loci harboring suggestive associations to OSA and genome-wide significant associations to other traits, and show that their OSA association replicates in independent cohorts of diverse ancestries. By investigating pleiotropic loci, our strategy allows proposing new hypotheses about OSA pathobiology across many physiological layers. For example, we identify and replicate the pleiotropy across the plateletcrit, OSA and an eQTL of DNA primase subunit 1 (PRIM1) in immune cells. We find suggestive links between OSA, a measure of lung function (FEV1/FVC), and an eQTL of matrix metallopeptidase 15 (MMP15) in lung tissue. We also link a previously known genome-wide significant peak for OSA in the hexokinase 1 (HK1) locus to hematocrit and other red blood cell related traits. Thus, the analysis of pleiotropic associations has the potential to assemble diverse phenotypes into a chain of mechanistic hypotheses that provide insight into the pathogenesis of complex human diseases.


Asunto(s)
Estudio de Asociación del Genoma Completo , Apnea Obstructiva del Sueño , Humanos , Estudio de Asociación del Genoma Completo/métodos , Fenotipo , Estudios de Asociación Genética , Sueño , Pleiotropía Genética , Polimorfismo de Nucleótido Simple , ADN Primasa
14.
Cell ; 185(16): 3041-3055.e25, 2022 08 04.
Artículo en Inglés | MEDLINE | ID: mdl-35917817

RESUMEN

Rare copy-number variants (rCNVs) include deletions and duplications that occur infrequently in the global human population and can confer substantial risk for disease. In this study, we aimed to quantify the properties of haploinsufficiency (i.e., deletion intolerance) and triplosensitivity (i.e., duplication intolerance) throughout the human genome. We harmonized and meta-analyzed rCNVs from nearly one million individuals to construct a genome-wide catalog of dosage sensitivity across 54 disorders, which defined 163 dosage sensitive segments associated with at least one disorder. These segments were typically gene dense and often harbored dominant dosage sensitive driver genes, which we were able to prioritize using statistical fine-mapping. Finally, we designed an ensemble machine-learning model to predict probabilities of dosage sensitivity (pHaplo & pTriplo) for all autosomal genes, which identified 2,987 haploinsufficient and 1,559 triplosensitive genes, including 648 that were uniquely triplosensitive. This dosage sensitivity resource will provide broad utility for human disease research and clinical genetics.


Asunto(s)
Variaciones en el Número de Copia de ADN , Genoma Humano , Variaciones en el Número de Copia de ADN/genética , Dosificación de Gen , Haploinsuficiencia/genética , Humanos
15.
Cell ; 185(12): 2035-2056.e33, 2022 06 09.
Artículo en Inglés | MEDLINE | ID: mdl-35688132

RESUMEN

Alpha-synuclein (αS) is a conformationally plastic protein that reversibly binds to cellular membranes. It aggregates and is genetically linked to Parkinson's disease (PD). Here, we show that αS directly modulates processing bodies (P-bodies), membraneless organelles that function in mRNA turnover and storage. The N terminus of αS, but not other synucleins, dictates mutually exclusive binding either to cellular membranes or to P-bodies in the cytosol. αS associates with multiple decapping proteins in close proximity on the Edc4 scaffold. As αS pathologically accumulates, aberrant interaction with Edc4 occurs at the expense of physiologic decapping-module interactions. mRNA decay kinetics within PD-relevant pathways are correspondingly disrupted in PD patient neurons and brain. Genetic modulation of P-body components alters αS toxicity, and human genetic analysis lends support to the disease-relevance of these interactions. Beyond revealing an unexpected aspect of αS function and pathology, our data highlight the versatility of conformationally plastic proteins with high intrinsic disorder.


Asunto(s)
Enfermedad de Parkinson , alfa-Sinucleína , Humanos , Enfermedad de Parkinson/metabolismo , Cuerpos de Procesamiento , Estabilidad del ARN , alfa-Sinucleína/genética , alfa-Sinucleína/metabolismo
16.
Science ; 376(6589): eabg5601, 2022 04 08.
Artículo en Inglés | MEDLINE | ID: mdl-35389777

RESUMEN

We established a genome-wide compendium of somatic mutation events in 3949 whole cancer genomes representing 19 tumor types. Protein-coding events captured well-established drivers. Noncoding events near tissue-specific genes, such as ALB in the liver or KLK3 in the prostate, characterized localized passenger mutation patterns and may reflect tumor-cell-of-origin imprinting. Noncoding events in regulatory promoter and enhancer regions frequently involved cancer-relevant genes such as BCL6, FGFR2, RAD51B, SMC6, TERT, and XBP1 and represent possible drivers. Unlike most noncoding regulatory events, XBP1 mutations primarily accumulated outside the gene's promoter, and we validated their effect on gene expression using CRISPR-interference screening and luciferase reporter assays. Broadly, our study provides a blueprint for capturing mutation events across the entire genome to guide advances in biological discovery, therapies, and diagnostics.


Asunto(s)
Neoplasias , Regiones Promotoras Genéticas , Análisis Mutacional de ADN , Regulación Neoplásica de la Expresión Génica , Humanos , Masculino , Mutación , Neoplasias/genética , Neoplasias/patología , Oncogenes , Secuencias Reguladoras de Ácidos Nucleicos , Proteína 1 de Unión a la X-Box
17.
Am J Hum Genet ; 109(2): 195-209, 2022 02 03.
Artículo en Inglés | MEDLINE | ID: mdl-35032432

RESUMEN

Whole-genome sequencing resolves many clinical cases where standard diagnostic methods have failed. However, at least half of these cases remain unresolved after whole-genome sequencing. Structural variants (SVs; genomic variants larger than 50 base pairs) of uncertain significance are the genetic cause of a portion of these unresolved cases. As sequencing methods using long or linked reads become more accessible and SV detection algorithms improve, clinicians and researchers are gaining access to thousands of reliable SVs of unknown disease relevance. Methods to predict the pathogenicity of these SVs are required to realize the full diagnostic potential of long-read sequencing. To address this emerging need, we developed StrVCTVRE to distinguish pathogenic SVs from benign SVs that overlap exons. In a random forest classifier, we integrated features that capture gene importance, coding region, conservation, expression, and exon structure. We found that features such as expression and conservation are important but are absent from SV classification guidelines. We leveraged multiple resources to construct a size-matched training set of rare, putatively benign and pathogenic SVs. StrVCTVRE performs accurately across a wide SV size range on independent test sets, which will allow clinicians and researchers to eliminate about half of SVs from consideration while retaining a 90% sensitivity. We anticipate clinicians and researchers will use StrVCTVRE to prioritize SVs in probands where no SV is immediately compelling, empowering deeper investigation into novel SVs to resolve cases and understand new mechanisms of disease. StrVCTVRE runs rapidly and is publicly available.


Asunto(s)
Algoritmos , Genoma Humano , Variación Estructural del Genoma , Programas Informáticos , Aprendizaje Automático Supervisado , Conjuntos de Datos como Asunto , Exones , Genómica/métodos , Humanos , Curva ROC , Secuenciación Completa del Genoma/estadística & datos numéricos
18.
Am J Hum Genet ; 109(1): 33-49, 2022 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-34951958

RESUMEN

The identification of genes that evolve under recessive natural selection is a long-standing goal of population genetics research that has important applications to the discovery of genes associated with disease. We found that commonly used methods to evaluate selective constraint at the gene level are highly sensitive to genes under heterozygous selection but ubiquitously fail to detect recessively evolving genes. Additionally, more sophisticated likelihood-based methods designed to detect recessivity similarly lack power for a human gene of realistic length from current population sample sizes. However, extensive simulations suggested that recessive genes may be detectable in aggregate. Here, we offer a method informed by population genetics simulations designed to detect recessive purifying selection in gene sets. Applying this to empirical gene sets produced significant enrichments for strong recessive selection in genes previously inferred to be under recessive selection in a consanguineous cohort and in genes involved in autosomal recessive monogenic disorders.


Asunto(s)
Frecuencia de los Genes , Genes Recesivos , Genética de Población , Selección Genética , Algoritmos , Alelos , Genes Dominantes , Predisposición Genética a la Enfermedad , Variación Genética , Genética de Población/métodos , Genómica/métodos , Genotipo , Humanos , Patrón de Herencia , Funciones de Verosimilitud , Modelos Genéticos , Mutación , Reino Unido
19.
Front Genet ; 12: 763363, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34868244

RESUMEN

Numerous studies have found evidence that GWAS loci experience negative selection, which increases in intensity with the effect size of identified variants. However, there is also accumulating evidence that this selection is not entirely mediated by the focal trait and contains a substantial pleiotropic component. Understanding how selective constraint shapes phenotypic variation requires advancing models capable of balancing these and other components of selection, as well as empirical analyses capable of inferring this balance and how it is generated by the underlying biology. We first review the classic theory connecting phenotypic selection to selection at individual loci as well as approaches and findings from recent analyses of negative selection in GWAS data. We then discuss geometric theories of pleiotropic selection with the potential to guide future modeling efforts. Recent findings revealing the nature of pleiotropic genetic variation provide clues to which genetic relationships are important and should be incorporated into analyses of selection, while findings that effect sizes vary between populations indicate that GWAS measurements could be misleading if effect sizes have also changed throughout human history.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA