RESUMEN
Alpha-synuclein (αS) is a conformationally plastic protein that reversibly binds to cellular membranes. It aggregates and is genetically linked to Parkinson's disease (PD). Here, we show that αS directly modulates processing bodies (P-bodies), membraneless organelles that function in mRNA turnover and storage. The N terminus of αS, but not other synucleins, dictates mutually exclusive binding either to cellular membranes or to P-bodies in the cytosol. αS associates with multiple decapping proteins in close proximity on the Edc4 scaffold. As αS pathologically accumulates, aberrant interaction with Edc4 occurs at the expense of physiologic decapping-module interactions. mRNA decay kinetics within PD-relevant pathways are correspondingly disrupted in PD patient neurons and brain. Genetic modulation of P-body components alters αS toxicity, and human genetic analysis lends support to the disease-relevance of these interactions. Beyond revealing an unexpected aspect of αS function and pathology, our data highlight the versatility of conformationally plastic proteins with high intrinsic disorder.
Asunto(s)
Enfermedad de Parkinson , alfa-Sinucleína , Humanos , Enfermedad de Parkinson/metabolismo , Cuerpos de Procesamiento , Estabilidad del ARN , alfa-Sinucleína/genética , alfa-Sinucleína/metabolismoRESUMEN
Rare copy-number variants (rCNVs) include deletions and duplications that occur infrequently in the global human population and can confer substantial risk for disease. In this study, we aimed to quantify the properties of haploinsufficiency (i.e., deletion intolerance) and triplosensitivity (i.e., duplication intolerance) throughout the human genome. We harmonized and meta-analyzed rCNVs from nearly one million individuals to construct a genome-wide catalog of dosage sensitivity across 54 disorders, which defined 163 dosage sensitive segments associated with at least one disorder. These segments were typically gene dense and often harbored dominant dosage sensitive driver genes, which we were able to prioritize using statistical fine-mapping. Finally, we designed an ensemble machine-learning model to predict probabilities of dosage sensitivity (pHaplo & pTriplo) for all autosomal genes, which identified 2,987 haploinsufficient and 1,559 triplosensitive genes, including 648 that were uniquely triplosensitive. This dosage sensitivity resource will provide broad utility for human disease research and clinical genetics.
Asunto(s)
Variaciones en el Número de Copia de ADN , Genoma Humano , Variaciones en el Número de Copia de ADN/genética , Dosificación de Gen , Haploinsuficiencia/genética , HumanosRESUMEN
How disease-associated mutations impair protein activities in the context of biological networks remains mostly undetermined. Although a few renowned alleles are well characterized, functional information is missing for over 100,000 disease-associated variants. Here we functionally profile several thousand missense mutations across a spectrum of Mendelian disorders using various interaction assays. The majority of disease-associated alleles exhibit wild-type chaperone binding profiles, suggesting they preserve protein folding or stability. While common variants from healthy individuals rarely affect interactions, two-thirds of disease-associated alleles perturb protein-protein interactions, with half corresponding to "edgetic" alleles affecting only a subset of interactions while leaving most other interactions unperturbed. With transcription factors, many alleles that leave protein-protein interactions intact affect DNA binding. Different mutations in the same gene leading to different interaction profiles often result in distinct disease phenotypes. Thus disease-associated alleles that perturb distinct protein activities rather than grossly affecting folding and stability are relatively widespread.
Asunto(s)
Enfermedad/genética , Mutación Missense , Mapas de Interacción de Proteínas , Proteínas/genética , Proteínas/metabolismo , Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Estudio de Asociación del Genoma Completo , Humanos , Sistemas de Lectura Abierta , Pliegue de Proteína , Estabilidad ProteicaRESUMEN
Mitochondrial DNA (mtDNA) has an important yet often overlooked role in health and disease. Constraint models quantify the removal of deleterious variation from the population by selection and represent powerful tools for identifying genetic variation that underlies human phenotypes1-4. However, nuclear constraint models are not applicable to mtDNA, owing to its distinct features. Here we describe the development of a mitochondrial genome constraint model and its application to the Genome Aggregation Database (gnomAD), a large-scale population dataset that reports mtDNA variation across 56,434 human participants5. Specifically, we analyse constraint by comparing the observed variation in gnomAD to that expected under neutrality, which was calculated using a mtDNA mutational model and observed maximum heteroplasmy-level data. Our results highlight strong depletion of expected variation, which suggests that many deleterious mtDNA variants remain undetected. To aid their discovery, we compute constraint metrics for every mitochondrial protein, tRNA and rRNA gene, which revealed a range of intolerance to variation. We further characterize the most constrained regions within genes through regional constraint and identify the most constrained sites within the entire mitochondrial genome through local constraint, which showed enrichment of pathogenic variation. Constraint also clustered in three-dimensional structures, which provided insight into functionally important domains and their disease relevance. Notably, we identify constraint at often overlooked sites, including in rRNA and noncoding regions. Last, we demonstrate that these metrics can improve the discovery of deleterious variation that underlies rare and common phenotypes.
RESUMEN
Despite years of active research into the role of DNA repair and replication in mutagenesis, surprisingly little is known about the origin of spontaneous human mutation in the germ line. With the advent of high-throughput sequencing, genome-scale data have revealed statistical properties of mutagenesis in humans. These properties include variation of the mutation rate and spectrum along the genome at different scales in relation to epigenomic features and dependency on parental age. Moreover, mutations originated in mothers are less frequent than mutations originated in fathers and have a distinct genomic distribution. Statistical analyses that interpret these patterns in the context of known biochemistry can provide mechanistic models of mutagenesis in humans.
Asunto(s)
Genoma Humano , Genómica/métodos , Células Germinativas/metabolismo , Mutagénesis , Tasa de Mutación , Mutación , HumanosRESUMEN
ABSTRACT: Extreme disease phenotypes can provide key insights into the pathophysiology of common conditions, but studying such cases is challenging due to their rarity and the limited statistical power of existing methods. Herein, we used a novel approach to pathway-based mutational burden testing, the rare variant trend test (RVTT), to investigate genetic risk factors for an extreme form of sepsis-induced coagulopathy, infectious purpura fulminans (PF). In addition to prospective patient sample collection, we electronically screened over 10.4 million medical records from 4 large hospital systems and identified historical cases of PF for which archived specimens were available to perform germline whole-exome sequencing. We found a significantly increased burden of low-frequency, putatively function-altering variants in the complement system in patients with PF compared with unselected patients with sepsis (P = .01). A multivariable logistic regression analysis found that the number of complement system variants per patient was independently associated with PF after controlling for age, sex, and disease acuity (P = .01). Functional characterization of PF-associated variants in the immunomodulatory complement receptors CR3 and CR4 revealed that they result in partial or complete loss of anti-inflammatory CR3 function and/or gain of proinflammatory CR4 function. Taken together, these findings suggest that inherited defects in CR3 and CR4 predispose to the maladaptive hyperinflammation that characterizes severe sepsis with coagulopathy.
Asunto(s)
Púrpura Fulminante , Sepsis , Humanos , Púrpura Fulminante/genética , Estudios Prospectivos , Receptores de ComplementoRESUMEN
The identification of genes that evolve under recessive natural selection is a long-standing goal of population genetics research that has important applications to the discovery of genes associated with disease. We found that commonly used methods to evaluate selective constraint at the gene level are highly sensitive to genes under heterozygous selection but ubiquitously fail to detect recessively evolving genes. Additionally, more sophisticated likelihood-based methods designed to detect recessivity similarly lack power for a human gene of realistic length from current population sample sizes. However, extensive simulations suggested that recessive genes may be detectable in aggregate. Here, we offer a method informed by population genetics simulations designed to detect recessive purifying selection in gene sets. Applying this to empirical gene sets produced significant enrichments for strong recessive selection in genes previously inferred to be under recessive selection in a consanguineous cohort and in genes involved in autosomal recessive monogenic disorders.
Asunto(s)
Frecuencia de los Genes , Genes Recesivos , Genética de Población , Selección Genética , Algoritmos , Alelos , Genes Dominantes , Predisposición Genética a la Enfermedad , Variación Genética , Genética de Población/métodos , Genómica/métodos , Genotipo , Humanos , Patrón de Herencia , Funciones de Verosimilitud , Modelos Genéticos , Mutación , Reino UnidoRESUMEN
Whole-genome sequencing resolves many clinical cases where standard diagnostic methods have failed. However, at least half of these cases remain unresolved after whole-genome sequencing. Structural variants (SVs; genomic variants larger than 50 base pairs) of uncertain significance are the genetic cause of a portion of these unresolved cases. As sequencing methods using long or linked reads become more accessible and SV detection algorithms improve, clinicians and researchers are gaining access to thousands of reliable SVs of unknown disease relevance. Methods to predict the pathogenicity of these SVs are required to realize the full diagnostic potential of long-read sequencing. To address this emerging need, we developed StrVCTVRE to distinguish pathogenic SVs from benign SVs that overlap exons. In a random forest classifier, we integrated features that capture gene importance, coding region, conservation, expression, and exon structure. We found that features such as expression and conservation are important but are absent from SV classification guidelines. We leveraged multiple resources to construct a size-matched training set of rare, putatively benign and pathogenic SVs. StrVCTVRE performs accurately across a wide SV size range on independent test sets, which will allow clinicians and researchers to eliminate about half of SVs from consideration while retaining a 90% sensitivity. We anticipate clinicians and researchers will use StrVCTVRE to prioritize SVs in probands where no SV is immediately compelling, empowering deeper investigation into novel SVs to resolve cases and understand new mechanisms of disease. StrVCTVRE runs rapidly and is publicly available.
Asunto(s)
Algoritmos , Genoma Humano , Variación Estructural del Genoma , Programas Informáticos , Aprendizaje Automático Supervisado , Conjuntos de Datos como Asunto , Exones , Genómica/métodos , Humanos , Curva ROC , Secuenciación Completa del Genoma/estadística & datos numéricosRESUMEN
Large biobank-scale whole genome sequencing (WGS) studies are rapidly identifying a multitude of coding and non-coding variants. They provide an unprecedented resource for illuminating the genetic basis of human diseases. Variant functional annotations play a critical role in WGS analysis, result interpretation, and prioritization of disease- or trait-associated causal variants. Existing functional annotation databases have limited scope to perform online queries and functionally annotate the genotype data of large biobank-scale WGS studies. We develop the Functional Annotation of Variants Online Resources (FAVOR) to meet these pressing needs. FAVOR provides a comprehensive multi-faceted variant functional annotation online portal that summarizes and visualizes findings of all possible nine billion single nucleotide variants (SNVs) across the genome. It allows for rapid variant-, gene- and region-level queries of variant functional annotations. FAVOR integrates variant functional information from multiple sources to describe the functional characteristics of variants and facilitates prioritizing plausible causal variants influencing human phenotypes. Furthermore, we provide a scalable annotation tool, FAVORannotator, to functionally annotate large-scale WGS studies and efficiently store the genotype and their variant functional annotation data in a single file using the annotated Genomic Data Structure (aGDS) format, making downstream analysis more convenient. FAVOR and FAVORannotator are available at https://favor.genohub.org.
Asunto(s)
Genoma Humano , Programas Informáticos , Humanos , Anotación de Secuencia Molecular , Genómica , Genotipo , Variación GenéticaRESUMEN
Genetic association studies of many heritable traits resulting from physiological testing often have modest sample sizes due to the cost and burden of the required phenotyping. This reduces statistical power and limits discovery of multiple genetic associations. We present a strategy to leverage pleiotropy between traits to both discover new loci and to provide mechanistic hypotheses of the underlying pathophysiology. Specifically, we combine a colocalization test with a locus-level test of pleiotropy. In simulations, we show that this approach is highly selective for identifying true pleiotropy driven by the same causative variant, thereby improves the chance to replicate the associations in underpowered validation cohorts and leads to higher interpretability. Here, as an exemplar, we use Obstructive Sleep Apnea (OSA), a common disorder diagnosed using overnight multi-channel physiological testing. We leverage pleiotropy with relevant cellular and cardio-metabolic phenotypes and gene expression traits to map new risk loci in an underpowered OSA GWAS. We identify several pleiotropic loci harboring suggestive associations to OSA and genome-wide significant associations to other traits, and show that their OSA association replicates in independent cohorts of diverse ancestries. By investigating pleiotropic loci, our strategy allows proposing new hypotheses about OSA pathobiology across many physiological layers. For example, we identify and replicate the pleiotropy across the plateletcrit, OSA and an eQTL of DNA primase subunit 1 (PRIM1) in immune cells. We find suggestive links between OSA, a measure of lung function (FEV1/FVC), and an eQTL of matrix metallopeptidase 15 (MMP15) in lung tissue. We also link a previously known genome-wide significant peak for OSA in the hexokinase 1 (HK1) locus to hematocrit and other red blood cell related traits. Thus, the analysis of pleiotropic associations has the potential to assemble diverse phenotypes into a chain of mechanistic hypotheses that provide insight into the pathogenesis of complex human diseases.
Asunto(s)
Estudio de Asociación del Genoma Completo , Apnea Obstructiva del Sueño , Humanos , Estudio de Asociación del Genoma Completo/métodos , Fenotipo , Estudios de Asociación Genética , Sueño , Pleiotropía Genética , Polimorfismo de Nucleótido Simple , ADN PrimasaRESUMEN
Genomic deletions provide a powerful loss-of-function model in noncoding regions to assess the role of purifying selection on genetic variation. Regulatory element function is characterized by nonuniform tissue and cell type activity, necessarily linking the study of fitness consequences from regulatory variants to their corresponding cellular activity. We generated a callset of deletions from genomes in the Alzheimer's Disease Neuroimaging Initiative (ADNI) and used deletions from The 1000 Genomes Project Consortium (1000GP) in order to examine whether purifying selection preserves noncoding sites of chromatin accessibility marked by DNase I hypersensitivity (DHS), histone modification (enhancer, transcribed, Polycomb-repressed, heterochromatin), and chromatin loop anchors. To examine this in a cellular activity-aware manner, we developed a statistical method, pleiotropy ratio score (PlyRS), which calculates a correlation-adjusted count of "cellular pleiotropy" for each noncoding base pair by analyzing shared regulatory annotations across tissues and cell types. By comparing real deletion PlyRS values to simulations in a length-matched framework and by using genomic covariates in analyses, we found that purifying selection acts to preserve both DHS and enhancer noncoding sites. However, we did not find evidence of purifying selection for noncoding transcribed, Polycomb-repressed, or heterochromatin sites beyond that of the noncoding background. Additionally, we found evidence that purifying selection is acting on chromatin loop integrity by preserving colocalized CTCF binding sites. At regions of DHS, enhancer, and CTCF within chromatin loop anchors, we found evidence that both sites of activity specific to a particular tissue or cell type and sites of cellularly pleiotropic activity are preserved by selection.
Asunto(s)
Cromatina , Genómica , Sitios de Unión , Cromatina/genética , Humanos , Proteínas del Grupo Polycomb/metabolismoRESUMEN
The rate at which plants grow is a major functional trait in plant ecology. However, little is known about its evolution in natural populations. Here, we investigate evolutionary and environmental factors shaping variation in the growth rate of Arabidopsis thaliana. We used plant diameter as a proxy to monitor plant growth over time in environments that mimicked latitudinal differences in the intensity of natural light radiation, across a set of 278 genotypes sampled within four broad regions, including an outgroup set of genotypes from China. A field experiment conducted under natural conditions confirmed the ecological relevance of the observed variation. All genotypes markedly expanded their rosette diameter when the light supply was decreased, demonstrating that environmental plasticity is a predominant source of variation to adapt plant size to prevailing light conditions. Yet, we detected significant levels of genetic variation both in growth rate and growth plasticity. Genome-wide association studies revealed that only 2 single nucleotide polymorphisms associate with genetic variation for growth above Bonferroni confidence levels. However, marginally associated variants were significantly enriched among genes with an annotated role in growth and stress reactions. Polygenic scores computed from marginally associated variants confirmed the polygenic basis of growth variation. For both light regimes, phenotypic divergence between the most distantly related population (China) and the various regions in Europe is smaller than the variation observed within Europe, indicating that the evolution of growth rate is likely to be constrained by stabilizing selection. We observed that Spanish genotypes, however, reach a significantly larger size than Northern European genotypes. Tests of adaptive divergence and analysis of the individual burden of deleterious mutations reveal that adaptive processes have played a more important role in shaping regional differences in rosette growth than maladaptive evolution.
Asunto(s)
Adaptación Fisiológica/genética , Arabidopsis/genética , Herencia Multifactorial/genética , Selección Genética , Aclimatación/genética , Arabidopsis/crecimiento & desarrollo , China , Europa (Continente) , Variación Genética/genética , Genética de Población , Genotipo , Fenotipo , Desarrollo de la Planta/genéticaRESUMEN
In complex trait genetics, the ability to predict phenotype from genotype is the ultimate measure of our understanding of genetic architecture underlying the heritability of a trait. A complete understanding of the genetic basis of a trait should allow for predictive methods with accuracies approaching the trait's heritability. The highly polygenic nature of quantitative traits and most common phenotypes has motivated the development of statistical strategies focused on combining myriad individually non-significant genetic effects. Now that predictive accuracies are improving, there is a growing interest in the practical utility of such methods for predicting risk of common diseases responsive to early therapeutic intervention. However, existing methods require individual-level genotypes or depend on accurately specifying the genetic architecture underlying each disease to be predicted. Here, we propose a polygenic risk prediction method that does not require explicitly modeling any underlying genetic architecture. We start with summary statistics in the form of SNP effect sizes from a large GWAS cohort. We then remove the correlation structure across summary statistics arising due to linkage disequilibrium and apply a piecewise linear interpolation on conditional mean effects. In both simulated and real datasets, this new non-parametric shrinkage (NPS) method can reliably allow for linkage disequilibrium in summary statistics of 5 million dense genome-wide markers and consistently improves prediction accuracy. We show that NPS improves the identification of groups at high risk for breast cancer, type 2 diabetes, inflammatory bowel disease, and coronary heart disease, all of which have available early intervention or prevention treatments.
Asunto(s)
Herencia Multifactorial/genética , Anciano , Estudios de Cohortes , Diabetes Mellitus Tipo 2/genética , Femenino , Estudio de Asociación del Genoma Completo/métodos , Genotipo , Humanos , Desequilibrio de Ligamiento/genética , Masculino , Persona de Mediana Edad , Modelos Genéticos , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Sitios de Carácter Cuantitativo/genéticaRESUMEN
During range expansion, edge populations are expected to face increased genetic drift, which in turn can alter and potentially compromise adaptive dynamics, preventing the removal of deleterious mutations and slowing down adaptation. Here, we contrast populations of the European subspecies Arabidopsis lyrata ssp. petraea, which expanded its Northern range after the last glaciation. We document a sharp decline in effective population size in the range-edge population and observe that nonsynonymous variants segregate at higher frequencies. We detect a 4.9% excess of derived nonsynonymous variants per individual in the range-edge population, suggesting an increase of the genomic burden of deleterious mutations. Inference of the fitness effects of mutations and modeling of allele frequencies under the explicit demographic history of each population predicts a depletion of rare deleterious variants in the range-edge population, but an enrichment for fixed ones, consistent with the bottleneck effect. However, the demographic history of the range-edge population predicts a small net decrease in per-individual fitness. Consistent with this prediction, the range-edge population is not impaired in its growth and survival measured in a common garden experiment. We further observe that the allelic diversity at the self-incompatibility locus, which ensures strict outcrossing and evolves under negative frequency-dependent selection, has remained unchanged. Genomic footprints indicative of selective sweeps are broader in the Northern population but not less frequent. We conclude that the outcrossing species A. lyrata ssp. petraea shows a strong resilience to the effect of range expansion.
Asunto(s)
Arabidopsis/genética , Carga Genética , Dispersión de las Plantas , Flujo Génico , Genes Recesivos , Aptitud Genética , Genoma de Planta , Dinámica Poblacional , Selección GenéticaRESUMEN
Sleep disordered breathing (SDB)-related overnight hypoxemia is associated with cardiometabolic disease and other comorbidities. Understanding the genetic bases for variations in nocturnal hypoxemia may help understand mechanisms influencing oxygenation and SDB-related mortality. We conducted genome-wide association tests across 10 cohorts and 4 populations to identify genetic variants associated with three correlated measures of overnight oxyhemoglobin saturation: average and minimum oxyhemoglobin saturation during sleep and the percent of sleep with oxyhemoglobin saturation under 90%. The discovery sample consisted of 8,326 individuals. Variants with p < 1 × 10(-6) were analyzed in a replication group of 14,410 individuals. We identified 3 significantly associated regions, including 2 regions in multi-ethnic analyses (2q12, 10q22). SNPs in the 2q12 region associated with minimum SpO2 (rs78136548 p = 2.70 × 10(-10)). SNPs at 10q22 were associated with all three traits including average SpO2 (rs72805692 p = 4.58 × 10(-8)). SNPs in both regions were associated in over 20,000 individuals and are supported by prior associations or functional evidence. Four additional significant regions were detected in secondary sex-stratified and combined discovery and replication analyses, including a region overlapping Reelin, a known marker of respiratory complex neurons.These are the first genome-wide significant findings reported for oxyhemoglobin saturation during sleep, a phenotype of high clinical interest. Our replicated associations with HK1 and IL18R1 suggest that variants in inflammatory pathways, such as the biologically-plausible NLRP3 inflammasome, may contribute to nocturnal hypoxemia.
Asunto(s)
Hexoquinasa/genética , Subunidad alfa del Receptor de Interleucina-18/genética , Oxihemoglobinas/metabolismo , Sueño/genética , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Moléculas de Adhesión Celular Neuronal/genética , Biología Computacional , Proteínas de la Matriz Extracelular/genética , Femenino , Redes Reguladoras de Genes , Variación Genética , Estudio de Asociación del Genoma Completo , Humanos , Hipoxia/sangre , Hipoxia/genética , Masculino , Persona de Mediana Edad , Proteína con Dominio Pirina 3 de la Familia NLR/genética , Proteínas del Tejido Nervioso/genética , Oxígeno/sangre , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Proteína Reelina , Serina Endopeptidasas/genética , Síndromes de la Apnea del Sueño/sangre , Síndromes de la Apnea del Sueño/genética , Adulto JovenRESUMEN
Identifying genetic variation in bacteria that has been shaped by ecological differences remains an important challenge. For recombining bacteria, the sign and strength of linkage provide a unique lens into ongoing selection. We show that derived alleles <300 bp apart in Neisseria gonorrhoeae exhibit more coupling linkage than repulsion linkage, a pattern that cannot be explained by limited recombination or neutrality as these couplings are significantly stronger for nonsynonymous alleles than synonymous alleles. This general pattern is driven by a small fraction of highly diverse genes, many of which exhibit evidence of interspecies horizontal gene transfer and an excess of intermediate frequency alleles. Extensive simulations show that two distinct forms of positive selection can create these patterns of genetic variation: directional selection on horizontally transferred alleles or balancing selection that maintains distinct haplotypes in the presence of recombination. Our results establish a framework for identifying patterns of selection in fine-scale haplotype structure that indicate specific ecological processes in species that recombine with distantly related lineages or possess coexisting adaptive haplotypes.
Asunto(s)
Variación Genética , Neisseria gonorrhoeae/genética , Análisis de Secuencia de ADN/métodos , Evolución Molecular , Frecuencia de los Genes , Transferencia de Gen Horizontal , Haplotipos , Desequilibrio de Ligamiento , Recombinación Genética , Selección GenéticaRESUMEN
Obstructive sleep apnea (OSA) is a common disorder associated with increased risk of cardiovascular disease and mortality. Its prevalence and severity vary across ancestral background. Although OSA traits are heritable, few genetic associations have been identified. To identify genetic regions associated with OSA and improve statistical power, we applied admixture mapping on three primary OSA traits [the apnea hypopnea index (AHI), overnight average oxyhemoglobin saturation (SaO2) and percentage time SaO2 < 90%] and a secondary trait (respiratory event duration) in a Hispanic/Latino American population study of 11 575 individuals with significant variation in ancestral background. Linear mixed models were performed using previously inferred African, European and Amerindian local genetic ancestry markers. Global African ancestry was associated with a lower AHI, higher SaO2 and shorter event duration. Admixture mapping analysis of the primary OSA traits identified local African ancestry at the chromosomal region 2q37 as genome-wide significantly associated with AHI (P < 5.7 × 10-5), and European and Amerindian ancestries at 18q21 suggestively associated with both AHI and percentage time SaO2 < 90% (P < 10-3). Follow-up joint ancestry-SNP association analyses identified novel variants in ferrochelatase (FECH), significantly associated with AHI and percentage time SaO2 < 90% after adjusting for multiple tests (P < 8 × 10-6). These signals contributed to the admixture mapping associations and were replicated in independent cohorts. In this first admixture mapping study of OSA, novel associations with variants in the iron/heme metabolism pathway suggest a role for iron in influencing respiratory traits underlying OSA.
Asunto(s)
Ferroquelatasa/genética , Estudio de Asociación del Genoma Completo , Apnea Obstructiva del Sueño/genética , Anciano , Mapeo Cromosómico , Femenino , Genotipo , Hispánicos o Latinos/genética , Humanos , Masculino , Persona de Mediana Edad , Polimorfismo de Nucleótido Simple/genética , Polisomnografía , Apnea Obstructiva del Sueño/diagnóstico por imagen , Apnea Obstructiva del Sueño/fisiopatología , Población Blanca/genéticaRESUMEN
PURPOSE: Genomic sequencing has become an increasingly powerful and relevant tool to be leveraged for the discovery of genetic aberrations underlying rare, Mendelian conditions. Although the computational tools incorporated into diagnostic workflows for this task are continually evolving and improving, we nevertheless sought to investigate commonalities across sequencing processing workflows to reveal consensus and standard practice tools and highlight exploratory analyses where technical and theoretical method improvements would be most impactful. METHODS: We collected details regarding the computational approaches used by a genetic testing laboratory and 11 clinical research sites in the United States participating in the Undiagnosed Diseases Network via meetings with bioinformaticians, online survey forms, and analyses of internal protocols. RESULTS: We found that tools for processing genomic sequencing data can be grouped into four distinct categories. Whereas well-established practices exist for initial variant calling and quality control steps, there is substantial divergence across sites in later stages for variant prioritization and multimodal data integration, demonstrating a diversity of approaches for solving the most mysterious undiagnosed cases. CONCLUSION: The largest differences across diagnostic workflows suggest that advances in structural variant detection, noncoding variant interpretation, and integration of additional biomedical data may be especially promising for solving chronically undiagnosed cases.
Asunto(s)
Genómica , Enfermedades no Diagnosticadas , Biología Computacional , Pruebas Genéticas , Genoma , Humanos , Programas Informáticos , Flujo de TrabajoRESUMEN
Patterns of amino acid conservation have served as a tool for understanding protein evolution. The same principles have also found broad application in human genomics, driven by the need to interpret the pathogenic potential of variants in patients. Here we performed a systematic comparative genomics analysis of human disease-causing missense variants. We found that an appreciable fraction of disease-causing alleles are fixed in the genomes of other species, suggesting a role for genomic context. We developed a model of genetic interactions that predicts most of these to be simple pairwise compensations. Functional testing of this model on two known human disease genes revealed discrete cis amino acid residues that, although benign on their own, could rescue the human mutations in vivo. This approach was also applied to ab initio gene discovery to support the identification of a de novo disease driver in BTG2 that is subject to protective cis-modification in more than 50 species. Finally, on the basis of our data and models, we developed a computational tool to predict candidate residues subject to compensation. Taken together, our data highlight the importance of cis-genomic context as a contributor to protein evolution; they provide an insight into the complexity of allele effect on phenotype; and they are likely to assist methods for predicting allele pathogenicity.