RESUMO
How the genetic composition of a population changes through stochastic processes, such as genetic drift, in combination with deterministic processes, such as selection, is critical to understanding how phenotypes vary in space and time. Here, we show how evolutionary forces affecting selection, including recombination and effective population size, drive genomic patterns of allele-specific expression (ASE). Integrating tissue-specific genotypic and transcriptomic data from 1500 individuals from two different cohorts, we demonstrate that ASE is less often observed in regions of low recombination, and loci in high or normal recombination regions are more efficient at using ASE to underexpress harmful mutations. By tracking genetic ancestry, we discriminate between ASE variability due to past demographic effects, including subsequent bottlenecks, versus local environment. We observe that ASE is not randomly distributed along the genome and that population parameters influencing the efficacy of natural selection alter ASE levels genome wide.
Assuntos
Variação Genética , Seleção Genética , Alelos , Deriva Genética , Humanos , Recombinação GenéticaRESUMO
BACKGROUND: Uromodulin, the most abundant protein excreted in normal urine, plays major roles in kidney physiology and disease. The mechanisms regulating the urinary excretion of uromodulin remain essentially unknown. METHODS: We conducted a meta-analysis of genome-wide association studies for raw (uUMOD) and indexed to creatinine (uUCR) urinary levels of uromodulin in 29,315 individuals of European ancestry from 13 cohorts. We tested the distribution of candidate genes in kidney segments and investigated the effects of keratin-40 (KRT40) on uromodulin processing. RESULTS: Two genome-wide significant signals were identified for uUMOD: a novel locus (P 1.24E-08) over the KRT40 gene coding for KRT40, a type 1 keratin expressed in the kidney, and the UMOD-PDILT locus (P 2.17E-88), with two independent sets of single nucleotide polymorphisms spread over UMOD and PDILT. Two genome-wide significant signals for uUCR were identified at the UMOD-PDILT locus and at the novel WDR72 locus previously associated with kidney function. The effect sizes for rs8067385, the index single nucleotide polymorphism in the KRT40 locus, were similar for both uUMOD and uUCR. KRT40 colocalized with uromodulin and modulating its expression in thick ascending limb (TAL) cells affected uromodulin processing and excretion. CONCLUSIONS: Common variants in KRT40, WDR72, UMOD, and PDILT associate with the levels of uromodulin in urine. The expression of KRT40 affects uromodulin processing in TAL cells. These results, although limited by lack of replication, provide insights into the biology of uromodulin, the role of keratins in the kidney, and the influence of the UMOD-PDILT locus on kidney function.
Assuntos
Estudo de Associação Genômica Ampla , Rim , Creatinina , Humanos , Polimorfismo de Nucleotídeo Único , Isomerases de Dissulfetos de Proteínas/genética , Uromodulina/genéticaRESUMO
Age-related clonal hematopoiesis (ARCH) is characterized by age-associated accumulation of somatic mutations in hematopoietic stem cells (HSCs) or their pluripotent descendants. HSCs harboring driver mutations will be positively selected and cells carrying these mutations will rise in frequency. While ARCH is a known risk factor for blood malignancies, such as Acute Myeloid Leukemia (AML), why some people who harbor ARCH driver mutations do not progress to AML remains unclear. Here, we model the interaction of positive and negative selection in deeply sequenced blood samples from individuals who subsequently progressed to AML, compared to healthy controls, using deep learning and population genetics. Our modeling allows us to discriminate amongst evolutionary classes with high accuracy and captures signatures of purifying selection in most individuals. Purifying selection, acting on benign or mildly damaging passenger mutations, appears to play a critical role in preventing disease-predisposing clones from rising to dominance and is associated with longer disease-free survival. Through exploring a range of evolutionary models, we show how different classes of selection shape clonal dynamics and health outcomes thus enabling us to better identify individuals at a high risk of malignancy.
Assuntos
Evolução Clonal , Hematopoiese Clonal/genética , Células-Tronco Hematopoéticas/metabolismo , Leucemia Mieloide/genética , Mutação , Doença Aguda , Adulto , Idoso , Aprendizado Profundo , Genética Populacional/métodos , Genética Populacional/estatística & dados numéricos , Células-Tronco Hematopoéticas/citologia , Humanos , Estimativa de Kaplan-Meier , Leucemia Mieloide/patologia , Pessoa de Meia-Idade , Modelos Genéticos , Avaliação de Resultados em Cuidados de Saúde/métodos , Avaliação de Resultados em Cuidados de Saúde/estatística & dados numéricosRESUMO
AIMS/HYPOTHESIS: Type 2 diabetes increases the risk of cardiovascular and renal complications, but early risk prediction could lead to timely intervention and better outcomes. Genetic information can be used to enable early detection of risk. METHODS: We developed a multi-polygenic risk score (multiPRS) that combines ten weighted PRSs (10 wPRS) composed of 598 SNPs associated with main risk factors and outcomes of type 2 diabetes, derived from summary statistics data of genome-wide association studies. The 10 wPRS, first principal component of ethnicity, sex, age at onset and diabetes duration were included into one logistic regression model to predict micro- and macrovascular outcomes in 4098 participants in the ADVANCE study and 17,604 individuals with type 2 diabetes in the UK Biobank study. RESULTS: The model showed a similar predictive performance for cardiovascular and renal complications in different cohorts. It identified the top 30% of ADVANCE participants with a mean of 3.1-fold increased risk of major micro- and macrovascular events (p = 6.3 × 10-21 and p = 9.6 × 10-31, respectively) and a 4.4-fold (p = 6.8 × 10-33) higher risk of cardiovascular death. While in ADVANCE overall, combined intensive blood pressure and glucose control decreased cardiovascular death by 24%, the model identified a high-risk group in whom it decreased the mortality rate by 47%, and a low-risk group in whom it had no discernible effect. High-risk individuals had the greatest absolute risk reduction with a number needed to treat of 12 to prevent one cardiovascular death over 5 years. CONCLUSIONS/INTERPRETATION: This novel multiPRS model stratified individuals with type 2 diabetes according to risk of complications and helped to target earlier those who would receive greater benefit from intensive therapy.
Assuntos
Complicações do Diabetes , Diabetes Mellitus Tipo 2 , Herança Multifatorial , Glicemia , Pressão Sanguínea/genética , Complicações do Diabetes/complicações , Diabetes Mellitus Tipo 2/complicações , Diabetes Mellitus Tipo 2/genética , Estudo de Associação Genômica Ampla , Humanos , Fatores de RiscoRESUMO
The binding of PRDM9 to chromatin is a key step in the induction of DNA double-strand breaks associated with meiotic recombination hotspots; it is normally expressed solely in germ cells. We interrogated 1879 cancer samples in 39 different cancer types and found that PRDM9 is unexpectedly expressed in 20% of these tumors even after stringent gene homology correction. The expression levels of PRDM9 in tumors are significantly higher than those found in healthy neighboring tissues and in healthy nongerm tissue databases. Recurrently mutated regions located within 5 Mb of the PRDM9 loci, as well as differentially expressed genes in meiotic pathways, correlate with PRDM9 expression. In samples with aberrant PRDM9 expression, structural variant breakpoints frequently neighbor the DNA motif recognized by PRDM9, and there is an enrichment of structural variants at sites of known meiotic PRDM9 activity. This study is the first to provide evidence of an association between aberrant expression of the meiosis-specific gene PRDM9 with genomic instability in cancer.
Assuntos
Regulação da Expressão Gênica no Desenvolvimento , Histona-Lisina N-Metiltransferase/genética , Taxa de Mutação , Neoplasias/genética , Pontos de Quebra do Cromossomo , Instabilidade Genômica , Histona-Lisina N-Metiltransferase/metabolismo , HumanosRESUMO
Uncovering the interaction between genomes and the environment is a principal challenge of modern genomics and preventive medicine. While theoretical models are well defined, little is known of the G × E interactions in humans. We used an integrative approach to comprehensively assess the interactions between 1.6 million data points, encompassing a range of environmental exposures, health, and gene expression levels, coupled with whole-genome genetic variation. From â¼1000 individuals of a founder population in Quebec, we reveal a substantial impact of the environment on the transcriptome and clinical endophenotypes, overpowering that of genetic ancestry. Air pollution impacts gene expression and pathways affecting cardio-metabolic and respiratory traits, when controlling for genetic ancestry. Finally, we capture four expression quantitative trait loci that interact with the environment (air pollution). Our findings demonstrate how the local environment directly affects disease risk phenotypes and that genetic variation, including less common variants, can modulate individual's response to environmental challenges.
Assuntos
Interação Gene-Ambiente , Adulto , Idoso , Poluição do Ar , Exposição Ambiental , França/etnologia , Expressão Gênica , Fluxo Gênico , Humanos , Pessoa de Meia-Idade , Penetrância , Polimorfismo Genético , Locos de Características Quantitativas , Quebeque , TranscriptomaRESUMO
Humans have colonized the planet through a series of range expansions, which deeply impacted genetic diversity in newly settled areas and potentially increased the frequency of deleterious mutations on expanding wave fronts. To test this prediction, we studied the genomic diversity of French Canadians who colonized Quebec in the 17th century. We used historical information and records from â¼4000 ascending genealogies to select individuals whose ancestors lived mostly on the colonizing wave front and individuals whose ancestors remained in the core of the settlement. Comparison of exomic diversity reveals that: (i) both new and low-frequency variants are significantly more deleterious in front than in core individuals, (ii) equally deleterious mutations are at higher frequencies in front individuals, and (iii) front individuals are two times more likely to be homozygous for rare very deleterious mutations present in Europeans. These differences have emerged in the past six to nine generations and cannot be explained by differential inbreeding, but are consistent with relaxed selection mainly due to higher rates of genetic drift on the wave front. Demographic inference and modeling of the evolution of rare variants suggest lower effective size on the front, and lead to an estimation of selection coefficients that increase with conservation scores. Even though range expansions have had a relatively limited impact on the overall fitness of French Canadians, they could explain the higher prevalence of recessive genetic diseases in recently settled regions of Quebec.
Assuntos
Genética Populacional , Modelos Genéticos , Seleção Genética , Algoritmos , Alelos , Evolução Biológica , Simulação por Computador , Demografia , Evolução Molecular , Frequência do Gene , Ontologia Genética , Aptidão Genética , Variação Genética , Humanos , Mutação , Polimorfismo de Nucleotídeo Único , QuebequeRESUMO
Perturbations of γ-aminobutyric acid (GABA) neurotransmission in the human prefrontal cortex have been implicated in the pathogenesis of schizophrenia (SCZ), but the mechanisms are unclear. NKCC1 (SLC12A2) is a Cl(-)-importing cation-Cl(-) cotransporter that contributes to the maintenance of depolarizing GABA activity in immature neurons, and variation in SLC12A2 has been shown to increase the risk for schizophrenia via alterations of NKCC1 mRNA expression. However, no disease-causing mutations or functional variants in NKCC1 have been identified in human patients with SCZ. Here, by sequencing three large French-Canadian (FC) patient cohorts of SCZ, autism spectrum disorders (ASD), and intellectual disability (ID), we identified a novel heterozygous NKCC1 missense variant (p.Y199C) in SCZ. This variant is located in an evolutionarily conserved residue in the critical N-terminal regulatory domain and exhibits high predicted pathogenicity. No NKCC1 variants were detected in ASD or ID, and no KCC3 variants were identified in any of the three neurodevelopmental disorder cohorts. Functional experiments show Y199C is a gain-of-function variant, increasing Cl(-)-dependent and bumetanide-sensitive NKCC1 activity even in conditions in which the transporter is normally functionally silent (hypotonicity). These data are the first to describe a functional missense variant in SLC12A2 in human SCZ, and suggest that genetically encoded dysregulation of NKCC1 may be a risk factor for, or contribute to the pathogenesis of, human SCZ.
Assuntos
Mutação de Sentido Incorreto , Esquizofrenia/genética , Membro 2 da Família 12 de Carreador de Soluto/genética , Animais , Transtorno do Espectro Autista/genética , Bumetanida/farmacologia , Estudos de Coortes , Deficiência Intelectual/genética , Potenciais da Membrana/efeitos dos fármacos , Potenciais da Membrana/fisiologia , Oócitos , Quebeque , Inibidores de Simportadores de Cloreto de Sódio e Potássio/farmacologia , Membro 2 da Família 12 de Carreador de Soluto/metabolismo , XenopusRESUMO
BACKGROUND AND OBJECTIVES: The urinary excretion of uromodulin is influenced by common variants in the UMOD gene, and it may be related to NaCl retention and hypertension. Levels of uromodulin are also dependent of the renal function, but other determinants remain unknown. DESIGN, SETTING, PARTICIPANTS, & MEASUREMENTS: We tested associations between the urinary excretion of uromodulin; medical history and medication; serum and urinary levels of electrolytes, glucose, and uric acid; and the genotype at the UMOD/Protein Disulfide Isomerase-Like, Testis Expressed locus (rs4293393 and rs12446492); 943 participants from the CARTaGENE Cohort, a random sample from the Canadian population of 20,004 individuals, were analyzed. Participants with available genotyping were obtained from a substudy addressing associations between common variants and cardiovascular disease in paired participants with high and low Framingham risk scores and vascular rigidity indexes. RESULTS: The population studied was 54±9 years old, with 51% women and eGFR of 9±14 ml/min per 1.73 m(2). Uromodulin excretion was 25 (11-42) mg/g creatinine. Using linear regression, it was independently higher among patients with higher eGFR, the TT genotype of rs4293393, and the TT genotype of rs12446492. The fractional excretions of urate and sodium showed a strong positive correlation with uromodulin, likely linked to the extracellular volume status. The presence of glycosuria and the use of uricosuric drugs, which both increased the fraction excretion of urate, were independently associated with a lower uromodulin excretion, suggesting novel interactions between uric acid and uromodulin excretion. CONCLUSIONS: In this large cohort, the excretion of uromodulin correlates with clinical, genetic, and urinary factors. The strongest associations were between uric acid, sodium, and uromodulin excretions and are likely linked to the extracellular volume status.
Assuntos
Uromodulina/urina , Idoso , Creatinina/urina , Estudos Transversais , Feminino , Taxa de Filtração Glomerular , Humanos , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único , Uromodulina/genéticaRESUMO
Mutations in the mitochondrial genome are associated with multiple diseases and biological processes; however, little is known about the extent of sequence variation in the mitochondrial transcriptome. By ultra-deeply sequencing mitochondrial RNA (>6000×) from the whole blood of ~1000 individuals from the CARTaGENE project, we identified remarkable levels of sequence variation within and across individuals, as well as sites that show consistent patterns of posttranscriptional modification. Using a genome-wide association study, we find that posttranscriptional modification of functionally important sites in mitochondrial transfer RNAs (tRNAs) is under strong genetic control, largely driven by a missense mutation in MRPP3 that explains ~22% of the variance. These results reveal a major nuclear genetic determinant of posttranscriptional modification in mitochondria and suggest that tRNA posttranscriptional modification may affect cellular energy production.
Assuntos
Variação Genética , Genoma Mitocondrial , RNA de Transferência/genética , RNA/genética , Ribonuclease P/genética , Adulto , Idoso , Sequência de Bases , DNA Mitocondrial/química , DNA Mitocondrial/genética , Feminino , Estudo de Associação Genômica Ampla , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Masculino , Metilação , Pessoa de Meia-Idade , Mutação de Sentido Incorreto , Polimorfismo de Nucleotídeo Único , RNA/química , RNA/metabolismo , Processamento Pós-Transcricional do RNA , RNA Mitocondrial , RNA de Transferência/química , RNA de Transferência/metabolismo , Ribonuclease P/metabolismo , Análise de Sequência de DNA , Análise de Sequência de RNA , TranscriptomaRESUMO
Sickle cell disease (SCD) is a congenital blood disease, affecting predominantly children from sub-Saharan Africa, but also populations world-wide. Although the causal mutation of SCD is known, the sources of clinical variability of SCD remain poorly understood, with only a few highly heritable traits associated with SCD having been identified. Phenotypic heterogeneity in the clinical expression of SCD is problematic for follow-up (FU), management, and treatment of patients. Here we used the joint analysis of gene expression and whole genome genotyping data to identify the genetic regulatory effects contributing to gene expression variation among groups of patients exhibiting clinical variability, as well as unaffected siblings, in Benin, West Africa. We characterized and replicated patterns of whole blood gene expression variation within and between SCD patients at entry to clinic, as well as in follow-up programs. We present a global map of genes involved in the disease through analysis of whole blood sampled from the cohort. Genome-wide association mapping of gene expression revealed 390 peak genome-wide significant expression SNPs (eSNPs) and 6 significant eSNP-by-clinical status interaction effects. The strong modulation of the transcriptome implicates pathways affecting core circulating cell functions and shows how genotypic regulatory variation likely contributes to the clinical variation observed in SCD.
RESUMO
Whole-exome or gene targeted resequencing in hundreds to thousands of individuals has shown that the majority of genetic variants are at low frequency in human populations. Rare variants are enriched for functional mutations and are expected to explain an important fraction of the genetic etiology of human disease, therefore having a potential medical interest. In this work, we analyze the whole-exome sequences of French-Canadian individuals, a founder population with a unique demographic history that includes an original population bottleneck less than 20 generations ago, followed by a demographic explosion, and the whole exomes of French individuals sampled from France. We show that in less than 20 generations of genetic isolation from the French population, the genetic pool of French-Canadians shows reduced levels of diversity, higher homozygosity, and an excess of rare variants with low variant sharing with Europeans. Furthermore, the French-Canadian population contains a larger proportion of putatively damaging functional variants, which could partially explain the increased incidence of genetic disease in the province. Our results highlight the impact of population demography on genetic fitness and the contribution of rare variants to the human genetic variation landscape, emphasizing the need for deep cataloguing of genetic variants by resequencing worldwide human populations in order to truly assess disease risk.
Assuntos
Suscetibilidade a Doenças , Exoma/genética , Mutação , Análise de Sequência de DNA/métodos , Canadá , Demografia , França , Frequência do Gene , Genética Populacional , Humanos , Polimorfismo de Nucleotídeo Único , População Branca/genéticaRESUMO
One of the most rapidly evolving genes in humans, PRDM9, is a key determinant of the distribution of meiotic recombination events. Mutations in this meiotic-specific gene have previously been associated with male infertility in humans and recent studies suggest that PRDM9 may be involved in pathological genomic rearrangements. In studying genomes from families with children affected by B-cell precursor acute lymphoblastic leukemia (B-ALL), we characterized meiotic recombination patterns within a family with two siblings having hyperdiploid childhood B-ALL and observed unusual localization of maternal recombination events. The mother of the family carries a rare PRDM9 allele, potentially explaining the unusual patterns found. From exomes sequenced in 44 additional parents of children affected with B-ALL, we discovered a substantial and significant excess of rare allelic forms of PRDM9. The rare PRDM9 alleles are transmitted to the affected children in half the cases; nonetheless there remains a significant excess of rare alleles among patients relative to controls. We successfully replicated this latter observation in an independent cohort of 50 children with B-ALL, where we found an excess of rare PRDM9 alleles in aneuploid and infant B-ALL patients. PRDM9 variability in humans is thought to influence genomic instability, and these data support a potential role for PRDM9 variation in risk of acquiring aneuploidies or genomic rearrangements associated with childhood leukemogenesis.
Assuntos
Alelos , Histona-Lisina N-Metiltransferase/genética , Leucemia Aguda Bifenotípica/genética , Leucemia Aguda Bifenotípica/patologia , Adolescente , Criança , Pré-Escolar , Estudos de Coortes , Troca Genética , Exoma , Feminino , Frequência do Gene , Rearranjo Gênico , Instabilidade Genômica , Histona-Lisina N-Metiltransferase/metabolismo , Humanos , Lactente , Masculino , Meiose , Análise em Microsséries , Mutação , Linhagem , Polimorfismo de Nucleotídeo Único , Recombinação Genética , Análise de Sequência de DNA , Translocação GenéticaRESUMO
The host mechanisms responsible for protection against malaria remain poorly understood, with only a few protective genetic effects mapped in humans. Here, we characterize a host-specific genome-wide signature in whole-blood transcriptomes of Plasmodium falciparum-infected West African children and report a demonstration of genotype-by-infection interactions in vivo. Several associations involve transcripts sensitive to infection and implicate complement system, antigen processing and presentation, and T-cell activation (i.e., SLC39A8, C3AR1, FCGR3B, RAD21, RETN, LRRC25, SLC3A2, and TAPBP), including one association that validated a genome-wide association candidate gene (SCO1), implicating binding variation within a noncoding regulatory element. Gene expression profiles in mice infected with Plasmodium chabaudi revealed and validated similar responses and highlighted specific pathways and genes that are likely important responders in both hosts. These results suggest that host variation and its interplay with infection affect children's ability to cope with infection and suggest a polygenic model mounted at the transcriptional level for susceptibility.