RESUMEN
The manganese (Mn) export protein SLC30A10 is essential for Mn excretion via the liver and intestines. Patients with SLC30A10 deficiency develop Mn excess, dystonia, liver disease, and polycythemia. Recent genome-wide association studies revealed a link between the SLC30A10 variant T95I and markers of liver disease. The in vivo relevance of this variant has yet to be investigated. Using in vitro and in vivo models, we explore the impact of the T95I variant on SLC30A10 function. While SLC30A10 I95 expressed at lower levels than T95 in transfected cell lines, both T95 and I95 variants protected cells similarly from Mn-induced toxicity. Adeno-associated virus 8-mediated expression of T95 or I95 SLC30A10 using the liver-specific thyroxine binding globulin promoter normalized liver Mn levels in mice with hepatocyte Slc30a10 deficiency. Furthermore, Adeno-associated virus-mediated expression of T95 or I95 SLC30A10 normalized red blood cell parameters and body weights and attenuated Mn levels and differential gene expression in livers and brains of mice with whole body Slc30a10 deficiency. While our in vivo data do not indicate that the T95I variant significantly compromises SLC30A10 function, it does reinforce the notion that the liver is a key site of SLC30A10 function. It also supports the idea that restoration of hepatic SLC30A10 expression is sufficient to attenuate phenotypes in SLC30A10 deficiency.
Asunto(s)
Sustitución de Aminoácidos , Proteínas de Transporte de Catión , Dependovirus , Hígado , Manganeso , Mutación , Animales , Ratones , Peso Corporal , Encéfalo/metabolismo , Proteínas de Transporte de Catión/deficiencia , Proteínas de Transporte de Catión/genética , Proteínas de Transporte de Catión/metabolismo , Línea Celular , Dependovirus/genética , Eritrocitos , Estudio de Asociación del Genoma Completo , Hepatocitos/metabolismo , Hígado/citología , Hígado/metabolismo , Hepatopatías/genética , Hepatopatías/metabolismo , Manganeso/metabolismo , Intoxicación por Manganeso/metabolismo , Fenotipo , Regiones Promotoras Genéticas , Globulina de Unión a Tiroxina/genéticaRESUMEN
A single gene can produce multiple transcripts with distinct molecular functions. Rare-variant association tests often aggregate all coding variants across individual genes, without accounting for the variants' presence or consequence in resulting transcript isoforms. To evaluate the utility of transcript-aware variant sets, rare predicted loss-of-function (pLOF) variants were aggregated for 17,035 protein-coding genes using 55,558 distinct transcript-specific variant sets. These sets were tested for their association with 728 circulating proteins and 188 quantitative phenotypes across 406,921 individuals in the UK Biobank. The transcript-specific approach resulted in larger estimated effects of pLOF variants decreasing serum cis-protein levels compared to the gene-based approach (pbinom ≤ 2x10-16). Additionally, 251 quantitative trait associations were identified as being significant using the transcript-specific approach but not the gene-based approach, including PCSK5 transcript ENST00000376752 and standing height (transcript-specific statistic, P = 1.3x10-16, effect = 0.7 SD decrease; gene-based statistic, P = 0.02, effect = 0.05 SD decrease) and LDLR transcript ENST00000252444 and apolipoprotein B (transcript-specific statistic, P = 5.7x10-20, effect = 1.0 SD increase; gene-based statistic, P = 3.0x10-4, effect = 0.2 SD increase). This approach demonstrates the importance of considering the effect of pLOFs on specific transcript isoforms when performing rare-variant association studies.
Asunto(s)
Bancos de Muestras Biológicas , Biobanco del Reino Unido , Humanos , Biología Computacional , Fenotipo , Isoformas de Proteínas/genéticaRESUMEN
The Pharma Proteomics Project is a precompetitive biopharmaceutical consortium characterizing the plasma proteomic profiles of 54,219 UK Biobank participants. Here we provide a detailed summary of this initiative, including technical and biological validations, insights into proteomic disease signatures, and prediction modelling for various demographic and health indicators. We present comprehensive protein quantitative trait locus (pQTL) mapping of 2,923 proteins that identifies 14,287 primary genetic associations, of which 81% are previously undescribed, alongside ancestry-specific pQTL mapping in non-European individuals. The study provides an updated characterization of the genetic architecture of the plasma proteome, contextualized with projected pQTL discovery rates as sample sizes and proteomic assay coverages increase over time. We offer extensive insights into trans pQTLs across multiple biological domains, highlight genetic influences on ligand-receptor interactions and pathway perturbations across a diverse collection of cytokines and complement networks, and illustrate long-range epistatic effects of ABO blood group and FUT2 secretor status on proteins with gastrointestinal tissue-enriched expression. We demonstrate the utility of these data for drug discovery by extending the genetic proxied effects of protein targets, such as PCSK9, on additional endpoints, and disentangle specific genes and proteins perturbed at loci associated with COVID-19 susceptibility. This public-private partnership provides the scientific community with an open-access proteomics resource of considerable breadth and depth to help to elucidate the biological mechanisms underlying proteo-genomic discoveries and accelerate the development of biomarkers, predictive models and therapeutics1.
Asunto(s)
Bancos de Muestras Biológicas , Proteínas Sanguíneas , Bases de Datos Factuales , Genómica , Salud , Proteoma , Proteómica , Humanos , Sistema del Grupo Sanguíneo ABO/genética , Proteínas Sanguíneas/análisis , Proteínas Sanguíneas/genética , COVID-19/genética , Descubrimiento de Drogas , Epistasis Genética , Fucosiltransferasas/metabolismo , Predisposición Genética a la Enfermedad , Plasma/química , Proproteína Convertasa 9/metabolismo , Proteoma/análisis , Proteoma/genética , Asociación entre el Sector Público-Privado , Sitios de Carácter Cuantitativo , Reino Unido , Galactósido 2-alfa-L-FucosiltransferasaRESUMEN
BACKGROUND: Whole-exome sequencing (WES) is an effective tool for diagnosis in patients who remain undiagnosed despite a comprehensive clinical work-up. While WES is being used increasingly in pediatrics and oncology, it remains underutilized in non-oncological adult medicine, including in patients with liver disease, in part based on the faulty premise that adults are unlikely to harbor rare genetic variants with large effect size. Here, we aim to assess the burden of rare genetic variants underlying liver disease in adults at two major tertiary referral academic medical centers. METHODS: WES analysis paired with comprehensive clinical evaluation was performed in fifty-two adult patients with liver disease of unknown etiology evaluated at two US tertiary academic health care centers. FINDINGS: Exome analysis uncovered a definitive or presumed diagnosis in 33% of patients (17/52) providing insight into their disease pathogenesis, with most of these patients (12/17) not having a known family history of liver disease. Our data shows that over two-thirds of undiagnosed liver disease patients attaining a genetic diagnosis were being evaluated for cholestasis or hepatic steatosis of unknown etiology. INTERPRETATION: This study reveals an underappreciated incidence and spectrum of genetic diseases presenting in adulthood and underscores the clinical value of incorporating exome sequencing in the evaluation and management of adults with liver disease of unknown etiology. FUNDING: S.V. is supported by the NIH/NIDDK (K08 DK113109 and R01 DK131033-01A1) and the Doris Duke Charitable Foundation Grant #2019081. This work was supported in part by NIH-funded Yale Liver Center, P30 DK34989.
Asunto(s)
Hígado Graso , Hepatopatías , Humanos , Adulto , Niño , Secuenciación del Exoma , Hepatopatías/diagnóstico , Hepatopatías/genética , Hepatopatías/terapia , Hígado Graso/genética , Exoma/genéticaRESUMEN
Human genetics research has discovered thousands of proteins associated with complex and rare diseases. Genome-wide association studies (GWAS) and studies of Mendelian disease have resulted in an increased understanding of the role of gene function and regulation in human conditions. Although the application of human genetics has been explored primarily as a method to identify potential drug targets and support their relevance to disease in humans, there is increasing interest in using genetic data to identify potential safety liabilities of modulating a given target. Human genetic variants can be used as a model to anticipate the effect of lifelong modulation of therapeutic targets and identify the potential risk for on-target adverse events. This approach is particularly useful for non-clinical safety evaluation of novel therapeutics that lack pharmacologically relevant animal models and can contribute to the intrinsic safety profile of a drug target. This Review illustrates applications of human genetics to safety studies during drug discovery and development, including assessing the potential for on- and off-target associated adverse events, carcinogenicity risk assessment, and guiding translational safety study designs and monitoring strategies. A summary of available human genetic resources and recommended best practices is provided. The challenges and future perspectives of translating human genetic information to identify risks for potential drug effects in preclinical and clinical development are discussed.
Asunto(s)
Estudio de Asociación del Genoma Completo , Genética Humana , Animales , HumanosRESUMEN
Identifying genetic variants associated with lower waist-to-hip ratio can reveal new therapeutic targets for abdominal obesity. We use exome sequences from 362,679 individuals to identify genes associated with waist-to-hip ratio adjusted for BMI (WHRadjBMI), a surrogate for abdominal fat that is causally linked to type 2 diabetes and coronary heart disease. Predicted loss of function (pLOF) variants in INHBE associate with lower WHRadjBMI and this association replicates in data from AMP-T2D-GENES. INHBE encodes a secreted protein, the hepatokine activin E. In vitro characterization of the most common INHBE pLOF variant in our study, indicates an in-frame deletion resulting in a 90% reduction in secreted protein levels. We detect associations with lower WHRadjBMI for variants in ACVR1C, encoding an activin receptor, further highlighting the involvement of activins in regulating fat distribution. These findings highlight activin E as a potential therapeutic target for abdominal obesity, a phenotype linked to cardiometabolic disease.
Asunto(s)
Diabetes Mellitus Tipo 2 , Subunidades beta de Inhibinas/genética , Receptores de Activinas Tipo I/genética , Índice de Masa Corporal , Diabetes Mellitus Tipo 2/genética , Humanos , Obesidad/genética , Obesidad Abdominal/genética , Relación Cintura-CaderaRESUMEN
The age of menopause is associated with fertility and disease risk, and its genetic control is of great interest. We use whole-exome sequences from 132,370 women in the UK Biobank to test for associations between rare damaging variants and age at natural menopause. Rare damaging variants in five genes are significantly associated with menopause: CHEK2 (p = 3.3 × 10-51), DCLRE1A (p = 8.4 × 10-13), and HELB (p = 5.7 × 10-7) with later menopause and TOP3A (p = 7.6 × 10-8) and CLPB (p = 8.1 × 10-7) with earlier menopause. Two additional genes are suggestive: RAD54L (p = 2.4 × 10-6) with later menopause and HROB (p = 2.9 × 10-6) with earlier menopause. In a follow-up analysis of repeated questionnaires in women who were initially premenopausal, CHEK2, TOP3A, and RAD54L genotypes are associated with subsequent menopause. Consistent with previous genome-wide association studies (GWASs), six of the seven genes are involved in the DNA damage repair pathway. Phenome-wide scans across 398,569 men and women revealed that in addition to known associations with cancers and blood cell counts, rare variants in CHEK2 are also associated with increased risk for uterine fibroids, polycystic ovary syndrome, and prostate hypertrophy; these associations are not shared with higher-penetrance breast cancer genes. Causal mediation analysis suggests that approximately 8% of the breast cancer risk conferred by CHEK2 pathogenic variants after menopause is mediated through delayed menopause.
RESUMEN
Sequencing of large cohorts offers an unprecedented opportunity to identify rare genetic variants and to find novel contributors to human disease. We used gene-based collapsing tests to identify genes associated with glucose, HbA1c and type 2 diabetes (T2D) diagnosis in 379,066 exome-sequenced participants in the UK Biobank. We identified associations for variants in GCK, HNF1A and PDX1, which are known to be involved in Mendelian forms of diabetes. Notably, we uncovered novel associations for GIGYF1, a gene not previously implicated by human genetics in diabetes. GIGYF1 predicted loss of function (pLOF) variants associated with increased levels of glucose (0.77 mmol/L increase, p = 4.42 × 10-12) and HbA1c (4.33 mmol/mol, p = 1.28 × 10-14) as well as T2D diagnosis (OR = 4.15, p = 6.14 × 10-11). Multiple rare variants contributed to these associations, including singleton variants. GIGYF1 pLOF also associated with decreased cholesterol levels as well as an increased risk of hypothyroidism. The association of GIGYF1 pLOF with T2D diagnosis replicated in an independent cohort from the Geisinger Health System. In addition, a common variant association for glucose and T2D was identified at the GIGYF1 locus. Our results highlight the role of GIGYF1 in regulating insulin signaling and protecting from diabetes.
Asunto(s)
Proteínas Portadoras/genética , Diabetes Mellitus Tipo 2/genética , Variación Genética , Colesterol/metabolismo , Exoma , Femenino , Predisposición Genética a la Enfermedad , Pruebas Genéticas , Genoma Humano , Estudio de Asociación del Genoma Completo , Glucosa/metabolismo , Factor Nuclear 1-alfa del Hepatocito/genética , Proteínas de Homeodominio/genética , Humanos , Hipotiroidismo/genética , Masculino , Mutación Missense , Fenotipo , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Transactivadores/genética , Reino Unido , Secuenciación del ExomaRESUMEN
Novel modalities such as PROTAC and RNAi have the ability to inadvertently alter the abundance of endogenous proteins. Currently available in vitro secondary pharmacology assays, which evaluate off-target binding or activity of small molecules, do not fully assess the off-target effects of PROTAC and are not applicable to RNAi. To address this gap, we developed a proteomics-based platform to comprehensively evaluate the abundance of off-target proteins. First, we selected off-target proteins using genetics and pharmacology evidence. This process yielded 2813 proteins, which we refer to as the "selected off-target proteome" (SOTP). An iterative algorithm was then used to identify four human cell lines out of 932. The 4 cell lines collectively expressed ~ 80% of the SOTP based on transcriptome data. Second, we used mass spectrometry to quantify the intracellular and extracellular proteins from the selected cell lines. Among over 10,000 quantifiable proteins identified, 1828 were part of the predefined SOTP. The SOTP was designed to be easily modified or expanded, owing to the rational selection process developed and the label free LC-MS/MS approach chosen. This versatility inherent to our platform is essential to design fit-for-purpose studies that can address the dynamic questions faced in investigative toxicology.
Asunto(s)
Biomarcadores de Tumor/metabolismo , Regulación Neoplásica de la Expresión Génica , Neoplasias/metabolismo , Neoplasias/patología , Proteoma/análisis , Proteoma/metabolismo , Proliferación Celular , Humanos , Células Tumorales CultivadasRESUMEN
Understanding mechanisms of hepatocellular damage may lead to new treatments for liver disease, and genome-wide association studies (GWAS) of alanine aminotransferase (ALT) and aspartate aminotransferase (AST) serum activities have proven useful for investigating liver biology. Here we report 100 loci associating with both enzymes, using GWAS across 411,048 subjects in the UK Biobank. The rare missense variant SLC30A10 Thr95Ile (rs188273166) associates with the largest elevation of both enzymes, and this association replicates in the DiscovEHR study. SLC30A10 excretes manganese from the liver to the bile duct, and rare homozygous loss of function causes the syndrome hypermanganesemia with dystonia-1 (HMNDYT1) which involves cirrhosis. Consistent with hematological symptoms of hypermanganesemia, SLC30A10 Thr95Ile carriers have increased hematocrit and risk of iron deficiency anemia. Carriers also have increased risk of extrahepatic bile duct cancer. These results suggest that genetic variation in SLC30A10 adversely affects more individuals than patients with diagnosed HMNDYT1.
Asunto(s)
Alanina Transaminasa/sangre , Aspartato Aminotransferasas/sangre , Proteínas de Transporte de Catión/genética , Estudio de Asociación del Genoma Completo , Manganeso/sangre , Mutación/genética , Proteínas de Transporte de Catión/metabolismo , Regulación de la Expresión Génica , Ligamiento Genético , Sitios Genéticos , Genoma Humano , Células HeLa , Hematócrito , Heterocigoto , Homeostasis , Humanos , Hígado/patología , Manganeso/metabolismo , Anotación de Secuencia Molecular , Fenotipo , Reproducibilidad de los ResultadosRESUMEN
Hereditary transthyretin-mediated (hATTR) amyloidosis is an underdiagnosed, progressively debilitating disease caused by mutations in the transthyretin (TTR) gene. V122I, a common pathogenic TTR mutation, is found in 3-4% of individuals of African ancestry in the United States and has been associated with cardiomyopathy and heart failure. To better understand the phenotypic consequences of carrying V122I, we conducted a phenome-wide association study scanning 427 ICD diagnosis codes in UK Biobank participants of African ancestry (n = 6062). Significant associations were tested for replication in the Penn Medicine Biobank (n = 5737) and the Million Veteran Program (n = 82,382). V122I was significantly associated with polyneuropathy in the UK Biobank (odds ratio [OR] = 6.4, 95% confidence interval [CI] 2.6-15.6, p = 4.2 × 10-5), which was replicated in the Penn Medicine Biobank (OR = 1.6, 95% CI 1.2-2.4, p = 6.0 × 10-3) and Million Veteran Program (OR = 1.5, 95% CI 1.2-1.8, p = 1.8 × 10-4). Polyneuropathy prevalence among V122I carriers was 2.1%, 9.0%, and 4.8% in the UK Biobank, Penn Medicine Biobank, and Million Veteran Program, respectively. The cumulative incidence of common hATTR amyloidosis manifestations (carpal tunnel syndrome, polyneuropathy, cardiomyopathy, heart failure) was significantly enriched in V122I carriers compared with non-carriers (HR = 2.8, 95% CI 1.7-4.5, p = 2.6 × 10-5) in the UK Biobank, with 37.4% of V122I carriers having at least one of these manifestations by age 75. Our findings show that V122I carriers are at increased risk of polyneuropathy. These results also emphasize the underdiagnosis of disease in V122I carriers with a significant proportion of subjects showing phenotypic changes consistent with hATTR amyloidosis. Greater understanding of the manifestations associated with V122I is critical for earlier diagnosis and treatment.
Asunto(s)
Neuropatías Amiloides Familiares/diagnóstico , Cardiomiopatías/diagnóstico , Insuficiencia Cardíaca/diagnóstico , Polineuropatías/diagnóstico , Prealbúmina/genética , Adulto , Anciano , Sustitución de Aminoácidos , Neuropatías Amiloides Familiares/complicaciones , Neuropatías Amiloides Familiares/etnología , Neuropatías Amiloides Familiares/genética , Bancos de Muestras Biológicas , Población Negra , Cardiomiopatías/complicaciones , Cardiomiopatías/etnología , Cardiomiopatías/genética , Femenino , Expresión Génica , Insuficiencia Cardíaca/complicaciones , Insuficiencia Cardíaca/etnología , Insuficiencia Cardíaca/genética , Heterocigoto , Humanos , Masculino , Persona de Mediana Edad , Mutación , Fenotipo , Polineuropatías/complicaciones , Polineuropatías/etnología , Polineuropatías/genética , Prevalencia , Reino Unido/epidemiologíaRESUMEN
Romosozumab (EVENITY™ [romosozumab-aqqg in the US]) is a humanized monoclonal antibody that inhibits sclerostin and has been approved in several countries for the treatment of osteoporosis in postmenopausal women at high risk of fracture. Sclerostin is expressed in bone and aortic vascular smooth muscle (AVSM). Its function in AVSM is unclear but it has been proposed to inhibit vascular calcification, atheroprogression, and inflammation. An increased incidence of positively adjudicated serious cardiovascular adverse events driven by an increase in myocardial infarction and stroke was observed in romosozumab-treated subjects in a clinical trial comparing alendronate with romosozumab (ARCH; NCT01631214) but not in a placebo-controlled trial (FRAME; NCT01575834). To investigate the effects of sclerostin inhibition with sclerostin antibody on the cardiovascular system, a comprehensive nonclinical toxicology package with additional cardiovascular studies was conducted. Although pharmacodynamic effects were observed in the bone, there were no functional, morphological, or transcriptional effects on the cardiovascular system in animal models in the presence or absence of atherosclerosis. These nonclinical studies did not identify evidence that proves the association between sclerostin inhibition and adverse cardiovascular function, increased cardiovascular calcification, and atheroprogression.
Asunto(s)
Proteínas Adaptadoras Transductoras de Señales/antagonistas & inhibidores , Anticuerpos Monoclonales/farmacología , Conservadores de la Densidad Ósea/farmacología , Sistema Cardiovascular/efectos de los fármacos , Animales , Anticuerpos Monoclonales/uso terapéutico , Conservadores de la Densidad Ósea/uso terapéutico , Evaluación Preclínica de Medicamentos , Femenino , Fracturas Óseas/prevención & control , Humanos , Macaca fascicularis , Masculino , Ratones Endogámicos C57BL , Ratones Noqueados para ApoE , Osteoporosis/tratamiento farmacológico , Ratas Sprague-Dawley , RiesgoRESUMEN
In the original version of this article, there were errors in the labelling of the colours in the key of Figure 2, whereby the labeling of the third and fourth of the four colours was reversed. This has been corrected in both the PDF and HTML versions of the article.
RESUMEN
Only a small fraction of early drug programs progress to the market, due to safety and efficacy failures, despite extensive efforts to predict safety. Characterizing the effect of natural variation in the genes encoding drug targets should present a powerful approach to predict side effects arising from drugging particular proteins. In this retrospective analysis, we report a correlation between the organ systems affected by genetic variation in drug targets and the organ systems in which side effects are observed. Across 1819 drugs and 21 phenotype categories analyzed, drug side effects are more likely to occur in organ systems where there is genetic evidence of a link between the drug target and a phenotype involving that organ system, compared to when there is no such genetic evidence (30.0 vs 19.2%; OR = 1.80). This result suggests that human genetic data should be used to predict safety issues associated with drug targets.
Asunto(s)
Ensayos Clínicos como Asunto , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos/genética , Estudios de Asociación Genética , Predisposición Genética a la Enfermedad , Humanos , Fenotipo , Análisis de Regresión , Estudios RetrospectivosRESUMEN
Safety-related drug failures remain a major challenge for the pharmaceutical industry. One approach to ensuring drug safety involves assessing small molecule drug specificity by examining the ability of a drug candidate to interact with a panel of "off-target" proteins, referred to as secondary pharmacology screening. Information from human genetics and pharmacology can be used to select proteins associated with adverse effects for such screening. In an analysis of marketed drugs, we found a clear relationship between the genetic and pharmacological phenotypes of a drug's off-target proteins and the observed drug side effects. In addition to using this phenotypic information for the selection of secondary pharmacology screens, we also show that it can be used to help identify drug off-target protein interactions responsible for drug-related adverse events. We anticipate that this phenotype-driven approach to secondary pharmacology screening will help to reduce safety-related drug failures due to drug off-target protein interactions.
Asunto(s)
Biomarcadores Farmacológicos/análisis , Evaluación Preclínica de Medicamentos/métodos , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos/genética , Farmacología/métodos , Proteínas/genética , Humanos , Modelos Teóricos , Redes Neurales de la Computación , FenotipoRESUMEN
Searching for novel sequence variants associated with cholesterol levels is of particular interest due to the causative role of non-HDL cholesterol levels in cardiovascular disease. Through whole-genome sequencing of 15,220 Icelanders and imputation of the variants identified, we discovered a rare missense variant in NR1H4 (R436H) associating with lower levels of total cholesterol (effect = -0.47 standard deviations or -0.55 mmol L-1, p = 4.21 × 10-10, N = 150,211). Importantly, NR1H4 R436H also associates with lower levels of non-HDL cholesterol and, consistent with this, protects against coronary artery disease. NR1H4 encodes FXR that regulates bile acid homeostasis, however, we do not detect a significant association between R436H and biological markers of liver function. Transcriptional profiling of hepatocytes carrying R436H shows that it is not a loss-of-function variant. Rather, we observe changes in gene expression compatible with effects on lipids. These findings highlight the role of FXR in regulation of cholesterol levels in humans.
RESUMEN
Cancer risk assessment of therapeutics is plagued by poor translatability of rodent models of carcinogenesis. In order to overcome this fundamental limitation, new approaches are needed that enable us to evaluate cancer risk directly in humans and human-based cellular models. Our enhanced understanding of the mechanisms of carcinogenesis and the influence of human genome sequence variation on cancer risk motivates us to re-evaluate how we assess the carcinogenic risk of therapeutics. This review will highlight new opportunities for applying this knowledge to the development of a battery of human-based in vitro models and biomarkers for assessing cancer risk of novel therapeutics.
Asunto(s)
Carcinógenos/toxicidad , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos/prevención & control , Neoplasias/prevención & control , Farmacovigilancia , Biomarcadores Farmacológicos/análisis , Biomarcadores de Tumor/genética , Biomarcadores de Tumor/metabolismo , Predisposición Genética a la Enfermedad , HumanosRESUMEN
Understanding of sequence diversity is the cornerstone of analysis of genetic disorders, population genetics, and evolutionary biology. Here, we present an update of our sequencing set to 15,220 Icelanders who we sequenced to an average genome-wide coverage of 34X. We identified 39,020,168 autosomal variants passing GATK filters: 31,079,378 SNPs and 7,940,790 indels. Calling de novo mutations (DNMs) is a formidable challenge given the high false positive rate in sequencing datasets relative to the mutation rate. Here we addressed this issue by using segregation of alleles in three-generation families. Using this transmission assay, we controlled the false positive rate and identified 108,778 high quality DNMs. Furthermore, we used our extended family structure and read pair tracing of DNMs to a panel of phased SNPs, to determine the parent of origin of 42,961 DNMs.
Asunto(s)
Genoma Humano , Humanos , Mutación INDEL , Islandia , Polimorfismo de Nucleótido SimpleRESUMEN
The characterization of mutational processes that generate sequence diversity in the human genome is of paramount importance both to medical genetics and to evolutionary studies. To understand how the age and sex of transmitting parents affect de novo mutations, here we sequence 1,548 Icelanders, their parents, and, for a subset of 225, at least one child, to 35× genome-wide coverage. We find 108,778 de novo mutations, both single nucleotide polymorphisms and indels, and determine the parent of origin of 42,961. The number of de novo mutations from mothers increases by 0.37 per year of age (95% CI 0.32-0.43), a quarter of the 1.51 per year from fathers (95% CI 1.45-1.57). The number of clustered mutations increases faster with the mother's age than with the father's, and the genomic span of maternal de novo mutation clusters is greater than that of paternal ones. The types of de novo mutation from mothers change substantially with age, with a 0.26% (95% CI 0.19-0.33%) decrease in cytosine-phosphate-guanine to thymine-phosphate-guanine (CpG>TpG) de novo mutations and a 0.33% (95% CI 0.28-0.38%) increase in C>G de novo mutations per year, respectively. Remarkably, these age-related changes are not distributed uniformly across the genome. A striking example is a 20 megabase region on chromosome 8p, with a maternal C>G mutation rate that is up to 50-fold greater than the rest of the genome. The age-related accumulation of maternal non-crossover gene conversions also mostly occurs within these regions. Increased sequence diversity and linkage disequilibrium of C>G variants within regions affected by excess maternal mutations indicate that the underlying mutational process has persisted in humans for thousands of years. Moreover, the regional excess of C>G variation in humans is largely shared by chimpanzees, less by gorillas, and is almost absent from orangutans. This demonstrates that sequence diversity in humans results from evolving interactions between age, sex, mutation type, and genomic location.