Search | VHL Search Portal

1.

Bedside Back to Bench: Building Bridges between Basic and Clinical Genomic Research.

Manolio, Teri A; Fowler, Douglas M; Starita, Lea M; Haendel, Melissa A; MacArthur, Daniel G; Biesecker, Leslie G; Worthey, Elizabeth; Chisholm, Rex L; Green, Eric D; Jacob, Howard J; McLeod, Howard L; Roden, Dan; Rodriguez, Laura Lyman; Williams, Marc S; Cooper, Gregory M; Cox, Nancy J; Herman, Gail E; Kingsmore, Stephen; Lo, Cecilia; Lutz, Cathleen; MacRae, Calum A; Nussbaum, Robert L; Ordovas, Jose M; Ramos, Erin M; Robinson, Peter N; Rubinstein, Wendy S; Seidman, Christine; Stranger, Barbara E; Wang, Haoyi; Westerfield, Monte; Bult, Carol.

Cell ; 169(1): 6-12, 2017 03 23.

Article in English | MEDLINE | ID: mdl-28340351

ABSTRACT

Genome sequencing has revolutionized the diagnosis of genetic diseases. Close collaborations between basic scientists and clinical genomicists are now needed to link genetic variants with disease causation. To facilitate such collaborations, we recommend prioritizing clinically relevant genes for functional studies, developing reference variant-phenotype databases, adopting phenotype description standards, and promoting data sharing.

Subject(s)

Biomedical Research , Genomics , Animals , DNA Mutational Analysis , Databases, Genetic , Disease/genetics , Human Genome Project , Humans , Information Dissemination , Models, Animal

2.

Sequencing and characterizing short tandem repeats in the human genome.

Tanudisastro, Hope A; Deveson, Ira W; Dashnow, Harriet; MacArthur, Daniel G.

Nat Rev Genet ; 25(7): 460-475, 2024 Jul.

Article in English | MEDLINE | ID: mdl-38366034

ABSTRACT

Short tandem repeats (STRs) are highly polymorphic sequences throughout the human genome that are composed of repeated copies of a 1-6-bp motif. Over 1 million variable STR loci are known, some of which regulate gene expression and influence complex traits, such as height. Moreover, variants in at least 60 STR loci cause genetic disorders, including Huntington disease and fragile X syndrome. Accurately identifying and genotyping STR variants is challenging, in particular mapping short reads to repetitive regions and inferring expanded repeat lengths. Recent advances in sequencing technology and computational tools for STR genotyping from sequencing data promise to help overcome this challenge and solve genetically unresolved cases and the 'missing heritability' of polygenic traits. Here, we compare STR genotyping methods, analytical tools and their applications to understand the effect of STR variation on health and disease. We identify emergent opportunities to refine genotyping and quality-control approaches as well as to integrate STRs into variant-calling workflows and large cohort analyses.

Subject(s)

Genome, Human , Microsatellite Repeats , Humans , Microsatellite Repeats/genetics , Sequence Analysis, DNA/methods , Genotyping Techniques/methods , High-Throughput Nucleotide Sequencing/methods , Genotype

3.

A genomic mutational constraint map using variation in 76,156 human genomes.

Chen, Siwei; Francioli, Laurent C; Goodrich, Julia K; Collins, Ryan L; Kanai, Masahiro; Wang, Qingbo; Alföldi, Jessica; Watts, Nicholas A; Vittal, Christopher; Gauthier, Laura D; Poterba, Timothy; Wilson, Michael W; Tarasova, Yekaterina; Phu, William; Grant, Riley; Yohannes, Mary T; Koenig, Zan; Farjoun, Yossi; Banks, Eric; Donnelly, Stacey; Gabriel, Stacey; Gupta, Namrata; Ferriera, Steven; Tolonen, Charlotte; Novod, Sam; Bergelson, Louis; Roazen, David; Ruano-Rubio, Valentin; Covarrubias, Miguel; Llanwarne, Christopher; Petrillo, Nikelle; Wade, Gordon; Jeandet, Thibault; Munshi, Ruchi; Tibbetts, Kathleen; O'Donnell-Luria, Anne; Solomonson, Matthew; Seed, Cotton; Martin, Alicia R; Talkowski, Michael E; Rehm, Heidi L; Daly, Mark J; Tiao, Grace; Neale, Benjamin M; MacArthur, Daniel G; Karczewski, Konrad J.

Nature ; 625(7993): 92-100, 2024 Jan.

Article in English | MEDLINE | ID: mdl-38057664

ABSTRACT

The depletion of disruptive variation caused by purifying natural selection (constraint) has been widely used to investigate protein-coding genes underlying human disorders1-4, but attempts to assess constraint for non-protein-coding regions have proved more difficult. Here we aggregate, process and release a dataset of 76,156 human genomes from the Genome Aggregation Database (gnomAD)-the largest public open-access human genome allele frequency reference dataset-and use it to build a genomic constraint map for the whole genome (genomic non-coding constraint of haploinsufficient variation (Gnocchi)). We present a refined mutational model that incorporates local sequence context and regional genomic features to detect depletions of variation. As expected, the average constraint for protein-coding sequences is stronger than that for non-coding regions. Within the non-coding genome, constrained regions are enriched for known regulatory elements and variants that are implicated in complex human diseases and traits, facilitating the triangulation of biological annotation, disease association and natural selection to non-coding DNA analysis. More constrained regulatory elements tend to regulate more constrained protein-coding genes, which in turn suggests that non-coding constraint can aid the identification of constrained genes that are as yet unrecognized by current gene constraint metrics. We demonstrate that this genome-wide constraint map improves the identification and interpretation of functional human genetic variation.

Subject(s)

Genome, Human , Genomics , Models, Genetic , Mutation , Humans , Access to Information , Databases, Genetic , Datasets as Topic , Gene Frequency , Genome, Human/genetics , Mutation/genetics , Selection, Genetic

4.

Single-cell genomics meets human genetics.

Cuomo, Anna S E; Nathan, Aparna; Raychaudhuri, Soumya; MacArthur, Daniel G; Powell, Joseph E.

Nat Rev Genet ; 24(8): 535-549, 2023 08.

Article in English | MEDLINE | ID: mdl-37085594

ABSTRACT

Single-cell genomic technologies are revealing the cellular composition, identities and states in tissues at unprecedented resolution. They have now scaled to the point that it is possible to query samples at the population level, across thousands of individuals. Combining single-cell information with genotype data at this scale provides opportunities to link genetic variation to the cellular processes underpinning key aspects of human biology and disease. This strategy has potential implications for disease diagnosis, risk prediction and development of therapeutic solutions. But, effectively integrating large-scale single-cell genomic data, genetic variation and additional phenotypic data will require advances in data generation and analysis methods. As single-cell genetics begins to emerge as a field in its own right, we review its current state and the challenges and opportunities ahead.

Subject(s)

Genome , Genomics , Humans , Genomics/methods , Genotype , Human Genetics

5.

Genome-wide generation and systematic phenotyping of knockout mice reveals new roles for many genes.

White, Jacqueline K; Gerdin, Anna-Karin; Karp, Natasha A; Ryder, Ed; Buljan, Marija; Bussell, James N; Salisbury, Jennifer; Clare, Simon; Ingham, Neil J; Podrini, Christine; Houghton, Richard; Estabel, Jeanne; Bottomley, Joanna R; Melvin, David G; Sunter, David; Adams, Niels C; Tannahill, David; Logan, Darren W; Macarthur, Daniel G; Flint, Jonathan; Mahajan, Vinit B; Tsang, Stephen H; Smyth, Ian; Watt, Fiona M; Skarnes, William C; Dougan, Gordon; Adams, David J; Ramirez-Solis, Ramiro; Bradley, Allan; Steel, Karen P.

Cell ; 154(2): 452-64, 2013 Jul 18.

Article in English | MEDLINE | ID: mdl-23870131

ABSTRACT

Mutations in whole organisms are powerful ways of interrogating gene function in a realistic context. We describe a program, the Sanger Institute Mouse Genetics Project, that provides a step toward the aim of knocking out all genes and screening each line for a broad range of traits. We found that hitherto unpublished genes were as likely to reveal phenotypes as known genes, suggesting that novel genes represent a rich resource for investigating the molecular basis of disease. We found many unexpected phenotypes detected only because we screened for them, emphasizing the value of screening all mutants for a wide range of traits. Haploinsufficiency and pleiotropy were both surprisingly common. Forty-two percent of genes were essential for viability, and these were less likely to have a paralog and more likely to contribute to a protein complex than other genes. Phenotypic data and more than 900 mutants are openly available for further analysis. PAPERCLIP:

Subject(s)

Genetic Techniques , Mice, Knockout , Phenotype , Animals , Disease/genetics , Disease Models, Animal , Female , Genes, Essential , Genome-Wide Association Study , Male , Mice

6.

Transcriptome variation in human tissues revealed by long-read sequencing.

Glinos, Dafni A; Garborcauskas, Garrett; Hoffman, Paul; Ehsan, Nava; Jiang, Lihua; Gokden, Alper; Dai, Xiaoguang; Aguet, François; Brown, Kathleen L; Garimella, Kiran; Bowers, Tera; Costello, Maura; Ardlie, Kristin; Jian, Ruiqi; Tucker, Nathan R; Ellinor, Patrick T; Harrington, Eoghan D; Tang, Hua; Snyder, Michael; Juul, Sissel; Mohammadi, Pejman; MacArthur, Daniel G; Lappalainen, Tuuli; Cummings, Beryl B.

Nature ; 608(7922): 353-359, 2022 08.

Article in English | MEDLINE | ID: mdl-35922509

ABSTRACT

Regulation of transcript structure generates transcript diversity and plays an important role in human disease1-7. The advent of long-read sequencing technologies offers the opportunity to study the role of genetic variation in transcript structure8-16. In this Article, we present a large human long-read RNA-seq dataset using the Oxford Nanopore Technologies platform from 88 samples from Genotype-Tissue Expression (GTEx) tissues and cell lines, complementing the GTEx resource. We identified just over 70,000 novel transcripts for annotated genes, and validated the protein expression of 10% of novel transcripts. We developed a new computational package, LORALS, to analyse the genetic effects of rare and common variants on the transcriptome by allele-specific analysis of long reads. We characterized allele-specific expression and transcript structure events, providing new insights into the specific transcript alterations caused by common and rare genetic variants and highlighting the resolution gained from long-read data. We were able to perturb the transcript structure upon knockdown of PTBP1, an RNA binding protein that mediates splicing, thereby finding genetic regulatory effects that are modified by the cellular environment. Finally, we used this dataset to enhance variant interpretation and study rare variants leading to aberrant splicing patterns.

Subject(s)

Alleles , Gene Expression Profiling , Organ Specificity , RNA-Seq , Transcriptome , Alternative Splicing/genetics , Cell Line , Datasets as Topic , Genotype , Heterogeneous-Nuclear Ribonucleoproteins/deficiency , Heterogeneous-Nuclear Ribonucleoproteins/genetics , Humans , Organ Specificity/genetics , Polypyrimidine Tract-Binding Protein/deficiency , Polypyrimidine Tract-Binding Protein/genetics , Reproducibility of Results , Transcriptome/genetics

7.

Advanced variant classification framework reduces the false positive rate of predicted loss-of-function variants in population sequencing data.

Singer-Berk, Moriel; Gudmundsson, Sanna; Baxter, Samantha; Seaby, Eleanor G; England, Eleina; Wood, Jordan C; Son, Rachel G; Watts, Nicholas A; Karczewski, Konrad J; Harrison, Steven M; MacArthur, Daniel G; Rehm, Heidi L; O'Donnell-Luria, Anne.

Am J Hum Genet ; 110(9): 1496-1508, 2023 09 07.

Article in English | MEDLINE | ID: mdl-37633279

ABSTRACT

Predicted loss of function (pLoF) variants are often highly deleterious and play an important role in disease biology, but many pLoF variants may not result in loss of function (LoF). Here we present a framework that advances interpretation of pLoF variants in research and clinical settings by considering three categories of LoF evasion: (1) predicted rescue by secondary sequence properties, (2) uncertain biological relevance, and (3) potential technical artifacts. We also provide recommendations on adjustments to ACMG/AMP guidelines' PVS1 criterion. Applying this framework to all high-confidence pLoF variants in 22 genes associated with autosomal-recessive disease from the Genome Aggregation Database (gnomAD v.2.1.1) revealed predicted LoF evasion or potential artifacts in 27.3% (304/1,113) of variants. The major reasons were location in the last exon, in a homopolymer repeat, in a low proportion expressed across transcripts (pext) scored region, or the presence of cryptic in-frame splice rescues. Variants predicted to evade LoF or to be potential artifacts were enriched for ClinVar benign variants. PVS1 was downgraded in 99.4% (162/163) of pLoF variants predicted as likely not LoF/not LoF, with 17.2% (28/163) downgraded as a result of our framework, adding to previous guidelines. Variant pathogenicity was affected (mostly from likely pathogenic to VUS) in 20 (71.4%) of these 28 variants. This framework guides assessment of pLoF variants beyond standard annotation pipelines and substantially reduces false positive rates, which is key to ensure accurate LoF variant prediction in both a research and clinical setting.

Subject(s)

Inheritance Patterns , Humans , Exons , Uncertainty

8.

The penetrance of rare variants in cardiomyopathy-associated genes: A cross-sectional approach to estimating penetrance for secondary findings.

McGurk, Kathryn A; Zhang, Xiaolei; Theotokis, Pantazis; Thomson, Kate; Harper, Andrew; Buchan, Rachel J; Mazaika, Erica; Ormondroyd, Elizabeth; Wright, William T; Macaya, Daniela; Pua, Chee Jian; Funke, Birgit; MacArthur, Daniel G; Prasad, Sanjay K; Cook, Stuart A; Allouba, Mona; Aguib, Yasmine; Yacoub, Magdi H; O'Regan, Declan P; Barton, Paul J R; Watkins, Hugh; Bottolo, Leonardo; Ware, James S.

Am J Hum Genet ; 110(9): 1482-1495, 2023 09 07.

Article in English | MEDLINE | ID: mdl-37652022

ABSTRACT

Understanding the penetrance of pathogenic variants identified as secondary findings (SFs) is of paramount importance with the growing availability of genetic testing. We estimated penetrance through large-scale analyses of individuals referred for diagnostic sequencing for hypertrophic cardiomyopathy (HCM; 10,400 affected individuals, 1,332 variants) and dilated cardiomyopathy (DCM; 2,564 affected individuals, 663 variants), using a cross-sectional approach comparing allele frequencies against reference populations (293,226 participants from UK Biobank and gnomAD). We generated updated prevalence estimates for HCM (1:543) and DCM (1:220). In aggregate, the penetrance by late adulthood of rare, pathogenic variants (23% for HCM, 35% for DCM) and likely pathogenic variants (7% for HCM, 10% for DCM) was substantial for dominant cardiomyopathy (CM). Penetrance was significantly higher for variant subgroups annotated as loss of function or ultra-rare and for males compared to females for variants in HCM-associated genes. We estimated variant-specific penetrance for 316 recurrent variants most likely to be identified as SFs (found in 51% of HCM- and 17% of DCM-affected individuals). 49 variants were observed at least ten times (14% of affected individuals) in HCM-associated genes. Median penetrance was 14.6% (±14.4% SD). We explore estimates of penetrance by age, sex, and ancestry and simulate the impact of including future cohorts. This dataset reports penetrance of individual variants at scale and will inform the management of individuals undergoing genetic screening for SFs. While most variants had low penetrance and the costs and harms of screening are unclear, some individuals with highly penetrant variants may benefit from SFs.

Subject(s)

Cardiomyopathies , Cardiomyopathy, Dilated , Cardiomyopathy, Hypertrophic , Female , Male , Humans , Adult , Penetrance , Cardiomyopathies/genetics , Cardiomyopathy, Dilated/genetics , Gene Frequency

9.

Systematic evaluation of genome sequencing for the diagnostic assessment of autism spectrum disorder and fetal structural anomalies.

Lowther, Chelsea; Valkanas, Elise; Giordano, Jessica L; Wang, Harold Z; Currall, Benjamin B; O'Keefe, Kathryn; Pierce-Hoffman, Emma; Kurtas, Nehir E; Whelan, Christopher W; Hao, Stephanie P; Weisburd, Ben; Jalili, Vahid; Fu, Jack; Wong, Isaac; Collins, Ryan L; Zhao, Xuefang; Austin-Tse, Christina A; Evangelista, Emily; Lemire, Gabrielle; Aggarwal, Vimla S; Lucente, Diane; Gauthier, Laura D; Tolonen, Charlotte; Sahakian, Nareh; Stevens, Christine; An, Joon-Yong; Dong, Shan; Norton, Mary E; MacKenzie, Tippi C; Devlin, Bernie; Gilmore, Kelly; Powell, Bradford C; Brandt, Alicia; Vetrini, Francesco; DiVito, Michelle; Sanders, Stephan J; MacArthur, Daniel G; Hodge, Jennelle C; O'Donnell-Luria, Anne; Rehm, Heidi L; Vora, Neeta L; Levy, Brynn; Brand, Harrison; Wapner, Ronald J; Talkowski, Michael E.

Am J Hum Genet ; 110(9): 1454-1469, 2023 09 07.

Article in English | MEDLINE | ID: mdl-37595579

ABSTRACT

Short-read genome sequencing (GS) holds the promise of becoming the primary diagnostic approach for the assessment of autism spectrum disorder (ASD) and fetal structural anomalies (FSAs). However, few studies have comprehensively evaluated its performance against current standard-of-care diagnostic tests: karyotype, chromosomal microarray (CMA), and exome sequencing (ES). To assess the clinical utility of GS, we compared its diagnostic yield against these three tests in 1,612 quartet families including an individual with ASD and in 295 prenatal families. Our GS analytic framework identified a diagnostic variant in 7.8% of ASD probands, almost 2-fold more than CMA (4.3%) and 3-fold more than ES (2.7%). However, when we systematically captured copy-number variants (CNVs) from the exome data, the diagnostic yield of ES (7.4%) was brought much closer to, but did not surpass, GS. Similarly, we estimated that GS could achieve an overall diagnostic yield of 46.1% in unselected FSAs, representing a 17.2% increased yield over karyotype, 14.1% over CMA, and 4.1% over ES with CNV calling or 36.1% increase without CNV discovery. Overall, GS provided an added diagnostic yield of 0.4% and 0.8% beyond the combination of all three standard-of-care tests in ASD and FSAs, respectively. This corresponded to nine GS unique diagnostic variants, including sequence variants in exons not captured by ES, structural variants (SVs) inaccessible to existing standard-of-care tests, and SVs where the resolution of GS changed variant classification. Overall, this large-scale evaluation demonstrated that GS significantly outperforms each individual standard-of-care test while also outperforming the combination of all three tests, thus warranting consideration as the first-tier diagnostic approach for the assessment of ASD and FSAs.

Subject(s)

Autism Spectrum Disorder , Female , Pregnancy , Humans , Autism Spectrum Disorder/diagnosis , Autism Spectrum Disorder/genetics , Pregnancy Trimester, First , Ultrasonography, Prenatal , Chromosome Mapping , Exome

10.

Discordant calls across genotype discovery approaches elucidate variants with systematic errors.

Atkinson, Elizabeth G; Artomov, Mykyta; Loboda, Alexander A; Rehm, Heidi L; MacArthur, Daniel G; Karczewski, Konrad J; Neale, Benjamin M; Daly, Mark J.

Genome Res ; 33(6): 999-1005, 2023 06.

Article in English | MEDLINE | ID: mdl-37253541

ABSTRACT

Large-scale high-throughput sequencing data sets have been transformative for informing clinical variant interpretation and for use as reference panels for statistical and population genetic efforts. Although such resources are often treated as ground truth, we find that in widely used reference data sets such as the Genome Aggregation Database (gnomAD), some variants pass gold-standard filters, yet are systematically different in their genotype calls across genotype discovery approaches. The inclusion of such discordant sites in study designs involving multiple genotype discovery strategies could bias results and lead to false-positive hits in association studies owing to technological artifacts rather than a true relationship to the phenotype. Here, we describe this phenomenon of discordant genotype calls across genotype discovery approaches, characterize the error mode of wrong calls, provide a list of discordant sites identified in gnomAD that should be treated with caution in analyses, and present a metric and machine learning classifier trained on gnomAD data to identify likely discordant variants in other data sets. We find that different genotype discovery approaches have different sets of variants at which this problem occurs, but there are characteristic variant features that can be used to predict discordant behavior. Discordant sites are largely shared across ancestry groups, although different populations are powered for the discovery of different variants. We find that the most common error mode is that of a variant being heterozygous for one approach and homozygous for the other, with heterozygous in the genomes and homozygous reference in the exomes making up the majority of miscalls.

Subject(s)

Exome , Genetics, Population , Genotype , Heterozygote , Phenotype , Polymorphism, Single Nucleotide

11.

Evaluating drug targets through human loss-of-function genetic variation.

Minikel, Eric Vallabh; Karczewski, Konrad J; Martin, Hilary C; Cummings, Beryl B; Whiffin, Nicola; Rhodes, Daniel; Alföldi, Jessica; Trembath, Richard C; van Heel, David A; Daly, Mark J; Schreiber, Stuart L; MacArthur, Daniel G.

Nature ; 581(7809): 459-464, 2020 05.

Article in English | MEDLINE | ID: mdl-32461653

ABSTRACT

Naturally occurring human genetic variants that are predicted to inactivate protein-coding genes provide an in vivo model of human gene inactivation that complements knockout studies in cells and model organisms. Here we report three key findings regarding the assessment of candidate drug targets using human loss-of-function variants. First, even essential genes, in which loss-of-function variants are not tolerated, can be highly successful as targets of inhibitory drugs. Second, in most genes, loss-of-function variants are sufficiently rare that genotype-based ascertainment of homozygous or compound heterozygous 'knockout' humans will await sample sizes that are approximately 1,000 times those presently available, unless recruitment focuses on consanguineous individuals. Third, automated variant annotation and filtering are powerful, but manual curation remains crucial for removing artefacts, and is a prerequisite for recall-by-genotype efforts. Our results provide a roadmap for human knockout studies and should guide the interpretation of loss-of-function variants in drug development.

Subject(s)

Genes, Essential/drug effects , Genes, Essential/genetics , Loss of Function Mutation/genetics , Molecular Targeted Therapy , Artifacts , Automation , Consanguinity , Exons/genetics , Gain of Function Mutation/genetics , Gene Frequency , Gene Knockdown Techniques , Heterozygote , Homozygote , Humans , Huntingtin Protein/genetics , Leucine-Rich Repeat Serine-Threonine Protein Kinase-2/genetics , Neurodegenerative Diseases/genetics , Prion Proteins/genetics , Reproducibility of Results , Sample Size , tau Proteins/genetics

12.

Transcript expression-aware annotation improves rare variant interpretation.

Cummings, Beryl B; Karczewski, Konrad J; Kosmicki, Jack A; Seaby, Eleanor G; Watts, Nicholas A; Singer-Berk, Moriel; Mudge, Jonathan M; Karjalainen, Juha; Satterstrom, F Kyle; O'Donnell-Luria, Anne H; Poterba, Timothy; Seed, Cotton; Solomonson, Matthew; Alföldi, Jessica; Daly, Mark J; MacArthur, Daniel G.

Nature ; 581(7809): 452-458, 2020 05.

Article in English | MEDLINE | ID: mdl-32461655

ABSTRACT

The acceleration of DNA sequencing in samples from patients and population studies has resulted in extensive catalogues of human genetic variation, but the interpretation of rare genetic variants remains problematic. A notable example of this challenge is the existence of disruptive variants in dosage-sensitive disease genes, even in apparently healthy individuals. Here, by manual curation of putative loss-of-function (pLoF) variants in haploinsufficient disease genes in the Genome Aggregation Database (gnomAD)1, we show that one explanation for this paradox involves alternative splicing of mRNA, which allows exons of a gene to be expressed at varying levels across different cell types. Currently, no existing annotation tool systematically incorporates information about exon expression into the interpretation of variants. We develop a transcript-level annotation metric known as the 'proportion expressed across transcripts', which quantifies isoform expression for variants. We calculate this metric using 11,706 tissue samples from the Genotype Tissue Expression (GTEx) project2 and show that it can differentiate between weakly and highly evolutionarily conserved exons, a proxy for functional importance. We demonstrate that expression-based annotation selectively filters 22.8% of falsely annotated pLoF variants found in haploinsufficient disease genes in gnomAD, while removing less than 4% of high-confidence pathogenic variants in the same genes. Finally, we apply our expression filter to the analysis of de novo variants in patients with autism spectrum disorder and intellectual disability or developmental disorders to show that pLoF variants in weakly expressed regions have similar effect sizes to those of synonymous variants, whereas pLoF variants in highly expressed exons are most strongly enriched among cases. Our annotation is fast, flexible and generalizable, making it possible for any variant file to be annotated with any isoform expression dataset, and will be valuable for the genetic diagnosis of rare diseases, the analysis of rare variant burden in complex disorders, and the curation and prioritization of variants in recall-by-genotype studies.

Subject(s)

Disease/genetics , Haploinsufficiency/genetics , Loss of Function Mutation/genetics , Molecular Sequence Annotation , Transcription, Genetic , Transcriptome/genetics , Autism Spectrum Disorder/genetics , Datasets as Topic , Developmental Disabilities/genetics , Exons/genetics , Female , Genotype , Humans , Intellectual Disability/genetics , Male , Molecular Sequence Annotation/standards , Poisson Distribution , RNA, Messenger/analysis , RNA, Messenger/genetics , Rare Diseases/diagnosis , Rare Diseases/genetics , Reproducibility of Results , Exome Sequencing

13.

A brief history of human disease genetics.

Claussnitzer, Melina; Cho, Judy H; Collins, Rory; Cox, Nancy J; Dermitzakis, Emmanouil T; Hurles, Matthew E; Kathiresan, Sekar; Kenny, Eimear E; Lindgren, Cecilia M; MacArthur, Daniel G; North, Kathryn N; Plon, Sharon E; Rehm, Heidi L; Risch, Neil; Rotimi, Charles N; Shendure, Jay; Soranzo, Nicole; McCarthy, Mark I.

Nature ; 577(7789): 179-189, 2020 01.

Article in English | MEDLINE | ID: mdl-31915397

ABSTRACT

A primary goal of human genetics is to identify DNA sequence variants that influence biomedical traits, particularly those related to the onset and progression of human disease. Over the past 25 years, progress in realizing this objective has been transformed by advances in technology, foundational genomic resources and analytical tools, and by access to vast amounts of genotype and phenotype data. Genetic discoveries have substantially improved our understanding of the mechanisms responsible for many rare and common diseases and driven development of novel preventative and therapeutic strategies. Medical innovation will increasingly focus on delivering care tailored to individual patterns of genetic predisposition.

Subject(s)

Genetic Variation , Animals , Genetic Testing , Genomics , Genotype , Humans , Phenotype , Rare Diseases/genetics

14.

A structural variation reference for medical and population genetics.

Collins, Ryan L; Brand, Harrison; Karczewski, Konrad J; Zhao, Xuefang; Alföldi, Jessica; Francioli, Laurent C; Khera, Amit V; Lowther, Chelsea; Gauthier, Laura D; Wang, Harold; Watts, Nicholas A; Solomonson, Matthew; O'Donnell-Luria, Anne; Baumann, Alexander; Munshi, Ruchi; Walker, Mark; Whelan, Christopher W; Huang, Yongqing; Brookings, Ted; Sharpe, Ted; Stone, Matthew R; Valkanas, Elise; Fu, Jack; Tiao, Grace; Laricchia, Kristen M; Ruano-Rubio, Valentin; Stevens, Christine; Gupta, Namrata; Cusick, Caroline; Margolin, Lauren; Taylor, Kent D; Lin, Henry J; Rich, Stephen S; Post, Wendy S; Chen, Yii-Der Ida; Rotter, Jerome I; Nusbaum, Chad; Philippakis, Anthony; Lander, Eric; Gabriel, Stacey; Neale, Benjamin M; Kathiresan, Sekar; Daly, Mark J; Banks, Eric; MacArthur, Daniel G; Talkowski, Michael E.

Nature ; 581(7809): 444-451, 2020 05.

Article in English | MEDLINE | ID: mdl-32461652

ABSTRACT

Structural variants (SVs) rearrange large segments of DNA1 and can have profound consequences in evolution and human disease2,3. As national biobanks, disease-association studies, and clinical genetic testing have grown increasingly reliant on genome sequencing, population references such as the Genome Aggregation Database (gnomAD)4 have become integral in the interpretation of single-nucleotide variants (SNVs)5. However, there are no reference maps of SVs from high-coverage genome sequencing comparable to those for SNVs. Here we present a reference of sequence-resolved SVs constructed from 14,891 genomes across diverse global populations (54% non-European) in gnomAD. We discovered a rich and complex landscape of 433,371 SVs, from which we estimate that SVs are responsible for 25-29% of all rare protein-truncating events per genome. We found strong correlations between natural selection against damaging SNVs and rare SVs that disrupt or duplicate protein-coding sequence, which suggests that genes that are highly intolerant to loss-of-function are also sensitive to increased dosage6. We also uncovered modest selection against noncoding SVs in cis-regulatory elements, although selection against protein-truncating SVs was stronger than all noncoding effects. Finally, we identified very large (over one megabase), rare SVs in 3.9% of samples, and estimate that 0.13% of individuals may carry an SV that meets the existing criteria for clinically important incidental findings7. This SV resource is freely distributed via the gnomAD browser8 and will have broad utility in population genetics, disease-association studies, and diagnostic screening.

Subject(s)

Disease/genetics , Genetic Variation , Genetics, Medical/standards , Genetics, Population/standards , Genome, Human/genetics , Female , Genetic Testing , Genotyping Techniques , Humans , Male , Middle Aged , Mutation , Polymorphism, Single Nucleotide/genetics , Racial Groups/genetics , Reference Standards , Selection, Genetic , Whole Genome Sequencing

15.

A minimal role for synonymous variation in human disease.

Dhindsa, Ryan S; Wang, Quanli; Vitsios, Dimitrios; Burren, Oliver S; Hu, Fengyuan; DiCarlo, James E; Kruglyak, Leonid; MacArthur, Daniel G; Hurles, Matthew E; Petrovski, Slavé.

Am J Hum Genet ; 109(12): 2105-2109, 2022 12 01.

Article in English | MEDLINE | ID: mdl-36459978

ABSTRACT

Synonymous mutations change the DNA sequence of a gene without affecting the amino acid sequence of the encoded protein. Although some synonymous mutations can affect RNA splicing, translational efficiency, and mRNA stability, studies in human genetics, mutagenesis screens, and other experiments and evolutionary analyses have repeatedly shown that most synonymous variants are neutral or only weakly deleterious, with some notable exceptions. Based on a recent study in yeast, there have been claims that synonymous mutations could be as important as nonsynonymous mutations in causing disease, assuming the yeast findings hold up and translate to humans. Here, we argue that there is insufficient evidence to overturn the large, coherent body of knowledge establishing the predominant neutrality of synonymous variants in the human genome.

Subject(s)

Biological Evolution , Saccharomyces cerevisiae , Humans , Mutation/genetics , Amino Acid Sequence , Genome, Human/genetics

16.

Mitochondrial DNA variation across 56,434 individuals in gnomAD.

Laricchia, Kristen M; Lake, Nicole J; Watts, Nicholas A; Shand, Megan; Haessly, Andrea; Gauthier, Laura; Benjamin, David; Banks, Eric; Soto, Jose; Garimella, Kiran; Emery, James; Rehm, Heidi L; MacArthur, Daniel G; Tiao, Grace; Lek, Monkol; Mootha, Vamsi K; Calvo, Sarah E.

Genome Res ; 32(3): 569-582, 2022 03.

Article in English | MEDLINE | ID: mdl-35074858

ABSTRACT

Genomic databases of allele frequency are extremely helpful for evaluating clinical variants of unknown significance; however, until now, databases such as the Genome Aggregation Database (gnomAD) have focused on nuclear DNA and have ignored the mitochondrial genome (mtDNA). Here, we present a pipeline to call mtDNA variants that addresses three technical challenges: (1) detecting homoplasmic and heteroplasmic variants, present, respectively, in all or a fraction of mtDNA molecules; (2) circular mtDNA genome; and (3) misalignment of nuclear sequences of mitochondrial origin (NUMTs). We observed that mtDNA copy number per cell varied across gnomAD cohorts and influenced the fraction of NUMT-derived false-positive variant calls, which can account for the majority of putative heteroplasmies. To avoid false positives, we excluded contaminated samples, cell lines, and samples prone to NUMT misalignment due to few mtDNA copies. Furthermore, we report variants with heteroplasmy ≥10%. We applied this pipeline to 56,434 whole-genome sequences in the gnomAD v3.1 database that includes individuals of European (58%), African (25%), Latino (10%), and Asian (5%) ancestry. Our gnomAD v3.1 release contains population frequencies for 10,850 unique mtDNA variants at more than half of all mtDNA bases. Importantly, we report frequencies within each nuclear ancestral population and mitochondrial haplogroup. Homoplasmic variants account for most variant calls (98%) and unique variants (85%). We observed that 1/250 individuals carry a pathogenic mtDNA variant with heteroplasmy above 10%. These mtDNA population allele frequencies are freely accessible and will aid in diagnostic interpretation and research studies.

Subject(s)

DNA, Mitochondrial , Genome, Mitochondrial , Cell Nucleus/genetics , DNA, Mitochondrial/genetics , Gene Frequency , Genome , Humans , Mitochondria/genetics , Sequence Analysis, DNA

17.

Author Correction: A genomic mutational constraint map using variation in 76,156 human genomes.

Chen, Siwei; Francioli, Laurent C; Goodrich, Julia K; Collins, Ryan L; Kanai, Masahiro; Wang, Qingbo; Alföldi, Jessica; Watts, Nicholas A; Vittal, Christopher; Gauthier, Laura D; Poterba, Timothy; Wilson, Michael W; Tarasova, Yekaterina; Phu, William; Grant, Riley; Yohannes, Mary T; Koenig, Zan; Farjoun, Yossi; Banks, Eric; Donnelly, Stacey; Gabriel, Stacey; Gupta, Namrata; Ferriera, Steven; Tolonen, Charlotte; Novod, Sam; Bergelson, Louis; Roazen, David; Ruano-Rubio, Valentin; Covarrubias, Miguel; Llanwarne, Christopher; Petrillo, Nikelle; Wade, Gordon; Jeandet, Thibault; Munshi, Ruchi; Tibbetts, Kathleen; O'Donnell-Luria, Anne; Solomonson, Matthew; Seed, Cotton; Martin, Alicia R; Talkowski, Michael E; Rehm, Heidi L; Daly, Mark J; Tiao, Grace; Neale, Benjamin M; MacArthur, Daniel G; Karczewski, Konrad J.

Nature ; 626(7997): E1, 2024 Feb.

Article in English | MEDLINE | ID: mdl-38225470

18.

Corrigendum: Landscape of X chromosome inactivation across human tissues.

Tukiainen, Taru; Villani, Alexandra-Chloé; Yen, Angela; Rivas, Manuel A; Marshall, Jamie L; Satija, Rahul; Aguirre, Matt; Gauthier, Laura; Fleharty, Mark; Kirby, Andrew; Cummings, Beryl B; Castel, Stephane E; Karczewski, Konrad J; Aguet, François; Byrnes, Andrea; Consortium, GTEx; Lappalainen, Tuuli; Regev, Aviv; Ardlie, Kristin G; Hacohen, Nir; MacArthur, Daniel G.

Nature ; 555(7695): 274, 2018 03 07.

Article in English | MEDLINE | ID: mdl-29517003

ABSTRACT

This corrects the article DOI: 10.1038/nature24265.

19.

Landscape of X chromosome inactivation across human tissues.

Tukiainen, Taru; Villani, Alexandra-Chloé; Yen, Angela; Rivas, Manuel A; Marshall, Jamie L; Satija, Rahul; Aguirre, Matt; Gauthier, Laura; Fleharty, Mark; Kirby, Andrew; Cummings, Beryl B; Castel, Stephane E; Karczewski, Konrad J; Aguet, François; Byrnes, Andrea; Lappalainen, Tuuli; Regev, Aviv; Ardlie, Kristin G; Hacohen, Nir; MacArthur, Daniel G.

Nature ; 550(7675): 244-248, 2017 10 11.

Article in English | MEDLINE | ID: mdl-29022598

ABSTRACT

X chromosome inactivation (XCI) silences transcription from one of the two X chromosomes in female mammalian cells to balance expression dosage between XX females and XY males. XCI is, however, incomplete in humans: up to one-third of X-chromosomal genes are expressed from both the active and inactive X chromosomes (Xa and Xi, respectively) in female cells, with the degree of 'escape' from inactivation varying between genes and individuals. The extent to which XCI is shared between cells and tissues remains poorly characterized, as does the degree to which incomplete XCI manifests as detectable sex differences in gene expression and phenotypic traits. Here we describe a systematic survey of XCI, integrating over 5,500 transcriptomes from 449 individuals spanning 29 tissues from GTEx (v6p release) and 940 single-cell transcriptomes, combined with genomic sequence data. We show that XCI at 683 X-chromosomal genes is generally uniform across human tissues, but identify examples of heterogeneity between tissues, individuals and cells. We show that incomplete XCI affects at least 23% of X-chromosomal genes, identify seven genes that escape XCI with support from multiple lines of evidence and demonstrate that escape from XCI results in sex biases in gene expression, establishing incomplete XCI as a mechanism that is likely to introduce phenotypic diversity. Overall, this updated catalogue of XCI across human tissues helps to increase our understanding of the extent and impact of the incompleteness in the maintenance of XCI.

Subject(s)

Organ Specificity/genetics , Single-Cell Analysis , X Chromosome Inactivation/genetics , Chromosomes, Human, X/genetics , Female , Genes, X-Linked/genetics , Genome, Human/genetics , Genomics , Humans , Male , Phenotype , Sequence Analysis, RNA , Transcriptome/genetics

20.

Human knockouts and phenotypic analysis in a cohort with a high rate of consanguinity.

Saleheen, Danish; Natarajan, Pradeep; Armean, Irina M; Zhao, Wei; Rasheed, Asif; Khetarpal, Sumeet A; Won, Hong-Hee; Karczewski, Konrad J; O'Donnell-Luria, Anne H; Samocha, Kaitlin E; Weisburd, Benjamin; Gupta, Namrata; Zaidi, Mozzam; Samuel, Maria; Imran, Atif; Abbas, Shahid; Majeed, Faisal; Ishaq, Madiha; Akhtar, Saba; Trindade, Kevin; Mucksavage, Megan; Qamar, Nadeem; Zaman, Khan Shah; Yaqoob, Zia; Saghir, Tahir; Rizvi, Syed Nadeem Hasan; Memon, Anis; Hayyat Mallick, Nadeem; Ishaq, Mohammad; Rasheed, Syed Zahed; Memon, Fazal-Ur-Rehman; Mahmood, Khalid; Ahmed, Naveeduddin; Do, Ron; Krauss, Ronald M; MacArthur, Daniel G; Gabriel, Stacey; Lander, Eric S; Daly, Mark J; Frossard, Philippe; Danesh, John; Rader, Daniel J; Kathiresan, Sekar.

Nature ; 544(7649): 235-239, 2017 04 12.

Article in English | MEDLINE | ID: mdl-28406212

ABSTRACT

A major goal of biomedicine is to understand the function of every gene in the human genome. Loss-of-function mutations can disrupt both copies of a given gene in humans and phenotypic analysis of such 'human knockouts' can provide insight into gene function. Consanguineous unions are more likely to result in offspring carrying homozygous loss-of-function mutations. In Pakistan, consanguinity rates are notably high. Here we sequence the protein-coding regions of 10,503 adult participants in the Pakistan Risk of Myocardial Infarction Study (PROMIS), designed to understand the determinants of cardiometabolic diseases in individuals from South Asia. We identified individuals carrying homozygous predicted loss-of-function (pLoF) mutations, and performed phenotypic analysis involving more than 200 biochemical and disease traits. We enumerated 49,138 rare (<1% minor allele frequency) pLoF mutations. These pLoF mutations are estimated to knock out 1,317 genes, each in at least one participant. Homozygosity for pLoF mutations at PLA2G7 was associated with absent enzymatic activity of soluble lipoprotein-associated phospholipase A2; at CYP2F1, with higher plasma interleukin-8 concentrations; at TREH, with lower concentrations of apoB-containing lipoprotein subfractions; at either A3GALT2 or NRG4, with markedly reduced plasma insulin C-peptide concentrations; and at SLC9A3R1, with mediators of calcium and phosphate signalling. Heterozygous deficiency of APOC3 has been shown to protect against coronary heart disease; we identified APOC3 homozygous pLoF carriers in our cohort. We recruited these human knockouts and challenged them with an oral fat load. Compared with family members lacking the mutation, individuals with APOC3 knocked out displayed marked blunting of the usual post-prandial rise in plasma triglycerides. Overall, these observations provide a roadmap for a 'human knockout project', a systematic effort to understand the phenotypic consequences of complete disruption of genes in humans.

Subject(s)

Consanguinity , DNA Mutational Analysis , Gene Deletion , Genes/genetics , Genetic Association Studies/methods , Homozygote , Phenotype , 1-Alkyl-2-acetylglycerophosphocholine Esterase/deficiency , 1-Alkyl-2-acetylglycerophosphocholine Esterase/genetics , Apolipoprotein C-III/deficiency , Apolipoprotein C-III/genetics , Cohort Studies , Coronary Disease/blood , Coronary Disease/genetics , Cytochrome P450 Family 2/genetics , Dietary Fats/pharmacology , Exome/genetics , Fasting/blood , Female , Gene Frequency , Humans , Interleukin-8/blood , Male , Middle Aged , Myocardial Infarction/blood , Myocardial Infarction/genetics , Neuregulins/genetics , Pakistan , Pedigree , Phosphoproteins/genetics , Postprandial Period , RNA Splice Sites/genetics , Reverse Genetics/methods , Sodium-Hydrogen Exchangers/genetics , Triglycerides/blood

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL