Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
1.
Am J Hum Genet ; 110(12): 2068-2076, 2023 Dec 07.
Article in English | MEDLINE | ID: mdl-38000370

ABSTRACT

DNA sample contamination is a major issue in clinical and research applications of whole-genome and -exome sequencing. Even modest levels of contamination can substantially affect the overall quality of variant calls and lead to widespread genotyping errors. Currently, popular tools for estimating the contamination level use short-read data (BAM/CRAM files), which are expensive to store and manipulate and often not retained or shared widely. We propose a metric to estimate DNA sample contamination from variant-level whole-genome and -exome sequence data called CHARR, contamination from homozygous alternate reference reads, which leverages the infiltration of reference reads within homozygous alternate variant calls. CHARR uses a small proportion of variant-level genotype information and thus can be computed from single-sample gVCFs or callsets in VCF or BCF formats, as well as efficiently stored variant calls in Hail VariantDataset format. Our results demonstrate that CHARR accurately recapitulates results from existing tools with substantially reduced costs, improving the accuracy and efficiency of downstream analyses of ultra-large whole-genome and exome sequencing datasets.


Subject(s)
DNA , Trout , Humans , Animals , Sequence Analysis, DNA/methods , Genotype , Homozygote , High-Throughput Nucleotide Sequencing/methods , Software
2.
bioRxiv ; 2023 Jun 28.
Article in English | MEDLINE | ID: mdl-37425834

ABSTRACT

DNA sample contamination is a major issue in clinical and research applications of whole genome and exome sequencing. Even modest levels of contamination can substantially affect the overall quality of variant calls and lead to widespread genotyping errors. Currently, popular tools for estimating the contamination level use short-read data (BAM/CRAM files), which are expensive to store and manipulate and often not retained or shared widely. We propose a new metric to estimate DNA sample contamination from variant-level whole genome and exome sequence data, CHARR, Contamination from Homozygous Alternate Reference Reads, which leverages the infiltration of reference reads within homozygous alternate variant calls. CHARR uses a small proportion of variant-level genotype information and thus can be computed from single-sample gVCFs or callsets in VCF or BCF formats, as well as efficiently stored variant calls in Hail VDS format. Our results demonstrate that CHARR accurately recapitulates results from existing tools with substantially reduced costs, improving the accuracy and efficiency of downstream analyses of ultra-large whole genome and exome sequencing datasets.

3.
Nat Genet ; 54(9): 1320-1331, 2022 09.
Article in English | MEDLINE | ID: mdl-35982160

ABSTRACT

Some individuals with autism spectrum disorder (ASD) carry functional mutations rarely observed in the general population. We explored the genes disrupted by these variants from joint analysis of protein-truncating variants (PTVs), missense variants and copy number variants (CNVs) in a cohort of 63,237 individuals. We discovered 72 genes associated with ASD at false discovery rate (FDR) ≤ 0.001 (185 at FDR ≤ 0.05). De novo PTVs, damaging missense variants and CNVs represented 57.5%, 21.1% and 8.44% of association evidence, while CNVs conferred greatest relative risk. Meta-analysis with cohorts ascertained for developmental delay (DD) (n = 91,605) yielded 373 genes associated with ASD/DD at FDR ≤ 0.001 (664 at FDR ≤ 0.05), some of which differed in relative frequency of mutation between ASD and DD cohorts. The DD-associated genes were enriched in transcriptomes of progenitor and immature neuronal cells, whereas genes showing stronger evidence in ASD were more enriched in maturing neurons and overlapped with schizophrenia-associated genes, emphasizing that these neuropsychiatric disorders may share common pathways to risk.


Subject(s)
Autism Spectrum Disorder , Autistic Disorder , Autism Spectrum Disorder/genetics , Autistic Disorder/genetics , DNA Copy Number Variations/genetics , Genetic Predisposition to Disease , Humans , Mutation
4.
Nat Genet ; 54(9): 1275-1283, 2022 09.
Article in English | MEDLINE | ID: mdl-36038634

ABSTRACT

Genome-wide association studies (GWASs) have identified hundreds of loci associated with Crohn's disease (CD). However, as with all complex diseases, robust identification of the genes dysregulated by noncoding variants typically driving GWAS discoveries has been challenging. Here, to complement GWASs and better define actionable biological targets, we analyzed sequence data from more than 30,000 patients with CD and 80,000 population controls. We directly implicate ten genes in general onset CD for the first time to our knowledge via association to coding variation, four of which lie within established CD GWAS loci. In nine instances, a single coding variant is significantly associated, and in the tenth, ATG4C, we see additionally a significantly increased burden of very rare coding variants in CD cases. In addition to reiterating the central role of innate and adaptive immune cells as well as autophagy in CD pathogenesis, these newly associated genes highlight the emerging role of mesenchymal cells in the development and maintenance of intestinal inflammation.


Subject(s)
Crohn Disease , Crohn Disease/genetics , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Polymorphism, Single Nucleotide/genetics
5.
Am J Med Genet B Neuropsychiatr Genet ; 177(8): 736-745, 2018 12.
Article in English | MEDLINE | ID: mdl-30421579

ABSTRACT

Protein homeostasis is tightly regulated by the ubiquitin proteasome pathway. Disruption of this pathway gives rise to a host of neurological disorders. Through whole exome sequencing (WES) in families with neurodevelopmental disorders, we identified mutations in PSMD12, a core component of the proteasome, underlying a neurodevelopmental disorder with intellectual disability (ID) and features of autism spectrum disorder (ASD). We performed WES on six affected siblings from a multiplex family with ID and autistic features, the affected father, and two unaffected mothers, and a trio from a simplex family with one affected child with ID and periventricular nodular heterotopia. We identified an inherited heterozygous nonsense mutation in PSMD12 (NM_002816: c.367C>T: p.R123X) in the multiplex family and a de novo nonsense mutation in the same gene (NM_002816: c.601C>T: p.R201X) in the simplex family. PSMD12 encodes a non-ATPase regulatory subunit of the 26S proteasome. We confirm the association of PSMD12 with ID, present the first cases of inherited PSMD12 mutation, and demonstrate the heterogeneity of phenotypes associated with PSMD12 mutations.


Subject(s)
Intellectual Disability/genetics , Proteasome Endopeptidase Complex/genetics , Adolescent , Adult , Autism Spectrum Disorder/genetics , Autistic Disorder/genetics , Child , Child, Preschool , Family , Female , Genetic Predisposition to Disease , Haploinsufficiency/genetics , Humans , Male , Mutation , Neurodevelopmental Disorders/genetics , Pedigree , Proteasome Endopeptidase Complex/metabolism , Siblings , Exome Sequencing
6.
Nature ; 539(7628): 242-247, 2016 11 10.
Article in English | MEDLINE | ID: mdl-27830782

ABSTRACT

Sensory stimuli drive the maturation and function of the mammalian nervous system in part through the activation of gene expression networks that regulate synapse development and plasticity. These networks have primarily been studied in mice, and it is not known whether there are species- or clade-specific activity-regulated genes that control features of brain development and function. Here we use transcriptional profiling of human fetal brain cultures to identify an activity-dependent secreted factor, Osteocrin (OSTN), that is induced by membrane depolarization of human but not mouse neurons. We find that OSTN has been repurposed in primates through the evolutionary acquisition of DNA regulatory elements that bind the activity-regulated transcription factor MEF2. In addition, we demonstrate that OSTN is expressed in primate neocortex and restricts activity-dependent dendritic growth in human neurons. These findings suggest that, in response to sensory input, OSTN regulates features of neuronal structure and function that are unique to primates.


Subject(s)
Evolution, Molecular , Muscle Proteins/metabolism , Neocortex/metabolism , Neurons/metabolism , Transcription Factors/metabolism , Transcriptome , Animals , Base Sequence , Bone and Bones/metabolism , Dendrites/metabolism , Enhancer Elements, Genetic/genetics , Female , Humans , MEF2 Transcription Factors/metabolism , Macaca mulatta , Male , Mice , Molecular Sequence Data , Muscle Proteins/genetics , Muscles/metabolism , Neocortex/cytology , Neurons/cytology , Organ Specificity , Species Specificity , Transcription Factors/genetics
7.
J Crohns Colitis ; 8(8): 845-51, 2014 Aug.
Article in English | MEDLINE | ID: mdl-24461721

ABSTRACT

BACKGROUND AND AIMS: More than 80% of Crohn's disease (CD) patients will require surgery. Surgery is not curative and rates of re-operation are high. Identification of genetic variants associated with repeat surgery would allow risk stratification of patients who may benefit from early aggressive therapy and/or post-operative prophylactic treatment. METHODS: CD patients who had at least one CD-related bowel resection were identified from the Prospective Registry in IBD Study at Massachusetts General Hospital (PRISM). The primary outcome was surgical recurrence. Covariates and potential interactions were assessed using the Cox proportional hazard model. Kaplan-Meier curves for time to surgical recurrence were developed for each genetic variant and analyzed with the log-rank test. RESULTS: 194 patients were identified who had at least 1 resection. Of these, 69 had two or more resections. Clinical predictors for repeat surgery were stricturing (HR 4.18, p=0.022) and penetrating behavior (HR 3.97, p=0.024). Smoking cessation was protective for repeat surgery (HR 0.45, p=0.018). SMAD3 homozygosity for the risk allele was also independently associated with increased risk of repeat surgery (HR 4.04, p=0.001). NOD2 was not associated with increased risk of surgical recurrence. CONCLUSION: Stricturing and penetrating behavior were associated with increased risk of surgical recurrence, while smoking cessation was associated with a decreased risk. A novel association between SMAD3 and increased risk of repeat operation and shorter time to repeat surgery was observed. This finding is of particular interest as SMAD3 may represent a new therapeutic target specifically for prevention of post-surgical disease recurrence.


Subject(s)
Crohn Disease/genetics , Polymorphism, Single Nucleotide/genetics , Reoperation/statistics & numerical data , Smad3 Protein/genetics , Adolescent , Adult , Aged , Child , Child, Preschool , Crohn Disease/surgery , Female , Genotyping Techniques , Humans , Kaplan-Meier Estimate , Male , Middle Aged , Registries , Risk Factors , Smoking Cessation/statistics & numerical data , Young Adult
8.
Neuron ; 77(2): 259-73, 2013 Jan 23.
Article in English | MEDLINE | ID: mdl-23352163

ABSTRACT

Despite significant heritability of autism spectrum disorders (ASDs), their extreme genetic heterogeneity has proven challenging for gene discovery. Studies of primarily simplex families have implicated de novo copy number changes and point mutations, but are not optimally designed to identify inherited risk alleles. We apply whole-exome sequencing (WES) to ASD families enriched for inherited causes due to consanguinity and find familial ASD associated with biallelic mutations in disease genes (AMT, PEX7, SYNE1, VPS13B, PAH, and POMGNT1). At least some of these genes show biallelic mutations in nonconsanguineous families as well. These mutations are often only partially disabling or present atypically, with patients lacking diagnostic features of the Mendelian disorders with which these genes are classically associated. Our study shows the utility of WES for identifying specific genetic conditions not clinically suspected and the importance of partial loss of gene function in ASDs.


Subject(s)
Autistic Disorder/diagnosis , Autistic Disorder/genetics , Exome/genetics , Genome-Wide Association Study/methods , Adolescent , Animals , Cells, Cultured , Child , Child, Preschool , Cohort Studies , Female , Humans , Male , Pedigree , Rats , Sequence Analysis, DNA/methods , Young Adult
9.
Am J Hum Genet ; 91(3): 541-7, 2012 Sep 07.
Article in English | MEDLINE | ID: mdl-22958903

ABSTRACT

Whole-exome sequencing (WES), which analyzes the coding sequence of most annotated genes in the human genome, is an ideal approach to studying fully penetrant autosomal-recessive diseases, and it has been very powerful in identifying disease-causing mutations even when enrollment of affected individuals is limited by reduced survival. In this study, we combined WES with homozygosity analysis of consanguineous pedigrees, which are informative even when a single affected individual is available, to identify genetic mutations responsible for Walker-Warburg syndrome (WWS), a genetically heterogeneous autosomal-recessive disorder that severely affects the development of the brain, eyes, and muscle. Mutations in seven genes are known to cause WWS and explain 50%-60% of cases, but multiple additional genes are expected to be mutated because unexplained cases show suggestive linkage to diverse loci. Using WES in consanguineous WWS-affected families, we found multiple deleterious mutations in GTDC2 (also known as AGO61). GTDC2's predicted role as an uncharacterized glycosyltransferase is consistent with the function of other genes that are known to be mutated in WWS and that are involved in the glycosylation of the transmembrane receptor dystroglycan. Therefore, to explore the role of GTDC2 loss of function during development, we used morpholino-mediated knockdown of its zebrafish ortholog, gtdc2. We found that gtdc2 knockdown in zebrafish replicates all WWS features (hydrocephalus, ocular defects, and muscular dystrophy), strongly suggesting that GTDC2 mutations cause WWS.


Subject(s)
Glycosyltransferases/genetics , Walker-Warburg Syndrome/genetics , Exome , Humans , Mutation
10.
PLoS Genet ; 8(4): e1002635, 2012.
Article in English | MEDLINE | ID: mdl-22511880

ABSTRACT

Although autism has a clear genetic component, the high genetic heterogeneity of the disorder has been a challenge for the identification of causative genes. We used homozygosity analysis to identify probands from nonconsanguineous families that showed evidence of distant shared ancestry, suggesting potentially recessive mutations. Whole-exome sequencing of 16 probands revealed validated homozygous, potentially pathogenic recessive mutations that segregated perfectly with disease in 4/16 families. The candidate genes (UBE3B, CLTCL1, NCKAP5L, ZNF18) encode proteins involved in proteolysis, GTPase-mediated signaling, cytoskeletal organization, and other pathways. Furthermore, neuronal depolarization regulated the transcription of these genes, suggesting potential activity-dependent roles in neurons. We present a multidimensional strategy for filtering whole-exome sequence data to find candidate recessive mutations in autism, which may have broader applicability to other complex, heterogeneous disorders.


Subject(s)
Autistic Disorder/genetics , Exons , Genes, Recessive , Mutation , Neurons , Adaptor Proteins, Signal Transducing/genetics , Clathrin Heavy Chains/genetics , Exons/genetics , Genome, Human , Genotype , High-Throughput Nucleotide Sequencing , Homozygote , Humans , Kruppel-Like Transcription Factors/genetics , Neurons/metabolism , Neurons/physiology , Oncogene Proteins/genetics , Transcription, Genetic , Ubiquitin-Protein Ligases/genetics
11.
Nat Genet ; 42(4): 332-7, 2010 Apr.
Article in English | MEDLINE | ID: mdl-20228799

ABSTRACT

Ulcerative colitis is a chronic, relapsing inflammatory condition of the gastrointestinal tract with a complex genetic and environmental etiology. In an effort to identify genetic variation underlying ulcerative colitis risk, we present two distinct genome-wide association studies of ulcerative colitis and their joint analysis with a previously published scan, comprising, in aggregate, 2,693 individuals with ulcerative colitis and 6,791 control subjects. Fifty-nine SNPs from 14 independent loci attained an association significance of P < 10(-5). Seven of these loci exceeded genome-wide significance (P < 5 x 10(-8)). After testing an independent cohort of 2,009 cases of ulcerative colitis and 1,580 controls, we identified 13 loci that were significantly associated with ulcerative colitis (P < 5 x 10(-8)), including the immunoglobulin receptor gene FCGR2A, 5p15, 2p16 and ORMDL3 (orosomucoid1-like 3). We confirmed association with 14 previously identified ulcerative colitis susceptibility loci, and an analysis of acknowledged Crohn's disease loci showed that roughly half of the known Crohn's disease associations are shared with ulcerative colitis. These data implicate approximately 30 loci in ulcerative colitis, thereby providing insight into disease pathogenesis.


Subject(s)
Colitis, Ulcerative/genetics , Polymorphism, Single Nucleotide , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Membrane Proteins/genetics , Meta-Analysis as Topic , Receptors, IgG/genetics
12.
Inflamm Bowel Dis ; 15(10): 1508-14, 2009 Oct.
Article in English | MEDLINE | ID: mdl-19322901

ABSTRACT

BACKGROUND: Early-onset disease is frequently examined in genetic studies because it is presumed to contain a more severe subset of patients under a higher influence of genetic effects. In light of the dramatic success of Crohn's disease (CD) gene discovery efforts, we aimed to characterize the contribution of established common risk variants to pediatric CD. METHODS: Using 35 confirmed CD risk alleles, we genotyped 384 parent-child trios (mean age of onset 11.7 years) along with 321 healthy controls. We performed association tests on the independent pediatric cohort and compared results to those previously published.1 We also computed a weighted CD genetic risk score for each affected person. Six variants not previously validated in children (at 5q33, 1q24, 7p12, 12q12, 8q24, and 1q32) were significantly associated with pediatric CD (P < 0.03). RESULTS: We detected no significant association between risk score and age at onset through age 30. This analysis illustrates that the genetic effect of established CD risk variants is similar in early and later onset CD. CONCLUSIONS: These results motivate joint analyses of genome-wide association data in early and late onset cohorts and suggest that, rather than established risk variants, independent variants or environmental exposures should be sought as modulators of age of onset.


Subject(s)
Crohn Disease/genetics , Genetic Markers/genetics , Genetic Variation/genetics , Adolescent , Adult , Age of Onset , Case-Control Studies , Child , Chromosomes, Human, Pair 1/genetics , Chromosomes, Human, Pair 12/genetics , Chromosomes, Human, Pair 5/genetics , Chromosomes, Human, Pair 7/genetics , Cohort Studies , Colitis, Ulcerative/genetics , Female , Genotype , Humans , Male , Phenotype , Prognosis , Time Factors , Young Adult
13.
PLoS Genet ; 4(4): e1000024, 2008 Apr 25.
Article in English | MEDLINE | ID: mdl-18437207

ABSTRACT

The major histocompatibility complex (MHC) is one of the most extensively studied regions in the human genome because of the association of variants at this locus with autoimmune, infectious, and inflammatory diseases. However, identification of causal variants within the MHC for the majority of these diseases has remained difficult due to the great variability and extensive linkage disequilibrium (LD) that exists among alleles throughout this locus, coupled with inadequate study design whereby only a limited subset of about 20 from a total of approximately 250 genes have been studied in small cohorts of predominantly European origin. We have performed a review and pooled analysis of the past 30 years of research on the role of the MHC in six genetically complex disease traits - multiple sclerosis (MS), type 1 diabetes (T1D), systemic lupus erythematosus (SLE), ulcerative colitis (UC), Crohn's disease (CD), and rheumatoid arthritis (RA) - in order to consolidate and evaluate the current literature regarding MHC genetics in these common autoimmune and inflammatory diseases. We corroborate established MHC disease associations and identify predisposing variants that previously have not been appreciated. Furthermore, we find a number of interesting commonalities and differences across diseases that implicate both general and disease-specific pathogenetic mechanisms in autoimmunity.


Subject(s)
Autoimmunity/genetics , Major Histocompatibility Complex , Alleles , Arthritis, Rheumatoid/genetics , Arthritis, Rheumatoid/immunology , Colitis, Ulcerative/genetics , Colitis, Ulcerative/immunology , Crohn Disease/genetics , Crohn Disease/immunology , Diabetes Mellitus, Type 1/genetics , Diabetes Mellitus, Type 1/immunology , Genetic Predisposition to Disease , Haplotypes , Humans , Linkage Disequilibrium , Lupus Erythematosus, Systemic/genetics , Lupus Erythematosus, Systemic/immunology , Multiple Sclerosis/genetics , Multiple Sclerosis/immunology
14.
PLoS Genet ; 3(11): e192, 2007 Nov.
Article in English | MEDLINE | ID: mdl-17997607

ABSTRACT

The association of the major histocompatibility complex (MHC) with SLE is well established yet the causal variants arising from this region remain to be identified, largely due to inadequate study design and the strong linkage disequilibrium demonstrated by genes across this locus. The majority of studies thus far have identified strong association with classical class II alleles, in particular HLA-DRB1*0301 and HLA-DRB1*1501. Additional associations have been reported with class III alleles; specifically, complement C4 null alleles and a tumor necrosis factor promoter SNP (TNF-308G/A). However, the relative effects of these class II and class III variants have not been determined. We have thus used a family-based approach to map association signals across the MHC class II and class III regions in a cohort of 314 complete United Kingdom Caucasian SLE trios by typing tagging SNPs together with classical typing of the HLA-DRB1 locus. Using TDT and conditional regression analyses, we have demonstrated the presence of two distinct and independent association signals in SLE: HLA-DRB1*0301 (nominal p = 4.9 x 10(-8), permuted p < 0.0001, OR = 2.3) and the T allele of SNP rs419788 (nominal p = 4.3 x 10(-8), permuted p < 0.0001, OR = 2.0) in intron 6 of the class III region gene SKIV2L. Assessment of genotypic risk demonstrates a likely dominant model of inheritance for HLA-DRB1*0301, while rs419788-T confers susceptibility in an additive manner. Furthermore, by comparing transmitted and untransmitted parental chromosomes, we have delimited our class II signal to a 180 kb region encompassing the alleles HLA-DRB1*0301-HLA-DQA1*0501-HLA-DQB1*0201 alone. Our class III signal importantly excludes independent association at the TNF promoter polymorphism, TNF-308G/A, in our SLE cohort and provides a potentially novel locus for future genetic and functional studies.


Subject(s)
Genetic Predisposition to Disease , Lupus Erythematosus, Systemic/genetics , Major Histocompatibility Complex/genetics , Alleles , Black People/genetics , Case-Control Studies , Cohort Studies , Family , Female , Gene Frequency , Genetic Markers , HLA-DR Antigens/genetics , HLA-DRB1 Chains , Haplotypes , Histocompatibility Antigens Class II/genetics , Humans , Linkage Disequilibrium/genetics , Lupus Erythematosus, Systemic/epidemiology , Male , Pedigree , Phenotype , Polymorphism, Single Nucleotide/genetics , Regression Analysis , United Kingdom/epidemiology , United States
SELECTION OF CITATIONS
SEARCH DETAIL
...