Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 142
Filter
Add more filters

Publication year range
1.
Cell ; 180(3): 568-584.e23, 2020 02 06.
Article in English | MEDLINE | ID: mdl-31981491

ABSTRACT

We present the largest exome sequencing study of autism spectrum disorder (ASD) to date (n = 35,584 total samples, 11,986 with ASD). Using an enhanced analytical framework to integrate de novo and case-control rare variation, we identify 102 risk genes at a false discovery rate of 0.1 or less. Of these genes, 49 show higher frequencies of disruptive de novo variants in individuals ascertained to have severe neurodevelopmental delay, whereas 53 show higher frequencies in individuals ascertained to have ASD; comparing ASD cases with mutations in these groups reveals phenotypic differences. Expressed early in brain development, most risk genes have roles in regulation of gene expression or neuronal communication (i.e., mutations effect neurodevelopmental and neurophysiological changes), and 13 fall within loci recurrently hit by copy number variants. In cells from the human cortex, expression of risk genes is enriched in excitatory and inhibitory neuronal lineages, consistent with multiple paths to an excitatory-inhibitory imbalance underlying ASD.


Subject(s)
Autistic Disorder/genetics , Cerebral Cortex/growth & development , Exome Sequencing/methods , Gene Expression Regulation, Developmental , Neurobiology/methods , Case-Control Studies , Cell Lineage , Cohort Studies , Exome , Female , Gene Frequency , Genetic Predisposition to Disease , Humans , Male , Mutation, Missense , Neurons/metabolism , Phenotype , Sex Factors , Single-Cell Analysis/methods
2.
Cell ; 155(5): 997-1007, 2013 Nov 21.
Article in English | MEDLINE | ID: mdl-24267886

ABSTRACT

Autism spectrum disorder (ASD) is a complex developmental syndrome of unknown etiology. Recent studies employing exome- and genome-wide sequencing have identified nine high-confidence ASD (hcASD) genes. Working from the hypothesis that ASD-associated mutations in these biologically pleiotropic genes will disrupt intersecting developmental processes to contribute to a common phenotype, we have attempted to identify time periods, brain regions, and cell types in which these genes converge. We have constructed coexpression networks based on the hcASD "seed" genes, leveraging a rich expression data set encompassing multiple human brain regions across human development and into adulthood. By assessing enrichment of an independent set of probable ASD (pASD) genes, derived from the same sequencing studies, we demonstrate a key point of convergence in midfetal layer 5/6 cortical projection neurons. This approach informs when, where, and in what cell types mutations in these specific genes may be productively studied to clarify ASD pathophysiology.


Subject(s)
Brain/metabolism , Child Development Disorders, Pervasive/genetics , Child Development Disorders, Pervasive/physiopathology , Animals , Brain/embryology , Brain/growth & development , Brain/pathology , Child Development Disorders, Pervasive/pathology , Exome , Female , Fetus/metabolism , Fetus/pathology , Gene Expression Profiling , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Male , Mice , Mutation , Neurons/metabolism , Prefrontal Cortex/metabolism , Sequence Analysis, DNA
3.
Am J Hum Genet ; 110(9): 1454-1469, 2023 09 07.
Article in English | MEDLINE | ID: mdl-37595579

ABSTRACT

Short-read genome sequencing (GS) holds the promise of becoming the primary diagnostic approach for the assessment of autism spectrum disorder (ASD) and fetal structural anomalies (FSAs). However, few studies have comprehensively evaluated its performance against current standard-of-care diagnostic tests: karyotype, chromosomal microarray (CMA), and exome sequencing (ES). To assess the clinical utility of GS, we compared its diagnostic yield against these three tests in 1,612 quartet families including an individual with ASD and in 295 prenatal families. Our GS analytic framework identified a diagnostic variant in 7.8% of ASD probands, almost 2-fold more than CMA (4.3%) and 3-fold more than ES (2.7%). However, when we systematically captured copy-number variants (CNVs) from the exome data, the diagnostic yield of ES (7.4%) was brought much closer to, but did not surpass, GS. Similarly, we estimated that GS could achieve an overall diagnostic yield of 46.1% in unselected FSAs, representing a 17.2% increased yield over karyotype, 14.1% over CMA, and 4.1% over ES with CNV calling or 36.1% increase without CNV discovery. Overall, GS provided an added diagnostic yield of 0.4% and 0.8% beyond the combination of all three standard-of-care tests in ASD and FSAs, respectively. This corresponded to nine GS unique diagnostic variants, including sequence variants in exons not captured by ES, structural variants (SVs) inaccessible to existing standard-of-care tests, and SVs where the resolution of GS changed variant classification. Overall, this large-scale evaluation demonstrated that GS significantly outperforms each individual standard-of-care test while also outperforming the combination of all three tests, thus warranting consideration as the first-tier diagnostic approach for the assessment of ASD and FSAs.


Subject(s)
Autism Spectrum Disorder , Female , Pregnancy , Humans , Autism Spectrum Disorder/diagnosis , Autism Spectrum Disorder/genetics , Pregnancy Trimester, First , Ultrasonography, Prenatal , Chromosome Mapping , Exome
4.
Am J Hum Genet ; 108(4): 597-607, 2021 04 01.
Article in English | MEDLINE | ID: mdl-33675682

ABSTRACT

Each human genome includes de novo mutations that arose during gametogenesis. While these germline mutations represent a fundamental source of new genetic diversity, they can also create deleterious alleles that impact fitness. Whereas the rate and patterns of point mutations in the human germline are now well understood, far less is known about the frequency and features that impact de novo structural variants (dnSVs). We report a family-based study of germline mutations among 9,599 human genomes from 33 multigenerational CEPH-Utah families and 2,384 families from the Simons Foundation Autism Research Initiative. We find that de novo structural mutations detected by alignment-based, short-read WGS occur at an overall rate of at least 0.160 events per genome in unaffected individuals, and we observe a significantly higher rate (0.206 per genome) in ASD-affected individuals. In both probands and unaffected samples, nearly 73% of de novo structural mutations arose in paternal gametes, and we predict most de novo structural mutations to be caused by mutational mechanisms that do not require sequence homology. After multiple testing correction, we did not observe a statistically significant correlation between parental age and the rate of de novo structural variation in offspring. These results highlight that a spectrum of mutational mechanisms contribute to germline structural mutations and that these mechanisms most likely have markedly different rates and selective pressures than those leading to point mutations.


Subject(s)
Family , Genome, Human/genetics , Germ Cells , Germ-Line Mutation/genetics , Mutation Rate , Aging/genetics , Autistic Disorder/genetics , Bias , DNA Copy Number Variations/genetics , DNA Mutational Analysis , Female , Humans , Male , Paternal Age , Point Mutation/genetics
5.
Genome Res ; 31(10): 1807-1818, 2021 10.
Article in English | MEDLINE | ID: mdl-33837133

ABSTRACT

When assessed over a large number of samples, bulk RNA sequencing provides reliable data for gene expression at the tissue level. Single-cell RNA sequencing (scRNA-seq) deepens those analyses by evaluating gene expression at the cellular level. Both data types lend insights into disease etiology. With current technologies, scRNA-seq data are known to be noisy. Constrained by costs, scRNA-seq data are typically generated from a relatively small number of subjects, which limits their utility for some analyses, such as identification of gene expression quantitative trait loci (eQTLs). To address these issues while maintaining the unique advantages of each data type, we develop a Bayesian method (bMIND) to integrate bulk and scRNA-seq data. With a prior derived from scRNA-seq data, we propose to estimate sample-level cell type-specific (CTS) expression from bulk expression data. The CTS expression enables large-scale sample-level downstream analyses, such as detection of CTS differentially expressed genes (DEGs) and eQTLs. Through simulations, we show that bMIND improves the accuracy of sample-level CTS expression estimates and increases the power to discover CTS DEGs when compared to existing methods. To further our understanding of two complex phenotypes, autism spectrum disorder and Alzheimer's disease, we apply bMIND to gene expression data of relevant brain tissue to identify CTS DEGs. Our results complement findings for CTS DEGs obtained from snRNA-seq studies, replicating certain DEGs in specific cell types while nominating other novel genes for those cell types. Finally, we calculate CTS eQTLs for 11 brain regions by analyzing Genotype-Tissue Expression Project data, creating a new resource for biological insights.


Subject(s)
Autism Spectrum Disorder , Single-Cell Analysis , Autism Spectrum Disorder/genetics , Bayes Theorem , Gene Expression , Gene Expression Profiling/methods , Humans , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods
6.
Brief Bioinform ; 22(6)2021 11 05.
Article in English | MEDLINE | ID: mdl-34459489

ABSTRACT

In genome-wide association studies (GWAS), it has become commonplace to test millions of single-nucleotide polymorphisms (SNPs) for phenotypic association. Gene-based testing can improve power to detect weak signal by reducing multiple testing and pooling signal strength. While such tests account for linkage disequilibrium (LD) structure of SNP alleles within each gene, current approaches do not capture LD of SNPs falling in different nearby genes, which can induce correlation of gene-based test statistics. We introduce an algorithm to account for this correlation. When a gene's test statistic is independent of others, it is assessed separately; when test statistics for nearby genes are strongly correlated, their SNPs are agglomerated and tested as a locus. To provide insight into SNPs and genes driving association within loci, we develop an interactive visualization tool to explore localized signal. We demonstrate our approach in the context of weakly powered GWAS for autism spectrum disorder, which is contrasted to more highly powered GWAS for schizophrenia and educational attainment. To increase power for these analyses, especially those for autism, we use adaptive $P$-value thresholding, guided by high-dimensional metadata modeled with gradient boosted trees, highlighting when and how it can be most useful. Notably our workflow is based on summary statistics.


Subject(s)
Algorithms , Computational Biology/methods , Genetic Predisposition to Disease , Genetic Testing/standards , Genome-Wide Association Study/methods , Genome-Wide Association Study/standards , Alleles , Chromosome Mapping , Databases, Genetic , Genetic Testing/methods , Humans , Linkage Disequilibrium , Phenotype , Polymorphism, Single Nucleotide , Quantitative Trait Loci
7.
J Neurol Neurosurg Psychiatry ; 94(8): 638-642, 2023 08.
Article in English | MEDLINE | ID: mdl-37100590

ABSTRACT

BACKGROUND: Risk for Tourette disorder, and chronic motor or vocal tic disorders (referenced here inclusively as CTD), arise from a combination of genetic and environmental factors. While multiple studies have demonstrated the importance of direct additive genetic variation for CTD risk, little is known about the role of cross-generational transmission of genetic risk, such as maternal effect, which is not transmitted via the inherited parental genomes. Here, we partition sources of variation on CTD risk into direct additive genetic effect (narrow-sense heritability) and maternal effect. METHODS: The study population consists of 2 522 677 individuals from the Swedish Medical Birth Register, who were born in Sweden between 1 January 1973 and 31 December 2000, and followed for a diagnosis of CTD through 31 December, 2013. We used generalised linear mixed models to partition the liability of CTD into: direct additive genetic effect, genetic maternal effect and environmental maternal effect. RESULTS: We identified 6227 (0.2%) individuals in the birth cohort with a CTD diagnosis. A study of half-siblings showed that maternal half-siblings had twice higher risk of developing a CTD compared with paternal ones. We estimated 60.7% direct additive genetic effect (95% credible interval, 58.5% to 62.4%), 4.8% genetic maternal effect (95% credible interval, 4.4% to 5.1%) and 0.5% environmental maternal effect (95% credible interval, 0.2% to 7%). CONCLUSIONS: Our results demonstrate genetic maternal effect contributes to the risk of CTD. Failure to account for maternal effect results in an incomplete understanding of the genetic risk architecture of CTD, as the risk for CTD is impacted by maternal effect which is above and beyond the risk from transmitted genetic effect.


Subject(s)
Tic Disorders , Tourette Syndrome , Humans , Tourette Syndrome/genetics , Maternal Inheritance , Tic Disorders/epidemiology , Tic Disorders/genetics , Family , Risk Factors , Sweden/epidemiology
8.
Proc Natl Acad Sci U S A ; 117(26): 15028-15035, 2020 06 30.
Article in English | MEDLINE | ID: mdl-32522875

ABSTRACT

To correct for a large number of hypothesis tests, most researchers rely on simple multiple testing corrections. Yet, new methodologies of selective inference could potentially improve power while retaining statistical guarantees, especially those that enable exploration of test statistics using auxiliary information (covariates) to weight hypothesis tests for association. We explore one such method, adaptive P-value thresholding (AdaPT), in the framework of genome-wide association studies (GWAS) and gene expression/coexpression studies, with particular emphasis on schizophrenia (SCZ). Selected SCZ GWAS association P values play the role of the primary data for AdaPT; single-nucleotide polymorphisms (SNPs) are selected because they are gene expression quantitative trait loci (eQTLs). This natural pairing of SNPs and genes allow us to map the following covariate values to these pairs: GWAS statistics from genetically correlated bipolar disorder, the effect size of SNP genotypes on gene expression, and gene-gene coexpression, captured by subnetwork (module) membership. In all, 24 covariates per SNP/gene pair were included in the AdaPT analysis using flexible gradient boosted trees. We demonstrate a substantial increase in power to detect SCZ associations using gene expression information from the developing human prefrontal cortex. We interpret these results in light of recent theories about the polygenic nature of SCZ. Importantly, our entire process for identifying enrichment and creating features with independent complementary data sources can be implemented in many different high-throughput settings to ultimately improve power.


Subject(s)
Bipolar Disorder/genetics , Schizophrenia/genetics , Algorithms , Genetic Predisposition to Disease , Genome-Wide Association Study , Genotype , Humans , Multifactorial Inheritance , Polymorphism, Single Nucleotide , Quantitative Trait Loci
9.
Mol Psychiatry ; 26(10): 5797-5811, 2021 10.
Article in English | MEDLINE | ID: mdl-34112972

ABSTRACT

Psychotic symptoms, defined as the occurrence of delusions or hallucinations, are frequent in Alzheimer disease (AD with psychosis, AD + P). AD + P affects ~50% of individuals with AD, identifies a subgroup with poor outcomes, and is associated with a greater degree of cognitive impairment and depressive symptoms, compared to subjects without psychosis (AD - P). Although the estimated heritability of AD + P is 61%, genetic sources of risk are unknown. We report a genome-wide meta-analysis of 12,317 AD subjects, 5445 AD + P. Results showed common genetic variation accounted for a significant portion of heritability. Two loci, one in ENPP6 (rs9994623, O.R. (95%CI) 1.16 (1.10, 1.22), p = 1.26 × 10-8) and one spanning the 3'-UTR of an alternatively spliced transcript of SUMF1 (rs201109606, O.R. 0.65 (0.56-0.76), p = 3.24 × 10-8), had genome-wide significant associations with AD + P. Gene-based analysis identified a significant association with APOE, due to the APOE risk haplotype ε4. AD + P demonstrated negative genetic correlations with cognitive and educational attainment and positive genetic correlation with depressive symptoms. We previously observed a negative genetic correlation with schizophrenia; instead, we now found a stronger negative correlation with the related phenotype of bipolar disorder. Analysis of polygenic risk scores supported this genetic correlation and documented a positive genetic correlation with risk variation for AD, beyond the effect of ε4. We also document a small set of SNPs likely to affect risk for AD + P and AD or schizophrenia. These findings provide the first unbiased identification of the association of psychosis in AD with common genetic variation and provide insights into its genetic architecture.


Subject(s)
Alzheimer Disease , Psychotic Disorders , Schizophrenia , Alzheimer Disease/genetics , Genetic Predisposition to Disease/genetics , Genome-Wide Association Study , Hallucinations , Humans , Oxidoreductases Acting on Sulfur Group Donors , Polymorphism, Single Nucleotide/genetics , Psychotic Disorders/genetics , Schizophrenia/genetics
10.
Proc Natl Acad Sci U S A ; 116(2): 466-471, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30587579

ABSTRACT

Motivated by the dynamics of development, in which cells of recognizable types, or pure cell types, transition into other types over time, we propose a method of semisoft clustering that can classify both pure and intermediate cell types from data on gene expression from individual cells. Called semisoft clustering with pure cells (SOUP), this algorithm reveals the clustering structure for both pure cells and transitional cells with soft memberships. SOUP involves a two-step process: Identify the set of pure cells and then estimate a membership matrix. To find pure cells, SOUP uses the special block structure in the expression similarity matrix. Once pure cells are identified, they provide the key information from which the membership matrix can be computed. By modeling cells as a continuous mixture of K discrete types we obtain more parsimonious results than obtained with standard clustering algorithms. Moreover, using soft membership estimates of cell type cluster centers leads to better estimates of developmental trajectories. The strong performance of SOUP is documented via simulation studies, which show its robustness to violations of modeling assumptions. The advantages of SOUP are illustrated by analyses of two independent datasets of gene expression from a large number of cells from fetal brain.


Subject(s)
Algorithms , Cell Differentiation , Cell Proliferation , Electronic Data Processing , Models, Biological , Animals , Humans
11.
Am J Hum Genet ; 102(6): 1169-1184, 2018 06 07.
Article in English | MEDLINE | ID: mdl-29805045

ABSTRACT

Causal genes and variants within genome-wide association study (GWAS) loci can be identified by integrating GWAS statistics with expression quantitative trait loci (eQTL) and determining which variants underlie both GWAS and eQTL signals. Most analyses, however, consider only the marginal eQTL signal, rather than dissect this signal into multiple conditionally independent signals for each gene. Here we show that analyzing conditional eQTL signatures, which could be important under specific cellular or temporal contexts, leads to improved fine mapping of GWAS associations. Using genotypes and gene expression levels from post-mortem human brain samples (n = 467) reported by the CommonMind Consortium (CMC), we find that conditional eQTL are widespread; 63% of genes with primary eQTL also have conditional eQTL. In addition, genomic features associated with conditional eQTL are consistent with context-specific (e.g., tissue-, cell type-, or developmental time point-specific) regulation of gene expression. Integrating the 2014 Psychiatric Genomics Consortium schizophrenia (SCZ) GWAS and CMC primary and conditional eQTL data reveals 40 loci with strong evidence for co-localization (posterior probability > 0.8), including six loci with co-localization of conditional eQTL. Our co-localization analyses support previously reported genes, identify novel genes associated with schizophrenia risk, and provide specific hypotheses for their functional follow-up.


Subject(s)
Genome-Wide Association Study , Prefrontal Cortex/pathology , Quantitative Trait Loci/genetics , Schizophrenia/genetics , Cells, Cultured , Epigenesis, Genetic , Genome, Human , Humans
12.
Bioinformatics ; 36(3): 782-788, 2020 02 01.
Article in English | MEDLINE | ID: mdl-31400192

ABSTRACT

MOTIVATION: Patterns of gene expression, quantified at the level of tissue or cells, can inform on etiology of disease. There are now rich resources for tissue-level (bulk) gene expression data, which have been collected from thousands of subjects, and resources involving single-cell RNA-sequencing (scRNA-seq) data are expanding rapidly. The latter yields cell type information, although the data can be noisy and typically are derived from a small number of subjects. RESULTS: Complementing these approaches, we develop a method to estimate subject- and cell-type-specific (CTS) gene expression from tissue using an empirical Bayes method that borrows information across multiple measurements of the same tissue per subject (e.g. multiple regions of the brain). Analyzing expression data from multiple brain regions from the Genotype-Tissue Expression project (GTEx) reveals CTS expression, which then permits downstream analyses, such as identification of CTS expression Quantitative Trait Loci (eQTL). AVAILABILITY AND IMPLEMENTATION: We implement this method as an R package MIND, hosted on https://github.com/randel/MIND. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Gene Expression Profiling , Software , Bayes Theorem , Sequence Analysis, RNA , Single-Cell Analysis
13.
Annu Rev Genomics Hum Genet ; 18: 167-187, 2017 08 31.
Article in English | MEDLINE | ID: mdl-28426285

ABSTRACT

The etiology of autism spectrum disorder (ASD) is complex, involving both genetic and environmental contributions to individual and population-level liability. Early researchers hypothesized that ASD arises from polygenic inheritance, but later results, such as the identification of mutations in certain genes that are responsible for syndromes associated with ASD, led others to propose that de novo mutations of major effect would account for most cases. This yin and yang of monogenic causes and polygenic inheritance continues to this day. The development of genome-wide genotyping and sequencing techniques has resulted in remarkable advances in our understanding of the genetic architecture of risk for ASD. The combined research findings provide solid evidence that ASD is a complex polygenic disorder. Rare de novo and inherited variations act within the context of a common-variant genetic load, and this load accounts for the largest portion of ASD liability.


Subject(s)
Autism Spectrum Disorder/genetics , Genetic Predisposition to Disease , Mutation , Polymorphism, Genetic , Autism Spectrum Disorder/etiology , Female , Humans , Male
14.
Hum Brain Mapp ; 41(15): 4187-4199, 2020 10 15.
Article in English | MEDLINE | ID: mdl-32652852

ABSTRACT

Pioneering studies have shown that individual correlation measures from resting-state functional magnetic resonance imaging studies can identify another scan from that same individual. This method is known as "connectotyping" or functional connectome "fingerprinting." We analyzed a unique dataset of 12-30 years old (N = 140) individuals who had two distinct resting state scans on the same day and again 12-18 months later to assess the sensitivity and specificity of fingerprinting accuracy across different time scales (same day, ~1.5 years apart) and developmental periods (youths, adults). Sensitivity and specificity to identify one's own scan was high (average AUC = 0.94), although it was significantly higher in the same day (average AUC = 0.97) than 1.5-years later (average AUC = 0.91). Accuracy in youths (average AUC = 0.93) was not significantly different from adults (average AUC = 0.96). Multiple statistical methods revealed select connections from the Frontoparietal, Default, and Dorsal Attention networks enhanced the ability to identify an individual. Identification of these features generalized across datasets and improved fingerprinting accuracy in a longitudinal replication data set (N = 208). These results provide a framework for understanding the sensitivity and specificity of fingerprinting accuracy in adolescents and adults at multiple time scales. Importantly, distinct features of one's "fingerprint" contribute to one's uniqueness, suggesting that cognitive and default networks play a primary role in the individualization of one's connectome.


Subject(s)
Brain/physiology , Connectome , Default Mode Network/physiology , Human Development/physiology , Nerve Net/physiology , Adolescent , Adult , Brain/diagnostic imaging , Child , Connectome/standards , Default Mode Network/diagnostic imaging , Female , Humans , Individuality , Longitudinal Studies , Magnetic Resonance Imaging , Male , Nerve Net/diagnostic imaging , Sensitivity and Specificity , Young Adult
15.
Mol Psychiatry ; 24(11): 1685-1695, 2019 11.
Article in English | MEDLINE | ID: mdl-29740122

ABSTRACT

Transcription at enhancers is a widespread phenomenon which produces so-called enhancer RNA (eRNA) and occurs in an activity-dependent manner. However, the role of eRNA and its utility in exploring disease-associated changes in enhancer function, and the downstream coding transcripts that they regulate, is not well established. We used transcriptomic and epigenomic data to interrogate the relationship of eRNA transcription to disease status and how genetic variants alter enhancer transcriptional activity in the human brain. We combined RNA-seq data from 537 postmortem brain samples from the CommonMind Consortium with cap analysis of gene expression and enhancer identification, using the assay for transposase-accessible chromatin followed by sequencing (ATACseq). We find 118 differentially transcribed eRNAs in schizophrenia and identify schizophrenia-associated gene/eRNA co-expression modules. Perturbations of a key module are associated with the polygenic risk scores. Furthermore, we identify genetic variants affecting expression of 927 enhancers, which we refer to as enhancer expression quantitative loci or eeQTLs. Enhancer expression patterns are consistent across studies, including differentially expressed eRNAs and eeQTLs. Combining eeQTLs with a genome-wide association study of schizophrenia identifies a genetic variant that alters enhancer function and expression of its target gene, GOLPH3L. Our novel approach to analyzing enhancer transcription is adaptable to other large-scale, non-poly-A depleted, RNA-seq studies.


Subject(s)
Enhancer Elements, Genetic/genetics , Schizophrenia/genetics , Schizophrenia/metabolism , Adult , Case-Control Studies , Chromatin/genetics , Female , Gene Expression Profiling/methods , Gene Expression Regulation/genetics , Genome-Wide Association Study/methods , Humans , Male , Middle Aged , Phosphoproteins/genetics , Phosphoproteins/metabolism , Prefrontal Cortex , Promoter Regions, Genetic/genetics , Quantitative Trait Loci/genetics , RNA/genetics , RNA, Untranslated/genetics , Transcription, Genetic/genetics
16.
Soc Psychiatry Psychiatr Epidemiol ; 55(10): 1383-1393, 2020 Oct.
Article in English | MEDLINE | ID: mdl-31907560

ABSTRACT

PURPOSE: The EGOS study (Epidemiology and Genetics of Obsessive-compulsive disorder and chronic tic disorders in Sweden) is a large-scale, epidemiological, prospective cohort that is used to identify genetic and environmental risk factors in the etiology of obsessive-compulsive disorder (OCD) and chronic tic disorders (CTD). METHODS: Individuals born between January 1954 and December 1998 with at least two diagnoses of OCD or CTD at different timepoints in the National Patient Register (NPR), and followed between January 1997 and December 2012, represent the EGOS source population (n = 20,374). The Swedish Multi-Generation Registry (MGR) are then used to define family relatedness for all cases and additional phenotypic and demographic data added to the resultant database. To create an epidemiologically valid subset of the source cohort that also includes biospecimens and additional phenotyping, we contact cases from within the source population. To date, 6832 invitations have been sent out and 1853 (27%) have elected to participate in the EGOS biospecimen collection. RESULTS: To date, 1608 biological samples have been collected, of which 1249 are genotyped and 832 supplementary Obsessive-Compulsive Inventory-Revised (OCI-R) and/or Florida Obsessive-Compulsive Inventory (FOCI) have been completed by individuals with OCD and/or CTD, age 16-64 years. DNA samples are genotyped using Infinium Global Screening Array and will undergo whole-exome sequencing in the future. Detailed information is available for each individual through linkage to the Swedish national registers, e.g., identification of additional psychiatric diagnoses, medical diagnoses, birth-related variables, and relevant demographic and social data. CONCLUSION: EGOS benefits from a genetically homogeneous sample with epidemiological ascertainment, minimizing the risk of confounding due to population stratification on ascertainment bias. In addition, this study is built upon clinical diagnoses of OCD and CTD in specialized psychiatric care, which reduces further biases and case misclassification.


Subject(s)
Obsessive-Compulsive Disorder , Tic Disorders , Tourette Syndrome , Humans , Obsessive-Compulsive Disorder/diagnosis , Obsessive-Compulsive Disorder/epidemiology , Obsessive-Compulsive Disorder/genetics , Prospective Studies , Sweden/epidemiology , Tic Disorders/diagnosis , Tic Disorders/epidemiology , Tic Disorders/genetics
17.
Am J Hum Genet ; 98(5): 857-868, 2016 05 05.
Article in English | MEDLINE | ID: mdl-27087321

ABSTRACT

One goal of human genetics is to understand the genetic basis of disease, a challenge for diseases of complex inheritance because risk alleles are few relative to the vast set of benign variants. Risk variants are often sought by association studies in which allele frequencies in case subjects are contrasted with those from population-based samples used as control subjects. In an ideal world we would know population-level allele frequencies, releasing researchers to focus on case subjects. We argue this ideal is possible, at least theoretically, and we outline a path to achieving it in reality. If such a resource were to exist, it would yield ample savings and would facilitate the effective use of data repositories by removing administrative and technical barriers. We call this concept the Universal Control Repository Network (UNICORN), a means to perform association analyses without necessitating direct access to individual-level control data. Our approach to UNICORN uses existing genetic resources and various statistical tools to analyze these data, including hierarchical clustering with spectral analysis of ancestry; and empirical Bayesian analysis along with Gaussian spatial processes to estimate ancestry-specific allele frequencies. We demonstrate our approach using tens of thousands of control subjects from studies of Crohn disease, showing how it controls false positives, provides power similar to that achieved when all control data are directly accessible, and enhances power when control data are limiting or even imperfectly matched ancestrally. These results highlight how UNICORN can enable reliable, powerful, and convenient genetic association analyses without access to the individual-level data.


Subject(s)
Disease/genetics , Genetic Predisposition to Disease , Genetics, Population , Heredity/genetics , Bayes Theorem , Case-Control Studies , Gene Frequency , Genetic Linkage , Genotype , Humans , Polymorphism, Single Nucleotide/genetics , Software
20.
Nature ; 485(7397): 237-41, 2012 Apr 04.
Article in English | MEDLINE | ID: mdl-22495306

ABSTRACT

Multiple studies have confirmed the contribution of rare de novo copy number variations to the risk for autism spectrum disorders. But whereas de novo single nucleotide variants have been identified in affected individuals, their contribution to risk has yet to be clarified. Specifically, the frequency and distribution of these mutations have not been well characterized in matched unaffected controls, and such data are vital to the interpretation of de novo coding mutations observed in probands. Here we show, using whole-exome sequencing of 928 individuals, including 200 phenotypically discordant sibling pairs, that highly disruptive (nonsense and splice-site) de novo mutations in brain-expressed genes are associated with autism spectrum disorders and carry large effects. On the basis of mutation rates in unaffected individuals, we demonstrate that multiple independent de novo single nucleotide variants in the same gene among unrelated probands reliably identifies risk alleles, providing a clear path forward for gene discovery. Among a total of 279 identified de novo coding mutations, there is a single instance in probands, and none in siblings, in which two independent nonsense variants disrupt the same gene, SCN2A (sodium channel, voltage-gated, type II, α subunit), a result that is highly unlikely by chance.


Subject(s)
Autistic Disorder/genetics , Exome/genetics , Exons/genetics , Genetic Predisposition to Disease/genetics , Mutation/genetics , Nerve Tissue Proteins/genetics , Sodium Channels/genetics , Alleles , Codon, Nonsense/genetics , Genetic Heterogeneity , Humans , NAV1.2 Voltage-Gated Sodium Channel , RNA Splice Sites/genetics , Siblings
SELECTION OF CITATIONS
SEARCH DETAIL