Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 32
Filtrar
1.
Cell ; 187(5): 1059-1075, 2024 Feb 29.
Artigo em Inglês | MEDLINE | ID: mdl-38428388

RESUMO

Human genetics has emerged as one of the most dynamic areas of biology, with a broadening societal impact. In this review, we discuss recent achievements, ongoing efforts, and future challenges in the field. Advances in technology, statistical methods, and the growing scale of research efforts have all provided many insights into the processes that have given rise to the current patterns of genetic variation. Vast maps of genetic associations with human traits and diseases have allowed characterization of their genetic architecture. Finally, studies of molecular and cellular effects of genetic variants have provided insights into biological processes underlying disease. Many outstanding questions remain, but the field is well poised for groundbreaking discoveries as it increases the use of genetic data to understand both the history of our species and its applications to improve human health.


Assuntos
Genética Humana , Humanos , Variação Genética , Herança Multifatorial , Fenótipo
2.
bioRxiv ; 2023 Oct 16.
Artigo em Inglês | MEDLINE | ID: mdl-37745605

RESUMO

Alternative splicing (AS) is pervasive in human genes, yet the specific function of most AS events remains unknown. It is widely assumed that the primary function of AS is to diversify the proteome, however AS can also influence gene expression levels by producing transcripts rapidly degraded by nonsense-mediated decay (NMD). Currently, there are no precise estimates for how often the coupling of AS and NMD (AS-NMD) impacts gene expression levels because rapidly degraded NMD transcripts are challenging to capture. To better understand the impact of AS on gene expression levels, we analyzed population-scale genomic data in lymphoblastoid cell lines across eight molecular assays that capture gene regulation before, during, and after transcription and cytoplasmic decay. Sequencing nascent mRNA transcripts revealed frequent aberrant splicing of human introns, which results in remarkably high levels of mRNA transcripts subject to NMD. We estimate that ~15% of all protein-coding transcripts are degraded by NMD, and this estimate increases to nearly half of all transcripts for lowly-expressed genes with many introns. Leveraging genetic variation across cell lines, we find that GWAS trait-associated loci explained by AS are similarly likely to associate with NMD-induced expression level differences as with differences in protein isoform usage. Additionally, we used the splice-switching drug risdiplam to perturb AS at hundreds of genes, finding that ~3/4 of the splicing perturbations induce NMD. Thus, we conclude that AS-NMD substantially impacts the expression levels of most human genes. Our work further suggests that much of the molecular impact of AS is mediated by changes in protein expression levels rather than diversification of the proteome.

3.
Nat Genet ; 55(3): 461-470, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36797366

RESUMO

Obesity-associated morbidity is exacerbated by abdominal obesity, which can be measured as the waist-to-hip ratio adjusted for the body mass index (WHRadjBMI). Here we identify genes associated with obesity and WHRadjBMI and characterize allele-sensitive enhancers that are predicted to regulate WHRadjBMI genes in women. We found that several waist-to-hip ratio-associated variants map within primate-specific Alu retrotransposons harboring a DNA motif associated with adipocyte differentiation. This suggests that a genetic component of adipose distribution in humans may involve co-option of retrotransposons as adipose enhancers. We evaluated the role of the strongest female WHRadjBMI-associated gene, SNX10, in adipose biology. We determined that it is required for human adipocyte differentiation and function and participates in diet-induced adipose expansion in female mice, but not males. Our data identify genes and regulatory mechanisms that underlie female-specific adipose distribution and mediate metabolic dysfunction in women.


Assuntos
Obesidade , Retroelementos , Humanos , Feminino , Animais , Camundongos , Obesidade/genética , Obesidade/metabolismo , Adiposidade/genética , Índice de Massa Corporal , Relação Cintura-Quadril , Tecido Adiposo/metabolismo , Nexinas de Classificação/genética , Nexinas de Classificação/metabolismo
4.
Mol Cell ; 82(24): 4681-4699.e8, 2022 12 15.
Artigo em Inglês | MEDLINE | ID: mdl-36435176

RESUMO

Long introns with short exons in vertebrate genes are thought to require spliceosome assembly across exons (exon definition), rather than introns, thereby requiring transcription of an exon to splice an upstream intron. Here, we developed CoLa-seq (co-transcriptional lariat sequencing) to investigate the timing and determinants of co-transcriptional splicing genome wide. Unexpectedly, 90% of all introns, including long introns, can splice before transcription of a downstream exon, indicating that exon definition is not obligatory for most human introns. Still, splicing timing varies dramatically across introns, and various genetic elements determine this variation. Strong U2AF2 binding to the polypyrimidine tract predicts early splicing, explaining exon definition-independent splicing. Together, our findings question the essentiality of exon definition and reveal features beyond intron and exon length that are determinative for splicing timing.


Assuntos
Processamento Alternativo , Splicing de RNA , Humanos , Sequência de Bases , Íntrons/genética , Éxons/genética
5.
Nature ; 608(7923): 569-577, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35922514

RESUMO

A major challenge in human genetics is to identify the molecular mechanisms of trait-associated and disease-associated variants. To achieve this, quantitative trait locus (QTL) mapping of genetic variants with intermediate molecular phenotypes such as gene expression and splicing have been widely adopted1,2. However, despite successes, the molecular basis for a considerable fraction of trait-associated and disease-associated variants remains unclear3,4. Here we show that ADAR-mediated adenosine-to-inosine RNA editing, a post-transcriptional event vital for suppressing cellular double-stranded RNA (dsRNA)-mediated innate immune interferon responses5-11, is an important potential mechanism underlying genetic variants associated with common inflammatory diseases. We identified and characterized 30,319 cis-RNA editing QTLs (edQTLs) across 49 human tissues. These edQTLs were significantly enriched in genome-wide association study signals for autoimmune and immune-mediated diseases. Colocalization analysis of edQTLs with disease risk loci further pinpointed key, putatively immunogenic dsRNAs formed by expected inverted repeat Alu elements as well as unexpected, highly over-represented cis-natural antisense transcripts. Furthermore, inflammatory disease risk variants, in aggregate, were associated with reduced editing of nearby dsRNAs and induced interferon responses in inflammatory diseases. This unique directional effect agrees with the established mechanism that lack of RNA editing by ADAR1 leads to the specific activation of the dsRNA sensor MDA5 and subsequent interferon responses and inflammation7-9. Our findings implicate cellular dsRNA editing and sensing as a previously underappreciated mechanism of common inflammatory diseases.


Assuntos
Adenosina Desaminase , Predisposição Genética para Doença , Doenças do Sistema Imunitário , Inflamação , Edição de RNA , RNA de Cadeia Dupla , Adenosina/metabolismo , Adenosina Desaminase/genética , Adenosina Desaminase/metabolismo , Elementos Alu/genética , Doenças Autoimunes/genética , Doenças Autoimunes/imunologia , Doenças Autoimunes/patologia , Estudo de Associação Genômica Ampla , Humanos , Doenças do Sistema Imunitário/genética , Doenças do Sistema Imunitário/imunologia , Doenças do Sistema Imunitário/patologia , Imunidade Inata , Inflamação/genética , Inflamação/imunologia , Inflamação/patologia , Inosina/metabolismo , Helicase IFIH1 Induzida por Interferon/metabolismo , Interferons/genética , Interferons/imunologia , Locos de Características Quantitativas/genética , Edição de RNA/genética , RNA de Cadeia Dupla/genética , Proteínas de Ligação a RNA/metabolismo
7.
Genome Biol ; 23(1): 103, 2022 04 21.
Artigo em Inglês | MEDLINE | ID: mdl-35449021

RESUMO

Recent progress in deep learning has greatly improved the prediction of RNA splicing from DNA sequence. Here, we present Pangolin, a deep learning model to predict splice site strength in multiple tissues. Pangolin outperforms state-of-the-art methods for predicting RNA splicing on a variety of prediction tasks. Pangolin improves prediction of the impact of genetic variants on RNA splicing, including common, rare, and lineage-specific genetic variation. In addition, Pangolin identifies loss-of-function mutations with high accuracy and recall, particularly for mutations that are not missense or nonsense, demonstrating remarkable potential for identifying pathogenic variants.


Assuntos
Pangolins , Splicing de RNA , Animais , Sequência de Bases , Mutação , Sítios de Splice de RNA
8.
Genome Biol ; 22(1): 291, 2021 10 14.
Artigo em Inglês | MEDLINE | ID: mdl-34649612

RESUMO

BACKGROUND: Alternative cleavage and polyadenylation (APA), an RNA processing event, occurs in over 70% of human protein-coding genes. APA results in mRNA transcripts with distinct 3' ends. Most APA occurs within 3' UTRs, which harbor regulatory elements that can impact mRNA stability, translation, and localization. RESULTS: APA can be profiled using a number of established computational tools that infer polyadenylation sites from standard, short-read RNA-seq datasets. Here, we benchmarked a number of such tools-TAPAS, QAPA, DaPars2, GETUTR, and APATrap- against 3'-Seq, a specialized RNA-seq protocol that enriches for reads at the 3' ends of genes, and Iso-Seq, a Pacific Biosciences (PacBio) single-molecule full-length RNA-seq method in their ability to identify polyadenylation sites and quantify polyadenylation site usage. We demonstrate that 3'-Seq and Iso-Seq are able to identify and quantify the usage of polyadenylation sites more reliably than computational tools that take short-read RNA-seq as input. However, we find that running one such tool, QAPA, with a set of polyadenylation site annotations derived from small quantities of 3'-Seq or Iso-Seq can reliably quantify variation in APA across conditions, such asacross genotypes, as demonstrated by the successful mapping of alternative polyadenylation quantitative trait loci (apaQTL). CONCLUSIONS: We envisage that our analyses will shed light on the advantages of studying APA with more specialized sequencing protocols, such as 3'-Seq or Iso-Seq, and the limitations of studying APA with short-read RNA-seq. We provide a computational pipeline to aid in the identification of polyadenylation sites and quantification of polyadenylation site usages using Iso-Seq data as input.


Assuntos
Poliadenilação , RNA-Seq , Software , Benchmarking , Linhagem Celular , Genoma Humano , Humanos
9.
Genome Biol ; 22(1): 122, 2021 04 29.
Artigo em Inglês | MEDLINE | ID: mdl-33926512

RESUMO

BACKGROUND: The vast majority of trait-associated variants identified using genome-wide association studies (GWAS) are noncoding, and therefore assumed to impact gene regulation. However, the majority of trait-associated loci are unexplained by regulatory quantitative trait loci (QTLs). RESULTS: We perform a comprehensive characterization of the putative mechanisms by which GWAS loci impact human immune traits. By harmonizing four major immune QTL studies, we identify 26,271 expression QTLs (eQTLs) and 23,121 splicing QTLs (sQTLs) spanning 18 immune cell types. Our colocalization analyses between QTLs and trait-associated loci from 72 GWAS reveals that genetic effects on RNA expression and splicing in immune cells colocalize with 40.4% of GWAS loci for immune-related traits, in many cases increasing the fraction of colocalized loci by two fold compared to previous studies. Notably, we find that the largest contributors of this increase are splicing QTLs, which colocalize on average with 14% of all GWAS loci that do not colocalize with eQTLs. By contrast, we find that cell type-specific eQTLs, and eQTLs with small effect sizes contribute very few new colocalizations. To investigate the 60% of GWAS loci that remain unexplained, we collect H3K27ac CUT&Tag data from rheumatoid arthritis and healthy controls, and find large-scale differences between immune cells from the different disease contexts, including at regions overlapping unexplained GWAS loci. CONCLUSION: Altogether, our work supports RNA splicing as an important mediator of genetic effects on immune traits, and suggests that we must expand our study of regulatory processes in disease contexts to improve functional interpretation of as yet unexplained GWAS loci.


Assuntos
Regulação da Expressão Gênica , Estudos de Associação Genética , Variação Genética , Imunidade/genética , Locos de Características Quantitativas , Característica Quantitativa Herdável , Artrite Reumatoide/etiologia , Artrite Reumatoide/metabolismo , Artrite Reumatoide/patologia , Mapeamento Cromossômico , Bases de Dados de Ácidos Nucleicos , Suscetibilidade a Doenças , Perfilação da Expressão Gênica , Estudos de Associação Genética/métodos , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/métodos , Histonas/metabolismo , Humanos , Imunomodulação/genética , Especificidade de Órgãos , Transcriptoma
10.
Genome Res ; 31(4): 698-712, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-33741686

RESUMO

Single-cell RNA sequencing (scRNA-seq) technology is poised to replace bulk cell RNA sequencing for many biological and medical applications as it allows users to measure gene expression levels in a cell type-specific manner. However, data produced by scRNA-seq often exhibit batch effects that can be specific to a cell type, to a sample, or to an experiment, which prevent integration or comparisons across multiple experiments. Here, we present Dmatch, a method that leverages an external expression atlas of human primary cells and kernel density matching to align multiple scRNA-seq experiments for downstream biological analysis. Dmatch facilitates alignment of scRNA-seq data sets with cell types that may overlap only partially and thus allows integration of multiple distinct scRNA-seq experiments to extract biological insights. In simulation, Dmatch compares favorably to other alignment methods, both in terms of reducing sample-specific clustering and in terms of avoiding overcorrection. When applied to scRNA-seq data collected from clinical samples in a healthy individual and five autoimmune disease patients, Dmatch enabled cell type-specific differential gene expression comparisons across biopsy sites and disease conditions and uncovered a shared population of pro-inflammatory monocytes across biopsy sites in RA patients. We further show that Dmatch increases the number of eQTLs mapped from population scRNA-seq data. Dmatch is fast, scalable, and improves the utility of scRNA-seq for several important applications. Dmatch is freely available online.


Assuntos
RNA-Seq/métodos , Análise de Célula Única/métodos , Análise por Conglomerados , Perfilação da Expressão Gênica , Humanos
11.
Bioinformatics ; 36(17): 4609-4615, 2020 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-32315392

RESUMO

MOTIVATION: Next-generation sequencing is rapidly improving diagnostic rates in rare Mendelian diseases, but even with whole genome or whole exome sequencing, the majority of cases remain unsolved. Increasingly, RNA sequencing is being used to solve many cases that evade diagnosis through sequencing alone. Specifically, the detection of aberrant splicing in many rare disease patients suggests that identifying RNA splicing outliers is particularly useful for determining causal Mendelian disease genes. However, there is as yet a paucity of statistical methodologies to detect splicing outliers. RESULTS: We developed LeafCutterMD, a new statistical framework that significantly improves the previously published LeafCutter in the context of detecting outlier splicing events. Through simulations and analysis of real patient data, we demonstrate that LeafCutterMD has better power than the state-of-the-art methodology while controlling false-positive rates. When applied to a cohort of disease-affected probands from the Mayo Clinic Center for Individualized Medicine, LeafCutterMD recovered all aberrantly spliced genes that had previously been identified by manual curation efforts. AVAILABILITY AND IMPLEMENTATION: The source code for this method is available under the opensource Apache 2.0 license in the latest release of the LeafCutter software package available online at http://davidaknowles.github.io/leafcutter. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genoma , Doenças Raras , Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Splicing de RNA , Doenças Raras/diagnóstico , Doenças Raras/genética , Análise de Sequência de RNA , Software
12.
Quant Biol ; 8(1): 78-94, 2020 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-32274259

RESUMO

BACKGROUND: Single-cell RNA-sequencing (scRNA-seq) is a rapidly evolving technology that enables measurement of gene expression levels at an unprecedented resolution. Despite the explosive growth in the number of cells that can be assayed by a single experiment, scRNA-seq still has several limitations, including high rates of dropouts, which result in a large number of genes having zero read count in the scRNA-seq data, and complicate downstream analyses. METHODS: To overcome this problem, we treat zeros as missing values and develop nonparametric deep learning methods for imputation. Specifically, our LATE (Learning with AuToEncoder) method trains an autoencoder with random initial values of the parameters, whereas our TRANSLATE (TRANSfer learning with LATE) method further allows for the use of a reference gene expression data set to provide LATE with an initial set of parameter estimates. RESULTS: On both simulated and real data, LATE and TRANSLATE outperform existing scRNA-seq imputation methods, achieving lower mean squared error in most cases, recovering nonlinear gene-gene relationships, and better separating cell types. They are also highly scalable and can efficiently process over 1 million cells in just a few hours on a GPU. CONCLUSIONS: We demonstrate that our nonparametric approach to imputation based on autoencoders is powerful and highly efficient.

13.
Neuron ; 105(2): 293-309.e5, 2020 01 22.
Artigo em Inglês | MEDLINE | ID: mdl-31901304

RESUMO

The molecular mechanisms that govern the maturation of oligodendrocyte lineage cells remain unclear. Emerging studies have shown that N6-methyladenosine (m6A), the most common internal RNA modification of mammalian mRNA, plays a critical role in various developmental processes. Here, we demonstrate that oligodendrocyte lineage progression is accompanied by dynamic changes in m6A modification on numerous transcripts. In vivo conditional inactivation of an essential m6A writer component, METTL14, results in decreased oligodendrocyte numbers and CNS hypomyelination, although oligodendrocyte precursor cell (OPC) numbers are normal. In vitro Mettl14 ablation disrupts postmitotic oligodendrocyte maturation and has distinct effects on OPC and oligodendrocyte transcriptomes. Moreover, the loss of Mettl14 in oligodendrocyte lineage cells causes aberrant splicing of myriad RNA transcripts, including those that encode the essential paranodal component neurofascin 155 (NF155). Together, our findings indicate that dynamic RNA methylation plays an important regulatory role in oligodendrocyte development and CNS myelination.


Assuntos
Adenosina/análogos & derivados , Diferenciação Celular/fisiologia , Metiltransferases/fisiologia , Bainha de Mielina/fisiologia , Oligodendroglia/citologia , Oligodendroglia/fisiologia , RNA Mensageiro/metabolismo , Adenosina/metabolismo , Animais , Moléculas de Adesão Celular/metabolismo , Contagem de Células , Linhagem da Célula , Células Cultivadas , Feminino , Masculino , Metilação , Metiltransferases/genética , Metiltransferases/metabolismo , Camundongos , Camundongos Transgênicos , Fatores de Crescimento Neural/metabolismo , Células Precursoras de Oligodendrócitos/fisiologia
14.
Methods Mol Biol ; 2082: 51-62, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31849007

RESUMO

Most complex traits, including diseases, have a large genetic component. Identifying the genetic variants and genes underlying phenotypic variation remains one of the most important objectives of current biomedical research. Unlike Mendelian or familial diseases, which are usually caused by mutations in the coding regions of individual genes, complex diseases are thought to result from the cumulative effects of a large number of variants, of which, the vast majority are noncoding. Therefore, to discern the genetic underpinnings of a complex trait, we must first understand the impact of noncoding variation, which presumably affects gene regulation. In this chapter, we outline the recent progress made and methods used to discover putative regulatory regions associated with complex traits. We will specifically focus on mapping splicing quantitative trait loci (sQTL) using Yoruba samples from GEUVADIS as a motivating example.


Assuntos
Mapeamento Cromossômico , Locos de Características Quantitativas , Splicing de RNA , Processamento Alternativo , Biologia Computacional/métodos , Expressão Gênica , Variação Genética , Humanos , Herança Multifatorial , Característica Quantitativa Herdável , Software
15.
Cell ; 177(4): 1022-1034.e6, 2019 05 02.
Artigo em Inglês | MEDLINE | ID: mdl-31051098

RESUMO

Early genome-wide association studies (GWASs) led to the surprising discovery that, for typical complex traits, most of the heritability is due to huge numbers of common variants with tiny effect sizes. Previously, we argued that new models are needed to understand these patterns. Here, we provide a formal model in which genetic contributions to complex traits are partitioned into direct effects from core genes and indirect effects from peripheral genes acting in trans. We propose that most heritability is driven by weak trans-eQTL SNPs, whose effects are mediated through peripheral genes to impact the expression of core genes. In particular, if the core genes for a trait tend to be co-regulated, then the effects of peripheral variation can be amplified such that nearly all of the genetic variance is driven by weak trans effects. Thus, our model proposes a framework for understanding key features of the architecture of complex traits.


Assuntos
Regulação da Expressão Gênica/genética , Hereditariedade/genética , Herança Multifatorial/genética , Bases de Dados Genéticas , Expressão Gênica/genética , Perfilação da Expressão Gênica/métodos , Variação Genética/genética , Estudo de Associação Genômica Ampla , Humanos , Modelos Teóricos , Fenótipo , Polimorfismo Genético/genética , Locos de Características Quantitativas/genética
16.
PLoS Genet ; 15(4): e1008045, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-31002671

RESUMO

Quantification of gene expression levels at the single cell level has revealed that gene expression can vary substantially even across a population of homogeneous cells. However, it is currently unclear what genomic features control variation in gene expression levels, and whether common genetic variants may impact gene expression variation. Here, we take a genome-wide approach to identify expression variance quantitative trait loci (vQTLs). To this end, we generated single cell RNA-seq (scRNA-seq) data from induced pluripotent stem cells (iPSCs) derived from 53 Yoruba individuals. We collected data for a median of 95 cells per individual and a total of 5,447 single cells, and identified 235 mean expression QTLs (eQTLs) at 10% FDR, of which 79% replicate in bulk RNA-seq data from the same individuals. We further identified 5 vQTLs at 10% FDR, but demonstrate that these can also be explained as effects on mean expression. Our study suggests that dispersion QTLs (dQTLs) which could alter the variance of expression independently of the mean can have larger fold changes, but explain less phenotypic variance than eQTLs. We estimate 4,015 individuals as a lower bound to achieve 80% power to detect the strongest dQTLs in iPSCs. These results will guide the design of future studies on understanding the genetic control of gene expression variance.


Assuntos
Células-Tronco Pluripotentes Induzidas/metabolismo , Locos de Características Quantitativas , População Negra/genética , Linhagem Celular , Simulação por Computador , Perfilação da Expressão Gênica , Variação Genética , Estudo de Associação Genômica Ampla , Humanos , Modelos Genéticos , Nigéria , Fenótipo , Análise de Sequência de RNA , Análise de Célula Única
17.
Nat Commun ; 10(1): 994, 2019 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-30824768

RESUMO

Genome-wide association studies (GWAS) have identified over 41 susceptibility loci associated with Parkinson's Disease (PD) but identifying putative causal genes and the underlying mechanisms remains challenging. Here, we leverage large-scale transcriptomic datasets to prioritize genes that are likely to affect PD by using a transcriptome-wide association study (TWAS) approach. Using this approach, we identify 66 gene associations whose predicted expression or splicing levels in dorsolateral prefrontal cortex (DLFPC) and peripheral monocytes are significantly associated with PD risk. We uncover many novel genes associated with PD but also novel mechanisms for known associations such as MAPT, for which we find that variation in exon 3 splicing explains the common genetic association. Genes identified in our analyses belong to the same or related pathways including lysosomal and innate immune function. Overall, our study provides a strong foundation for further mechanistic studies that will elucidate the molecular drivers of PD.


Assuntos
Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/métodos , Doença de Parkinson/genética , Transcriptoma/genética , Bases de Dados Genéticas , Regulação da Expressão Gênica , Genótipo , Humanos , Imunidade Inata , Doença de Parkinson/imunologia , Polimorfismo de Nucleotídeo Único , Fatores de Risco
18.
Cell ; 176(3): 535-548.e24, 2019 01 24.
Artigo em Inglês | MEDLINE | ID: mdl-30661751

RESUMO

The splicing of pre-mRNAs into mature transcripts is remarkable for its precision, but the mechanisms by which the cellular machinery achieves such specificity are incompletely understood. Here, we describe a deep neural network that accurately predicts splice junctions from an arbitrary pre-mRNA transcript sequence, enabling precise prediction of noncoding genetic variants that cause cryptic splicing. Synonymous and intronic mutations with predicted splice-altering consequence validate at a high rate on RNA-seq and are strongly deleterious in the human population. De novo mutations with predicted splice-altering consequence are significantly enriched in patients with autism and intellectual disability compared to healthy controls and validate against RNA-seq in 21 out of 28 of these patients. We estimate that 9%-11% of pathogenic mutations in patients with rare genetic disorders are caused by this previously underappreciated class of disease variation.


Assuntos
Previsões/métodos , Precursores de RNA/genética , Splicing de RNA/genética , Algoritmos , Processamento Alternativo/genética , Transtorno Autístico/genética , Aprendizado Profundo , Éxons/genética , Humanos , Deficiência Intelectual/genética , Íntrons/genética , Redes Neurais de Computação , Precursores de RNA/metabolismo , Sítios de Splice de RNA/genética , Sítios de Splice de RNA/fisiologia
19.
Cell ; 176(3): 663-675.e19, 2019 01 24.
Artigo em Inglês | MEDLINE | ID: mdl-30661756

RESUMO

In order to provide a comprehensive resource for human structural variants (SVs), we generated long-read sequence data and analyzed SVs for fifteen human genomes. We sequence resolved 99,604 insertions, deletions, and inversions including 2,238 (1.6 Mbp) that are shared among all discovery genomes with an additional 13,053 (6.9 Mbp) present in the majority, indicating minor alleles or errors in the reference. Genotyping in 440 additional genomes confirms the most common SVs in unique euchromatin are now sequence resolved. We report a ninefold SV bias toward the last 5 Mbp of human chromosomes with nearly 55% of all VNTRs (variable number of tandem repeats) mapping to this portion of the genome. We identify SVs affecting coding and noncoding regulatory loci improving annotation and interpretation of functional variation. These data provide the framework to construct a canonical human reference and a resource for developing advanced representations capable of capturing allelic diversity.


Assuntos
Frequência do Gene/genética , Genoma Humano/genética , Variação Estrutural do Genoma/genética , Alelos , Eucromatina/genética , Genômica/métodos , Humanos , Repetições Minissatélites/genética , Análise de Sequência de DNA/métodos
20.
Nat Genet ; 50(11): 1584-1592, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30297968

RESUMO

Here we use deep sequencing to identify sources of variation in mRNA splicing in the dorsolateral prefrontal cortex (DLPFC) of 450 subjects from two aging cohorts. Hundreds of aberrant pre-mRNA splicing events are reproducibly associated with Alzheimer's disease. We also generate a catalog of splicing quantitative trait loci (sQTL) effects: splicing of 3,006 genes is influenced by genetic variation. We report that altered splicing is the mechanism for the effects of the PICALM, CLU and PTK2B susceptibility alleles. Furthermore, we performed a transcriptome-wide association study and identified 21 genes with significant associations with Alzheimer's disease, many of which are found in known loci, whereas 8 are in novel loci. These results highlight the convergence of old and new genes associated with Alzheimer's disease in autophagy-lysosomal-related pathways. Overall, this study of the transcriptome of the aging brain provides evidence that dysregulation of mRNA splicing is a feature of Alzheimer's disease and is, in some cases, genetically driven.


Assuntos
Envelhecimento/genética , Processamento Alternativo/genética , Doença de Alzheimer/genética , Encéfalo/metabolismo , Perfilação da Expressão Gênica/métodos , Idoso , Idoso de 80 Anos ou mais , Envelhecimento/metabolismo , Doença de Alzheimer/metabolismo , Doença de Alzheimer/patologia , Mapeamento Cromossômico/métodos , Estudos de Coortes , Feminino , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Humanos , Masculino , Locos de Características Quantitativas/genética , Splicing de RNA/genética , Biologia de Sistemas/métodos , Integração de Sistemas , Transcriptoma/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...