Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 32
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Cell ; 187(5): 1059-1075, 2024 Feb 29.
Artículo en Inglés | MEDLINE | ID: mdl-38428388

RESUMEN

Human genetics has emerged as one of the most dynamic areas of biology, with a broadening societal impact. In this review, we discuss recent achievements, ongoing efforts, and future challenges in the field. Advances in technology, statistical methods, and the growing scale of research efforts have all provided many insights into the processes that have given rise to the current patterns of genetic variation. Vast maps of genetic associations with human traits and diseases have allowed characterization of their genetic architecture. Finally, studies of molecular and cellular effects of genetic variants have provided insights into biological processes underlying disease. Many outstanding questions remain, but the field is well poised for groundbreaking discoveries as it increases the use of genetic data to understand both the history of our species and its applications to improve human health.


Asunto(s)
Genética Humana , Humanos , Variación Genética , Herencia Multifactorial , Fenotipo
2.
Cell ; 177(4): 1022-1034.e6, 2019 05 02.
Artículo en Inglés | MEDLINE | ID: mdl-31051098

RESUMEN

Early genome-wide association studies (GWASs) led to the surprising discovery that, for typical complex traits, most of the heritability is due to huge numbers of common variants with tiny effect sizes. Previously, we argued that new models are needed to understand these patterns. Here, we provide a formal model in which genetic contributions to complex traits are partitioned into direct effects from core genes and indirect effects from peripheral genes acting in trans. We propose that most heritability is driven by weak trans-eQTL SNPs, whose effects are mediated through peripheral genes to impact the expression of core genes. In particular, if the core genes for a trait tend to be co-regulated, then the effects of peripheral variation can be amplified such that nearly all of the genetic variance is driven by weak trans effects. Thus, our model proposes a framework for understanding key features of the architecture of complex traits.


Asunto(s)
Regulación de la Expresión Génica/genética , Herencia/genética , Herencia Multifactorial/genética , Bases de Datos Genéticas , Expresión Génica/genética , Perfilación de la Expresión Génica/métodos , Variación Genética/genética , Estudio de Asociación del Genoma Completo , Humanos , Modelos Teóricos , Fenotipo , Polimorfismo Genético/genética , Sitios de Carácter Cuantitativo/genética
3.
Cell ; 176(3): 535-548.e24, 2019 01 24.
Artículo en Inglés | MEDLINE | ID: mdl-30661751

RESUMEN

The splicing of pre-mRNAs into mature transcripts is remarkable for its precision, but the mechanisms by which the cellular machinery achieves such specificity are incompletely understood. Here, we describe a deep neural network that accurately predicts splice junctions from an arbitrary pre-mRNA transcript sequence, enabling precise prediction of noncoding genetic variants that cause cryptic splicing. Synonymous and intronic mutations with predicted splice-altering consequence validate at a high rate on RNA-seq and are strongly deleterious in the human population. De novo mutations with predicted splice-altering consequence are significantly enriched in patients with autism and intellectual disability compared to healthy controls and validate against RNA-seq in 21 out of 28 of these patients. We estimate that 9%-11% of pathogenic mutations in patients with rare genetic disorders are caused by this previously underappreciated class of disease variation.


Asunto(s)
Predicción/métodos , Precursores del ARN/genética , Empalme del ARN/genética , Algoritmos , Empalme Alternativo/genética , Trastorno Autístico/genética , Aprendizaje Profundo , Exones/genética , Humanos , Discapacidad Intelectual/genética , Intrones/genética , Redes Neurales de la Computación , Precursores del ARN/metabolismo , Sitios de Empalme de ARN/genética , Sitios de Empalme de ARN/fisiología
4.
Cell ; 176(3): 663-675.e19, 2019 01 24.
Artículo en Inglés | MEDLINE | ID: mdl-30661756

RESUMEN

In order to provide a comprehensive resource for human structural variants (SVs), we generated long-read sequence data and analyzed SVs for fifteen human genomes. We sequence resolved 99,604 insertions, deletions, and inversions including 2,238 (1.6 Mbp) that are shared among all discovery genomes with an additional 13,053 (6.9 Mbp) present in the majority, indicating minor alleles or errors in the reference. Genotyping in 440 additional genomes confirms the most common SVs in unique euchromatin are now sequence resolved. We report a ninefold SV bias toward the last 5 Mbp of human chromosomes with nearly 55% of all VNTRs (variable number of tandem repeats) mapping to this portion of the genome. We identify SVs affecting coding and noncoding regulatory loci improving annotation and interpretation of functional variation. These data provide the framework to construct a canonical human reference and a resource for developing advanced representations capable of capturing allelic diversity.


Asunto(s)
Frecuencia de los Genes/genética , Genoma Humano/genética , Variación Estructural del Genoma/genética , Alelos , Eucromatina/genética , Genómica/métodos , Humanos , Repeticiones de Minisatélite/genética , Análisis de Secuencia de ADN/métodos
5.
Cell ; 169(7): 1177-1186, 2017 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-28622505

RESUMEN

A central goal of genetics is to understand the links between genetic variation and disease. Intuitively, one might expect disease-causing variants to cluster into key pathways that drive disease etiology. But for complex traits, association signals tend to be spread across most of the genome-including near many genes without an obvious connection to disease. We propose that gene regulatory networks are sufficiently interconnected such that all genes expressed in disease-relevant cells are liable to affect the functions of core disease-related genes and that most heritability can be explained by effects on genes outside core pathways. We refer to this hypothesis as an "omnigenic" model.


Asunto(s)
Enfermedad/genética , Herencia Multifactorial , Animales , Enfermedades Genéticas Congénitas/genética , Estudio de Asociación del Genoma Completo , Genómica , Humanos , Polimorfismo de Nucleótido Simple
6.
Mol Cell ; 82(24): 4681-4699.e8, 2022 12 15.
Artículo en Inglés | MEDLINE | ID: mdl-36435176

RESUMEN

Long introns with short exons in vertebrate genes are thought to require spliceosome assembly across exons (exon definition), rather than introns, thereby requiring transcription of an exon to splice an upstream intron. Here, we developed CoLa-seq (co-transcriptional lariat sequencing) to investigate the timing and determinants of co-transcriptional splicing genome wide. Unexpectedly, 90% of all introns, including long introns, can splice before transcription of a downstream exon, indicating that exon definition is not obligatory for most human introns. Still, splicing timing varies dramatically across introns, and various genetic elements determine this variation. Strong U2AF2 binding to the polypyrimidine tract predicts early splicing, explaining exon definition-independent splicing. Together, our findings question the essentiality of exon definition and reveal features beyond intron and exon length that are determinative for splicing timing.


Asunto(s)
Empalme Alternativo , Empalme del ARN , Humanos , Secuencia de Bases , Intrones/genética , Exones/genética
7.
Nature ; 608(7923): 569-577, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-35922514

RESUMEN

A major challenge in human genetics is to identify the molecular mechanisms of trait-associated and disease-associated variants. To achieve this, quantitative trait locus (QTL) mapping of genetic variants with intermediate molecular phenotypes such as gene expression and splicing have been widely adopted1,2. However, despite successes, the molecular basis for a considerable fraction of trait-associated and disease-associated variants remains unclear3,4. Here we show that ADAR-mediated adenosine-to-inosine RNA editing, a post-transcriptional event vital for suppressing cellular double-stranded RNA (dsRNA)-mediated innate immune interferon responses5-11, is an important potential mechanism underlying genetic variants associated with common inflammatory diseases. We identified and characterized 30,319 cis-RNA editing QTLs (edQTLs) across 49 human tissues. These edQTLs were significantly enriched in genome-wide association study signals for autoimmune and immune-mediated diseases. Colocalization analysis of edQTLs with disease risk loci further pinpointed key, putatively immunogenic dsRNAs formed by expected inverted repeat Alu elements as well as unexpected, highly over-represented cis-natural antisense transcripts. Furthermore, inflammatory disease risk variants, in aggregate, were associated with reduced editing of nearby dsRNAs and induced interferon responses in inflammatory diseases. This unique directional effect agrees with the established mechanism that lack of RNA editing by ADAR1 leads to the specific activation of the dsRNA sensor MDA5 and subsequent interferon responses and inflammation7-9. Our findings implicate cellular dsRNA editing and sensing as a previously underappreciated mechanism of common inflammatory diseases.


Asunto(s)
Adenosina Desaminasa , Predisposición Genética a la Enfermedad , Enfermedades del Sistema Inmune , Inflamación , Edición de ARN , ARN Bicatenario , Adenosina/metabolismo , Adenosina Desaminasa/genética , Adenosina Desaminasa/metabolismo , Elementos Alu/genética , Enfermedades Autoinmunes/genética , Enfermedades Autoinmunes/inmunología , Enfermedades Autoinmunes/patología , Estudio de Asociación del Genoma Completo , Humanos , Enfermedades del Sistema Inmune/genética , Enfermedades del Sistema Inmune/inmunología , Enfermedades del Sistema Inmune/patología , Inmunidad Innata , Inflamación/genética , Inflamación/inmunología , Inflamación/patología , Inosina/metabolismo , Helicasa Inducida por Interferón IFIH1/metabolismo , Interferones/genética , Interferones/inmunología , Sitios de Carácter Cuantitativo/genética , Edición de ARN/genética , ARN Bicatenario/genética , Proteínas de Unión al ARN/metabolismo
8.
Genome Res ; 31(4): 698-712, 2021 04.
Artículo en Inglés | MEDLINE | ID: mdl-33741686

RESUMEN

Single-cell RNA sequencing (scRNA-seq) technology is poised to replace bulk cell RNA sequencing for many biological and medical applications as it allows users to measure gene expression levels in a cell type-specific manner. However, data produced by scRNA-seq often exhibit batch effects that can be specific to a cell type, to a sample, or to an experiment, which prevent integration or comparisons across multiple experiments. Here, we present Dmatch, a method that leverages an external expression atlas of human primary cells and kernel density matching to align multiple scRNA-seq experiments for downstream biological analysis. Dmatch facilitates alignment of scRNA-seq data sets with cell types that may overlap only partially and thus allows integration of multiple distinct scRNA-seq experiments to extract biological insights. In simulation, Dmatch compares favorably to other alignment methods, both in terms of reducing sample-specific clustering and in terms of avoiding overcorrection. When applied to scRNA-seq data collected from clinical samples in a healthy individual and five autoimmune disease patients, Dmatch enabled cell type-specific differential gene expression comparisons across biopsy sites and disease conditions and uncovered a shared population of pro-inflammatory monocytes across biopsy sites in RA patients. We further show that Dmatch increases the number of eQTLs mapped from population scRNA-seq data. Dmatch is fast, scalable, and improves the utility of scRNA-seq for several important applications. Dmatch is freely available online.


Asunto(s)
RNA-Seq/métodos , Análisis de la Célula Individual/métodos , Análisis por Conglomerados , Perfilación de la Expresión Génica , Humanos
9.
PLoS Genet ; 15(4): e1008045, 2019 04.
Artículo en Inglés | MEDLINE | ID: mdl-31002671

RESUMEN

Quantification of gene expression levels at the single cell level has revealed that gene expression can vary substantially even across a population of homogeneous cells. However, it is currently unclear what genomic features control variation in gene expression levels, and whether common genetic variants may impact gene expression variation. Here, we take a genome-wide approach to identify expression variance quantitative trait loci (vQTLs). To this end, we generated single cell RNA-seq (scRNA-seq) data from induced pluripotent stem cells (iPSCs) derived from 53 Yoruba individuals. We collected data for a median of 95 cells per individual and a total of 5,447 single cells, and identified 235 mean expression QTLs (eQTLs) at 10% FDR, of which 79% replicate in bulk RNA-seq data from the same individuals. We further identified 5 vQTLs at 10% FDR, but demonstrate that these can also be explained as effects on mean expression. Our study suggests that dispersion QTLs (dQTLs) which could alter the variance of expression independently of the mean can have larger fold changes, but explain less phenotypic variance than eQTLs. We estimate 4,015 individuals as a lower bound to achieve 80% power to detect the strongest dQTLs in iPSCs. These results will guide the design of future studies on understanding the genetic control of gene expression variance.


Asunto(s)
Células Madre Pluripotentes Inducidas/metabolismo , Sitios de Carácter Cuantitativo , Población Negra/genética , Línea Celular , Simulación por Computador , Perfilación de la Expresión Génica , Variación Genética , Estudio de Asociación del Genoma Completo , Humanos , Modelos Genéticos , Nigeria , Fenotipo , Análisis de Secuencia de ARN , Análisis de la Célula Individual
10.
Genome Res ; 28(1): 122-131, 2018 01.
Artículo en Inglés | MEDLINE | ID: mdl-29208628

RESUMEN

Induced pluripotent stem cells (iPSCs) are an essential tool for studying cellular differentiation and cell types that are otherwise difficult to access. We investigated the use of iPSCs and iPSC-derived cells to study the impact of genetic variation on gene regulation across different cell types and as models for studies of complex disease. To do so, we established a panel of iPSCs from 58 well-studied Yoruba lymphoblastoid cell lines (LCLs); 14 of these lines were further differentiated into cardiomyocytes. We characterized regulatory variation across individuals and cell types by measuring gene expression levels, chromatin accessibility, and DNA methylation. Our analysis focused on a comparison of inter-individual regulatory variation across cell types. While most cell-type-specific regulatory quantitative trait loci (QTLs) lie in chromatin that is open only in the affected cell types, we found that 20% of cell-type-specific regulatory QTLs are in shared open chromatin. This observation motivated us to develop a deep neural network to predict open chromatin regions from DNA sequence alone. Using this approach, we were able to use the sequences of segregating haplotypes to predict the effects of common SNPs on cell-type-specific chromatin accessibility.


Asunto(s)
Diferenciación Celular , Ensamble y Desensamble de Cromatina , Cromatina/metabolismo , Metilación de ADN , Sitios Genéticos , Células Madre Pluripotentes Inducidas/metabolismo , Miocitos Cardíacos/metabolismo , Línea Celular , Cromatina/genética , Humanos , Células Madre Pluripotentes Inducidas/citología , Miocitos Cardíacos/citología
11.
Bioinformatics ; 36(17): 4609-4615, 2020 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-32315392

RESUMEN

MOTIVATION: Next-generation sequencing is rapidly improving diagnostic rates in rare Mendelian diseases, but even with whole genome or whole exome sequencing, the majority of cases remain unsolved. Increasingly, RNA sequencing is being used to solve many cases that evade diagnosis through sequencing alone. Specifically, the detection of aberrant splicing in many rare disease patients suggests that identifying RNA splicing outliers is particularly useful for determining causal Mendelian disease genes. However, there is as yet a paucity of statistical methodologies to detect splicing outliers. RESULTS: We developed LeafCutterMD, a new statistical framework that significantly improves the previously published LeafCutter in the context of detecting outlier splicing events. Through simulations and analysis of real patient data, we demonstrate that LeafCutterMD has better power than the state-of-the-art methodology while controlling false-positive rates. When applied to a cohort of disease-affected probands from the Mayo Clinic Center for Individualized Medicine, LeafCutterMD recovered all aberrantly spliced genes that had previously been identified by manual curation efforts. AVAILABILITY AND IMPLEMENTATION: The source code for this method is available under the opensource Apache 2.0 license in the latest release of the LeafCutter software package available online at http://davidaknowles.github.io/leafcutter. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genoma , Enfermedades Raras , Algoritmos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Empalme del ARN , Enfermedades Raras/diagnóstico , Enfermedades Raras/genética , Análisis de Secuencia de ARN , Programas Informáticos
12.
Nature ; 513(7518): 375-381, 2014 Sep 18.
Artículo en Inglés | MEDLINE | ID: mdl-25186727

RESUMEN

Cichlid fishes are famous for large, diverse and replicated adaptive radiations in the Great Lakes of East Africa. To understand the molecular mechanisms underlying cichlid phenotypic diversity, we sequenced the genomes and transcriptomes of five lineages of African cichlids: the Nile tilapia (Oreochromis niloticus), an ancestral lineage with low diversity; and four members of the East African lineage: Neolamprologus brichardi/pulcher (older radiation, Lake Tanganyika), Metriaclima zebra (recent radiation, Lake Malawi), Pundamilia nyererei (very recent radiation, Lake Victoria), and Astatotilapia burtoni (riverine species around Lake Tanganyika). We found an excess of gene duplications in the East African lineage compared to tilapia and other teleosts, an abundance of non-coding element divergence, accelerated coding sequence evolution, expression divergence associated with transposable element insertions, and regulation by novel microRNAs. In addition, we analysed sequence data from sixty individuals representing six closely related species from Lake Victoria, and show genome-wide diversifying selection on coding and regulatory variants, some of which were recruited from ancient polymorphisms. We conclude that a number of molecular mechanisms shaped East African cichlid genomes, and that amassing of standing variation during periods of relaxed purifying selection may have been important in facilitating subsequent evolutionary diversification.


Asunto(s)
Cíclidos/clasificación , Cíclidos/genética , Evolución Molecular , Especiación Genética , Genoma/genética , África Oriental , Animales , Elementos Transponibles de ADN/genética , Duplicación de Gen/genética , Regulación de la Expresión Génica/genética , Genómica , Lagos , MicroARNs/genética , Filogenia , Polimorfismo Genético/genética
13.
Genome Res ; 25(1): 1-13, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25524026

RESUMEN

Ninety-four percent of mammalian protein-coding exons exceed 51 nucleotides (nt) in length. The paucity of micro-exons (≤ 51 nt) suggests that their recognition and correct processing by the splicing machinery present greater challenges than for longer exons. Yet, because thousands of human genes harbor processed micro-exons, specialized mechanisms may be in place to promote their splicing. Here, we survey deep genomic data sets to define 13,085 micro-exons and to study their splicing mechanisms and molecular functions. More than 60% of annotated human micro-exons exhibit a high level of sequence conservation, an indicator of functionality. While most human micro-exons require splicing-enhancing genomic features to be processed, the splicing of hundreds of micro-exons is enhanced by the adjacent binding of splice factors in the introns of pre-messenger RNAs. Notably, splicing of a significant number of micro-exons was found to be facilitated by the binding of RBFOX proteins, which promote their inclusion in the brain, muscle, and heart. Our analyses suggest that accurate regulation of micro-exon inclusion by RBFOX proteins and PTBP1 plays an important role in the maintenance of tissue-specific protein-protein interactions.


Asunto(s)
Empalme Alternativo , Exones , Ribonucleoproteínas Nucleares Heterogéneas/metabolismo , Proteína de Unión al Tracto de Polipirimidina/metabolismo , Proteínas de Unión al ARN/metabolismo , Animales , Encéfalo/metabolismo , Mapeo Cromosómico , Secuencia Conservada , Regulación de la Expresión Génica , Genómica , Ribonucleoproteínas Nucleares Heterogéneas/genética , Humanos , Intrones , Ratones , Nucleótidos/genética , Proteína de Unión al Tracto de Polipirimidina/genética , Dominios y Motivos de Interacción de Proteínas , Factores de Empalme de ARN , ARN Mensajero , Proteínas de Unión al ARN/genética
14.
Nat Methods ; 12(6): 519-22, 2015 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-25915121

RESUMEN

The simultaneous sequencing of a single cell's genome and transcriptome offers a powerful means to dissect genetic variation and its effect on gene expression. Here we describe G&T-seq, a method for separating and sequencing genomic DNA and full-length mRNA from single cells. By applying G&T-seq to over 220 single cells from mice and humans, we discovered cellular properties that could not be inferred from DNA or RNA sequencing alone.


Asunto(s)
ADN/genética , Genómica/métodos , Técnicas de Amplificación de Ácido Nucleico/métodos , ARN Mensajero/genética , Animales , Línea Celular Tumoral , Humanos , Ratones
15.
Bioinformatics ; 29(2): 160-5, 2013 Jan 15.
Artículo en Inglés | MEDLINE | ID: mdl-23162087

RESUMEN

MOTIVATION: The ready availability of next-generation sequencing has led to a situation where it is easy to produce very fragmentary genome assemblies. We present a pipeline, SWiPS (Scaffolding With Protein Sequences), that uses orthologous proteins to improve low quality genome assemblies. The protein sequences are used as guides to scaffold existing contigs, while simultaneously allowing the gene structure to be predicted by homology. RESULTS: To perform, SWiPS does not depend on a high N50 or whole proteins being encoded on a single contig. We tested our algorithm on simulated next-generation data from Ciona intestinalis, real next-generation data from Drosophila melanogaster, a complex genome assembly of Homo sapiens and the low coverage Sanger sequence assembly of Callorhinchus milii. The improvements in N50 are of the order of ∼20% for the C.intestinalis and H.sapiens assemblies, which is significant, considering the large size of intergenic regions in these eukaryotes. Using the CEGMA pipeline to assess the gene space represented in the genome assemblies, the number of genes retrieved increased by >110% for C.milii and from 20 to 40% for C.intestinalis. The scaffold error rates are low: 85-90% of scaffolds are fully correct, and >95% of local contig joins are correct. AVAILABILITY: SWiPS is available freely for download at http://www.well.ox.ac.uk/∼yli142/swips.html. CONTACT: yang.li@well.ox.ac.uk or copley@well.ox.ac.uk


Asunto(s)
Algoritmos , Genómica/métodos , Homología de Secuencia de Aminoácido , Animales , Ciona intestinalis , Mapeo Contig , Drosophila melanogaster/genética , Peces/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Proteínas/genética , Análisis de Secuencia de ADN
16.
bioRxiv ; 2023 Oct 16.
Artículo en Inglés | MEDLINE | ID: mdl-37745605

RESUMEN

Alternative splicing (AS) is pervasive in human genes, yet the specific function of most AS events remains unknown. It is widely assumed that the primary function of AS is to diversify the proteome, however AS can also influence gene expression levels by producing transcripts rapidly degraded by nonsense-mediated decay (NMD). Currently, there are no precise estimates for how often the coupling of AS and NMD (AS-NMD) impacts gene expression levels because rapidly degraded NMD transcripts are challenging to capture. To better understand the impact of AS on gene expression levels, we analyzed population-scale genomic data in lymphoblastoid cell lines across eight molecular assays that capture gene regulation before, during, and after transcription and cytoplasmic decay. Sequencing nascent mRNA transcripts revealed frequent aberrant splicing of human introns, which results in remarkably high levels of mRNA transcripts subject to NMD. We estimate that ~15% of all protein-coding transcripts are degraded by NMD, and this estimate increases to nearly half of all transcripts for lowly-expressed genes with many introns. Leveraging genetic variation across cell lines, we find that GWAS trait-associated loci explained by AS are similarly likely to associate with NMD-induced expression level differences as with differences in protein isoform usage. Additionally, we used the splice-switching drug risdiplam to perturb AS at hundreds of genes, finding that ~3/4 of the splicing perturbations induce NMD. Thus, we conclude that AS-NMD substantially impacts the expression levels of most human genes. Our work further suggests that much of the molecular impact of AS is mediated by changes in protein expression levels rather than diversification of the proteome.

17.
Nat Genet ; 55(3): 461-470, 2023 03.
Artículo en Inglés | MEDLINE | ID: mdl-36797366

RESUMEN

Obesity-associated morbidity is exacerbated by abdominal obesity, which can be measured as the waist-to-hip ratio adjusted for the body mass index (WHRadjBMI). Here we identify genes associated with obesity and WHRadjBMI and characterize allele-sensitive enhancers that are predicted to regulate WHRadjBMI genes in women. We found that several waist-to-hip ratio-associated variants map within primate-specific Alu retrotransposons harboring a DNA motif associated with adipocyte differentiation. This suggests that a genetic component of adipose distribution in humans may involve co-option of retrotransposons as adipose enhancers. We evaluated the role of the strongest female WHRadjBMI-associated gene, SNX10, in adipose biology. We determined that it is required for human adipocyte differentiation and function and participates in diet-induced adipose expansion in female mice, but not males. Our data identify genes and regulatory mechanisms that underlie female-specific adipose distribution and mediate metabolic dysfunction in women.


Asunto(s)
Obesidad , Retroelementos , Humanos , Femenino , Animales , Ratones , Obesidad/genética , Obesidad/metabolismo , Adiposidad/genética , Índice de Masa Corporal , Relación Cintura-Cadera , Tejido Adiposo/metabolismo , Nexinas de Clasificación/genética , Nexinas de Clasificación/metabolismo
18.
Genome Biol ; 23(1): 103, 2022 04 21.
Artículo en Inglés | MEDLINE | ID: mdl-35449021

RESUMEN

Recent progress in deep learning has greatly improved the prediction of RNA splicing from DNA sequence. Here, we present Pangolin, a deep learning model to predict splice site strength in multiple tissues. Pangolin outperforms state-of-the-art methods for predicting RNA splicing on a variety of prediction tasks. Pangolin improves prediction of the impact of genetic variants on RNA splicing, including common, rare, and lineage-specific genetic variation. In addition, Pangolin identifies loss-of-function mutations with high accuracy and recall, particularly for mutations that are not missense or nonsense, demonstrating remarkable potential for identifying pathogenic variants.


Asunto(s)
Pangolines , Empalme del ARN , Animales , Secuencia de Bases , Mutación , Sitios de Empalme de ARN
19.
Genome Biol ; 22(1): 291, 2021 10 14.
Artículo en Inglés | MEDLINE | ID: mdl-34649612

RESUMEN

BACKGROUND: Alternative cleavage and polyadenylation (APA), an RNA processing event, occurs in over 70% of human protein-coding genes. APA results in mRNA transcripts with distinct 3' ends. Most APA occurs within 3' UTRs, which harbor regulatory elements that can impact mRNA stability, translation, and localization. RESULTS: APA can be profiled using a number of established computational tools that infer polyadenylation sites from standard, short-read RNA-seq datasets. Here, we benchmarked a number of such tools-TAPAS, QAPA, DaPars2, GETUTR, and APATrap- against 3'-Seq, a specialized RNA-seq protocol that enriches for reads at the 3' ends of genes, and Iso-Seq, a Pacific Biosciences (PacBio) single-molecule full-length RNA-seq method in their ability to identify polyadenylation sites and quantify polyadenylation site usage. We demonstrate that 3'-Seq and Iso-Seq are able to identify and quantify the usage of polyadenylation sites more reliably than computational tools that take short-read RNA-seq as input. However, we find that running one such tool, QAPA, with a set of polyadenylation site annotations derived from small quantities of 3'-Seq or Iso-Seq can reliably quantify variation in APA across conditions, such asacross genotypes, as demonstrated by the successful mapping of alternative polyadenylation quantitative trait loci (apaQTL). CONCLUSIONS: We envisage that our analyses will shed light on the advantages of studying APA with more specialized sequencing protocols, such as 3'-Seq or Iso-Seq, and the limitations of studying APA with short-read RNA-seq. We provide a computational pipeline to aid in the identification of polyadenylation sites and quantification of polyadenylation site usages using Iso-Seq data as input.


Asunto(s)
Poliadenilación , RNA-Seq , Programas Informáticos , Benchmarking , Línea Celular , Genoma Humano , Humanos
20.
Genome Biol ; 22(1): 122, 2021 04 29.
Artículo en Inglés | MEDLINE | ID: mdl-33926512

RESUMEN

BACKGROUND: The vast majority of trait-associated variants identified using genome-wide association studies (GWAS) are noncoding, and therefore assumed to impact gene regulation. However, the majority of trait-associated loci are unexplained by regulatory quantitative trait loci (QTLs). RESULTS: We perform a comprehensive characterization of the putative mechanisms by which GWAS loci impact human immune traits. By harmonizing four major immune QTL studies, we identify 26,271 expression QTLs (eQTLs) and 23,121 splicing QTLs (sQTLs) spanning 18 immune cell types. Our colocalization analyses between QTLs and trait-associated loci from 72 GWAS reveals that genetic effects on RNA expression and splicing in immune cells colocalize with 40.4% of GWAS loci for immune-related traits, in many cases increasing the fraction of colocalized loci by two fold compared to previous studies. Notably, we find that the largest contributors of this increase are splicing QTLs, which colocalize on average with 14% of all GWAS loci that do not colocalize with eQTLs. By contrast, we find that cell type-specific eQTLs, and eQTLs with small effect sizes contribute very few new colocalizations. To investigate the 60% of GWAS loci that remain unexplained, we collect H3K27ac CUT&Tag data from rheumatoid arthritis and healthy controls, and find large-scale differences between immune cells from the different disease contexts, including at regions overlapping unexplained GWAS loci. CONCLUSION: Altogether, our work supports RNA splicing as an important mediator of genetic effects on immune traits, and suggests that we must expand our study of regulatory processes in disease contexts to improve functional interpretation of as yet unexplained GWAS loci.


Asunto(s)
Regulación de la Expresión Génica , Estudios de Asociación Genética , Variación Genética , Inmunidad/genética , Sitios de Carácter Cuantitativo , Carácter Cuantitativo Heredable , Artritis Reumatoide/etiología , Artritis Reumatoide/metabolismo , Artritis Reumatoide/patología , Mapeo Cromosómico , Bases de Datos de Ácidos Nucleicos , Susceptibilidad a Enfermedades , Perfilación de la Expresión Génica , Estudios de Asociación Genética/métodos , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo/métodos , Histonas/metabolismo , Humanos , Inmunomodulación/genética , Especificidad de Órganos , Transcriptoma
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA