Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
bioRxiv ; 2023 Nov 08.
Artículo en Inglés | MEDLINE | ID: mdl-37965206

RESUMEN

Genetic variation influencing gene expression and splicing is a key source of phenotypic diversity. Though invaluable, studies investigating these links in humans have been strongly biased toward participants of European ancestries, diminishing generalizability and hindering evolutionary research. To address these limitations, we developed MAGE, an open-access RNA-seq data set of lymphoblastoid cell lines from 731 individuals from the 1000 Genomes Project spread across 5 continental groups and 26 populations. Most variation in gene expression (92%) and splicing (95%) was distributed within versus between populations, mirroring variation in DNA sequence. We mapped associations between genetic variants and expression and splicing of nearby genes (cis-eQTLs and cis-sQTLs, respective), identifying >15,000 putatively causal eQTLs and >16,000 putatively causal sQTLs that are enriched for relevant epigenomic signatures. These include 1310 eQTLs and 1657 sQTLs that are largely private to previously underrepresented populations. Our data further indicate that the magnitude and direction of causal eQTL effects are highly consistent across populations and that apparent "population-specific" effects observed in previous studies were largely driven by low resolution or additional independent eQTLs of the same genes that were not detected. Together, our study expands understanding of gene expression diversity across human populations and provides an inclusive resource for studying the evolution and function of human genomes.

2.
Cell ; 186(7): 1493-1511.e40, 2023 03 30.
Artículo en Inglés | MEDLINE | ID: mdl-37001506

RESUMEN

Understanding how genetic variants impact molecular phenotypes is a key goal of functional genomics, currently hindered by reliance on a single haploid reference genome. Here, we present the EN-TEx resource of 1,635 open-access datasets from four donors (∼30 tissues × âˆ¼15 assays). The datasets are mapped to matched, diploid genomes with long-read phasing and structural variants, instantiating a catalog of >1 million allele-specific loci. These loci exhibit coordinated activity along haplotypes and are less conserved than corresponding, non-allele-specific ones. Surprisingly, a deep-learning transformer model can predict the allele-specific activity based only on local nucleotide-sequence context, highlighting the importance of transcription-factor-binding motifs particularly sensitive to variants. Furthermore, combining EN-TEx with existing genome annotations reveals strong associations between allele-specific and GWAS loci. It also enables models for transferring known eQTLs to difficult-to-profile tissues (e.g., from skin to heart). Overall, EN-TEx provides rich data and generalizable models for more accurate personal functional genomics.


Asunto(s)
Epigenoma , Sitios de Carácter Cuantitativo , Estudio de Asociación del Genoma Completo , Genómica , Fenotipo , Polimorfismo de Nucleótido Simple
3.
Front Cell Dev Biol ; 10: 1033695, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36467401

RESUMEN

The small GTPase family is well-studied in cancer and cellular physiology. With 162 annotated human genes, the family has a broad expression throughout cells of the body. Members of the family have multiple exons that require splicing. Yet, the role of splicing within the family has been underexplored. We have studied the splicing dynamics of small GTPases throughout 41,671 samples by integrating Nanopore and Illumina sequencing techniques. Within this work, we have made several discoveries. 1). Using the GTEx long read data of 92 samples, each small GTPase gene averages two transcripts, with 83 genes (51%) expressing two or more isoforms. 2). Cross-tissue analysis of GTEx from 17,382 samples shows 41 genes (25%) expressing two or more protein-coding isoforms. These include protein-changing transcripts in genes such as RHOA, RAB37, RAB40C, RAB4B, RAB5C, RHOC, RAB1A, RAN, RHEB, RAC1, and KRAS. 3). The isolation and library technique of the RNAseq influences the abundance of non-sense-mediated decay and retained intron transcripts of small GTPases, which are observed more often in genes than appreciated. 4). Analysis of 16,243 samples of "Blood PAXgene" identified seven genes (3.7%; RHOA, RAB40C, RAB4B, RAB37, RAB5B, RAB5C, RHOC) with two or more transcripts expressed as the major isoform (75% of the total gene), suggesting a role of genetics in altering splicing. 5). Rare (ARL6, RAB23, ARL13B, HRAS, NRAS) and common variants (GEM, RHOC, MRAS, RAB5B, RERG, ARL16) can influence splicing and have an impact on phenotypes and diseases. 6). Multiple genes (RAB9A, RAP2C, ARL4A, RAB3A, RAB26, RAB3C, RASL10A, RAB40B, and HRAS) have sex differences in transcript expression. 7). Several exons are included or excluded for small GTPase genes (RASEF, KRAS, RAC1, RHEB, ARL4A, RHOA, RAB30, RHOBTB1, ARL16, RAP1A) in one or more forms of cancer. 8). Ten transcripts are altered in hypoxia (SAR1B, IFT27, ARL14, RAB11A, RAB10, RAB38, RAN, RIT1, RAB9A) with RHOA identified to have a transient 3'UTR RNA base editing at a conserved site found in all of its transcripts. Overall, we show a remarkable and dynamic role of splicing within the small GTPase family that requires future explorations.

5.
Genome Biol ; 21(1): 235, 2020 09 11.
Artículo en Inglés | MEDLINE | ID: mdl-32912314

RESUMEN

Genetic regulation of gene expression, revealed by expression quantitative trait loci (eQTLs), exhibits complex patterns of tissue-specific effects. Characterization of these patterns may allow us to better understand mechanisms of gene regulation and disease etiology. We develop a constrained matrix factorization model, sn-spMF, to learn patterns of tissue-sharing and apply it to 49 human tissues from the Genotype-Tissue Expression (GTEx) project. The learned factors reflect tissues with known biological similarity and identify transcription factors that may mediate tissue-specific effects. sn-spMF, available at https://github.com/heyuan7676/ts_eQTLs , can be applied to learn biologically interpretable patterns of eQTL tissue-specificity and generate testable mechanistic hypotheses.


Asunto(s)
Regulación de la Expresión Génica , Modelos Genéticos , Sitios de Carácter Cuantitativo , Factores de Transcripción/metabolismo , Humanos
6.
Nature ; 583(7818): 720-728, 2020 07.
Artículo en Inglés | MEDLINE | ID: mdl-32728244

RESUMEN

Transcription factors are DNA-binding proteins that have key roles in gene regulation1,2. Genome-wide occupancy maps of transcriptional regulators are important for understanding gene regulation and its effects on diverse biological processes3-6. However, only a minority of the more than 1,600 transcription factors encoded in the human genome has been assayed. Here we present, as part of the ENCODE (Encyclopedia of DNA Elements) project, data and analyses from chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) experiments using the human HepG2 cell line for 208 chromatin-associated proteins (CAPs). These comprise 171 transcription factors and 37 transcriptional cofactors and chromatin regulator proteins, and represent nearly one-quarter of CAPs expressed in HepG2 cells. The binding profiles of these CAPs form major groups associated predominantly with promoters or enhancers, or with both. We confirm and expand the current catalogue of DNA sequence motifs for transcription factors, and describe motifs that correspond to other transcription factors that are co-enriched with the primary ChIP target. For example, FOX family motifs are enriched in ChIP-seq peaks of 37 other CAPs. We show that motif content and occupancy patterns can distinguish between promoters and enhancers. This catalogue reveals high-occupancy target regions at which many CAPs associate, although each contains motifs for only a minority of the numerous associated transcription factors. These analyses provide a more complete overview of the gene regulatory networks that define this cell type, and demonstrate the usefulness of the large-scale production efforts of the ENCODE Consortium.


Asunto(s)
Secuenciación de Inmunoprecipitación de Cromatina , Cromatina/genética , Cromatina/metabolismo , Proteínas de Unión al ADN/metabolismo , Anotación de Secuencia Molecular , Secuencias Reguladoras de Ácidos Nucleicos/genética , Conjuntos de Datos como Asunto , Elementos de Facilitación Genéticos/genética , Células Hep G2 , Humanos , Motivos de Nucleótidos/genética , Regiones Promotoras Genéticas/genética , Unión Proteica , Factores de Transcripción/metabolismo
7.
Nature ; 583(7818): 699-710, 2020 07.
Artículo en Inglés | MEDLINE | ID: mdl-32728249

RESUMEN

The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development. All data are available through the ENCODE data portal (https://www.encodeproject.org), including phase II ENCODE1 and Roadmap Epigenomics2 data. We have developed a registry of 926,535 human and 339,815 mouse candidate cis-regulatory elements, covering 7.9 and 3.4% of their respective genomes, by integrating selected datatypes associated with gene regulation, and constructed a web-based server (SCREEN; http://screen.encodeproject.org) to provide flexible, user-defined access to this resource. Collectively, the ENCODE data and registry provide an expansive resource for the scientific community to build a better understanding of the organization and function of the human and mouse genomes.


Asunto(s)
ADN/genética , Bases de Datos Genéticas , Genoma/genética , Genómica , Anotación de Secuencia Molecular , Sistema de Registros , Secuencias Reguladoras de Ácidos Nucleicos/genética , Animales , Cromatina/genética , Cromatina/metabolismo , ADN/química , Huella de ADN , Metilación de ADN/genética , Momento de Replicación del ADN , Desoxirribonucleasa I/metabolismo , Genoma Humano , Histonas/metabolismo , Humanos , Ratones , Ratones Transgénicos , Proteínas de Unión al ARN/genética , Transcripción Genética/genética , Transposasas/metabolismo
8.
Biol Sex Differ ; 11(1): 28, 2020 05 12.
Artículo en Inglés | MEDLINE | ID: mdl-32398044

RESUMEN

BACKGROUND: The commonly used laboratory rat, Rattus norvegicus, is unique in having multiple Sry gene copies found on the Y chromosome, with different copies encoding amino acid variations that influence the resulting protein function. It is not clear which Sry genes are expressed at the onset of testis differentiation or how their expression correlates with that of other genes in testis-determination pathways. METHODS: Here, two independent E11-E14 developmental RNAseq datasets show that multiple Sry genes are expressed at E12-E13. RESULTS: The identified copies expressed during testis initiation include Sry4A, Sry1, and Sry3C, which are conserved in every strain of Rattus norvegicus with genomes sequenced to date. CONCLUSIONS: This work represents a first step in defining the complex environment of rat testis differentiation that can open the door for generating sex reversal model systems using embryo manipulation techniques that have been available in the mouse but not the rat.


Asunto(s)
Genes sry , Testículo/crecimiento & desarrollo , Animales , Regulación del Desarrollo de la Expresión Génica , Masculino , Ratas Sprague-Dawley , Transcripción Genética
9.
J Am Soc Nephrol ; 29(5): 1525-1535, 2018 05.
Artículo en Inglés | MEDLINE | ID: mdl-29476007

RESUMEN

Background Interpreting genetic variants is one of the greatest challenges impeding analysis of rapidly increasing volumes of genomic data from patients. For example, SHROOM3 is an associated risk gene for CKD, yet causative mechanism(s) of SHROOM3 allele(s) are unknown.Methods We used our analytic pipeline that integrates genetic, computational, biochemical, CRISPR/Cas9 editing, molecular, and physiologic data to characterize coding and noncoding variants to study the human SHROOM3 risk locus for CKD.Results We identified a novel SHROOM3 transcriptional start site, which results in a shorter isoform lacking the PDZ domain and is regulated by a common noncoding sequence variant associated with CKD (rs17319721, allele frequency: 0.35). This variant disrupted allele binding to the transcription factor TCF7L2 in podocyte cell nuclear extracts and altered transcription levels of SHROOM3 in cultured cells, potentially through the loss of repressive looping between rs17319721 and the novel start site. Although common variant mechanisms are of high utility, sequencing is beginning to identify rare variants involved in disease; therefore, we used our biophysical tools to analyze an average of 112,849 individual human genome sequences for rare SHROOM3 missense variants, revealing 35 high-effect variants. The high-effect alleles include a coding variant (P1244L) previously associated with CKD (P=0.01, odds ratio=7.95; 95% CI, 1.53 to 41.46) that we find to be present in East Asian individuals at an allele frequency of 0.0027. We determined that P1244L attenuates the interaction of SHROOM3 with 14-3-3, suggesting alterations to the Hippo pathway, a known mediator of CKD.Conclusions These data demonstrate multiple new SHROOM3-dependent genetic/molecular mechanisms that likely affect CKD.


Asunto(s)
Proteínas de Microfilamentos/genética , Insuficiencia Renal Crónica/genética , Alelos , Animales , Núcleo Celular , Frecuencia de los Genes , Sitios Genéticos , Células HEK293 , Humanos , Ratones , Mutación Missense , Podocitos , Isoformas de Proteínas/genética , Proteína 2 Similar al Factor de Transcripción 7/genética , Transcripción Genética , Pez Cebra
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...