Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 138
Filtrar
1.
bioRxiv ; 2024 May 05.
Artigo em Inglês | MEDLINE | ID: mdl-38746472

RESUMO

The regulatory mechanisms underlying the response to pro-inflammatory cytokines during myocarditis are poorly understood. Here, we use iPSC-derived cardiovascular progenitor cells (CVPCs) to model the response to interferon gamma (IFN-γ) during myocarditis. We generate RNA-seq and ATAC-seq for four CVPCs that were treated with IFN-γ and compare them with paired untreated controls. Transcriptional differences after treatment show that IFN-γ initiates an innate immune cell-like response in the vascular cardiac endothelium. IFN-γ treatment also shifts the CVPC transcriptome towards the adult coronary artery and aorta profiles and expands the relative endothelial cell population in all four CVPC lines. Analysis of the accessible chromatin shows that IFN-γ is a potent chromatin remodeler and establishes an IRF-STAT immune-cell like regulatory network. Our findings reveal insights into the endothelial-specific protective mechanisms during myocarditis.

2.
bioRxiv ; 2024 Apr 11.
Artigo em Inglês | MEDLINE | ID: mdl-38645112

RESUMO

Most GWAS loci are presumed to affect gene regulation, however, only ∼43% colocalize with expression quantitative trait loci (eQTLs). To address this colocalization gap, we identify eQTLs, chromatin accessibility QTLs (caQTLs), and histone acetylation QTLs (haQTLs) using molecular samples from three early developmental (EDev) tissues. Through colocalization, we annotate 586 GWAS loci for 17 traits by QTL complexity, QTL phenotype, and QTL temporal specificity. We show that GWAS loci are highly enriched for colocalization with complex QTL modules that affect multiple elements (genes and/or peaks). We also demonstrate that caQTLs and haQTLs capture regulatory variations not associated with eQTLs and explain ∼49% of the functionally annotated GWAS loci. Additionally, we show that EDev-unique QTLs are strongly depleted for colocalizing with GWAS loci. By conducting one of the largest multi-omic QTL studies to date, we demonstrate that many GWAS loci exhibit phenotypic complexity and therefore, are missed by traditional eQTL analyses.

3.
Nat Commun ; 15(1): 1664, 2024 Feb 23.
Artigo em Inglês | MEDLINE | ID: mdl-38395976

RESUMO

Stem cells exist in vitro in a spectrum of interconvertible pluripotent states. Analyzing hundreds of hiPSCs derived from different individuals, we show the proportions of these pluripotent states vary considerably across lines. We discover 13 gene network modules (GNMs) and 13 regulatory network modules (RNMs), which are highly correlated with each other suggesting that the coordinated co-accessibility of regulatory elements in the RNMs likely underlie the coordinated expression of genes in the GNMs. Epigenetic analyses reveal that regulatory networks underlying self-renewal and pluripotency are more complex than previously realized. Genetic analyses identify thousands of regulatory variants that overlapped predicted transcription factor binding sites and are associated with chromatin accessibility in the hiPSCs. We show that the master regulator of pluripotency, the NANOG-OCT4 Complex, and its associated network are significantly enriched for regulatory variants with large effects, suggesting that they play a role in the varying cellular proportions of pluripotency states between hiPSCs. Our work bins tens of thousands of regulatory elements in hiPSCs into discrete regulatory networks, shows that pluripotency and self-renewal processes have a surprising level of regulatory complexity, and suggests that genetic factors may contribute to cell state transitions in human iPSC lines.


Assuntos
Células-Tronco Pluripotentes Induzidas , Humanos , Células-Tronco Pluripotentes Induzidas/metabolismo , Redes Reguladoras de Genes , Cromatina/genética , Diferenciação Celular/genética , Fator 3 de Transcrição de Octâmero/genética
4.
Dev Cell ; 58(21): 2206-2216.e5, 2023 11 06.
Artigo em Inglês | MEDLINE | ID: mdl-37848026

RESUMO

Transcriptional enhancers direct precise gene expression patterns during development and harbor the majority of variants associated with phenotypic diversity, evolutionary adaptations, and disease. Pinpointing which enhancer variants contribute to changes in gene expression and phenotypes is a major challenge. Here, we find that suboptimal or low-affinity binding sites are necessary for precise gene expression during heart development. Single-nucleotide variants (SNVs) can optimize the affinity of ETS binding sites, causing gain-of-function (GOF) gene expression, cell migration defects, and phenotypes as severe as extra beating hearts in the marine chordate Ciona robusta. In human induced pluripotent stem cell (iPSC)-derived cardiomyocytes, a SNV within a human GATA4 enhancer increases ETS binding affinity and causes GOF enhancer activity. The prevalence of suboptimal-affinity sites within enhancers creates a vulnerability whereby affinity-optimizing SNVs can lead to GOF gene expression, changes in cellular identity, and organismal-level phenotypes that could contribute to the evolution of novel traits or diseases.


Assuntos
Elementos Facilitadores Genéticos , Células-Tronco Pluripotentes Induzidas , Humanos , Elementos Facilitadores Genéticos/genética , Miócitos Cardíacos/metabolismo , Sítios de Ligação , Nucleotídeos
5.
Nat Commun ; 14(1): 6928, 2023 10 30.
Artigo em Inglês | MEDLINE | ID: mdl-37903777

RESUMO

The impact of genetic regulatory variation active in early pancreatic development on adult pancreatic disease and traits is not well understood. Here, we generate a panel of 107 fetal-like iPSC-derived pancreatic progenitor cells (iPSC-PPCs) from whole genome-sequenced individuals and identify 4065 genes and 4016 isoforms whose expression and/or alternative splicing are affected by regulatory variation. We integrate eQTLs identified in adult islets and whole pancreas samples, which reveal 1805 eQTL associations that are unique to the fetal-like iPSC-PPCs and 1043 eQTLs that exhibit regulatory plasticity across the fetal-like and adult pancreas tissues. Colocalization with GWAS risk loci for pancreatic diseases and traits show that some putative causal regulatory variants are active only in the fetal-like iPSC-PPCs and likely influence disease by modulating expression of disease-associated genes in early development, while others with regulatory plasticity likely exert their effects in both the fetal and adult pancreas by modulating expression of different disease genes in the two developmental stages.


Assuntos
Diabetes Mellitus , Locos de Características Quantitativas , Adulto , Humanos , Locos de Características Quantitativas/genética , Estudo de Associação Genômica Ampla , Pâncreas , Sequência de Bases , Diabetes Mellitus/genética , Polimorfismo de Nucleotídeo Único , Predisposição Genética para Doença
6.
Cell Genom ; 3(7): 100360, 2023 Jul 12.
Artigo em Inglês | MEDLINE | ID: mdl-37492100

RESUMO

For the past few years, researchers in the Human Pangenome Reference Consortium (HPRC) have been working to catalog almost all human genomic diversity. Frazer and Schork preview an article recently published in Nature, "A draft human pangenome reference,"1 which represents the initial release of 47 fully phased diploid assemblies of genomes of individuals with diverse ancestries.

8.
bioRxiv ; 2023 Sep 19.
Artigo em Inglês | MEDLINE | ID: mdl-37292794

RESUMO

Stem cells exist in vitro in a spectrum of interconvertible pluripotent states. Analyzing hundreds of hiPSCs derived from different individuals, we show the proportions of these pluripotent states vary considerably across lines. We discovered 13 gene network modules (GNMs) and 13 regulatory network modules (RNMs), which were highly correlated with each other suggesting that the coordinated co-accessibility of regulatory elements in the RNMs likely underlied the coordinated expression of genes in the GNMs. Epigenetic analyses revealed that regulatory networks underlying self-renewal and pluripotency have a surprising level of complexity. Genetic analyses identified thousands of regulatory variants that overlapped predicted transcription factor binding sites and were associated with chromatin accessibility in the hiPSCs. We show that the master regulator of pluripotency, the NANOG-OCT4 Complex, and its associated network were significantly enriched for regulatory variants with large effects, suggesting that they may play a role in the varying cellular proportions of pluripotency states between hiPSCs. Our work captures the coordinated activity of tens of thousands of regulatory elements in hiPSCs and bins these elements into discrete functionally characterized regulatory networks, shows that regulatory elements in pluripotency networks harbor variants with large effects, and provides a rich resource for future pluripotent stem cell research.

9.
Nat Commun ; 14(1): 1132, 2023 02 28.
Artigo em Inglês | MEDLINE | ID: mdl-36854752

RESUMO

The causal variants and genes underlying thousands of cardiac GWAS signals have yet to be identified. Here, we leverage spatiotemporal information on 966 RNA-seq cardiac samples and perform an expression quantitative trait locus (eQTL) analysis detecting eQTLs considering both eGenes and eIsoforms. We identify 2,578 eQTLs associated with a specific developmental stage-, tissue- and/or cell type. Colocalization between eQTL and GWAS signals of five cardiac traits identified variants with high posterior probabilities for being causal in 210 GWAS loci. Pulse pressure GWAS loci are enriched for colocalization with fetal- and smooth muscle- eQTLs; pulse rate with adult- and cardiac muscle- eQTLs; and atrial fibrillation with cardiac muscle- eQTLs. Fine mapping identifies 79 credible sets with five or fewer SNPs, of which 15 were associated with spatiotemporal eQTLs. Our study shows that many cardiac GWAS variants impact traits and disease in a developmental stage-, tissue- and/or cell type-specific fashion.


Assuntos
Fibrilação Atrial , Coração , Humanos , Miocárdio , Fibrilação Atrial/genética , Pressão Sanguínea , Feto
10.
Exp Eye Res ; 225: 109248, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36108770

RESUMO

Genomic studies in age-related macular degeneration (AMD) have identified genetic variants that account for the majority of AMD risk. An important next step is to understand the functional consequences and downstream effects of the identified AMD-associated genetic variants. Instrumental for this next step are 'omics' technologies, which enable high-throughput characterization and quantification of biological molecules, and subsequent integration of genomics with these omics datasets, a field referred to as systems genomics. Single cell sequencing studies of the retina and choroid demonstrated that the majority of candidate AMD genes identified through genomic studies are expressed in non-neuronal cells, such as the retinal pigment epithelium (RPE), glia, myeloid and choroidal cells, highlighting that many different retinal and choroidal cell types contribute to the pathogenesis of AMD. Expression quantitative trait locus (eQTL) studies in retinal tissue have identified putative causal genes by demonstrating a genetic overlap between gene regulation and AMD risk. Linking genetic data to complement measurements in the systemic circulation has aided in understanding the effect of AMD-associated genetic variants in the complement system, and supports that protein QTL (pQTL) studies in plasma or serum samples may aid in understanding the effect of genetic variants and pinpointing causal genes in AMD. A recent epigenomic study fine-mapped AMD causal variants by determing regulatory regions in RPE cells differentiated from induced pluripotent stem cells (iPSC-RPE). Another approach that is being employed to pinpoint causal AMD genes is to produce synthetic DNA assemblons representing risk and protective haplotypes, which are then delivered to cellular or animal model systems. Pinpointing causal genes and understanding disease mechanisms is crucial for the next step towards clinical translation. Clinical trials targeting proteins encoded by the AMD-associated genomic loci C3, CFB, CFI, CFH, and ARMS2/HTRA1 are currently ongoing, and a phase III clinical trial for C3 inhibition recently showed a modest reduction of lesion growth in geographic atrophy. The EYERISK consortium recently developed a genetic test for AMD that allows genotyping of common and rare variants in AMD-associated genes. Polygenic risk scores (PRS) were applied to quantify AMD genetic risk, and may aid in predicting AMD progression. In conclusion, genomic studies represent a turning point in our exploration of AMD. The results of those studies now serve as a driving force for several clinical trials. Expanding to omics and systems genomics will further decipher function and causality from the associations that have been reported, and will enable the development of therapies that will lessen the burden of AMD.


Assuntos
Degeneração Macular , Humanos , Degeneração Macular/genética , Degeneração Macular/metabolismo , Epitélio Pigmentado da Retina/metabolismo , Proteínas do Sistema Complemento/metabolismo , Corioide/metabolismo , Proteínas/genética , Genômica , Polimorfismo de Nucleotídeo Único , Fator H do Complemento/genética , Fator H do Complemento/metabolismo , Serina Peptidase 1 de Requerimento de Alta Temperatura A/genética
12.
Adv Funct Mater ; 32(8)2022 Feb 16.
Artigo em Inglês | MEDLINE | ID: mdl-35603230

RESUMO

We report innovative scalable, vertical, ultra-sharp nanowire arrays that are individually addressable to enable long-term, native recordings of intracellular potentials. Stable amplitudes of intracellular potentials from 3D tissue-like networks of neurons and cardiomyocytes are obtained. Individual electrical addressability is necessary for high-fidelity intracellular electrophysiological recordings. This study paves the way toward predictive, high-throughput, and low-cost electrophysiological drug screening platforms.

13.
PLoS Comput Biol ; 18(2): e1009918, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-35226669

RESUMO

Reactivation of fetal-specific genes and isoforms occurs during heart failure. However, the underlying molecular mechanisms and the extent to which the fetal program switch occurs remains unclear. Limitations hindering transcriptome-wide analyses of alternative splicing differences (i.e. isoform switching) in cardiovascular system (CVS) tissues between fetal, healthy adult and heart failure have included both cellular heterogeneity across bulk RNA-seq samples and limited availability of fetal tissue for research. To overcome these limitations, we have deconvoluted the cellular compositions of 996 RNA-seq samples representing heart failure, healthy adult (heart and arteria), and fetal-like (iPSC-derived cardiovascular progenitor cells) CVS tissues. Comparison of the expression profiles revealed that reactivation of fetal-specific RNA-binding proteins (RBPs), and the accompanied re-expression of 1,523 fetal-specific isoforms, contribute to the transcriptome differences between heart failure and healthy adult heart. Of note, isoforms for 20 different RBPs were among those that reverted in heart failure to the fetal-like expression pattern. We determined that, compared with adult-specific isoforms, fetal-specific isoforms encode proteins that tend to have more functions, are more likely to harbor RBP binding sites, have canonical sequences at their splice sites, and contain typical upstream polypyrimidine tracts. Our study suggests that compared with healthy adult, fetal cardiac tissue requires stricter transcriptional regulation, and that during heart failure reversion to this stricter transcriptional regulation occurs. Furthermore, we provide a resource of cardiac developmental stage-specific and heart failure-associated genes and isoforms, which are largely unexplored and can be exploited to investigate novel therapeutics for heart failure.


Assuntos
Insuficiência Cardíaca , Adulto , Processamento Alternativo/genética , Feto/metabolismo , Insuficiência Cardíaca/genética , Humanos , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Proteínas de Ligação a RNA/genética , Proteínas de Ligação a RNA/metabolismo
14.
Cell Genom ; 2(12): 100214, 2022 Dec 14.
Artigo em Inglês | MEDLINE | ID: mdl-36778047

RESUMO

We combined functional genomics and human genetics to investigate processes that affect type 1 diabetes (T1D) risk by mediating beta cell survival in response to proinflammatory cytokines. We mapped 38,931 cytokine-responsive candidate cis-regulatory elements (cCREs) in beta cells using ATAC-seq and snATAC-seq and linked them to target genes using co-accessibility and HiChIP. Using a genome-wide CRISPR screen in EndoC-ßH1 cells, we identified 867 genes affecting cytokine-induced survival, and genes promoting survival and up-regulated in cytokines were enriched at T1D risk loci. Using SNP-SELEX, we identified 2,229 variants in cytokine-responsive cCREs altering transcription factor (TF) binding, and variants altering binding of TFs regulating stress, inflammation, and apoptosis were enriched for T1D risk. At the 16p13 locus, a fine-mapped T1D variant altering TF binding in a cytokine-induced cCRE interacted with SOCS1, which promoted survival in cytokine exposure. Our findings reveal processes and genes acting in beta cells during inflammation that modulate T1D risk.

15.
Cell Rep ; 37(7): 110020, 2021 11 16.
Artigo em Inglês | MEDLINE | ID: mdl-34762851

RESUMO

Variability in SARS-CoV-2 susceptibility and COVID-19 disease severity between individuals is partly due to genetic factors. Here, we identify 4 genomic loci with suggestive associations for SARS-CoV-2 susceptibility and 19 for COVID-19 disease severity. Four of these 23 loci likely have an ethnicity-specific component. Genome-wide association study (GWAS) signals in 11 loci colocalize with expression quantitative trait loci (eQTLs) associated with the expression of 20 genes in 62 tissues/cell types (range: 1:43 tissues/gene), including lung, brain, heart, muscle, and skin as well as the digestive system and immune system. We perform genetic fine mapping to compute 99% credible SNP sets, which identify 10 GWAS loci that have eight or fewer SNPs in the credible set, including three loci with one single likely causal SNP. Our study suggests that the diverse symptoms and disease severity of COVID-19 observed between individuals is associated with variants across the genome, affecting gene expression levels in a wide variety of tissue types.


Assuntos
COVID-19/genética , SARS-CoV-2/genética , Mapeamento Cromossômico/métodos , Biologia Computacional/métodos , Bases de Dados Genéticas , Etnicidade/genética , Expressão Gênica/genética , Perfilação da Expressão Gênica/métodos , Predisposição Genética para Doença/genética , Variação Genética/genética , Estudo de Associação Genômica Ampla/métodos , Humanos , Especificidade de Órgãos/genética , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , SARS-CoV-2/patogenicidade , Índice de Gravidade de Doença , Transcriptoma/genética
16.
PLoS Genet ; 17(10): e1009848, 2021 10.
Artigo em Inglês | MEDLINE | ID: mdl-34662339

RESUMO

Patients with inherited retinal dystrophies (IRDs) were recruited from two understudied populations: Mexico and Pakistan as well as a third well-studied population of European Americans to define the genetic architecture of IRD by performing whole-genome sequencing (WGS). Whole-genome analysis was performed on 409 individuals from 108 unrelated pedigrees with IRDs. All patients underwent an ophthalmic evaluation to establish the retinal phenotype. Although the 108 pedigrees in this study had previously been examined for mutations in known IRD genes using a wide range of methodologies including targeted gene(s) or mutation(s) screening, linkage analysis and exome sequencing, the gene mutations responsible for IRD in these 108 pedigrees were not determined. WGS was performed on these pedigrees using Illumina X10 at a minimum of 30X depth. The sequence reads were mapped against hg19 followed by variant calling using GATK. The genome variants were annotated using SnpEff, PolyPhen2, and CADD score; the structural variants (SVs) were called using GenomeSTRiP and LUMPY. We identified potential causative sequence alterations in 61 pedigrees (57%), including 39 novel and 54 reported variants in IRD genes. For 57 of these pedigrees the observed genotype was consistent with the initial clinical diagnosis, the remaining 4 had the clinical diagnosis reclassified based on our findings. In seven pedigrees (12%) we observed atypical causal variants, i.e. unexpected genotype(s), including 4 pedigrees with causal variants in more than one IRD gene within all affected family members, one pedigree with intrafamilial genetic heterogeneity (different affected family members carrying causal variants in different IRD genes), one pedigree carrying a dominant causative variant present in pseudo-recessive form due to consanguinity and one pedigree with a de-novo variant in the affected family member. Combined atypical and large structural variants contributed to about 20% of cases. Among the novel mutations, 75% were detected in Mexican and 50% found in European American pedigrees and have not been reported in any other population while only 20% were detected in Pakistani pedigrees and were not previously reported. The remaining novel IRD causative variants were listed in gnomAD but were found to be very rare and population specific. Mutations in known IRD associated genes contributed to pathology in 63% Mexican, 60% Pakistani and 45% European American pedigrees analyzed. Overall, contribution of known IRD gene variants to disease pathology in these three populations was similar to that observed in other populations worldwide. This study revealed a spectrum of mutations contributing to IRD in three populations, identified a large proportion of novel potentially causative variants that are specific to the corresponding population or not reported in gnomAD and shed light on the genetic architecture of IRD in these diverse global populations.


Assuntos
Etnicidade/genética , Degeneração Retiniana/genética , Consanguinidade , Análise Mutacional de DNA/métodos , Exoma/genética , Proteínas do Olho/genética , Feminino , Estudos de Associação Genética/métodos , Ligação Genética/genética , Genótipo , Humanos , Masculino , México , Mutação/genética , Paquistão , Linhagem , Retina/patologia , Sequenciamento do Exoma/métodos , Sequenciamento Completo do Genoma/métodos
17.
medRxiv ; 2021 May 12.
Artigo em Inglês | MEDLINE | ID: mdl-34013287

RESUMO

Variability in SARS-CoV-2 susceptibility and COVID-19 disease severity between individuals is partly due to genetic factors. Here, we applied colocalization to compare summary statistics for 16 GWASs from the COVID-19 Host Genetics Initiative to investigate similarities and differences in their genetic signals. We identified 9 loci associated with susceptibility (one with two independent GWAS signals; one with an ethnicity-specific signal), 14 associated with severity (one with two independent GWAS signals; two with ethnicity-specific signals) and one harboring two discrepant GWAS signals (one for susceptibility; one for severity). Utilizing colocalization we also identified 45 GTEx tissues that had eQTL(s) for 18 genes strongly associated with GWAS signals in eleven loci (1-4 genes per locus). Some of these genes showed tissue-specific altered expression and others showed altered expression in up to 41 different tissue types. Our study provides insights into the complex molecular mechanisms underlying inherited predispositions to COVID-19-disease phenotypes.

18.
Nature ; 595(7869): 735-740, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-34040254

RESUMO

The functional engagement between an enhancer and its target promoter ensures precise gene transcription1. Understanding the basis of promoter choice by enhancers has important implications for health and disease. Here we report that functional loss of a preferred promoter can release its partner enhancer to loop to and activate an alternative promoter (or alternative promoters) in the neighbourhood. We refer to this target-switching process as 'enhancer release and retargeting'. Genetic deletion, motif perturbation or mutation, and dCas9-mediated CTCF tethering reveal that promoter choice by an enhancer can be determined by the binding of CTCF at promoters, in a cohesin-dependent manner-consistent with a model of 'enhancer scanning' inside the contact domain. Promoter-associated CTCF shows a lower affinity than that at chromatin domain boundaries and often lacks a preferred motif orientation or a partnering CTCF at the cognate enhancer, suggesting properties distinct from boundary CTCF. Analyses of cancer mutations, data from the GTEx project and risk loci from genome-wide association studies, together with a focused CRISPR interference screen, reveal that enhancer release and retargeting represents an overlooked mechanism that underlies the activation of disease-susceptibility genes, as exemplified by a risk locus for Parkinson's disease (NUCKS1-RAB7L1) and three loci associated with cancer (CLPTM1L-TERT, ZCCHC7-PAX5 and PVT1-MYC).


Assuntos
Fator de Ligação a CCCTC/genética , Elementos Facilitadores Genéticos , Predisposição Genética para Doença , Regiões Promotoras Genéticas , Sistemas CRISPR-Cas , Proteínas de Ciclo Celular/genética , Células Cultivadas , Cromatina , Proteínas Cromossômicas não Histona/genética , Deleção de Genes , Regulação Neoplásica da Expressão Gênica , Estudo de Associação Genômica Ampla , Humanos , Células MCF-7 , Neoplasias/genética , Células-Tronco Neurais , Oncogenes , Doença de Parkinson/genética , Coesinas
19.
Nat Genet ; 53(3): 313-321, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33664507

RESUMO

Induced pluripotent stem cells (iPSCs) are an established cellular system to study the impact of genetic variants in derived cell types and developmental contexts. However, in their pluripotent state, the disease impact of genetic variants is less well known. Here, we integrate data from 1,367 human iPSC lines to comprehensively map common and rare regulatory variants in human pluripotent cells. Using this population-scale resource, we report hundreds of new colocalization events for human traits specific to iPSCs, and find increased power to identify rare regulatory variants compared with somatic tissues. Finally, we demonstrate how iPSCs enable the identification of causal genes for rare diseases.


Assuntos
Variação Genética , Células-Tronco Pluripotentes Induzidas/fisiologia , Locos de Características Quantitativas , Síndrome de Bardet-Biedl/genética , Canais de Cálcio/genética , Linhagem Celular , Ataxia Cerebelar/genética , Metilação de DNA , Expressão Gênica , Humanos , Células-Tronco Pluripotentes Induzidas/citologia , Polimorfismo de Nucleotídeo Único , Proteínas/genética , Doenças Raras/genética , Sequências Reguladoras de Ácido Nucleico , Análise de Sequência de RNA , Sequenciamento Completo do Genoma
20.
Nature ; 591(7848): 147-151, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33505025

RESUMO

Many sequence variants have been linked to complex human traits and diseases1, but deciphering their biological functions remains challenging, as most of them reside in noncoding DNA. Here we have systematically assessed the binding of 270 human transcription factors to 95,886 noncoding variants in the human genome using an ultra-high-throughput multiplex protein-DNA binding assay, termed single-nucleotide polymorphism evaluation by systematic evolution of ligands by exponential enrichment (SNP-SELEX). The resulting 828 million measurements of transcription factor-DNA interactions enable estimation of the relative affinity of these transcription factors to each variant in vitro and evaluation of the current methods to predict the effects of noncoding variants on transcription factor binding. We show that the position weight matrices of most transcription factors lack sufficient predictive power, whereas the support vector machine combined with the gapped k-mer representation show much improved performance, when assessed on results from independent SNP-SELEX experiments involving a new set of 61,020 sequence variants. We report highly predictive models for 94 human transcription factors and demonstrate their utility in genome-wide association studies and understanding of the molecular pathways involved in diverse human traits and diseases.


Assuntos
Polimorfismo de Nucleotídeo Único/genética , Técnica de Seleção de Aptâmeros , Máquina de Vetores de Suporte , Fatores de Transcrição/metabolismo , Sítios de Ligação/genética , Doença/genética , Genoma Humano/genética , Humanos , Ligantes , Ligação Proteica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...