Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
Cell ; 159(1): 188-199, 2014 Sep 25.
Artículo en Inglés | MEDLINE | ID: mdl-25259926

RESUMEN

Intermolecular RNA-RNA interactions are used by many noncoding RNAs (ncRNAs) to achieve their diverse functions. To identify these contacts, we developed a method based on RNA antisense purification to systematically map RNA-RNA interactions (RAP-RNA) and applied it to investigate two ncRNAs implicated in RNA processing: U1 small nuclear RNA, a component of the spliceosome, and Malat1, a large ncRNA that localizes to nuclear speckles. U1 and Malat1 interact with nascent transcripts through distinct targeting mechanisms. Using differential crosslinking, we confirmed that U1 directly hybridizes to 5' splice sites and 5' splice site motifs throughout introns and found that Malat1 interacts with pre-mRNAs indirectly through protein intermediates. Interactions with nascent pre-mRNAs cause U1 and Malat1 to localize proximally to chromatin at active genes, demonstrating that ncRNAs can use RNA-RNA interactions to target specific pre-mRNAs and genomic sites. RAP-RNA is sensitive to lower abundance RNAs as well, making it generally applicable for investigating ncRNAs.


Asunto(s)
Técnicas Genéticas , ARN Mensajero/metabolismo , Animales , Secuencia de Bases , Reactivos de Enlaces Cruzados/metabolismo , Ratones , Datos de Secuencia Molecular , Motivos de Nucleótidos , Sitios de Empalme de ARN , ARN Largo no Codificante/química , ARN Largo no Codificante/metabolismo , ARN Mensajero/química , ARN Nuclear Pequeño/metabolismo , ARN no Traducido/química , ARN no Traducido/metabolismo
2.
Cell ; 152(4): 703-13, 2013 Feb 14.
Artículo en Inglés | MEDLINE | ID: mdl-23415221

RESUMEN

Although several hundred regions of the human genome harbor signals of positive natural selection, few of the relevant adaptive traits and variants have been elucidated. Using full-genome sequence variation from the 1000 Genomes (1000G) Project and the composite of multiple signals (CMS) test, we investigated 412 candidate signals and leveraged functional annotation, protein structure modeling, epigenetics, and association studies to identify and extensively annotate candidate causal variants. The resulting catalog provides a tractable list for experimental follow-up; it includes 35 high-scoring nonsynonymous variants, 59 variants associated with expression levels of a nearby coding gene or lincRNA, and numerous variants associated with susceptibility to infectious disease and other phenotypes. We experimentally characterized one candidate nonsynonymous variant in Toll-like receptor 5 (TLR5) and show that it leads to altered NF-κB signaling in response to bacterial flagellin. PAPERFLICK:


Asunto(s)
Técnicas Genéticas , Genoma Humano , Estudio de Asociación del Genoma Completo , Mutación , Animales , Bacterias/metabolismo , Flagelina/metabolismo , Proyecto Mapa de Haplotipos , Humanos , FN-kappa B/metabolismo , Sitios de Carácter Cuantitativo , Elementos Reguladores de la Transcripción , Transducción de Señal , Receptor Toll-Like 5/genética , Receptor Toll-Like 5/metabolismo
3.
Nature ; 607(7917): 176-184, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-35594906

RESUMEN

Gene regulation in the human genome is controlled by distal enhancers that activate specific nearby promoters1. A proposed model for this specificity is that promoters have sequence-encoded preferences for certain enhancers, for example, mediated by interacting sets of transcription factors or cofactors2. This 'biochemical compatibility' model has been supported by observations at individual human promoters and by genome-wide measurements in Drosophila3-9. However, the degree to which human enhancers and promoters are intrinsically compatible has not yet been systematically measured, and how their activities combine to control RNA expression remains unclear. Here we design a high-throughput reporter assay called enhancer × promoter self-transcribing active regulatory region sequencing (ExP STARR-seq) and applied it to examine the combinatorial compatibilities of 1,000 enhancer and 1,000 promoter sequences in human K562 cells. We identify simple rules for enhancer-promoter compatibility, whereby most enhancers activate all promoters by similar amounts, and intrinsic enhancer and promoter activities multiplicatively combine to determine RNA output (R2 = 0.82). In addition, two classes of enhancers and promoters show subtle preferential effects. Promoters of housekeeping genes contain built-in activating motifs for factors such as GABPA and YY1, which decrease the responsiveness of promoters to distal enhancers. Promoters of variably expressed genes lack these motifs and show stronger responsiveness to enhancers. Together, this systematic assessment of enhancer-promoter compatibility suggests a multiplicative model tuned by enhancer and promoter class to control gene transcription in the human genome.


Asunto(s)
Elementos de Facilitación Genéticos , Regiones Promotoras Genéticas , Elementos de Facilitación Genéticos/genética , Humanos , Regiones Promotoras Genéticas/genética , ARN/biosíntesis , ARN/genética , Factores de Transcripción/metabolismo
4.
Annu Rev Genet ; 47: 97-120, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24274750

RESUMEN

The past fifty years have seen the development and application of numerous statistical methods to identify genomic regions that appear to be shaped by natural selection. These methods have been used to investigate the macro- and microevolution of a broad range of organisms, including humans. Here, we provide a comprehensive outline of these methods, explaining their conceptual motivations and statistical interpretations. We highlight areas of recent and future development in evolutionary genomics methods and discuss ongoing challenges for researchers employing such tests. In particular, we emphasize the importance of functional follow-up studies to characterize putative selected alleles and the use of selection scans as hypothesis-generating tools for investigating evolutionary histories.


Asunto(s)
Genómica , Selección Genética/genética , Adaptación Fisiológica/genética , Alelos , Sustitución de Aminoácidos , Animales , Evolución Molecular , Predicción , Frecuencia de los Genes , Genética de Población/métodos , Técnicas de Genotipaje , Humanos , Desequilibrio de Ligamiento , Modelos Genéticos , Herencia Multifactorial/genética , Mutación , Tasa de Mutación , Fenotipo , Análisis de Secuencia de ADN
5.
Proc Natl Acad Sci U S A ; 115(30): E7222-E7230, 2018 07 24.
Artículo en Inglés | MEDLINE | ID: mdl-29987030

RESUMEN

Gene expression is controlled by sequence-specific transcription factors (TFs), which bind to regulatory sequences in DNA. TF binding occurs in nucleosome-depleted regions of DNA (NDRs), which generally encompass regions with lengths similar to those protected by nucleosomes. However, less is known about where within these regions specific TFs tend to be found. Here, we characterize the positional bias of inferred binding sites for 103 TFs within ∼500,000 NDRs across 47 cell types. We find that distinct classes of TFs display different binding preferences: Some tend to have binding sites toward the edges, some toward the center, and some at other positions within the NDR. These patterns are highly consistent across cell types, suggesting that they may reflect TF-specific intrinsic structural or functional characteristics. In particular, TF classes with binding sites at NDR edges are enriched for those known to interact with histones and chromatin remodelers, whereas TFs with central enrichment interact with other TFs and cofactors such as p300. Our results suggest distinct regiospecific binding patterns and functions of TF classes within enhancers.


Asunto(s)
Regulación de la Expresión Génica/fisiología , Elementos de Respuesta/fisiología , Factores de Transcripción/metabolismo , Humanos , Células Jurkat , Factores de Transcripción/genética , Células U937
6.
Proc Natl Acad Sci U S A ; 114(7): E1291-E1300, 2017 02 14.
Artículo en Inglés | MEDLINE | ID: mdl-28137873

RESUMEN

Enhancers regulate gene expression through the binding of sequence-specific transcription factors (TFs) to cognate motifs. Various features influence TF binding and enhancer function-including the chromatin state of the genomic locus, the affinities of the binding site, the activity of the bound TFs, and interactions among TFs. However, the precise nature and relative contributions of these features remain unclear. Here, we used massively parallel reporter assays (MPRAs) involving 32,115 natural and synthetic enhancers, together with high-throughput in vivo binding assays, to systematically dissect the contribution of each of these features to the binding and activity of genomic regulatory elements that contain motifs for PPARγ, a TF that serves as a key regulator of adipogenesis. We show that distinct sets of features govern PPARγ binding vs. enhancer activity. PPARγ binding is largely governed by the affinity of the specific motif site and higher-order features of the larger genomic locus, such as chromatin accessibility. In contrast, the enhancer activity of PPARγ binding sites depends on varying contributions from dozens of TFs in the immediate vicinity, including interactions between combinations of these TFs. Different pairs of motifs follow different interaction rules, including subadditive, additive, and superadditive interactions among specific classes of TFs, with both spatially constrained and flexible grammars. Our results provide a paradigm for the systematic characterization of the genomic features underlying regulatory elements, applicable to the design of synthetic regulatory elements or the interpretation of human genetic variation.


Asunto(s)
Elementos de Facilitación Genéticos/genética , Regulación de la Expresión Génica , Genómica/métodos , Factores de Transcripción/metabolismo , Células 3T3-L1 , Animales , Sitios de Unión/genética , Ratones , Mutación , Motivos de Nucleótidos/genética , PPAR gamma/metabolismo , Unión Proteica
7.
PLoS Genet ; 7(4): e1001383, 2011 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-21533027

RESUMEN

The Plasmodium falciparum parasite's ability to adapt to environmental pressures, such as the human immune system and antimalarial drugs, makes malaria an enduring burden to public health. Understanding the genetic basis of these adaptations is critical to intervening successfully against malaria. To that end, we created a high-density genotyping array that assays over 17,000 single nucleotide polymorphisms (∼ 1 SNP/kb), and applied it to 57 culture-adapted parasites from three continents. We characterized genome-wide genetic diversity within and between populations and identified numerous loci with signals of natural selection, suggesting their role in recent adaptation. In addition, we performed a genome-wide association study (GWAS), searching for loci correlated with resistance to thirteen antimalarials; we detected both known and novel resistance loci, including a new halofantrine resistance locus, PF10_0355. Through functional testing we demonstrated that PF10_0355 overexpression decreases sensitivity to halofantrine, mefloquine, and lumefantrine, but not to structurally unrelated antimalarials, and that increased gene copy number mediates resistance. Our GWAS and follow-on functional validation demonstrate the potential of genome-wide studies to elucidate functionally important loci in the malaria parasite genome.


Asunto(s)
Antimaláricos/farmacología , Resistencia a Medicamentos/genética , Sitios Genéticos , Plasmodium falciparum/genética , Etanolaminas/farmacología , Fluorenos/farmacología , Dosificación de Gen , Expresión Génica , Estudios de Asociación Genética , Variación Genética , Genotipo , Haplotipos , Desequilibrio de Ligamiento , Lumefantrina , Malaria Falciparum/parasitología , Malaria Falciparum/prevención & control , Mefloquina/farmacología , Fenantrenos/farmacología , Plasmodium falciparum/efectos de los fármacos , Polimorfismo de Nucleótido Simple , Selección Genética
8.
PLoS Genet ; 6(10): e1001169, 2010 Oct 21.
Artículo en Inglés | MEDLINE | ID: mdl-20975951

RESUMEN

In human cells, DNA double-strand breaks are repaired primarily by the non-homologous end joining (NHEJ) pathway. Given their critical nature, we expected NHEJ proteins to be evolutionarily conserved, with relatively little sequence change over time. Here, we report that while critical domains of these proteins are conserved as expected, the sequence of NHEJ proteins has also been shaped by recurrent positive selection, leading to rapid sequence evolution in other protein domains. In order to characterize the molecular evolution of the human NHEJ pathway, we generated large simian primate sequence datasets for NHEJ genes. Codon-based models of gene evolution yielded statistical support for the recurrent positive selection of five NHEJ genes during primate evolution: XRCC4, NBS1, Artemis, POLλ, and CtIP. Analysis of human polymorphism data using the composite of multiple signals (CMS) test revealed that XRCC4 has also been subjected to positive selection in modern humans. Crystal structures are available for XRCC4, Nbs1, and Polλ; and residues under positive selection fall exclusively on the surfaces of these proteins. Despite the positive selection of such residues, biochemical experiments with variants of one positively selected site in Nbs1 confirm that functions necessary for DNA repair and checkpoint signaling have been conserved. However, many viruses interact with the proteins of the NHEJ pathway as part of their infectious lifecycle. We propose that an ongoing evolutionary arms race between viruses and NHEJ genes may be driving the surprisingly rapid evolution of these critical genes.


Asunto(s)
Reparación del ADN/genética , Evolución Molecular , Primates/genética , Recombinación Genética/genética , Adaptación Fisiológica/genética , Secuencia de Aminoácidos , Animales , Sitios de Unión/genética , Proteínas Portadoras/química , Proteínas Portadoras/genética , Proteínas Portadoras/metabolismo , Proteínas de Ciclo Celular/química , Proteínas de Ciclo Celular/genética , Proteínas de Ciclo Celular/metabolismo , Roturas del ADN de Doble Cadena , ADN Polimerasa beta/química , ADN Polimerasa beta/genética , ADN Polimerasa beta/metabolismo , Proteínas de Unión al ADN/química , Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Endodesoxirribonucleasas , Endonucleasas , Humanos , Modelos Moleculares , Datos de Secuencia Molecular , Proteínas Nucleares/química , Proteínas Nucleares/genética , Proteínas Nucleares/metabolismo , Filogenia , Primates/clasificación , Unión Proteica , Estructura Terciaria de Proteína , Selección Genética , Homología de Secuencia de Aminoácido , Transducción de Señal
9.
Nat Genet ; 51(12): 1664-1669, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31784727

RESUMEN

Enhancer elements in the human genome control how genes are expressed in specific cell types and harbor thousands of genetic variants that influence risk for common diseases1-4. Yet, we still do not know how enhancers regulate specific genes, and we lack general rules to predict enhancer-gene connections across cell types5,6. We developed an experimental approach, CRISPRi-FlowFISH, to perturb enhancers in the genome, and we applied it to test >3,500 potential enhancer-gene connections for 30 genes. We found that a simple activity-by-contact model substantially outperformed previous methods at predicting the complex connections in our CRISPR dataset. This activity-by-contact model allows us to construct genome-wide maps of enhancer-gene connections in a given cell type, on the basis of chromatin state measurements. Together, CRISPRi-FlowFISH and the activity-by-contact model provide a systematic approach to map and predict which enhancers regulate which genes, and will help to interpret the functions of the thousands of disease risk variants in the noncoding genome.


Asunto(s)
Repeticiones Palindrómicas Cortas Agrupadas y Regularmente Espaciadas , Elementos de Facilitación Genéticos , Regiones Promotoras Genéticas , Animales , Factor de Transcripción GATA1/genética , Regulación de la Expresión Génica , Histona Desacetilasa 6/genética , Humanos , Hibridación Fluorescente in Situ , Células K562 , Ratones , Modelos Genéticos , ARN Guía de Kinetoplastida
10.
Nat Genet ; 50(10): 1483-1493, 2018 10.
Artículo en Inglés | MEDLINE | ID: mdl-30177862

RESUMEN

Biological interpretation of genome-wide association study data frequently involves assessing whether SNPs linked to a biological process, for example, binding of a transcription factor, show unsigned enrichment for disease signal. However, signed annotations quantifying whether each SNP allele promotes or hinders the biological process can enable stronger statements about disease mechanism. We introduce a method, signed linkage disequilibrium profile regression, for detecting genome-wide directional effects of signed functional annotations on disease risk. We validate the method via simulations and application to molecular quantitative trait loci in blood, recovering known transcriptional regulators. We apply the method to expression quantitative trait loci in 48 Genotype-Tissue Expression tissues, identifying 651 transcription factor-tissue associations including 30 with robust evidence of tissue specificity. We apply the method to 46 diseases and complex traits (average n = 290 K), identifying 77 annotation-trait associations representing 12 independent transcription factor-trait associations, and characterize the underlying transcriptional programs using gene-set enrichment analyses. Our results implicate new causal disease genes and new disease mechanisms.


Asunto(s)
Enfermedad/genética , Estudio de Asociación del Genoma Completo , Herencia Multifactorial/genética , Sitios de Carácter Cuantitativo , Factores de Transcripción/metabolismo , Sitios de Unión/genética , Células Sanguíneas/metabolismo , Células Sanguíneas/patología , Análisis Químico de la Sangre , Regulación de la Expresión Génica , Predisposición Genética a la Enfermedad , Humanos , Desequilibrio de Ligamiento , Fenotipo , Polimorfismo de Nucleótido Simple , Unión Proteica , Factores de Riesgo
11.
Science ; 354(6313): 769-773, 2016 11 11.
Artículo en Inglés | MEDLINE | ID: mdl-27708057

RESUMEN

Gene expression in mammals is regulated by noncoding elements that can affect physiology and disease, yet the functions and target genes of most noncoding elements remain unknown. We present a high-throughput approach that uses clustered regularly interspaced short palindromic repeats (CRISPR) interference (CRISPRi) to discover regulatory elements and identify their target genes. We assess >1 megabase of sequence in the vicinity of two essential transcription factors, MYC and GATA1, and identify nine distal enhancers that control gene expression and cellular proliferation. Quantitative features of chromatin state and chromosome conformation distinguish the seven enhancers that regulate MYC from other elements that do not, suggesting a strategy for predicting enhancer-promoter connectivity. This CRISPRi-based approach can be applied to dissect transcriptional networks and interpret the contributions of noncoding genetic variation to human disease.


Asunto(s)
Mapeo Cromosómico/métodos , Repeticiones Palindrómicas Cortas Agrupadas y Regularmente Espaciadas , Elementos de Facilitación Genéticos/fisiología , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Regiones Promotoras Genéticas/fisiología , Sistemas CRISPR-Cas , Proliferación Celular/genética , Enfermedad/genética , Elementos de Facilitación Genéticos/genética , Factor de Transcripción GATA1/genética , Regulación de la Expresión Génica , Humanos , Células K562 , Regiones Promotoras Genéticas/genética , Proteínas Proto-Oncogénicas c-myc/genética , Reacción en Cadena en Tiempo Real de la Polimerasa
12.
Philos Trans R Soc Lond B Biol Sci ; 367(1590): 868-77, 2012 Mar 19.
Artículo en Inglés | MEDLINE | ID: mdl-22312054

RESUMEN

Rapidly evolving viruses and other pathogens can have an immense impact on human evolution as natural selection acts to increase the prevalence of genetic variants providing resistance to disease. With the emergence of large datasets of human genetic variation, we can search for signatures of natural selection in the human genome driven by such disease-causing microorganisms. Based on this approach, we have previously hypothesized that Lassa virus (LASV) may have been a driver of natural selection in West African populations where Lassa haemorrhagic fever is endemic. In this study, we provide further evidence for this notion. By applying tests for selection to genome-wide data from the International Haplotype Map Consortium and the 1000 Genomes Consortium, we demonstrate evidence for positive selection in LARGE and interleukin 21 (IL21), two genes implicated in LASV infectivity and immunity. We further localized the signals of selection, using the recently developed composite of multiple signals method, to introns and putative regulatory regions of those genes. Our results suggest that natural selection may have targeted variants giving rise to alternative splicing or differential gene expression of LARGE and IL21. Overall, our study supports the hypothesis that selective pressures imposed by LASV may have led to the emergence of particular alleles conferring resistance to Lassa fever, and opens up new avenues of research pursuit.


Asunto(s)
Resistencia a la Enfermedad/genética , Evolución Molecular , Genoma Humano/genética , Interleucinas/genética , Fiebre de Lassa/genética , Virus Lassa/patogenicidad , N-Acetilglucosaminiltransferasas/genética , Selección Genética , África Occidental , Población Negra/genética , Humanos , Filogeografía
13.
Science ; 334(6062): 1518-24, 2011 Dec 16.
Artículo en Inglés | MEDLINE | ID: mdl-22174245

RESUMEN

Identifying interesting relationships between pairs of variables in large data sets is increasingly important. Here, we present a measure of dependence for two-variable relationships: the maximal information coefficient (MIC). MIC captures a wide range of associations both functional and not, and for functional relationships provides a score that roughly equals the coefficient of determination (R(2)) of the data relative to the regression function. MIC belongs to a larger class of maximal information-based nonparametric exploration (MINE) statistics for identifying and classifying relationships. We apply MIC and MINE to data sets in global health, gene expression, major-league baseball, and the human gut microbiota and identify known and novel relationships.


Asunto(s)
Interpretación Estadística de Datos , Algoritmos , Animales , Béisbol/estadística & datos numéricos , Femenino , Expresión Génica , Genes Fúngicos , Genómica/métodos , Humanos , Intestinos/microbiología , Masculino , Metagenoma , Ratones , Obesidad , Saccharomyces cerevisiae/genética
14.
Science ; 327(5967): 883-6, 2010 Feb 12.
Artículo en Inglés | MEDLINE | ID: mdl-20056855

RESUMEN

The human genome contains hundreds of regions whose patterns of genetic variation indicate recent positive natural selection, yet for most the underlying gene and the advantageous mutation remain unknown. We developed a method, composite of multiple signals (CMS), that combines tests for multiple signals of selection and increases resolution by up to 100-fold. By applying CMS to candidate regions from the International Haplotype Map, we localized population-specific selective signals to 55 kilobases (median), identifying known and novel causal variants. CMS can not just identify individual loci but implicates precise variants selected by evolution.


Asunto(s)
Variación Genética , Genoma Humano , Selección Genética , Biología Computacional/métodos , ADN Intergénico/genética , Evolución Molecular , Sitios Genéticos , Haplotipos , Humanos , Polimorfismo Genético , Grupos de Población/genética , Secuencias Reguladoras de Ácidos Nucleicos/genética , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA