RESUMO
DNase I hypersensitive sites (DHSs) are generic markers of regulatory DNA1-5 and contain genetic variations associated with diseases and phenotypic traits6-8. We created high-resolution maps of DHSs from 733 human biosamples encompassing 438 cell and tissue types and states, and integrated these to delineate and numerically index approximately 3.6 million DHSs within the human genome sequence, providing a common coordinate system for regulatory DNA. Here we show that these maps highly resolve the cis-regulatory compartment of the human genome, which encodes unexpectedly diverse cell- and tissue-selective regulatory programs at very high density. These programs can be captured comprehensively by a simple vocabulary that enables the assignment to each DHS of a regulatory barcode that encapsulates its tissue manifestations, and global annotation of protein-coding and non-coding RNA genes in a manner orthogonal to gene expression. Finally, we show that sharply resolved DHSs markedly enhance the genetic association and heritability signals of diseases and traits. Rather than being confined to a small number of distal elements or promoters, we find that genetic signals converge on congruently regulated sets of DHSs that decorate entire gene bodies. Together, our results create a universal, extensible coordinate system and vocabulary for human regulatory DNA marked by DHSs, and provide a new global perspective on the architecture of human gene regulation.
Assuntos
Cromatina/genética , DNA/metabolismo , Desoxirribonuclease I/metabolismo , Anotação de Sequência Molecular , Cromatina/química , Cromatina/metabolismo , DNA/química , DNA/genética , Regulação da Expressão Gênica , Genes/genética , Genoma Humano/genética , Humanos , Regiões Promotoras Genéticas/genética , Sequências Reguladoras de Ácido Nucleico/genéticaRESUMO
Genetic association studies of many heritable traits resulting from physiological testing often have modest sample sizes due to the cost and burden of the required phenotyping. This reduces statistical power and limits discovery of multiple genetic associations. We present a strategy to leverage pleiotropy between traits to both discover new loci and to provide mechanistic hypotheses of the underlying pathophysiology. Specifically, we combine a colocalization test with a locus-level test of pleiotropy. In simulations, we show that this approach is highly selective for identifying true pleiotropy driven by the same causative variant, thereby improves the chance to replicate the associations in underpowered validation cohorts and leads to higher interpretability. Here, as an exemplar, we use Obstructive Sleep Apnea (OSA), a common disorder diagnosed using overnight multi-channel physiological testing. We leverage pleiotropy with relevant cellular and cardio-metabolic phenotypes and gene expression traits to map new risk loci in an underpowered OSA GWAS. We identify several pleiotropic loci harboring suggestive associations to OSA and genome-wide significant associations to other traits, and show that their OSA association replicates in independent cohorts of diverse ancestries. By investigating pleiotropic loci, our strategy allows proposing new hypotheses about OSA pathobiology across many physiological layers. For example, we identify and replicate the pleiotropy across the plateletcrit, OSA and an eQTL of DNA primase subunit 1 (PRIM1) in immune cells. We find suggestive links between OSA, a measure of lung function (FEV1/FVC), and an eQTL of matrix metallopeptidase 15 (MMP15) in lung tissue. We also link a previously known genome-wide significant peak for OSA in the hexokinase 1 (HK1) locus to hematocrit and other red blood cell related traits. Thus, the analysis of pleiotropic associations has the potential to assemble diverse phenotypes into a chain of mechanistic hypotheses that provide insight into the pathogenesis of complex human diseases.
Assuntos
Estudo de Associação Genômica Ampla , Apneia Obstrutiva do Sono , Humanos , Estudo de Associação Genômica Ampla/métodos , Fenótipo , Estudos de Associação Genética , Sono , Pleiotropia Genética , Polimorfismo de Nucleotídeo Único , DNA PrimaseRESUMO
MOTIVATION: Analysis of allele-specific expression is strongly affected by the technical noise present in RNA-seq experiments. Previously, we showed that technical replicates can be used for precise estimates of this noise, and we provided a tool for correction of technical noise in allele-specific expression analysis. This approach is very accurate but costly due to the need for two or more replicates of each library. Here, we develop a spike-in approach which is highly accurate at only a small fraction of the cost. RESULTS: We show that a distinct RNA added as a spike-in before library preparation reflects technical noise of the whole library and can be used in large batches of samples. We experimentally demonstrate the effectiveness of this approach using combinations of RNA from species distinguishable by alignment, namely, mouse, human, and Caenorhabditis elegans. Our new approach, controlFreq, enables highly accurate and computationally efficient analysis of allele-specific expression in (and between) arbitrarily large studies at an overall cost increase of â¼5%. AVAILABILITY AND IMPLEMENTATION: Analysis pipeline for this approach is available at GitHub as R package controlFreq (github.com/gimelbrantlab/controlFreq).
Assuntos
Caenorhabditis elegans , Bibliotecas , Humanos , Animais , Camundongos , Alelos , Caenorhabditis elegans/genética , Biblioteca Gênica , RNA/genéticaRESUMO
Motivation: Analysis of allele-specific expression is strongly affected by the technical noise present in RNA-seq experiments. Previously, we showed that technical replicates can be used for precise estimates of this noise, and we provided a tool for correction of technical noise in allele-specific expression analysis. This approach is very accurate but costly due to the need for two or more replicates of each library. Here, we develop a spike-in approach that is highly accurate at only a small fraction of the cost. Results: We show that a distinct RNA added as a spike-in before library preparation reflects technical noise of the whole library and can be used in large batches of samples. We experimentally demonstrate the effectiveness of this approach using combinations of RNA from species distinguishable by alignment, namely, mouse, human, and C.elegans . Our new approach, controlFreq , enables highly accurate and computationally efficient analysis of allele-specific expression in (and between) arbitrarily large studies at an overall cost increase of ~ 5%. Availability: Analysis pipeline for this approach is available at GitHub as R package controlFreq ( github.com/gimelbrantlab/controlFreq ). Contact: agimelbrant@altius.org.
RESUMO
Peloponnese has been one of the cradles of the Classical European civilization and an important contributor to the ancient European history. It has also been the subject of a controversy about the ancestry of its population. In a theory hotly debated by scholars for over 170 years, the German historian Jacob Philipp Fallmerayer proposed that the medieval Peloponneseans were totally extinguished by Slavic and Avar invaders and replaced by Slavic settlers during the 6th century CE. Here we use 2.5 million single-nucleotide polymorphisms to investigate the genetic structure of Peloponnesean populations in a sample of 241 individuals originating from all districts of the peninsula and to examine predictions of the theory of replacement of the medieval Peloponneseans by Slavs. We find considerable heterogeneity of Peloponnesean populations exemplified by genetically distinct subpopulations and by gene flow gradients within Peloponnese. By principal component analysis (PCA) and ADMIXTURE analysis the Peloponneseans are clearly distinguishable from the populations of the Slavic homeland and are very similar to Sicilians and Italians. Using a novel method of quantitative analysis of ADMIXTURE output we find that the Slavic ancestry of Peloponnesean subpopulations ranges from 0.2 to 14.4%. Subpopulations considered by Fallmerayer to be Slavic tribes or to have Near Eastern origin, have no significant ancestry of either. This study rejects the theory of extinction of medieval Peloponneseans and illustrates how genetics can clarify important aspects of the history of a human population.