RESUMO
In the era of genome-wide association studies (GWAS) and personalized medicine, predicting the impact of single nucleotide polymorphisms (SNPs) in regulatory elements is an important goal. Current approaches to determine the potential of regulatory SNPs depend on inadequate knowledge of cell-specific DNA binding motifs. Here, we present Sasquatch, a new computational approach that uses DNase footprint data to estimate and visualize the effects of noncoding variants on transcription factor binding. Sasquatch performs a comprehensive k-mer-based analysis of DNase footprints to determine any k-mer's potential for protein binding in a specific cell type and how this may be changed by sequence variants. Therefore, Sasquatch uses an unbiased approach, independent of known transcription factor binding sites and motifs. Sasquatch only requires a single DNase-seq data set per cell type, from any genotype, and produces consistent predictions from data generated by different experimental procedures and at different sequence depths. Here we demonstrate the effectiveness of Sasquatch using previously validated functional SNPs and benchmark its performance against existing approaches. Sasquatch is available as a versatile webtool incorporating publicly available data, including the human ENCODE collection. Thus, Sasquatch provides a powerful tool and repository for prioritizing likely regulatory SNPs in the noncoding genome.
Assuntos
Pegada de DNA/métodos , Desoxirribonucleases/química , Células Eritroides/metabolismo , Motivos de Nucleotídeos , Polimorfismo de Nucleotídeo Único , Elementos de Resposta , Análise de Sequência de DNA/métodos , Fatores de Transcrição/metabolismo , Humanos , Valor Preditivo dos TestesRESUMO
The α- and ß-globin loci harbor developmentally expressed genes, which are silenced throughout post-natal life. Reactivation of these genes may offer therapeutic approaches for the hemoglobinopathies, the most common single gene disorders. Here, we address mechanisms regulating the embryonically expressed α-like globin, termed ζ-globin. We show that in embryonic erythroid cells, the ζ-gene lies within a ~65 kb sub-TAD (topologically associating domain) of open, acetylated chromatin and interacts with the α-globin super-enhancer. By contrast, in adult erythroid cells, the ζ-gene is packaged within a small (~10 kb) sub-domain of hypoacetylated, facultative heterochromatin within the acetylated sub-TAD and that it no longer interacts with its enhancers. The ζ-gene can be partially re-activated by acetylation and inhibition of histone de-acetylases. In addition to suggesting therapies for severe α-thalassemia, these findings illustrate the general principles by which reactivation of developmental genes may rescue abnormalities arising from mutations in their adult paralogues.
Assuntos
Regulação da Expressão Gênica no Desenvolvimento , Inativação Gênica , Ativação Transcricional , Globinas zeta/genética , Acetilação , Animais , Cromatina/metabolismo , Proteínas de Ligação a DNA/metabolismo , Elementos Facilitadores Genéticos , Células Eritroides/metabolismo , Regulação da Expressão Gênica no Desenvolvimento/efeitos dos fármacos , Inativação Gênica/efeitos dos fármacos , Inibidores de Histona Desacetilases/farmacologia , Humanos , Camundongos , Proteínas Repressoras/metabolismo , Fatores de Transcrição/metabolismo , Ativação Transcricional/efeitos dos fármacos , alfa-Globinas/genéticaRESUMO
The human reference genome serves as the foundation for genomics by providing a scaffold for alignment of sequencing reads, but currently only reflects a single consensus haplotype, thus impairing analysis accuracy. Here we present a graph reference genome implementation that enables read alignment across 2,800 diploid genomes encompassing 12.6 million SNPs and 4.0 million insertions and deletions (indels). The pipeline processes one whole-genome sequencing sample in 6.5 h using a system with 36 CPU cores. We show that using a graph genome reference improves read mapping sensitivity and produces a 0.5% increase in variant calling recall, with unaffected specificity. Structural variations incorporated into a graph genome can be genotyped accurately under a unified framework. Finally, we show that iterative augmentation of graph genomes yields incremental gains in variant calling accuracy. Our implementation is an important advance toward fulfilling the promise of graph genomes to radically enhance the scalability and accuracy of genomic analyses.
Assuntos
Genoma Humano/genética , Genômica/métodos , Humanos , Polimorfismo de Nucleotídeo Único/genética , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Deleção de Sequência/genética , Sequenciamento Completo do Genoma/métodosRESUMO
The emergence in recent years of DNA editing technologies-Zinc finger nucleases (ZFNs), transcription activator-like effector (TALE) guided nucleases (TALENs), clustered regularly interspaced short palindromic repeats (CRISPR)/Cas family enzymes, and Base-Editors-have greatly increased our ability to generate hundreds of edited cells carrying an array of alleles, including single-nucleotide substitutions. However, the infrequency of homology-dependent repair (HDR) in generating these substitutions in general requires the screening of large numbers of edited cells to isolate the sequence change of interest. Here we present a high-throughput method for the amplification and barcoding of edited loci in a 96-well plate format. After barcoding, plates are indexed as pools which permits multiplexed sequencing of hundreds of clones simultaneously. This protocol works at high success rate with more than 94% of clones successfully genotyped following analysis.
RESUMO
Many genes determining cell identity are regulated by clusters of Mediator-bound enhancer elements collectively referred to as super-enhancers. These super-enhancers have been proposed to manifest higher-order properties important in development and disease. Here we report a comprehensive functional dissection of one of the strongest putative super-enhancers in erythroid cells. By generating a series of mouse models, deleting each of the five regulatory elements of the α-globin super-enhancer individually and in informative combinations, we demonstrate that each constituent enhancer seems to act independently and in an additive fashion with respect to hematological phenotype, gene expression, chromatin structure and chromosome conformation, without clear evidence of synergistic or higher-order effects. Our study highlights the importance of functional genetic analyses for the identification of new concepts in transcriptional regulation.