RESUMEN
Transcription factors (TFs) and their specific interactions with targets are crucial for specifying gene-expression programs. To gain insights into the transcriptional regulatory networks in embryonic stem (ES) cells, we use chromatin immunoprecipitation coupled with ultra-high-throughput DNA sequencing (ChIP-seq) to map the locations of 13 sequence-specific TFs (Nanog, Oct4, STAT3, Smad1, Sox2, Zfx, c-Myc, n-Myc, Klf4, Esrrb, Tcfcp2l1, E2f1, and CTCF) and 2 transcription regulators (p300 and Suz12). These factors are known to play different roles in ES-cell biology as components of the LIF and BMP signaling pathways, self-renewal regulators, and key reprogramming factors. Our study provides insights into the integration of the signaling pathways into the ES-cell-specific transcription circuitries. Intriguingly, we find specific genomic regions extensively targeted by different TFs. Collectively, the comprehensive mapping of TF-binding sites identifies important features of the transcriptional regulatory networks that define ES-cell identity.
Asunto(s)
Células Madre Embrionarias/metabolismo , Redes Reguladoras de Genes , Transducción de Señal , Animales , Secuencia de Bases , Sitios de Unión , Inmunoprecipitación de Cromatina , Genoma , Factor 4 Similar a Kruppel , Ratones , Complejos Multiproteicos , Factores de Transcripción/metabolismoRESUMEN
Using a long-span, paired-end deep sequencing strategy, we have comprehensively identified cancer genome rearrangements in eight breast cancer genomes. Herein, we show that 40%-54% of these structural genomic rearrangements result in different forms of fusion transcripts and that 44% are potentially translated. We find that single segmental tandem duplication spanning several genes is a major source of the fusion gene transcripts in both cell lines and primary tumors involving adjacent genes placed in the reverse-order position by the duplication event. Certain other structural mutations, however, tend to attenuate gene expression. From these candidate gene fusions, we have found a fusion transcript (RPS6KB1-VMP1) recurrently expressed in â¼30% of breast cancers associated with potential clinical consequences. This gene fusion is caused by tandem duplication on 17q23 and appears to be an indicator of local genomic instability altering the expression of oncogenic components such as MIR21 and RPS6KB1.
Asunto(s)
Neoplasias de la Mama/metabolismo , Reordenamiento Génico , Genoma Humano/genética , Proteínas de la Membrana/genética , Proteínas de la Membrana/metabolismo , Proteínas Recombinantes de Fusión/metabolismo , Proteínas Quinasas S6 Ribosómicas/metabolismo , Transcripción Genética , Neoplasias de la Mama/genética , Línea Celular Tumoral , Mapeo Cromosómico , Cromosomas Humanos Par 17/genética , Femenino , Dosificación de Gen , Perfilación de la Expresión Génica , Inestabilidad Genómica , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Proteínas Recombinantes de Fusión/genética , Proteínas Quinasas S6 Ribosómicas/genética , Análisis de Secuencia de ADNRESUMEN
Somatic genome rearrangements are thought to play important roles in cancer development. We optimized a long-span paired-end-tag (PET) sequencing approach using 10-Kb genomic DNA inserts to study human genome structural variations (SVs). The use of a 10-Kb insert size allows the identification of breakpoints within repetitive or homology-containing regions of a few kilobases in size and results in a higher physical coverage compared with small insert libraries with the same sequencing effort. We have applied this approach to comprehensively characterize the SVs of 15 cancer and two noncancer genomes and used a filtering approach to strongly enrich for somatic SVs in the cancer genomes. Our analyses revealed that most inversions, deletions, and insertions are germ-line SVs, whereas tandem duplications, unpaired inversions, interchromosomal translocations, and complex rearrangements are over-represented among somatic rearrangements in cancer genomes. We demonstrate that the quantitative and connective nature of DNA-PET data is precise in delineating the genealogy of complex rearrangement events, we observe signatures that are compatible with breakage-fusion-bridge cycles, and we discover that large duplications are among the initial rearrangements that trigger genome instability for extensive amplification in epithelial cancers.
Asunto(s)
Emparejamiento Base/genética , Neoplasias de la Mama/genética , Mapeo Cromosómico/métodos , Genoma Humano/genética , Variación Estructural del Genoma/genética , Neoplasias Gástricas/genética , Línea Celular Tumoral , Biología Computacional , ADN/genética , Femenino , Reordenamiento Génico , Humanos , Análisis de Secuencia de ADNRESUMEN
MicroRNAs (miRNAs) are a class of small, noncoding RNAs that function as posttranscriptional regulators of gene expression. Many miRNAs are expressed in the developing brain and regulate multiple aspects of neural development, including neurogenesis, dendritogenesis, and synapse formation. Rett syndrome (RTT) is a progressive neurodevelopmental disorder caused by mutations in the gene encoding methyl-CpG-binding protein 2 (MECP2). Although Mecp2 is known to act as a global transcriptional regulator, miRNAs that are directly regulated by Mecp2 in the brain are not known. Using massively parallel sequencing methods, we have identified miRNAs whose expression is altered in cerebella of Mecp2-null mice before and after the onset of severe neurological symptoms. In vivo genome-wide analyses indicate that promoter regions of a significant fraction of dysregulated miRNA transcripts, including a large polycistronic cluster of brain-specific miRNAs, are DNA-methylated and are bound directly by Mecp2. Functional analysis demonstrates that the 3' UTR of messenger RNA encoding Brain-derived neurotrophic factor (Bdnf) can be targeted by multiple miRNAs aberrantly up-regulated in the absence of Mecp2. Taken together, these results suggest that dysregulation of miRNAs may contribute to RTT pathoetiology and also may provide a valuable resource for further investigations of the role of miRNAs in RTT.
Asunto(s)
Modelos Animales de Enfermedad , Estudio de Asociación del Genoma Completo , Proteína 2 de Unión a Metil-CpG/fisiología , MicroARNs/genética , Síndrome de Rett/genética , Regiones no Traducidas 3' , Animales , Inmunoprecipitación de Cromatina , Ensayo de Inmunoadsorción Enzimática , Proteína 2 de Unión a Metil-CpG/genética , Ratones , Ratones Noqueados , Regiones Promotoras Genéticas , Síndrome de Rett/metabolismoRESUMEN
MBF (or DSC1) is known to regulate transcription of a set of G(1)/S-phase genes encoding proteins involved in regulation of DNA replication. Previous studies have shown that MBF binds not only the promoter of G(1)/S-phase genes, but also the constitutive genes; however, it was unclear if the MBF bindings at the G(1)/S-phase and constitutive genes were mechanistically distinguishable. Here, we report a chromatin immunoprecipitation-microarray (ChIP-chip) analysis of MBF binding in the Schizosaccharomyces pombe genome using high-resolution genome tiling microarrays. ChIP-chip analysis indicates that the majority of the MBF occupancies are located at the intragenic regions. Deconvolution analysis using Rpb1 ChIP-chip results distinguishes the Cdc10 bindings at the Rpb1-poor loci (promoters) from those at the Rpb1-rich loci (intragenic sequences). Importantly, Res1 binding at the Rpb1-poor loci, but not at the Rpb1-rich loci, is dependent on the Cdc10 function, suggesting a distinct binding mechanism. Most Cdc10 promoter bindings at the Rpb1-poor loci are associated with the G(1)/S-phase genes. While Res1 or Res2 is found at both the Cdc10 promoter and intragenic binding sites, Rep2 appears to be absent at the Cdc10 promoter binding sites but present at the intragenic sites. Time course ChIP-chip analysis demonstrates that Rep2 is temporally accumulated at the coding region of the MBF target genes, resembling the RNAP-II occupancies. Taken together, our results show that deconvolution analysis of Cdc10 occupancies refines the functional subset of genomic binding sites. We propose that the MBF activator Rep2 plays a role in mediating the cell cycle-specific transcription through the recruitment of RNAP-II to the MBF-bound G(1)/S-phase genes.
Asunto(s)
Proteínas de Ciclo Celular/metabolismo , Genoma Fúngico , Proteínas de Schizosaccharomyces pombe/metabolismo , Schizosaccharomyces/genética , Transactivadores/metabolismo , Factores de Transcripción/metabolismo , Secuencia de Bases , Inmunoprecipitación de Cromatina/métodos , ADN Intergénico/metabolismo , Componentes del Gen , Genes cdc , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Regiones Promotoras Genéticas , Unión Proteica , Schizosaccharomyces/metabolismoRESUMEN
SUMMARY: The algorithm MGR enables the reconstruction of rearrangement phylogenies based on gene or synteny block order in multiple genomes. Although MGR has been successfully applied to study the evolution of different sets of species, its utilization has been hampered by the prohibitive running time for some applications. In the current work, we have designed new heuristics that significantly speed up the tool without compromising its accuracy. Moreover, we have developed a web server (webMGR) that includes elaborate web output to facilitate navigation through the results. AVAILABILITY: webMGR can be accessed via http://www.gis.a-star.edu.sg/~bourque. The source code of the improved standalone version of MGR is also freely available from the web site. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Biología Computacional/métodos , Reordenamiento Génico/genética , Genoma , Internet , Programas Informáticos , Algoritmos , Bases de Datos Genéticas , Filogenia , SinteníaRESUMEN
Unequivocal demonstration of the therapeutic utility of γ-retroviral vectors for gene therapy applications targeting the hematopoietic system was accompanied by instances of insertional mutagenesis. These events stimulated the ongoing development of putatively safer integrating vector systems and analysis methods to characterize and compare integration site (IS) biosafety profiles. Continuing advances in next-generation sequencing technologies are driving the generation of ever-more complex IS datasets. Available bioinformatic tools to compare such datasets focus on the association of integration sites (ISs) with selected genomic and epigenetic features, and the choice of these features determines the ability to discriminate between datasets. We describe the scalable application of point-process coherence analysis (CA) to compare patterns produced by vector ISs across genomic intervals, uncoupled from association with genomic features. To explore the utility of CA in the context of an unresolved question, we asked whether the differing transduction conditions used in the initial Paris and London SCID-X1 gene therapy trials result in divergent genome-wide integration profiles. We tested a transduction carried out under each condition, and showed that CA could indeed resolve differences in IS distributions. Existence of these differences was confirmed by the application of established methods to compare integration datasets.
RESUMEN
Structural variations (SVs) contribute significantly to the variability of the human genome and extensive genomic rearrangements are a hallmark of cancer. While genomic DNA paired-end-tag (DNA-PET) sequencing is an attractive approach to identify genomic SVs, the current application of PET sequencing with short insert size DNA can be insufficient for the comprehensive mapping of SVs in low complexity and repeat-rich genomic regions. We employed a recently developed procedure to generate PET sequencing data using large DNA inserts of 10-20 kb and compared their characteristics with short insert (1 kb) libraries for their ability to identify SVs. Our results suggest that although short insert libraries bear an advantage in identifying small deletions, they do not provide significantly better breakpoint resolution. In contrast, large inserts are superior to short inserts in providing higher physical genome coverage for the same sequencing cost and achieve greater sensitivity, in practice, for the identification of several classes of SVs, such as copy number neutral and complex events. Furthermore, our results confirm that large insert libraries allow for the identification of SVs within repetitive sequences, which cannot be spanned by short inserts. This provides a key advantage in studying rearrangements in cancer, and we show how it can be used in a fusion-point-guided-concatenation algorithm to study focally amplified regions in cancer.
Asunto(s)
Genoma Humano , Variación Estructural del Genoma , Mutación , Neoplasias/genética , Sistemas de Lectura Abierta , Análisis de Secuencia de ADN/métodos , Algoritmos , Línea Celular Tumoral , Mapeo Cromosómico , Variaciones en el Número de Copia de ADN , Biblioteca Genómica , Humanos , Mutagénesis InsercionalRESUMEN
BACKGROUND: Gastric cancer is the second highest cause of global cancer mortality. To explore the complete repertoire of somatic alterations in gastric cancer, we combined massively parallel short read and DNA paired-end tag sequencing to present the first whole-genome analysis of two gastric adenocarcinomas, one with chromosomal instability and the other with microsatellite instability. RESULTS: Integrative analysis and de novo assemblies revealed the architecture of a wild-type KRAS amplification, a common driver event in gastric cancer. We discovered three distinct mutational signatures in gastric cancer--against a genome-wide backdrop of oxidative and microsatellite instability-related mutational signatures, we identified the first exome-specific mutational signature. Further characterization of the impact of these signatures by combining sequencing data from 40 complete gastric cancer exomes and targeted screening of an additional 94 independent gastric tumors uncovered ACVR2A, RPL22 and LMAN1 as recurrently mutated genes in microsatellite instability-positive gastric cancer and PAPPA as a recurrently mutated gene in TP53 wild-type gastric cancer. CONCLUSIONS: These results highlight how whole-genome cancer sequencing can uncover information relevant to tissue-specific carcinogenesis that would otherwise be missed from exome-sequencing data.
Asunto(s)
Análisis Mutacional de ADN/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Neoplasias Gástricas/genética , Adenocarcinoma/genética , Inestabilidad Cromosómica , Desaminación , Exoma , Genómica , Inestabilidad de Microsatélites , Mutación , Especies Reactivas de Oxígeno/metabolismoRESUMEN
Tyrosine kinase inhibitors (TKIs) elicit high response rates among individuals with kinase-driven malignancies, including chronic myeloid leukemia (CML) and epidermal growth factor receptor-mutated non-small-cell lung cancer (EGFR NSCLC). However, the extent and duration of these responses are heterogeneous, suggesting the existence of genetic modifiers affecting an individual's response to TKIs. Using paired-end DNA sequencing, we discovered a common intronic deletion polymorphism in the gene encoding BCL2-like 11 (BIM). BIM is a pro-apoptotic member of the B-cell CLL/lymphoma 2 (BCL2) family of proteins, and its upregulation is required for TKIs to induce apoptosis in kinase-driven cancers. The polymorphism switched BIM splicing from exon 4 to exon 3, which resulted in expression of BIM isoforms lacking the pro-apoptotic BCL2-homology domain 3 (BH3). The polymorphism was sufficient to confer intrinsic TKI resistance in CML and EGFR NSCLC cell lines, but this resistance could be overcome with BH3-mimetic drugs. Notably, individuals with CML and EGFR NSCLC harboring the polymorphism experienced significantly inferior responses to TKIs than did individuals without the polymorphism (P = 0.02 for CML and P = 0.027 for EGFR NSCLC). Our results offer an explanation for the heterogeneity of TKI responses across individuals and suggest the possibility of personalizing therapy with BH3 mimetics to overcome BIM-polymorphism-associated TKI resistance.
Asunto(s)
Proteínas Reguladoras de la Apoptosis/genética , Apoptosis/efectos de los fármacos , Carcinoma de Pulmón de Células no Pequeñas/genética , Resistencia a Antineoplásicos/efectos de los fármacos , Leucemia Mielógena Crónica BCR-ABL Positiva/genética , Neoplasias Pulmonares/genética , Proteínas de la Membrana/genética , Polimorfismo Genético/genética , Inhibidores de Proteínas Quinasas/farmacología , Proteínas Proto-Oncogénicas/genética , Eliminación de Secuencia/genética , Adulto , Anciano , Anciano de 80 o más Años , Anexinas/metabolismo , Proteína Proapoptótica que Interacciona Mediante Dominios BH3/genética , Proteína 11 Similar a Bcl2 , Carcinoma de Pulmón de Células no Pequeñas/tratamiento farmacológico , Línea Celular Tumoral , Estudios de Cohortes , Relación Dosis-Respuesta a Droga , Resistencia a Antineoplásicos/genética , Ensayo de Inmunoadsorción Enzimática/métodos , Receptores ErbB/genética , Exones/genética , Femenino , Estudios de Seguimiento , Regulación Neoplásica de la Expresión Génica/efectos de los fármacos , Frecuencia de los Genes , Genotipo , Humanos , Cooperación Internacional , Leucemia Mielógena Crónica BCR-ABL Positiva/tratamiento farmacológico , Neoplasias Pulmonares/tratamiento farmacológico , Masculino , Persona de Mediana Edad , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , ARN Interferente Pequeño/metabolismo , Estadísticas no Paramétricas , TransfecciónRESUMEN
Mammalian genomes are viewed as functional organizations that orchestrate spatial and temporal gene regulation. CTCF, the most characterized insulator-binding protein, has been implicated as a key genome organizer. However, little is known about CTCF-associated higher-order chromatin structures at a global scale. Here we applied chromatin interaction analysis by paired-end tag (ChIA-PET) sequencing to elucidate the CTCF-chromatin interactome in pluripotent cells. From this analysis, we identified 1,480 cis- and 336 trans-interacting loci with high reproducibility and precision. Associating these chromatin interaction loci with their underlying epigenetic states, promoter activities, enhancer binding and nuclear lamina occupancy, we uncovered five distinct chromatin domains that suggest potential new models of CTCF function in chromatin organization and transcriptional control. Specifically, CTCF interactions demarcate chromatin-nuclear membrane attachments and influence proper gene expression through extensive cross-talk between promoters and regulatory elements. This highly complex nuclear organization offers insights toward the unifying principles that govern genome plasticity and function.
Asunto(s)
Cromatina/genética , Cromatina/metabolismo , Proteínas de Unión al ADN/metabolismo , Embrión de Mamíferos/metabolismo , Genes Reguladores , Células Madre Pluripotentes/metabolismo , Proteínas Represoras/metabolismo , Animales , Factor de Unión a CCCTC , Células Cultivadas , Cromatina/química , Inmunoprecipitación de Cromatina , Proteínas de Unión al ADN/genética , Embrión de Mamíferos/citología , Epigenómica , Regulación de la Expresión Génica , Hibridación Fluorescente in Situ , Ratones , Regiones Promotoras Genéticas/genética , ARN Interferente Pequeño/genética , Proteínas Represoras/antagonistas & inhibidores , Proteínas Represoras/genética , Transcripción GenéticaRESUMEN
Careful analysis of microarray probe design should be an obligatory component of MicroArray Quality Control (MACQ) project [Patterson et al., 2006; Shi et al., 2006] initiated by the FDA (USA) in order to provide quality control tools to researchers of gene expression profiles and to translate the microarray technology from bench to bedside. The identification and filtering of unreliable probesets are important preprocessing steps before analysis of microarray data. These steps may result in an essential improvement in the selection of differentially expressed genes, gene clustering and construction of co-regulatory expression networks. We revised genome localization of the Affymetrix U133A&B GeneChip initial (target) probe sequences, and evaluated the impact of erroneous and poorly annotated target sequences on the quality of gene expression data. We found about 25% of Affymetrix target sequences overlapping with interspersed repeats that could cause cross-hybridization effects. In total, discrepancies in target sequence annotation account for up to approximately 30% of 44692 Affymetrix probesets. We introduce a novel quality control algorithm based on target sequence mapping onto genome and GeneChip expression data analysis. To validate the quality of probesets we used expression data from large, clinically and genetically distinct groups of breast cancers (249 samples). For the first time, we quantitatively evaluated the effect of repeats and other sources of inadequate probe design on the specificity, reliability and discrimination ability of Affymetrix probesets. We propose that only functionally reliable Affymetrix probesets that passed our quality control algorithm (approximately 86%) for gene expression analysis should be utilized. The target sequence annotation and filtering is available upon request.
Asunto(s)
Mapeo Cromosómico , Perfilación de la Expresión Génica/métodos , Genoma Humano , Análisis de Secuencia por Matrices de Oligonucleótidos , Etiquetas de Secuencia Expresada , Humanos , Modelos Genéticos , ARN Mensajero/genética , Reproducibilidad de los ResultadosRESUMEN
Epigenetic modifications are crucial for proper lineage specification and embryo development. To explore the chromatin modification landscapes in human ES cells, we profiled two histone modifications, H3K4me3 and H3K27me3, by ChIP coupled with the paired-end ditags sequencing strategy. H3K4me3 was found to be a prevalent mark and occurred in close proximity to the promoters of two-thirds of total human genes. Among the H3K27me3 loci identified, 56% are associated with promoters and the vast majority of them are comodified by H3K4me3. By deep-transcript digital counting, 80% of H3K4me3 and 36% of comodified promoters were found to be transcribed. Remarkably, we observed that different combinations of histone methylations are associated with genes from distinct functional categories. These global histone methylation maps provide an epigenetic framework that enables the discovery of novel transcriptional networks and delineation of different genetic compartments of the pluripotent cell genome.
Asunto(s)
Células Madre Embrionarias/metabolismo , Perfilación de la Expresión Génica , Genoma Humano/genética , Histonas/metabolismo , Lisina/metabolismo , Animales , Diferenciación Celular , Secuencia Conservada , ADN Intergénico/genética , Proteínas de Unión al ADN/genética , Células Madre Embrionarias/citología , Humanos , Metilación , Ratones , Complejo de la Endopetidasa Proteasomal/genética , Transporte de Proteínas , Transcripción Genética , Regulación hacia Arriba/genética , Vertebrados/genéticaRESUMEN
Identification of unconventional functional features such as fusion transcripts is a challenging task in the effort to annotate all functional DNA elements in the human genome. Paired-End diTag (PET) analysis possesses a unique capability to accurately and efficiently characterize the two ends of DNA fragments, which may have either normal or unusual compositions. This unique nature of PET analysis makes it an ideal tool for uncovering unconventional features residing in the human genome. Using the PET approach for comprehensive transcriptome analysis, we were able to identify fusion transcripts derived from genome rearrangements and actively expressed retrotransposed pseudogenes, which would be difficult to capture by other means. Here, we demonstrate this unique capability through the analysis of 865,000 individual transcripts in two types of cancer cells. In addition to the characterization of a large number of differentially expressed alternative 5' and 3' transcript variants and novel transcriptional units, we identified 70 fusion transcript candidates in this study. One was validated as the product of a fusion gene between BCAS4 and BCAS3 resulting from an amplification followed by a translocation event between the two loci, chr20q13 and chr17q23. Through an examination of PETs that mapped to multiple genomic locations, we identified 4055 retrotransposed loci in the human genome, of which at least three were found to be transcriptionally active. The PET mapping strategy presented here promises to be a useful tool in annotating the human genome, especially aberrations in human cancer genomes.
Asunto(s)
Cromosomas Humanos Par 17/genética , Cromosomas Humanos Par 20/genética , Genoma Humano , Neoplasias/genética , Transcripción Genética , Translocación Genética , Línea Celular Tumoral , Humanos , Proteínas de Neoplasias/genética , Sitios de Carácter Cuantitativo , Retroelementos , Análisis de Secuencia de ADNRESUMEN
NF-kappaB is a key mediator of inflammation. Here, we mapped the genome-wide loci bound by the RELA subunit of NF-kappaB in lipopolysaccharide (LPS)-stimulated human monocytic cells, and together with global gene expression profiling, found an overrepresentation of the E2F1-binding motif among RELA-bound loci associated with NF-kappaB target genes. Knockdown of endogenous E2F1 impaired the LPS inducibility of the proinflammatory cytokines CCL3(MIP-1alpha), IL23A(p19), TNF-alpha, and IL1-beta. Upon LPS stimulation, E2F1 is rapidly recruited to the promoters of these genes along with p50/RELA heterodimer via a mechanism that is dependent on NF-kappaB activation. Together with the observation that E2F1 physically interacts with p50/RELA in LPS-stimulated cells, our findings suggest that NF-kappaB recruits E2F1 to fully activate the transcription of NF-kappaB target genes. Global gene expression profiling subsequently revealed a spectrum of NF-kappaB target genes that are positively regulated by E2F1, further demonstrating the critical role of E2F1 in the Toll-like receptor 4 pathway.
Asunto(s)
Factor de Transcripción E2F1/metabolismo , Genoma Humano/genética , Receptor Toll-Like 4/metabolismo , Transactivadores/metabolismo , Factor de Transcripción ReIA/metabolismo , Secuencias de Aminoácidos , Secuencia de Bases , Sitios de Unión , Línea Celular , Núcleo Celular/efectos de los fármacos , Núcleo Celular/metabolismo , Secuencia de Consenso , Citocinas/metabolismo , Regulación de la Expresión Génica/efectos de los fármacos , Humanos , Mediadores de Inflamación/metabolismo , Lipopolisacáridos/farmacología , Datos de Secuencia Molecular , Unión Proteica/efectos de los fármacos , Transporte de Proteínas/efectos de los fármacos , Proteína de Retinoblastoma/metabolismoRESUMEN
The ability to derive a whole-genome map of transcription-factor binding sites (TFBS) is crucial for elucidating gene regulatory networks. Herein, we describe a robust approach that couples chromatin immunoprecipitation (ChIP) with the paired-end ditag (PET) sequencing strategy for unbiased and precise global localization of TFBS. We have applied this strategy to map p53 targets in the human genome. From a saturated sampling of over half a million PET sequences, we characterized 65,572 unique p53 ChIP DNA fragments and established overlapping PET clusters as a readout to define p53 binding loci with remarkable specificity. Based on this information, we refined the consensus p53 binding motif, identified at least 542 binding loci with high confidence, discovered 98 previously unidentified p53 target genes that were implicated in novel aspects of p53 functions, and showed their clinical relevance to p53-dependent tumorigenesis in primary cancer samples.
Asunto(s)
Mapeo Cromosómico , Genoma Humano , Factores de Transcripción/genética , Proteína p53 Supresora de Tumor/genética , Sitios de Unión/genética , Inmunoprecipitación de Cromatina/métodos , ADN/análisis , Células HCT116 , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Factores de Transcripción/metabolismo , Células Tumorales Cultivadas , Proteína p53 Supresora de Tumor/metabolismoRESUMEN
The protooncogene MYC encodes the c-Myc transcription factor that regulates cell growth, cell proliferation, cell cycle, and apoptosis. Although deregulation of MYC contributes to tumorigenesis, it is still unclear what direct Myc-induced transcriptomes promote cell transformation. Here we provide a snapshot of genome-wide, unbiased characterization of direct Myc binding targets in a model of human B lymphoid tumor using ChIP coupled with pair-end ditag sequencing analysis (ChIP-PET). Myc potentially occupies > 4,000 genomic loci with the majority near proximal promoter regions associated frequently with CpG islands. Using gene expression profiles with ChIP-PET, we identified 668 direct Myc-regulated gene targets, including 48 transcription factors, indicating that Myc is a central transcriptional hub in growth and proliferation control. This first global genomic view of Myc binding sites yields insights of transcriptional circuitries and cis regulatory modules involving Myc and provides a substantial framework for our understanding of mechanisms of Myc-induced tumorigenesis.
Asunto(s)
Linfocitos B/fisiología , Mapeo Cromosómico , Regulación de la Expresión Génica , Proteínas Proto-Oncogénicas c-myc/metabolismo , Sitios de Unión , Inmunoprecipitación de Cromatina/métodos , Islas de CpG , Genoma Humano , Humanos , MicroARNs/metabolismo , Regiones Promotoras Genéticas , Análisis de Secuencia de ADN/métodos , Factores de Transcripción/genética , Factores de Transcripción/metabolismoRESUMEN
We have developed a DNA tag sequencing and mapping strategy called gene identification signature (GIS) analysis, in which 5' and 3' signatures of full-length cDNAs are accurately extracted into paired-end ditags (PETs) that are concatenated for efficient sequencing and mapped to genome sequences to demarcate the transcription boundaries of every gene. GIS analysis is potentially 30-fold more efficient than standard cDNA sequencing approaches for transcriptome characterization. We demonstrated this approach with 116,252 PET sequences derived from mouse embryonic stem cells. Initial analysis of this dataset identified hundreds of previously uncharacterized transcripts, including alternative transcripts of known genes. We also uncovered several intergenically spliced and unusual fusion transcripts, one of which was confirmed as a trans-splicing event and was differentially expressed. The concept of paired-end ditagging described here for transcriptome analysis can also be applied to whole-genome analysis of cis-regulatory and other DNA elements and represents an important technological advance for genome annotation.