Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 18 de 18
Filter
1.
Mol Ther Methods Clin Dev ; 2: 15015, 2015.
Article in English | MEDLINE | ID: mdl-26029726

ABSTRACT

Unequivocal demonstration of the therapeutic utility of γ-retroviral vectors for gene therapy applications targeting the hematopoietic system was accompanied by instances of insertional mutagenesis. These events stimulated the ongoing development of putatively safer integrating vector systems and analysis methods to characterize and compare integration site (IS) biosafety profiles. Continuing advances in next-generation sequencing technologies are driving the generation of ever-more complex IS datasets. Available bioinformatic tools to compare such datasets focus on the association of integration sites (ISs) with selected genomic and epigenetic features, and the choice of these features determines the ability to discriminate between datasets. We describe the scalable application of point-process coherence analysis (CA) to compare patterns produced by vector ISs across genomic intervals, uncoupled from association with genomic features. To explore the utility of CA in the context of an unresolved question, we asked whether the differing transduction conditions used in the initial Paris and London SCID-X1 gene therapy trials result in divergent genome-wide integration profiles. We tested a transduction carried out under each condition, and showed that CA could indeed resolve differences in IS distributions. Existence of these differences was confirmed by the application of established methods to compare integration datasets.

2.
Genome Biol ; 13(12): R115, 2012 Dec 13.
Article in English | MEDLINE | ID: mdl-23237666

ABSTRACT

BACKGROUND: Gastric cancer is the second highest cause of global cancer mortality. To explore the complete repertoire of somatic alterations in gastric cancer, we combined massively parallel short read and DNA paired-end tag sequencing to present the first whole-genome analysis of two gastric adenocarcinomas, one with chromosomal instability and the other with microsatellite instability. RESULTS: Integrative analysis and de novo assemblies revealed the architecture of a wild-type KRAS amplification, a common driver event in gastric cancer. We discovered three distinct mutational signatures in gastric cancer--against a genome-wide backdrop of oxidative and microsatellite instability-related mutational signatures, we identified the first exome-specific mutational signature. Further characterization of the impact of these signatures by combining sequencing data from 40 complete gastric cancer exomes and targeted screening of an additional 94 independent gastric tumors uncovered ACVR2A, RPL22 and LMAN1 as recurrently mutated genes in microsatellite instability-positive gastric cancer and PAPPA as a recurrently mutated gene in TP53 wild-type gastric cancer. CONCLUSIONS: These results highlight how whole-genome cancer sequencing can uncover information relevant to tissue-specific carcinogenesis that would otherwise be missed from exome-sequencing data.


Subject(s)
DNA Mutational Analysis/methods , High-Throughput Nucleotide Sequencing/methods , Stomach Neoplasms/genetics , Adenocarcinoma/genetics , Chromosomal Instability , Deamination , Exome , Genomics , Microsatellite Instability , Mutation , Reactive Oxygen Species/metabolism
3.
PLoS One ; 7(9): e46152, 2012.
Article in English | MEDLINE | ID: mdl-23029419

ABSTRACT

Structural variations (SVs) contribute significantly to the variability of the human genome and extensive genomic rearrangements are a hallmark of cancer. While genomic DNA paired-end-tag (DNA-PET) sequencing is an attractive approach to identify genomic SVs, the current application of PET sequencing with short insert size DNA can be insufficient for the comprehensive mapping of SVs in low complexity and repeat-rich genomic regions. We employed a recently developed procedure to generate PET sequencing data using large DNA inserts of 10-20 kb and compared their characteristics with short insert (1 kb) libraries for their ability to identify SVs. Our results suggest that although short insert libraries bear an advantage in identifying small deletions, they do not provide significantly better breakpoint resolution. In contrast, large inserts are superior to short inserts in providing higher physical genome coverage for the same sequencing cost and achieve greater sensitivity, in practice, for the identification of several classes of SVs, such as copy number neutral and complex events. Furthermore, our results confirm that large insert libraries allow for the identification of SVs within repetitive sequences, which cannot be spanned by short inserts. This provides a key advantage in studying rearrangements in cancer, and we show how it can be used in a fusion-point-guided-concatenation algorithm to study focally amplified regions in cancer.


Subject(s)
Genome, Human , Genomic Structural Variation , Mutation , Neoplasms/genetics , Open Reading Frames , Sequence Analysis, DNA/methods , Algorithms , Cell Line, Tumor , Chromosome Mapping , DNA Copy Number Variations , Genomic Library , Humans , Mutagenesis, Insertional
4.
Nat Med ; 18(4): 521-8, 2012 Mar 18.
Article in English | MEDLINE | ID: mdl-22426421

ABSTRACT

Tyrosine kinase inhibitors (TKIs) elicit high response rates among individuals with kinase-driven malignancies, including chronic myeloid leukemia (CML) and epidermal growth factor receptor-mutated non-small-cell lung cancer (EGFR NSCLC). However, the extent and duration of these responses are heterogeneous, suggesting the existence of genetic modifiers affecting an individual's response to TKIs. Using paired-end DNA sequencing, we discovered a common intronic deletion polymorphism in the gene encoding BCL2-like 11 (BIM). BIM is a pro-apoptotic member of the B-cell CLL/lymphoma 2 (BCL2) family of proteins, and its upregulation is required for TKIs to induce apoptosis in kinase-driven cancers. The polymorphism switched BIM splicing from exon 4 to exon 3, which resulted in expression of BIM isoforms lacking the pro-apoptotic BCL2-homology domain 3 (BH3). The polymorphism was sufficient to confer intrinsic TKI resistance in CML and EGFR NSCLC cell lines, but this resistance could be overcome with BH3-mimetic drugs. Notably, individuals with CML and EGFR NSCLC harboring the polymorphism experienced significantly inferior responses to TKIs than did individuals without the polymorphism (P = 0.02 for CML and P = 0.027 for EGFR NSCLC). Our results offer an explanation for the heterogeneity of TKI responses across individuals and suggest the possibility of personalizing therapy with BH3 mimetics to overcome BIM-polymorphism-associated TKI resistance.


Subject(s)
Apoptosis Regulatory Proteins/genetics , Apoptosis/drug effects , Carcinoma, Non-Small-Cell Lung/genetics , Drug Resistance, Neoplasm/drug effects , Leukemia, Myelogenous, Chronic, BCR-ABL Positive/genetics , Lung Neoplasms/genetics , Membrane Proteins/genetics , Polymorphism, Genetic/genetics , Protein Kinase Inhibitors/pharmacology , Proto-Oncogene Proteins/genetics , Sequence Deletion/genetics , Adult , Aged , Aged, 80 and over , Annexins/metabolism , BH3 Interacting Domain Death Agonist Protein/genetics , Bcl-2-Like Protein 11 , Carcinoma, Non-Small-Cell Lung/drug therapy , Cell Line, Tumor , Cohort Studies , Dose-Response Relationship, Drug , Drug Resistance, Neoplasm/genetics , Enzyme-Linked Immunosorbent Assay/methods , ErbB Receptors/genetics , Exons/genetics , Female , Follow-Up Studies , Gene Expression Regulation, Neoplastic/drug effects , Gene Frequency , Genotype , Humans , International Cooperation , Leukemia, Myelogenous, Chronic, BCR-ABL Positive/drug therapy , Lung Neoplasms/drug therapy , Male , Middle Aged , Protein Isoforms/genetics , Protein Isoforms/metabolism , RNA, Small Interfering/metabolism , Statistics, Nonparametric , Transfection
5.
Nat Genet ; 43(7): 630-8, 2011 Jun 19.
Article in English | MEDLINE | ID: mdl-21685913

ABSTRACT

Mammalian genomes are viewed as functional organizations that orchestrate spatial and temporal gene regulation. CTCF, the most characterized insulator-binding protein, has been implicated as a key genome organizer. However, little is known about CTCF-associated higher-order chromatin structures at a global scale. Here we applied chromatin interaction analysis by paired-end tag (ChIA-PET) sequencing to elucidate the CTCF-chromatin interactome in pluripotent cells. From this analysis, we identified 1,480 cis- and 336 trans-interacting loci with high reproducibility and precision. Associating these chromatin interaction loci with their underlying epigenetic states, promoter activities, enhancer binding and nuclear lamina occupancy, we uncovered five distinct chromatin domains that suggest potential new models of CTCF function in chromatin organization and transcriptional control. Specifically, CTCF interactions demarcate chromatin-nuclear membrane attachments and influence proper gene expression through extensive cross-talk between promoters and regulatory elements. This highly complex nuclear organization offers insights toward the unifying principles that govern genome plasticity and function.


Subject(s)
Chromatin/genetics , Chromatin/metabolism , DNA-Binding Proteins/metabolism , Embryo, Mammalian/metabolism , Genes, Regulator , Pluripotent Stem Cells/metabolism , Repressor Proteins/metabolism , Animals , CCCTC-Binding Factor , Cells, Cultured , Chromatin/chemistry , Chromatin Immunoprecipitation , DNA-Binding Proteins/genetics , Embryo, Mammalian/cytology , Epigenomics , Gene Expression Regulation , In Situ Hybridization, Fluorescence , Mice , Promoter Regions, Genetic/genetics , RNA, Small Interfering/genetics , Repressor Proteins/antagonists & inhibitors , Repressor Proteins/genetics , Transcription, Genetic
6.
Genome Res ; 21(5): 676-87, 2011 May.
Article in English | MEDLINE | ID: mdl-21467264

ABSTRACT

Using a long-span, paired-end deep sequencing strategy, we have comprehensively identified cancer genome rearrangements in eight breast cancer genomes. Herein, we show that 40%-54% of these structural genomic rearrangements result in different forms of fusion transcripts and that 44% are potentially translated. We find that single segmental tandem duplication spanning several genes is a major source of the fusion gene transcripts in both cell lines and primary tumors involving adjacent genes placed in the reverse-order position by the duplication event. Certain other structural mutations, however, tend to attenuate gene expression. From these candidate gene fusions, we have found a fusion transcript (RPS6KB1-VMP1) recurrently expressed in ∼30% of breast cancers associated with potential clinical consequences. This gene fusion is caused by tandem duplication on 17q23 and appears to be an indicator of local genomic instability altering the expression of oncogenic components such as MIR21 and RPS6KB1.


Subject(s)
Breast Neoplasms/metabolism , Gene Rearrangement , Genome, Human/genetics , Membrane Proteins/genetics , Membrane Proteins/metabolism , Recombinant Fusion Proteins/metabolism , Ribosomal Protein S6 Kinases/metabolism , Transcription, Genetic , Breast Neoplasms/genetics , Cell Line, Tumor , Chromosome Mapping , Chromosomes, Human, Pair 17/genetics , Female , Gene Dosage , Gene Expression Profiling , Genomic Instability , High-Throughput Nucleotide Sequencing , Humans , Recombinant Fusion Proteins/genetics , Ribosomal Protein S6 Kinases/genetics , Sequence Analysis, DNA
7.
Genome Res ; 21(5): 665-75, 2011 May.
Article in English | MEDLINE | ID: mdl-21467267

ABSTRACT

Somatic genome rearrangements are thought to play important roles in cancer development. We optimized a long-span paired-end-tag (PET) sequencing approach using 10-Kb genomic DNA inserts to study human genome structural variations (SVs). The use of a 10-Kb insert size allows the identification of breakpoints within repetitive or homology-containing regions of a few kilobases in size and results in a higher physical coverage compared with small insert libraries with the same sequencing effort. We have applied this approach to comprehensively characterize the SVs of 15 cancer and two noncancer genomes and used a filtering approach to strongly enrich for somatic SVs in the cancer genomes. Our analyses revealed that most inversions, deletions, and insertions are germ-line SVs, whereas tandem duplications, unpaired inversions, interchromosomal translocations, and complex rearrangements are over-represented among somatic rearrangements in cancer genomes. We demonstrate that the quantitative and connective nature of DNA-PET data is precise in delineating the genealogy of complex rearrangement events, we observe signatures that are compatible with breakage-fusion-bridge cycles, and we discover that large duplications are among the initial rearrangements that trigger genome instability for extensive amplification in epithelial cancers.


Subject(s)
Base Pairing/genetics , Breast Neoplasms/genetics , Chromosome Mapping/methods , Genome, Human/genetics , Genomic Structural Variation/genetics , Stomach Neoplasms/genetics , Cell Line, Tumor , Computational Biology , DNA/genetics , Female , Gene Rearrangement , Humans , Sequence Analysis, DNA
8.
Eukaryot Cell ; 10(1): 130-41, 2011 Jan.
Article in English | MEDLINE | ID: mdl-21076007

ABSTRACT

MBF (or DSC1) is known to regulate transcription of a set of G(1)/S-phase genes encoding proteins involved in regulation of DNA replication. Previous studies have shown that MBF binds not only the promoter of G(1)/S-phase genes, but also the constitutive genes; however, it was unclear if the MBF bindings at the G(1)/S-phase and constitutive genes were mechanistically distinguishable. Here, we report a chromatin immunoprecipitation-microarray (ChIP-chip) analysis of MBF binding in the Schizosaccharomyces pombe genome using high-resolution genome tiling microarrays. ChIP-chip analysis indicates that the majority of the MBF occupancies are located at the intragenic regions. Deconvolution analysis using Rpb1 ChIP-chip results distinguishes the Cdc10 bindings at the Rpb1-poor loci (promoters) from those at the Rpb1-rich loci (intragenic sequences). Importantly, Res1 binding at the Rpb1-poor loci, but not at the Rpb1-rich loci, is dependent on the Cdc10 function, suggesting a distinct binding mechanism. Most Cdc10 promoter bindings at the Rpb1-poor loci are associated with the G(1)/S-phase genes. While Res1 or Res2 is found at both the Cdc10 promoter and intragenic binding sites, Rep2 appears to be absent at the Cdc10 promoter binding sites but present at the intragenic sites. Time course ChIP-chip analysis demonstrates that Rep2 is temporally accumulated at the coding region of the MBF target genes, resembling the RNAP-II occupancies. Taken together, our results show that deconvolution analysis of Cdc10 occupancies refines the functional subset of genomic binding sites. We propose that the MBF activator Rep2 plays a role in mediating the cell cycle-specific transcription through the recruitment of RNAP-II to the MBF-bound G(1)/S-phase genes.


Subject(s)
Cell Cycle Proteins/metabolism , Genome, Fungal , Schizosaccharomyces pombe Proteins/metabolism , Schizosaccharomyces/genetics , Trans-Activators/metabolism , Transcription Factors/metabolism , Base Sequence , Chromatin Immunoprecipitation/methods , DNA, Intergenic/metabolism , Gene Components , Genes, cdc , Oligonucleotide Array Sequence Analysis/methods , Promoter Regions, Genetic , Protein Binding , Schizosaccharomyces/metabolism
9.
Proc Natl Acad Sci U S A ; 107(42): 18161-6, 2010 Oct 19.
Article in English | MEDLINE | ID: mdl-20921386

ABSTRACT

MicroRNAs (miRNAs) are a class of small, noncoding RNAs that function as posttranscriptional regulators of gene expression. Many miRNAs are expressed in the developing brain and regulate multiple aspects of neural development, including neurogenesis, dendritogenesis, and synapse formation. Rett syndrome (RTT) is a progressive neurodevelopmental disorder caused by mutations in the gene encoding methyl-CpG-binding protein 2 (MECP2). Although Mecp2 is known to act as a global transcriptional regulator, miRNAs that are directly regulated by Mecp2 in the brain are not known. Using massively parallel sequencing methods, we have identified miRNAs whose expression is altered in cerebella of Mecp2-null mice before and after the onset of severe neurological symptoms. In vivo genome-wide analyses indicate that promoter regions of a significant fraction of dysregulated miRNA transcripts, including a large polycistronic cluster of brain-specific miRNAs, are DNA-methylated and are bound directly by Mecp2. Functional analysis demonstrates that the 3' UTR of messenger RNA encoding Brain-derived neurotrophic factor (Bdnf) can be targeted by multiple miRNAs aberrantly up-regulated in the absence of Mecp2. Taken together, these results suggest that dysregulation of miRNAs may contribute to RTT pathoetiology and also may provide a valuable resource for further investigations of the role of miRNAs in RTT.


Subject(s)
Disease Models, Animal , Genome-Wide Association Study , Methyl-CpG-Binding Protein 2/physiology , MicroRNAs/genetics , Rett Syndrome/genetics , 3' Untranslated Regions , Animals , Chromatin Immunoprecipitation , Enzyme-Linked Immunosorbent Assay , Methyl-CpG-Binding Protein 2/genetics , Mice , Mice, Knockout , Promoter Regions, Genetic , Rett Syndrome/metabolism
10.
Bioinformatics ; 26(3): 408-10, 2010 Feb 01.
Article in English | MEDLINE | ID: mdl-20022974

ABSTRACT

SUMMARY: The algorithm MGR enables the reconstruction of rearrangement phylogenies based on gene or synteny block order in multiple genomes. Although MGR has been successfully applied to study the evolution of different sets of species, its utilization has been hampered by the prohibitive running time for some applications. In the current work, we have designed new heuristics that significantly speed up the tool without compromising its accuracy. Moreover, we have developed a web server (webMGR) that includes elaborate web output to facilitate navigation through the results. AVAILABILITY: webMGR can be accessed via http://www.gis.a-star.edu.sg/~bourque. The source code of the improved standalone version of MGR is also freely available from the web site. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Computational Biology/methods , Gene Rearrangement/genetics , Genome , Internet , Software , Algorithms , Databases, Genetic , Phylogeny , Synteny
11.
Cell ; 133(6): 1106-17, 2008 Jun 13.
Article in English | MEDLINE | ID: mdl-18555785

ABSTRACT

Transcription factors (TFs) and their specific interactions with targets are crucial for specifying gene-expression programs. To gain insights into the transcriptional regulatory networks in embryonic stem (ES) cells, we use chromatin immunoprecipitation coupled with ultra-high-throughput DNA sequencing (ChIP-seq) to map the locations of 13 sequence-specific TFs (Nanog, Oct4, STAT3, Smad1, Sox2, Zfx, c-Myc, n-Myc, Klf4, Esrrb, Tcfcp2l1, E2f1, and CTCF) and 2 transcription regulators (p300 and Suz12). These factors are known to play different roles in ES-cell biology as components of the LIF and BMP signaling pathways, self-renewal regulators, and key reprogramming factors. Our study provides insights into the integration of the signaling pathways into the ES-cell-specific transcription circuitries. Intriguingly, we find specific genomic regions extensively targeted by different TFs. Collectively, the comprehensive mapping of TF-binding sites identifies important features of the transcriptional regulatory networks that define ES-cell identity.


Subject(s)
Embryonic Stem Cells/metabolism , Gene Regulatory Networks , Signal Transduction , Animals , Base Sequence , Binding Sites , Chromatin Immunoprecipitation , Genome , Kruppel-Like Factor 4 , Mice , Multiprotein Complexes , Transcription Factors/metabolism
12.
Mol Cell ; 27(4): 622-35, 2007 Aug 17.
Article in English | MEDLINE | ID: mdl-17707233

ABSTRACT

NF-kappaB is a key mediator of inflammation. Here, we mapped the genome-wide loci bound by the RELA subunit of NF-kappaB in lipopolysaccharide (LPS)-stimulated human monocytic cells, and together with global gene expression profiling, found an overrepresentation of the E2F1-binding motif among RELA-bound loci associated with NF-kappaB target genes. Knockdown of endogenous E2F1 impaired the LPS inducibility of the proinflammatory cytokines CCL3(MIP-1alpha), IL23A(p19), TNF-alpha, and IL1-beta. Upon LPS stimulation, E2F1 is rapidly recruited to the promoters of these genes along with p50/RELA heterodimer via a mechanism that is dependent on NF-kappaB activation. Together with the observation that E2F1 physically interacts with p50/RELA in LPS-stimulated cells, our findings suggest that NF-kappaB recruits E2F1 to fully activate the transcription of NF-kappaB target genes. Global gene expression profiling subsequently revealed a spectrum of NF-kappaB target genes that are positively regulated by E2F1, further demonstrating the critical role of E2F1 in the Toll-like receptor 4 pathway.


Subject(s)
E2F1 Transcription Factor/metabolism , Genome, Human/genetics , Toll-Like Receptor 4/metabolism , Trans-Activators/metabolism , Transcription Factor RelA/metabolism , Amino Acid Motifs , Base Sequence , Binding Sites , Cell Line , Cell Nucleus/drug effects , Cell Nucleus/metabolism , Consensus Sequence , Cytokines/metabolism , Gene Expression Regulation/drug effects , Humans , Inflammation Mediators/metabolism , Lipopolysaccharides/pharmacology , Molecular Sequence Data , Protein Binding/drug effects , Protein Transport/drug effects , Retinoblastoma Protein/metabolism
13.
Genome Res ; 17(6): 828-38, 2007 Jun.
Article in English | MEDLINE | ID: mdl-17568001

ABSTRACT

Identification of unconventional functional features such as fusion transcripts is a challenging task in the effort to annotate all functional DNA elements in the human genome. Paired-End diTag (PET) analysis possesses a unique capability to accurately and efficiently characterize the two ends of DNA fragments, which may have either normal or unusual compositions. This unique nature of PET analysis makes it an ideal tool for uncovering unconventional features residing in the human genome. Using the PET approach for comprehensive transcriptome analysis, we were able to identify fusion transcripts derived from genome rearrangements and actively expressed retrotransposed pseudogenes, which would be difficult to capture by other means. Here, we demonstrate this unique capability through the analysis of 865,000 individual transcripts in two types of cancer cells. In addition to the characterization of a large number of differentially expressed alternative 5' and 3' transcript variants and novel transcriptional units, we identified 70 fusion transcript candidates in this study. One was validated as the product of a fusion gene between BCAS4 and BCAS3 resulting from an amplification followed by a translocation event between the two loci, chr20q13 and chr17q23. Through an examination of PETs that mapped to multiple genomic locations, we identified 4055 retrotransposed loci in the human genome, of which at least three were found to be transcriptionally active. The PET mapping strategy presented here promises to be a useful tool in annotating the human genome, especially aberrations in human cancer genomes.


Subject(s)
Chromosomes, Human, Pair 17/genetics , Chromosomes, Human, Pair 20/genetics , Genome, Human , Neoplasms/genetics , Transcription, Genetic , Translocation, Genetic , Cell Line, Tumor , Humans , Neoplasm Proteins/genetics , Quantitative Trait Loci , Retroelements , Sequence Analysis, DNA
14.
Cell Stem Cell ; 1(3): 286-98, 2007 Sep 13.
Article in English | MEDLINE | ID: mdl-18371363

ABSTRACT

Epigenetic modifications are crucial for proper lineage specification and embryo development. To explore the chromatin modification landscapes in human ES cells, we profiled two histone modifications, H3K4me3 and H3K27me3, by ChIP coupled with the paired-end ditags sequencing strategy. H3K4me3 was found to be a prevalent mark and occurred in close proximity to the promoters of two-thirds of total human genes. Among the H3K27me3 loci identified, 56% are associated with promoters and the vast majority of them are comodified by H3K4me3. By deep-transcript digital counting, 80% of H3K4me3 and 36% of comodified promoters were found to be transcribed. Remarkably, we observed that different combinations of histone methylations are associated with genes from distinct functional categories. These global histone methylation maps provide an epigenetic framework that enables the discovery of novel transcriptional networks and delineation of different genetic compartments of the pluripotent cell genome.


Subject(s)
Embryonic Stem Cells/metabolism , Gene Expression Profiling , Genome, Human/genetics , Histones/metabolism , Lysine/metabolism , Animals , Cell Differentiation , Conserved Sequence , DNA, Intergenic/genetics , DNA-Binding Proteins/genetics , Embryonic Stem Cells/cytology , Humans , Methylation , Mice , Proteasome Endopeptidase Complex/genetics , Protein Transport , Transcription, Genetic , Up-Regulation/genetics , Vertebrates/genetics
15.
In Silico Biol ; 7(3): 241-60, 2007.
Article in English | MEDLINE | ID: mdl-18415975

ABSTRACT

Careful analysis of microarray probe design should be an obligatory component of MicroArray Quality Control (MACQ) project [Patterson et al., 2006; Shi et al., 2006] initiated by the FDA (USA) in order to provide quality control tools to researchers of gene expression profiles and to translate the microarray technology from bench to bedside. The identification and filtering of unreliable probesets are important preprocessing steps before analysis of microarray data. These steps may result in an essential improvement in the selection of differentially expressed genes, gene clustering and construction of co-regulatory expression networks. We revised genome localization of the Affymetrix U133A&B GeneChip initial (target) probe sequences, and evaluated the impact of erroneous and poorly annotated target sequences on the quality of gene expression data. We found about 25% of Affymetrix target sequences overlapping with interspersed repeats that could cause cross-hybridization effects. In total, discrepancies in target sequence annotation account for up to approximately 30% of 44692 Affymetrix probesets. We introduce a novel quality control algorithm based on target sequence mapping onto genome and GeneChip expression data analysis. To validate the quality of probesets we used expression data from large, clinically and genetically distinct groups of breast cancers (249 samples). For the first time, we quantitatively evaluated the effect of repeats and other sources of inadequate probe design on the specificity, reliability and discrimination ability of Affymetrix probesets. We propose that only functionally reliable Affymetrix probesets that passed our quality control algorithm (approximately 86%) for gene expression analysis should be utilized. The target sequence annotation and filtering is available upon request.


Subject(s)
Chromosome Mapping , Gene Expression Profiling/methods , Genome, Human , Oligonucleotide Array Sequence Analysis , Expressed Sequence Tags , Humans , Models, Genetic , RNA, Messenger/genetics , Reproducibility of Results
16.
Proc Natl Acad Sci U S A ; 103(47): 17834-9, 2006 Nov 21.
Article in English | MEDLINE | ID: mdl-17093053

ABSTRACT

The protooncogene MYC encodes the c-Myc transcription factor that regulates cell growth, cell proliferation, cell cycle, and apoptosis. Although deregulation of MYC contributes to tumorigenesis, it is still unclear what direct Myc-induced transcriptomes promote cell transformation. Here we provide a snapshot of genome-wide, unbiased characterization of direct Myc binding targets in a model of human B lymphoid tumor using ChIP coupled with pair-end ditag sequencing analysis (ChIP-PET). Myc potentially occupies > 4,000 genomic loci with the majority near proximal promoter regions associated frequently with CpG islands. Using gene expression profiles with ChIP-PET, we identified 668 direct Myc-regulated gene targets, including 48 transcription factors, indicating that Myc is a central transcriptional hub in growth and proliferation control. This first global genomic view of Myc binding sites yields insights of transcriptional circuitries and cis regulatory modules involving Myc and provides a substantial framework for our understanding of mechanisms of Myc-induced tumorigenesis.


Subject(s)
B-Lymphocytes/physiology , Chromosome Mapping , Gene Expression Regulation , Proto-Oncogene Proteins c-myc/metabolism , Binding Sites , Chromatin Immunoprecipitation/methods , CpG Islands , Genome, Human , Humans , MicroRNAs/metabolism , Promoter Regions, Genetic , Sequence Analysis, DNA/methods , Transcription Factors/genetics , Transcription Factors/metabolism
17.
Cell ; 124(1): 207-19, 2006 Jan 13.
Article in English | MEDLINE | ID: mdl-16413492

ABSTRACT

The ability to derive a whole-genome map of transcription-factor binding sites (TFBS) is crucial for elucidating gene regulatory networks. Herein, we describe a robust approach that couples chromatin immunoprecipitation (ChIP) with the paired-end ditag (PET) sequencing strategy for unbiased and precise global localization of TFBS. We have applied this strategy to map p53 targets in the human genome. From a saturated sampling of over half a million PET sequences, we characterized 65,572 unique p53 ChIP DNA fragments and established overlapping PET clusters as a readout to define p53 binding loci with remarkable specificity. Based on this information, we refined the consensus p53 binding motif, identified at least 542 binding loci with high confidence, discovered 98 previously unidentified p53 target genes that were implicated in novel aspects of p53 functions, and showed their clinical relevance to p53-dependent tumorigenesis in primary cancer samples.


Subject(s)
Chromosome Mapping , Genome, Human , Transcription Factors/genetics , Tumor Suppressor Protein p53/genetics , Binding Sites/genetics , Chromatin Immunoprecipitation/methods , DNA/analysis , HCT116 Cells , Humans , Oligonucleotide Array Sequence Analysis/methods , Transcription Factors/metabolism , Tumor Cells, Cultured , Tumor Suppressor Protein p53/metabolism
18.
Nat Methods ; 2(2): 105-11, 2005 Feb.
Article in English | MEDLINE | ID: mdl-15782207

ABSTRACT

We have developed a DNA tag sequencing and mapping strategy called gene identification signature (GIS) analysis, in which 5' and 3' signatures of full-length cDNAs are accurately extracted into paired-end ditags (PETs) that are concatenated for efficient sequencing and mapped to genome sequences to demarcate the transcription boundaries of every gene. GIS analysis is potentially 30-fold more efficient than standard cDNA sequencing approaches for transcriptome characterization. We demonstrated this approach with 116,252 PET sequences derived from mouse embryonic stem cells. Initial analysis of this dataset identified hundreds of previously uncharacterized transcripts, including alternative transcripts of known genes. We also uncovered several intergenically spliced and unusual fusion transcripts, one of which was confirmed as a trans-splicing event and was differentially expressed. The concept of paired-end ditagging described here for transcriptome analysis can also be applied to whole-genome analysis of cis-regulatory and other DNA elements and represents an important technological advance for genome annotation.


Subject(s)
Chromosome Mapping/methods , DNA Probes/genetics , Gene Expression Profiling/methods , Proteome/genetics , Proteome/metabolism , Transcription Factors/genetics , Transcription Factors/metabolism , 5' Flanking Region/genetics , Animals , Cell Line , Expressed Sequence Tags , Mice , Reproducibility of Results , Sensitivity and Specificity , Sequence Analysis, DNA/methods , Stem Cells/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL
...