Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 57
Filter
1.
Cell ; 153(5): 1134-48, 2013 May 23.
Article in English | MEDLINE | ID: mdl-23664764

ABSTRACT

Epigenetic mechanisms have been proposed to play crucial roles in mammalian development, but their precise functions are only partially understood. To investigate epigenetic regulation of embryonic development, we differentiated human embryonic stem cells into mesendoderm, neural progenitor cells, trophoblast-like cells, and mesenchymal stem cells and systematically characterized DNA methylation, chromatin modifications, and the transcriptome in each lineage. We found that promoters that are active in early developmental stages tend to be CG rich and mainly engage H3K27me3 upon silencing in nonexpressing lineages. By contrast, promoters for genes expressed preferentially at later stages are often CG poor and primarily employ DNA methylation upon repression. Interestingly, the early developmental regulatory genes are often located in large genomic domains that are generally devoid of DNA methylation in most lineages, which we termed DNA methylation valleys (DMVs). Our results suggest that distinct epigenetic mechanisms regulate early and late stages of ES cell differentiation.


Subject(s)
DNA Methylation , Embryonic Stem Cells/metabolism , Epigenomics , Gene Expression Regulation, Developmental , Animals , Cell Differentiation , Chromatin/metabolism , CpG Islands , Embryonic Stem Cells/cytology , Histones/metabolism , Humans , Methylation , Neoplasms/genetics , Promoter Regions, Genetic , Zebrafish/embryology
2.
Acta Neuropathol ; 147(1): 73, 2024 04 19.
Article in English | MEDLINE | ID: mdl-38641715

ABSTRACT

The most prominent genetic cause of both amyotrophic lateral sclerosis (ALS) and frontotemporal lobar degeneration (FTLD) is a repeat expansion in the gene C9orf72. Importantly, the transcriptomic consequences of the C9orf72 repeat expansion remain largely unclear. Here, we used short-read RNA sequencing (RNAseq) to profile the cerebellar transcriptome, detecting alterations in patients with a C9orf72 repeat expansion. We focused on the cerebellum, since key C9orf72-related pathologies are abundant in this neuroanatomical region, yet TDP-43 pathology and neuronal loss are minimal. Consistent with previous work, we showed a reduction in the expression of the C9orf72 gene and an elevation in homeobox genes, when comparing patients with the expansion to both patients without the C9orf72 repeat expansion and control subjects. Interestingly, we identified more than 1000 alternative splicing events, including 4 in genes previously associated with ALS and/or FTLD. We also found an increase of cryptic splicing in C9orf72 patients compared to patients without the expansion and controls. Furthermore, we demonstrated that the expression level of select RNA-binding proteins is associated with cryptic splice junction inclusion. Overall, this study explores the presence of widespread transcriptomic changes in the cerebellum, a region not confounded by severe neurodegeneration, in post-mortem tissue from C9orf72 patients.


Subject(s)
Amyotrophic Lateral Sclerosis , C9orf72 Protein , Cerebellum , Frontotemporal Lobar Degeneration , Humans , Amyotrophic Lateral Sclerosis/genetics , Amyotrophic Lateral Sclerosis/metabolism , Amyotrophic Lateral Sclerosis/pathology , C9orf72 Protein/genetics , C9orf72 Protein/metabolism , Cerebellum/pathology , DNA Repeat Expansion/genetics , Frontotemporal Lobar Degeneration/genetics , Frontotemporal Lobar Degeneration/metabolism , Frontotemporal Lobar Degeneration/pathology , Gene Expression Profiling , Transcriptome
3.
Brain ; 145(7): 2472-2485, 2022 07 29.
Article in English | MEDLINE | ID: mdl-34918030

ABSTRACT

Frontotemporal lobar degeneration with TDP-43 inclusions (FTLD-TDP) is a complex heterogeneous neurodegenerative disorder for which mechanisms are poorly understood. To explore transcriptional changes underlying FTLD-TDP, we performed RNA-sequencing on 66 genetically unexplained FTLD-TDP patients, 24 FTLD-TDP patients with GRN mutations and 24 control participants. Using principal component analysis, hierarchical clustering, differential expression and coexpression network analyses, we showed that GRN mutation carriers and FTLD-TDP-A patients without a known mutation shared a common transcriptional signature that is independent of GRN loss-of-function. After combining both groups, differential expression as compared to the control group and coexpression analyses revealed alteration of processes related to immune response, synaptic transmission, RNA metabolism, angiogenesis and vesicle-mediated transport. Deconvolution of the data highlighted strong cellular alterations that were similar in FTLD-TDP-A and GRN mutation carriers with NSF as a potentially important player in both groups. We propose several potentially druggable pathways such as the GABAergic, GDNF and sphingolipid pathways. Our findings underline new disease mechanisms and strongly suggest that affected pathways in GRN mutation carriers extend beyond GRN and contribute to genetically unexplained forms of FTLD-TDP-A.


Subject(s)
Frontotemporal Dementia , Frontotemporal Lobar Degeneration , Progranulins , Brain/metabolism , DNA-Binding Proteins/genetics , DNA-Binding Proteins/metabolism , Frontotemporal Dementia/genetics , Frontotemporal Dementia/metabolism , Frontotemporal Lobar Degeneration/genetics , Frontotemporal Lobar Degeneration/metabolism , Humans , Intercellular Signaling Peptides and Proteins/genetics , Intercellular Signaling Peptides and Proteins/metabolism , Mutation , Progranulins/genetics , Progranulins/metabolism , Transcriptome
4.
Hum Mol Genet ; 29(16): 2761-2774, 2020 09 29.
Article in English | MEDLINE | ID: mdl-32744316

ABSTRACT

Chronic lymphocytic leukemia (CLL) is the most common adult leukemia in Western countries. It has a strong genetic basis, showing a ~ 8-fold increased risk of CLL in first-degree relatives. Genome-wide association studies (GWAS) have identified 41 risk variants across 41 loci. However, for a majority of the loci, the functional variants and the mechanisms underlying their causal roles remain undefined. Here, we examined the genetic and epigenetic features associated with 12 index variants, along with any correlated (r2 ≥ 0.5) variants, at the CLL risk loci located outside of gene promoters. Based on publicly available ChIP-seq and chromatin accessibility data as well as our own ChIP-seq data from CLL patients, we identified six candidate functional variants at six loci and at least two candidate functional variants at each of the remaining six loci. The functional variants are predominantly located within enhancers or super-enhancers, including bi-directionally transcribed enhancers, which are often restricted to immune cell types. Furthermore, we found that, at 78% of the functional variants, the alternative alleles altered the transcription factor binding motifs or histone modifications, indicating the involvement of these variants in the change of local chromatin state. Finally, the enhancers carrying functional variants physically interacted with genes enriched in the type I interferon signaling pathway, apoptosis, or TP53 network that are known to play key roles in CLL. These results support the regulatory roles for inherited noncoding variants in the pathogenesis of CLL.


Subject(s)
Enhancer Elements, Genetic/genetics , Genetic Predisposition to Disease , Genome-Wide Association Study , Leukemia, Lymphocytic, Chronic, B-Cell/genetics , Alleles , Chromatin/genetics , Epigenesis, Genetic/genetics , Female , Humans , Leukemia, Lymphocytic, Chronic, B-Cell/pathology , Male , Polymorphism, Single Nucleotide/genetics , Protein Binding , Risk Factors , Tumor Suppressor Protein p53/genetics
5.
Blood ; 133(26): 2776-2789, 2019 06 27.
Article in English | MEDLINE | ID: mdl-31101622

ABSTRACT

Anaplastic large cell lymphomas (ALCLs) represent a relatively common group of T-cell non-Hodgkin lymphomas (T-NHLs) that are unified by similar pathologic features but demonstrate marked genetic heterogeneity. ALCLs are broadly classified as being anaplastic lymphoma kinase (ALK)+ or ALK-, based on the presence or absence of ALK rearrangements. Exome sequencing of 62 T-NHLs identified a previously unreported recurrent mutation in the musculin gene, MSC E116K, exclusively in ALK- ALCLs. Additional sequencing for a total of 238 T-NHLs confirmed the specificity of MSC E116K for ALK- ALCL and further demonstrated that 14 of 15 mutated cases (93%) had coexisting DUSP22 rearrangements. Musculin is a basic helix-loop-helix (bHLH) transcription factor that heterodimerizes with other bHLH proteins to regulate lymphocyte development. The E116K mutation localized to the DNA binding domain of musculin and permitted formation of musculin-bHLH heterodimers but prevented their binding to authentic target sequence. Functional analysis showed MSCE116K acted in a dominant-negative fashion, reversing wild-type musculin-induced repression of MYC and cell cycle inhibition. Chromatin immunoprecipitation-sequencing and transcriptome analysis identified the cell cycle regulatory gene E2F2 as a direct transcriptional target of musculin. MSCE116K reversed E2F2-induced cell cycle arrest and promoted expression of the CD30-IRF4-MYC axis, whereas its expression was reciprocally induced by binding of IRF4 to the MSC promoter. Finally, ALCL cells expressing MSC E116K were preferentially targeted by the BET inhibitor JQ1. These findings identify a novel recurrent MSC mutation as a key driver of the CD30-IRF4-MYC axis and cell cycle progression in a unique subset of ALCLs.


Subject(s)
Basic Helix-Loop-Helix Transcription Factors/genetics , Lymphoma, Large-Cell, Anaplastic/genetics , Anaplastic Lymphoma Kinase/genetics , Cell Cycle/genetics , Gene Expression Regulation, Neoplastic/genetics , Humans , Mutation
6.
Brief Bioinform ; 19(5): 893-904, 2018 09 28.
Article in English | MEDLINE | ID: mdl-28407084

ABSTRACT

Current variant discovery approaches often rely on an initial read mapping to the reference sequence. Their effectiveness is limited by the presence of gaps, potential misassemblies, regions of duplicates with a high-sequence similarity and regions of high-sequence divergence in the reference. Also, mapping-based approaches are less sensitive to large INDELs and complex variations and provide little phase information in personal genomes. A few de novo assemblers have been developed to identify variants through direct variant calling from the assembly graph, micro-assembly and whole-genome assembly, but mainly for whole-genome sequencing (WGS) data. We developed SGVar, a de novo assembly workflow for haplotype-based variant discovery from whole-exome sequencing (WES) data. Using simulated human exome data, we compared SGVar with five variation-aware de novo assemblers and with BWA-MEM together with three haplotype- or local de novo assembly-based callers. SGVar outperforms the other assemblers in sensitivity and tolerance of sequencing errors. We recapitulated the findings on whole-genome and exome data from a Utah residents with Northern and Western European ancestry (CEU) trio, showing that SGVar had high sensitivity both in the highly divergent human leukocyte antigen (HLA) region and in non-HLA regions of chromosome 6. In particular, SGVar is robust to sequencing error, k-mer selection, divergence level and coverage depth. Unlike mapping-based approaches, SGVar is capable of resolving long-range phase and identifying large INDELs from WES, more prominently from WGS. We conclude that SGVar represents an ideal platform for WES-based variant discovery in highly divergent regions and across the whole genome.


Subject(s)
Exome Sequencing/methods , Genetic Variation , Chromosome Mapping/methods , Chromosome Mapping/statistics & numerical data , Chromosomes, Human, Pair 6/genetics , Computational Biology/methods , Computer Simulation , Female , Genome, Human , HLA Antigens/genetics , Haplotypes , Humans , INDEL Mutation , Polymorphism, Single Nucleotide , Pregnancy , Exome Sequencing/statistics & numerical data , Whole Genome Sequencing/methods , Whole Genome Sequencing/statistics & numerical data
7.
Am J Hematol ; 95(8): 906-917, 2020 08.
Article in English | MEDLINE | ID: mdl-32279347

ABSTRACT

Next-generation sequencing identified about 60 genes recurrently mutated in chronic lymphocytic leukemia (CLL). We examined the additive prognostic value of the total number of recurrently mutated CLL genes (i.e., tumor mutational load [TML]) or the individually mutated genes beyond the CLL international prognostic index (CLL-IPI) in newly diagnosed CLL and high-count monoclonal B-cell lymphocytosis (HC MBL). We sequenced 59 genes among 557 individuals (112 HC MBL/445 CLL) in a multi-stage design, to estimate hazard ratios (HR) and 95% confidence intervals (CI) for time-to-first treatment (TTT), adjusted for CLL-IPI and sex. TML was associated with shorter TTT in the discovery and validation cohorts, with a combined estimate of continuous HR = 1.27 (CI:1.17-1.39, P = 2.6 × 10-8 ; c-statistic = 0.76). When stratified by CLL-IPI, the association of TML with TTT was stronger and validated within low/intermediate risk (combined HR = 1.54, CI:1.37-1.72, P = 7.0 × 10-14 ). Overall, 80% of low/intermediate CLL-IPI cases with two or more mutated genes progressed to require therapy within 5 years, compared to 24% among those without mutations. TML was also associated with shorter TTT in the HC MBL cohort (HR = 1.53, CI:1.12-2.07, P = .007; c-statistic = 0.71). TML is a strong prognostic factor for TTT independent of CLL-IPI, especially among low/intermediate CLL-IPI risk, and a better predictor than any single gene. Mutational screening at early stages may improve risk stratification and better predict TTT.


Subject(s)
High-Throughput Nucleotide Sequencing/methods , Leukemia, Lymphocytic, Chronic, B-Cell/genetics , Lymphocytosis/metabolism , Adult , Aged , Aged, 80 and over , Female , Humans , Leukemia, Lymphocytic, Chronic, B-Cell/pathology , Male , Middle Aged , Prognosis
8.
Blood ; 129(26): 3419-3427, 2017 06 29.
Article in English | MEDLINE | ID: mdl-28424162

ABSTRACT

Chronic lymphocytic leukemia (CLL) patients progressed early on ibrutinib often develop Richter transformation (RT) with a short survival of about 4 months. Preclinical studies suggest that programmed death 1 (PD-1) pathway is critical to inhibit immune surveillance in CLL. This phase 2 study was designed to test the efficacy and safety of pembrolizumab, a humanized PD-1-blocking antibody, at a dose of 200 mg every 3 weeks in relapsed and transformed CLL. Twenty-five patients including 16 relapsed CLL and 9 RT (all proven diffuse large cell lymphoma) patients were enrolled, and 60% received prior ibrutinib. Objective responses were observed in 4 out of 9 RT patients (44%) and in 0 out of 16 CLL patients (0%). All responses were observed in RT patients who had progression after prior therapy with ibrutinib. After a median follow-up time of 11 months, the median overall survival in the RT cohort was 10.7 months, but was not reached in RT patients who progressed after prior ibrutinib. Treatment-related grade 3 or above adverse events were reported in 15 (60%) patients and were manageable. Analyses of pretreatment tumor specimens from available patients revealed increased expression of PD-ligand 1 (PD-L1) and a trend of increased expression in PD-1 in the tumor microenvironment in patients who had confirmed responses. Overall, pembrolizumab exhibited selective efficacy in CLL patients with RT. The results of this study are the first to demonstrate the benefit of PD-1 blockade in CLL patients with RT, and could change the landscape of therapy for RT patients if further validated. This trial was registered at www.clinicaltrials.gov as #NCT02332980.


Subject(s)
Antibodies, Monoclonal, Humanized/administration & dosage , Leukemia, Lymphocytic, Chronic, B-Cell/drug therapy , Lymphoma, Large B-Cell, Diffuse/drug therapy , Adenine/analogs & derivatives , Aged , Aged, 80 and over , Cell Transformation, Neoplastic , Disease-Free Survival , Female , Gene Expression , Humans , Leukemia, Lymphocytic, Chronic, B-Cell/mortality , Lymphoma, Large B-Cell, Diffuse/mortality , Male , Middle Aged , Piperidines , Programmed Cell Death 1 Receptor/genetics , Pyrazoles/administration & dosage , Pyrimidines/administration & dosage , Recurrence , Survival Analysis
9.
Acta Neuropathol ; 137(6): 879-899, 2019 06.
Article in English | MEDLINE | ID: mdl-30739198

ABSTRACT

Frontotemporal lobar degeneration with neuronal inclusions of the TAR DNA-binding protein 43 (FTLD-TDP) represents the most common pathological subtype of FTLD. We established the international FTLD-TDP whole-genome sequencing consortium to thoroughly characterize the known genetic causes of FTLD-TDP and identify novel genetic risk factors. Through the study of 1131 unrelated Caucasian patients, we estimated that C9orf72 repeat expansions and GRN loss-of-function mutations account for 25.5% and 13.9% of FTLD-TDP patients, respectively. Mutations in TBK1 (1.5%) and other known FTLD genes (1.4%) were rare, and the disease in 57.7% of FTLD-TDP patients was unexplained by the known FTLD genes. To unravel the contribution of common genetic factors to the FTLD-TDP etiology in these patients, we conducted a two-stage association study comprising the analysis of whole-genome sequencing data from 517 FTLD-TDP patients and 838 controls, followed by targeted genotyping of the most associated genomic loci in 119 additional FTLD-TDP patients and 1653 controls. We identified three genome-wide significant FTLD-TDP risk loci: one new locus at chromosome 7q36 within the DPP6 gene led by rs118113626 (p value = 4.82e - 08, OR = 2.12), and two known loci: UNC13A, led by rs1297319 (p value = 1.27e - 08, OR = 1.50) and HLA-DQA2 led by rs17219281 (p value = 3.22e - 08, OR = 1.98). While HLA represents a locus previously implicated in clinical FTLD and related neurodegenerative disorders, the association signal in our study is independent from previously reported associations. Through inspection of our whole-genome sequence data for genes with an excess of rare loss-of-function variants in FTLD-TDP patients (n ≥ 3) as compared to controls (n = 0), we further discovered a possible role for genes functioning within the TBK1-related immune pathway (e.g., DHX58, TRIM21, IRF7) in the genetic etiology of FTLD-TDP. Together, our study based on the largest cohort of unrelated FTLD-TDP patients assembled to date provides a comprehensive view of the genetic landscape of FTLD-TDP, nominates novel FTLD-TDP risk loci, and strongly implicates the immune pathway in FTLD-TDP pathogenesis.


Subject(s)
Nerve Tissue Proteins/genetics , TDP-43 Proteinopathies/genetics , Aged , DNA Repeat Expansion , Dipeptidyl-Peptidases and Tripeptidyl-Peptidases/genetics , Female , Frontal Lobe/metabolism , Frontotemporal Lobar Degeneration/genetics , Frontotemporal Lobar Degeneration/immunology , Genetic Predisposition to Disease , Genome-Wide Association Study , HLA-DQ Antigens/genetics , Humans , Intracellular Signaling Peptides and Proteins , Loss of Function Mutation , Male , Middle Aged , Nerve Tissue Proteins/physiology , Potassium Channels/genetics , Progranulins/genetics , Progranulins/physiology , Protein Serine-Threonine Kinases/genetics , Protein Serine-Threonine Kinases/physiology , Proteins/genetics , Proteins/physiology , RNA, Messenger/biosynthesis , Risk Factors , Sequence Analysis, RNA , Societies, Scientific , TDP-43 Proteinopathies/immunology , White People/genetics
11.
BMC Bioinformatics ; 19(1): 139, 2018 04 16.
Article in English | MEDLINE | ID: mdl-29661148

ABSTRACT

BACKGROUND: After decades of identifying risk factors using array-based genome-wide association studies (GWAS), genetic research of complex diseases has shifted to sequencing-based rare variants discovery. This requires large sample sizes for statistical power and has brought up questions about whether the current variant calling practices are adequate for large cohorts. It is well-known that there are discrepancies between variants called by different pipelines, and that using a single pipeline always misses true variants exclusively identifiable by other pipelines. Nonetheless, it is common practice today to call variants by one pipeline due to computational cost and assume that false negative calls are a small percent of total. RESULTS: We analyzed 10,000 exomes from the Alzheimer's Disease Sequencing Project (ADSP) using multiple analytic pipelines consisting of different read aligners and variant calling strategies. We compared variants identified by using two aligners in 50,100, 200, 500, 1000, and 1952 samples; and compared variants identified by adding single-sample genotyping to the default multi-sample joint genotyping in 50,100, 500, 2000, 5000 and 10,000 samples. We found that using a single pipeline missed increasing numbers of high-quality variants correlated with sample sizes. By combining two read aligners and two variant calling strategies, we rescued 30% of pass-QC variants at sample size of 2000, and 56% at 10,000 samples. The rescued variants had higher proportions of low frequency (minor allele frequency [MAF] 1-5%) and rare (MAF < 1%) variants, which are the very type of variants of interest. In 660 Alzheimer's disease cases with earlier onset ages of ≤65, 4 out of 13 (31%) previously-published rare pathogenic and protective mutations in APP, PSEN1, and PSEN2 genes were undetected by the default one-pipeline approach but recovered by the multi-pipeline approach. CONCLUSIONS: Identification of the complete variant set from sequencing data is the prerequisite of genetic association analyses. The current analytic practice of calling genetic variants from sequencing data using a single bioinformatics pipeline is no longer adequate with the increasingly large projects. The number and percentage of quality variants that passed quality filters but are missed by the one-pipeline approach rapidly increased with sample size.


Subject(s)
Computational Biology/methods , Genetic Variation , Alzheimer Disease/genetics , Base Composition/genetics , Drug Discovery , Genome , Genotype , Genotyping Techniques , Humans , Sample Size , Sequence Alignment
13.
BMC Bioinformatics ; 17(1): 403, 2016 Oct 03.
Article in English | MEDLINE | ID: mdl-27716037

ABSTRACT

BACKGROUND: GATK Best Practices workflows are widely used in large-scale sequencing projects and recommend post-alignment processing before variant calling. Two key post-processing steps include the computationally intensive local realignment around known INDELs and base quality score recalibration (BQSR). Both have been shown to reduce erroneous calls; however, the findings are mainly supported by the analytical pipeline that incorporates BWA and GATK UnifiedGenotyper. It is not known whether there is any benefit of post-processing and to what extent the benefit might be for pipelines implementing other methods, especially given that both mappers and callers are typically updated. Moreover, because sequencing platforms are upgraded regularly and the new platforms provide better estimations of read quality scores, the need for post-processing is also unknown. Finally, some regions in the human genome show high sequence divergence from the reference genome; it is unclear whether there is benefit from post-processing in these regions. RESULTS: We used both simulated and NA12878 exome data to comprehensively assess the impact of post-processing for five or six popular mappers together with five callers. Focusing on chromosome 6p21.3, which is a region of high sequence divergence harboring the human leukocyte antigen (HLA) system, we found that local realignment had little or no impact on SNP calling, but increased sensitivity was observed in INDEL calling for the Stampy + GATK UnifiedGenotyper pipeline. No or only a modest effect of local realignment was detected on the three haplotype-based callers and no evidence of effect on Novoalign. BQSR had virtually negligible effect on INDEL calling and generally reduced sensitivity for SNP calling that depended on caller, coverage and level of divergence. Specifically, for SAMtools and FreeBayes calling in the regions with low divergence, BQSR reduced the SNP calling sensitivity but improved the precision when the coverage is insufficient. However, in regions of high divergence (e.g., the HLA region), BQSR reduced the sensitivity of both callers with little gain in precision rate. For the other three callers, BQSR reduced the sensitivity without increasing the precision rate regardless of coverage and divergence level. CONCLUSIONS: We demonstrated that the gain from post-processing is not universal; rather, it depends on mapper and caller combination, and the benefit is influenced further by sequencing depth and divergence level. Our analysis highlights the importance of considering these key factors in deciding to apply the computationally intensive post-processing to Illumina exome data.


Subject(s)
Computational Biology/methods , Computational Biology/standards , Exome/genetics , Sequence Alignment/methods , Software , Genome, Human , High-Throughput Nucleotide Sequencing/methods , Humans , Mutation/genetics , Polymorphism, Single Nucleotide/genetics , Workflow
14.
BMC Genomics ; 17: 703, 2016 09 02.
Article in English | MEDLINE | ID: mdl-27590916

ABSTRACT

BACKGROUND: Current variant discovery methods often start with the mapping of short reads to a reference genome; yet, their performance deteriorates in genomic regions where the reads are highly divergent from the reference sequence. This is particularly problematic for the human leukocyte antigen (HLA) region on chromosome 6p21.3. This region is associated with over 100 diseases, but variant calling is hindered by the extreme divergence across different haplotypes. RESULTS: We simulated reads from chromosome 6 exonic regions over a wide range of sequence divergence and coverage depth. We systematically assessed combinations between five mappers and five callers for their performance on simulated data and exome-seq data from NA12878, a well-studied individual in which multiple public call sets have been generated. Among those combinations, the number of known SNPs differed by about 5 % in the non-HLA regions of chromosome 6 but over 20 % in the HLA region. Notably, GSNAP mapping combined with GATK UnifiedGenotyper calling identified about 20 % more known SNPs than most existing methods without a noticeable loss of specificity, with 100 % sensitivity in three highly polymorphic HLA genes examined. Much larger differences were observed among these combinations in INDEL calling from both non-HLA and HLA regions. We obtained similar results with our internal exome-seq data from a cohort of chronic lymphocytic leukemia patients. CONCLUSIONS: We have established a workflow enabling variant detection, with high sensitivity and specificity, over the full spectrum of divergence seen in the human genome. Comparing to public call sets from NA12878 has highlighted the overall superiority of GATK UnifiedGenotyper, followed by GATK HaplotypeCaller and SAMtools, in SNP calling, and of GATK HaplotypeCaller and Platypus in INDEL calling, particularly in regions of high sequence divergence such as the HLA region. GSNAP and Novoalign are the ideal mappers in combination with the above callers. We expect that the proposed workflow should be applicable to variant discovery in other highly divergent regions.


Subject(s)
Genetic Variation , Genome, Human , Genomics/methods , Workflow , Algorithms , Chromosome Mapping , Computational Biology/methods , Computer Simulation , Exome , Genomics/standards , HLA Antigens/genetics , High-Throughput Nucleotide Sequencing , Humans , INDEL Mutation , Leukemia, Lymphocytic, Chronic, B-Cell/genetics , Polymorphism, Single Nucleotide , Reproducibility of Results
15.
Am J Epidemiol ; 183(2): 96-109, 2016 Jan 15.
Article in English | MEDLINE | ID: mdl-26721890

ABSTRACT

Epigenetic information encoded in covalent modifications of DNA and histone proteins regulates fundamental biological processes through the action of chromatin regulators, transcription factors, and noncoding RNA species. Epigenetic plasticity enables an organism to respond to developmental and environmental signals without genetic changes. However, aberrant epigenetic control plays a key role in pathogenesis of disease. Normal epigenetic states could be disrupted by detrimental mutations and expression alteration of chromatin regulators or by environmental factors. In this primer, we briefly review the epigenetic basis of human disease and discuss how recent discoveries in this field could be translated into clinical diagnosis, prevention, and treatment. We introduce platforms for mapping genome-wide chromatin accessibility, nucleosome occupancy, DNA-binding proteins, and DNA methylation, primarily focusing on the integration of DNA methylation and chromatin immunoprecipitation-sequencing technologies into disease association studies. We highlight practical considerations in applying high-throughput epigenetic assays and formulating analytical strategies. Finally, we summarize current challenges in sample acquisition, experimental procedures, data analysis, and interpretation and make recommendations on further refinement in these areas. Incorporating epigenomic testing into the clinical research arsenal will greatly facilitate our understanding of the epigenetic basis of disease and help identify novel therapeutic targets.


Subject(s)
Epigenomics/methods , Genome-Wide Association Study/methods , Chromatin Assembly and Disassembly , DNA Methylation , DNA-Binding Proteins , Humans , Nucleosomes , Transcription Factors
16.
Bioinformatics ; 31(16): 2614-22, 2015 Aug 15.
Article in English | MEDLINE | ID: mdl-25847007

ABSTRACT

MOTIVATION: With improvements in next-generation sequencing technologies and reductions in price, ordered RNA-seq experiments are becoming common. Of primary interest in these experiments is identifying genes that are changing over time or space, for example, and then characterizing the specific expression changes. A number of robust statistical methods are available to identify genes showing differential expression among multiple conditions, but most assume conditions are exchangeable and thereby sacrifice power and precision when applied to ordered data. RESULTS: We propose an empirical Bayes mixture modeling approach called EBSeq-HMM. In EBSeq-HMM, an auto-regressive hidden Markov model is implemented to accommodate dependence in gene expression across ordered conditions. As demonstrated in simulation and case studies, the output proves useful in identifying differentially expressed genes and in specifying gene-specific expression paths. EBSeq-HMM may also be used for inference regarding isoform expression. AVAILABILITY AND IMPLEMENTATION: An R package containing examples and sample datasets is available at Bioconductor. CONTACT: kendzior@biostat.wisc.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Bayes Theorem , Gene Expression Profiling/methods , High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, RNA/methods , Software , Gene Expression Regulation , Humans
17.
Nat Methods ; 8(10): 821-7, 2011 Sep 11.
Article in English | MEDLINE | ID: mdl-21983960

ABSTRACT

Combining high-mass-accuracy mass spectrometry, isobaric tagging and software for multiplexed, large-scale protein quantification, we report deep proteomic coverage of four human embryonic stem cell and four induced pluripotent stem cell lines in biological triplicate. This 24-sample comparison resulted in a very large set of identified proteins and phosphorylation sites in pluripotent cells. The statistical analysis afforded by our approach revealed subtle but reproducible differences in protein expression and protein phosphorylation between embryonic stem cells and induced pluripotent cells. Merging these results with RNA-seq analysis data, we found functionally related differences across each tier of regulation. We also introduce the Stem Cell-Omics Repository (SCOR), a resource to collate and display quantitative information across multiple planes of measurement, including mRNA, protein and post-translational modifications.


Subject(s)
Embryonic Stem Cells/metabolism , Induced Pluripotent Stem Cells/metabolism , Proteome/analysis , Proteomics , Humans , Proteome/metabolism
18.
PLoS Comput Biol ; 9(3): e1002936, 2013.
Article in English | MEDLINE | ID: mdl-23505351

ABSTRACT

The salamander has the remarkable ability to regenerate its limb after amputation. Cells at the site of amputation form a blastema and then proliferate and differentiate to regrow the limb. To better understand this process, we performed deep RNA sequencing of the blastema over a time course in the axolotl, a species whose genome has not been sequenced. Using a novel comparative approach to analyzing RNA-seq data, we characterized the transcriptional dynamics of the regenerating axolotl limb with respect to the human gene set. This approach involved de novo assembly of axolotl transcripts, RNA-seq transcript quantification without a reference genome, and transformation of abundances from axolotl contigs to human genes. We found a prominent burst in oncogene expression during the first day and blastemal/limb bud genes peaking at 7 to 14 days. In addition, we found that limb patterning genes, SALL genes, and genes involved in angiogenesis, wound healing, defense/immunity, and bone development are enriched during blastema formation and development. Finally, we identified a category of genes with no prior literature support for limb regeneration that are candidates for further evaluation based on their expression pattern during the regenerative process.


Subject(s)
Ambystoma mexicanum/physiology , Gene Expression Profiling/methods , Gene Expression Regulation , Oncogenes , Sequence Analysis, RNA/methods , Ambystoma mexicanum/genetics , Amputation, Surgical , Animals , Cluster Analysis , Extremities/injuries , Extremities/physiology , Regeneration/genetics , Regeneration/physiology , Up-Regulation , Wound Healing/genetics , Wound Healing/physiology
19.
Nat Commun ; 15(1): 5294, 2024 Jun 21.
Article in English | MEDLINE | ID: mdl-38906885

ABSTRACT

Determining the balance between DNA double strand break repair (DSBR) pathways is essential for understanding treatment response in cancer. We report a method for simultaneously measuring non-homologous end joining (NHEJ), homologous recombination (HR), and microhomology-mediated end joining (MMEJ). Using this method, we show that patient-derived glioblastoma (GBM) samples with acquired temozolomide (TMZ) resistance display elevated HR and MMEJ activity, suggesting that these pathways contribute to treatment resistance. We screen clinically relevant small molecules for DSBR inhibition with the aim of identifying improved GBM combination therapy regimens. We identify the ATM kinase inhibitor, AZD1390, as a potent dual HR/MMEJ inhibitor that suppresses radiation-induced phosphorylation of DSBR proteins, blocks DSB end resection, and enhances the cytotoxic effects of TMZ in treatment-naïve and treatment-resistant GBMs with TP53 mutation. We further show that a combination of G2/M checkpoint deficiency and reliance upon ATM-dependent DSBR renders TP53 mutant GBMs hypersensitive to TMZ/AZD1390 and radiation/AZD1390 combinations. This report identifies ATM-dependent HR and MMEJ as targetable resistance mechanisms in TP53-mutant GBM and establishes an approach for simultaneously measuring multiple DSBR pathways in treatment selection and oncology research.


Subject(s)
Ataxia Telangiectasia Mutated Proteins , DNA Breaks, Double-Stranded , Glioblastoma , Temozolomide , Tumor Suppressor Protein p53 , Humans , Ataxia Telangiectasia Mutated Proteins/metabolism , Ataxia Telangiectasia Mutated Proteins/antagonists & inhibitors , Ataxia Telangiectasia Mutated Proteins/genetics , Glioblastoma/genetics , Glioblastoma/drug therapy , Glioblastoma/metabolism , Glioblastoma/pathology , Tumor Suppressor Protein p53/metabolism , Tumor Suppressor Protein p53/genetics , DNA Breaks, Double-Stranded/drug effects , Temozolomide/pharmacology , Cell Line, Tumor , Mutation , Drug Resistance, Neoplasm/genetics , Drug Resistance, Neoplasm/drug effects , DNA Repair/drug effects , Brain Neoplasms/genetics , Brain Neoplasms/drug therapy , Brain Neoplasms/pathology , Brain Neoplasms/metabolism , Animals , DNA End-Joining Repair/drug effects , Mice , Phosphorylation/drug effects
20.
Leukemia ; 2024 Oct 14.
Article in English | MEDLINE | ID: mdl-39402215

ABSTRACT

Multiple myeloma (MM) is a plasma cell (PC) malignancy characterized by cytogenetic abnormalities, such as t(11;14)(q13;q32), resulting in CCND1 overexpression. The rs9344 G allele within CCND1 is the most significant susceptibility allele for t(11;14). Sequencing data from 2 independent cohorts, CoMMpass (n = 698) and Mayo Clinic (n = 661), confirm the positive association between the G allele and t(11;14). Among 80% of individuals heterozygous for rs9344 with t(11;14), the t(11;14) event occurs on the G allele, demonstrating a biological preference for the G allele in t(11;14). Within t(11;14), the G allele is associated with higher CCND1 expression and elevated H3K27ac and H3K4me3. CRISPR/Cas9 mediated A to G conversion resulted in increased H3K27ac over CCND1 and elevated CCND1 expression. ENCODE ChIP-seq data supported a PAX5 binding site within the enhancer region covering rs9344, showing preferential binding to the G allele. Overexpression of PAX5 resulted in increased CCND1 expression. These results support the importance of rs9344 G enhancer in increasing CCND1 expression in MM.

SELECTION OF CITATIONS
SEARCH DETAIL