Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 56
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Mol Cell ; 78(4): 653-669.e8, 2020 05 21.
Artigo em Inglês | MEDLINE | ID: mdl-32315601

RESUMO

Epstein-Barr virus (EBV) is associated with multiple human malignancies. To evade immune detection, EBV switches between latent and lytic programs. How viral latency is maintained in tumors or in memory B cells, the reservoir for lifelong EBV infection, remains incompletely understood. To gain insights, we performed a human genome-wide CRISPR/Cas9 screen in Burkitt lymphoma B cells. Our analyses identified a network of host factors that repress lytic reactivation, centered on the transcription factor MYC, including cohesins, FACT, STAGA, and Mediator. Depletion of MYC or factors important for MYC expression reactivated the lytic cycle, including in Burkitt xenografts. MYC bound the EBV genome origin of lytic replication and suppressed its looping to the lytic cycle initiator BZLF1 promoter. Notably, MYC abundance decreases with plasma cell differentiation, a key lytic reactivation trigger. Our results suggest that EBV senses MYC abundance as a readout of B cell state and highlights Burkitt latency reversal therapeutic targets.


Assuntos
Linfoma de Burkitt/patologia , Infecções por Vírus Epstein-Barr/virologia , Herpesvirus Humano 4/fisiologia , Interações Hospedeiro-Patógeno , Proteínas Proto-Oncogênicas c-myc/metabolismo , Ativação Viral , Latência Viral , Animais , Linfócitos B/metabolismo , Linfócitos B/patologia , Linfócitos B/virologia , Linfoma de Burkitt/metabolismo , Linfoma de Burkitt/virologia , Proliferação de Células , Infecções por Vírus Epstein-Barr/genética , Infecções por Vírus Epstein-Barr/metabolismo , Feminino , Regulação Viral da Expressão Gênica , Humanos , Camundongos , Camundongos Endogâmicos NOD , Camundongos SCID , Regiões Promotoras Genéticas , Proteínas Proto-Oncogênicas c-myc/genética , Células Tumorais Cultivadas , Ensaios Antitumorais Modelo de Xenoenxerto
2.
PLoS Comput Biol ; 20(2): e1011873, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38335222

RESUMO

Super enhancers (SE), large genomic elements that activate transcription and drive cell identity, have been found with cancer-specific gene regulation in human cancers. Recent studies reported the importance of understanding the cooperation and function of SE internal components, i.e., the constituent enhancers (CE). However, there are no pan-cancer studies to identify cancer-specific SE signatures at the constituent level. Here, by revisiting pan-cancer SE activities with H3K27Ac ChIP-seq datasets, we report fingerprint SE signatures for 28 cancer types in the NCI-60 cell panel. We implement a mixture model to discriminate active CEs from inactive CEs by taking into consideration ChIP-seq variabilities between cancer samples and across CEs. We demonstrate that the model-based estimation of CE states provides improved functional interpretation of SE-associated regulation. We identify cancer-specific CEs by balancing their active prevalence with their capability of encoding cancer type identities. We further demonstrate that cancer-specific CEs have the strongest per-base enhancer activities in independent enhancer sequencing assays, suggesting their importance in understanding critical SE signatures. We summarize fingerprint SEs based on the cancer-specific statuses of their component CEs and build an easy-to-use R package to facilitate the query, exploration, and visualization of fingerprint SEs across cancers.


Assuntos
Neoplasias , Super Intensificadores , Humanos , Epigenômica , Elementos Facilitadores Genéticos/genética , Regulação da Expressão Gênica , Neoplasias/genética
3.
J Infect Dis ; 2024 Apr 24.
Artigo em Inglês | MEDLINE | ID: mdl-38657098

RESUMO

BACKGROUND: Cancer-related deaths for people living with HIV (PWH) are increasing due to longer life expectancies and disparately poor cancer-related outcomes. We hypothesize that advanced biological aging contributes to cancer-related morbidity and mortality for PWH and cancer. We sought to determine the impact of clonal hematopoiesis (CH) on cancer disparities in PWH. METHODS: We conducted a retrospective study to compare the prevalence and clinical outcomes of CH in PWH and people without HIV (PWoH) and cancer. Included in the study were PWH and similar PWoH based on tumor site, age, tumor sequence, and cancer treatment status. Biological aging was also measured using epigenetic methylation clocks. RESULTS: In 136 patients with cancer, PWH had twice the prevalence of CH compared to similar PWoH (23% vs 11%, p=0.07). After adjusting for patient characteristics, PWH were four-times more likely to have CH than PWoH (OR 4.1, 95% CI 1.3-13.9, p=0.02). The effect of CH on survival was most pronounced in PWH, who had a 5-year survival rate of 38% if they had CH (vs 59% if no CH), compared to PWoH who had a 5-year survival rate of 75% if they had CH (vs 83% if no CH). CONCLUSION: This study provides the first evidence that PWH may have a higher prevalence of CH than PWoH with the same cancers. CH may be an independent biological aging risk factor contributing to inferior survival for PWH and cancer.

4.
Nucleic Acids Res ; 50(6): 3115-3127, 2022 04 08.
Artigo em Inglês | MEDLINE | ID: mdl-35234924

RESUMO

Super enhancers (SEs) are broad enhancer domains usually containing multiple constituent enhancers that hold elevated activities in gene regulation. Disruption in one or more constituent enhancers causes aberrant SE activities that lead to gene dysregulation in diseases. To quantify SE aberrations, differential analysis is performed to compare SE activities between cell conditions. The state-of-art strategy in estimating differential SEs relies on overall activities and neglect the changes in length and structure of SEs. Here, we propose a novel computational method to identify differential SEs by weighting the combinatorial effects of constituent-enhancer activities and locations (i.e. internal dynamics). In addition to overall activity changes, our method identified four novel classes of differential SEs with distinct enhancer structural alterations. We demonstrate that these structure alterations hold distinct regulatory impact, such as regulating different number of genes and modulating gene expression with different strengths, highlighting the differentiated regulatory roles of these unexplored SE features. When compared to the existing method, our method showed improved identification of differential SEs that were linked to better discernment of cell-type-specific SE activity and functional interpretation.


Assuntos
Elementos Facilitadores Genéticos , Regulação da Expressão Gênica , Diferenciação Celular
5.
BMC Bioinformatics ; 24(1): 266, 2023 Jun 28.
Artigo em Inglês | MEDLINE | ID: mdl-37380943

RESUMO

Pathway-level survival analysis offers the opportunity to examine molecular pathways and immune signatures that influence patient outcomes. However, available survival analysis algorithms are limited in pathway-level function and lack a streamlined analytical process. Here we present a comprehensive pathway-level survival analysis suite, PATH-SURVEYOR, which includes a Shiny user interface with extensive features for systematic exploration of pathways and covariates in a Cox proportional-hazard model. Moreover, our framework offers an integrative strategy for performing Hazard Ratio ranked Gene Set Enrichment Analysis and pathway clustering. As an example, we applied our tool in a combined cohort of melanoma patients treated with checkpoint inhibition (ICI) and identified several immune populations and biomarkers predictive of ICI efficacy. We also analyzed gene expression data of pediatric acute myeloid leukemia (AML) and performed an inverse association of drug targets with the patient's clinical endpoint. Our analysis derived several drug targets in high-risk KMT2A-fusion-positive patients, which were then validated in AML cell lines in the Genomics of Drug Sensitivity database. Altogether, the tool offers a comprehensive suite for pathway-level survival analysis and a user interface for exploring drug targets, molecular features, and immune populations at different resolutions.


Assuntos
Leucemia Mieloide Aguda , Melanoma , Criança , Humanos , Reposicionamento de Medicamentos , Oncologia , Melanoma/tratamento farmacológico , Melanoma/genética , Algoritmos , Leucemia Mieloide Aguda/tratamento farmacológico , Leucemia Mieloide Aguda/genética
6.
Blood ; 138(22): 2216-2230, 2021 12 02.
Artigo em Inglês | MEDLINE | ID: mdl-34232987

RESUMO

Epstein-Barr virus (EBV) causes endemic Burkitt lymphoma, the leading childhood cancer in sub-Saharan Africa. Burkitt cells retain aspects of germinal center B-cell physiology with MYC-driven B-cell hyperproliferation; however, little is presently known about their iron metabolism. CRISPR/Cas9 analysis highlighted the little-studied ferrireductase CYB561A3 as critical for Burkitt proliferation but not for that of the closely related EBV-transformed lymphoblastoid cells or nearly all other Cancer Dependency Map cell lines. Burkitt CYB561A3 knockout induced profound iron starvation, despite ferritinophagy ad plasma membrane transferrin upregulation. Elevated concentrations of ascorbic acid, a key CYB561 family electron donor, or the labile iron source ferrous citrate rescued Burkitt CYB561A3 deficiency. CYB561A3 knockout caused catastrophic lysosomal and mitochondrial damage and impaired mitochondrial respiration. Conversely, lymphoblastoid B cells with the transforming EBV latency III program were instead dependent on the STEAP3 ferrireductase. These results highlight CYB561A3 as an attractive therapeutic Burkitt lymphoma target.


Assuntos
Linfoma de Burkitt/patologia , Citocromos b/genética , Regulação Neoplásica da Expressão Gênica , Lisossomos/patologia , Linfócitos B/metabolismo , Linfócitos B/patologia , Linfoma de Burkitt/genética , Sistemas CRISPR-Cas , Linhagem Celular Tumoral , Proliferação de Células , Infecções por Vírus Epstein-Barr/complicações , FMN Redutase/genética , Células HEK293 , Herpesvirus Humano 4/isolamento & purificação , Humanos , Lisossomos/genética , Mitocôndrias/genética , Mitocôndrias/patologia
7.
Breast Cancer Res ; 24(1): 11, 2022 02 08.
Artigo em Inglês | MEDLINE | ID: mdl-35135604

RESUMO

PURPOSE: Estrogen-receptor (ER) and progesterone-receptor (PR) expression levels in breast cancer, which have been principally compared via binomial descriptors, can vary widely across tumors. We sought to characterize ER and PR expression levels using semi-quantitative analyses of receptor staining in germline pathogenic variant (PV) carriers of cancer predisposition genes. METHODS: We conducted a retrospective chart review of patients who underwent germline genetic testing for cancer predisposition genes at a tertiary cancer center genetics clinic. We performed comparisons of semi-quantitative ER and PR percentage staining levels across carriers and non-carriers of cancer predisposition genes. RESULTS: Breast cancers from BRCA1 PV carriers expressed significantly lower ER (15.2% vs 78.2%, p < 0.001) and lower PR (6.8% vs 41.1%, p < 0.001) staining compared to non-PV carriers. Similarly, breast cancers of BRCA2 (66.7% vs 78.2%, p = 0.005) and TP53 (50.6% vs 78.2%, p = 0.015) PV tumors also displayed moderate decreases in ER staining. Conversely, CHEK2 tumors displayed higher ER (93.1% vs 78.2%, p = 0.005) and PR (72% vs 48.8%, p = 0.001) staining when compared to non-PV carriers. We observed a wide range of dispersion across the ER and PR staining levels of the carriers and noncarriers. ER and PR ranges of dispersion of CHEK2 tumors were uniquely narrower than all other groups. CONCLUSION: The findings of our study suggest that precise expression levels of ER and PR in breast cancers can vary widely. These differences are further augmented when comparing expression staining across PV and non-PV carriers, suggesting potentially unique tumorigenesis and progression pathways influenced by germline cancer predisposition genes.


Assuntos
Neoplasias da Mama , Neoplasias da Mama/patologia , Quinase do Ponto de Checagem 2/genética , Feminino , Predisposição Genética para Doença , Células Germinativas/metabolismo , Mutação em Linhagem Germinativa , Hormônios , Humanos , Mutação , Receptores de Progesterona/genética , Receptores de Progesterona/metabolismo , Estudos Retrospectivos
8.
Bioinformatics ; 37(20): 3681-3683, 2021 Oct 25.
Artigo em Inglês | MEDLINE | ID: mdl-33901274

RESUMO

SUMMARY: The heterogeneous cell types of the tumor-immune microenvironment (TIME) play key roles in determining cancer progression, metastasis and response to treatment. We report the development of TIMEx, a novel TIME deconvolution method emphasizing on estimating infiltrating immune cells for bulk transcriptomics using pan-cancer single-cell RNA-seq signatures. We also implemented a comprehensive, user-friendly web-portal for users to evaluate TIMEx and other deconvolution methods with bulk transcriptomic profiles. AVAILABILITY AND IMPLEMENTATION: TIMEx web-portal is freely accessible at http://timex.moffitt.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

9.
J Biol Chem ; 294(25): 9734-9745, 2019 06 21.
Artigo em Inglês | MEDLINE | ID: mdl-31073033

RESUMO

Early diagnosis of nasopharyngeal carcinoma (NPC) is difficult because of a lack of specific symptoms. Many patients have advanced disease at diagnosis, and these patients respond poorly to treatment. New treatments are therefore needed to improve the outcome of NPC. To better understand the molecular pathogenesis of NPC, here we used an NPC cell line in a genome-wide CRISPR-based knockout screen to identify the cellular factors and pathways essential for NPC (i.e. dependence factors). This screen identified the Moz, Ybf2/Sas3, Sas2, Tip60 histone acetyl transferase complex, NF-κB signaling, purine synthesis, and linear ubiquitination pathways; and MDM2 proto-oncogene as NPC dependence factors/pathways. Using gene knock out, complementary DNA rescue, and inhibitor assays, we found that perturbation of these pathways greatly reduces the growth of NPC cell lines but does not affect growth of SV40-immortalized normal nasopharyngeal epithelial cells. These results suggest that targeting these pathways/proteins may hold promise for achieving better treatment of patients with NPC.


Assuntos
Biomarcadores Tumorais/genética , Sistemas CRISPR-Cas , Proliferação de Células , Técnicas de Inativação de Genes/métodos , Genoma Humano , Carcinoma Nasofaríngeo/genética , Neoplasias Nasofaríngeas/genética , Biomarcadores Tumorais/antagonistas & inibidores , Humanos , Carcinoma Nasofaríngeo/patologia , Neoplasias Nasofaríngeas/patologia , Proto-Oncogene Mas , Transdução de Sinais , Células Tumorais Cultivadas
10.
Genome Res ; 27(11): 1930-1938, 2017 11.
Artigo em Inglês | MEDLINE | ID: mdl-29025895

RESUMO

The main application of ChIP-seq technology is the detection of genomic regions that bind to a protein of interest. A large part of functional genomics' public catalogs is based on ChIP-seq data. These catalogs rely on peak calling algorithms that infer protein-binding sites by detecting genomic regions associated with more mapped reads (coverage) than expected by chance, as a result of the experimental protocol's lack of perfect specificity. We find that GC-content bias accounts for substantial variability in the observed coverage for ChIP-seq experiments and that this variability leads to false-positive peak calls. More concerning is that the GC effect varies across experiments, with the effect strong enough to result in a substantial number of peaks called differently when different laboratories perform experiments on the same cell line. However, accounting for GC content bias in ChIP-seq is challenging because the binding sites of interest tend to be more common in high GC-content regions, which confounds real biological signals with unwanted variability. To account for this challenge, we introduce a statistical approach that accounts for GC effects on both nonspecific noise and signal induced by the binding site. The method can be used to account for this bias in binding quantification as well to improve existing peak calling algorithms. We use this approach to show a reduction in false-positive peaks as well as improved consistency across laboratories.


Assuntos
Composição de Bases , DNA/metabolismo , Análise de Sequência de DNA/métodos , Algoritmos , Sítios de Ligação , Imunoprecipitação da Cromatina , DNA/química , Reações Falso-Positivas , Genômica , Sequenciamento de Nucleotídeos em Larga Escala
11.
J Virol ; 93(13)2019 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-31019051

RESUMO

Epstein-Barr virus (EBV) infection of human primary resting B lymphocytes (RBLs) leads to the establishment of lymphoblastoid cell lines (LCLs) that can grow indefinitely in vitro EBV transforms RBLs through the expression of viral latency genes, and these genes alter host transcription programs. To globally measure the transcriptome changes during EBV transformation, primary human resting B lymphocytes (RBLs) were infected with B95.8 EBV for 0, 2, 4, 7, 14, 21, and 28 days, and poly(A) plus RNAs were analyzed by transcriptome sequencing (RNA-seq). Analyses of variance (ANOVAs) found 3,669 protein-coding genes that were differentially expressed (false-discovery rate [FDR] < 0.01). Ninety-four percent of LCL genes that are essential for LCL growth and survival were differentially expressed. Pathway analyses identified a significant enrichment of pathways involved in cell proliferation, DNA repair, metabolism, and antiviral responses. RNA-seq also identified long noncoding RNAs (lncRNAs) differentially expressed during EBV infection. Clustered regularly interspaced short palindromic repeat (CRISPR) interference (CRISPRi) and CRISPR activation (CRISPRa) found that CYTOR and NORAD lncRNAs were important for LCL growth. During EBV infection, type III EBV latency genes were expressed rapidly after infection. Immediately after LCL establishment, EBV lytic genes were also expressed in LCLs, and ∼4% of the LCLs express gp350. Chromatin immune precipitation followed by deep sequencing (ChIP-seq) and POLR2A chromatin interaction analysis followed by paired-end tag sequencing (ChIA-PET) data linked EBV enhancers to 90% of EBV-regulated genes. Many genes were linked to enhancers occupied by multiple EBNAs or NF-κB subunits. Incorporating these assays, we generated a comprehensive EBV regulome in LCLs.IMPORTANCE Epstein-Barr virus (EBV) immortalization of resting B lymphocytes (RBLs) is a useful model system to study EBV oncogenesis. By incorporating transcriptome sequencing (RNA-seq), chromatin immune precipitation followed by deep sequencing (ChIP-seq), chromatin interaction analysis followed by paired-end tag sequencing (ChIA-PET), and genome-wide clustered regularly interspaced short palindromic repeat (CRISPR) screen, we identified key pathways that EBV usurps to enable B cell growth and transformation. Multiple layers of regulation could be achieved by cooperations between multiple EBV transcription factors binding to the same enhancers. EBV manipulated the expression of most cell genes essential for lymphoblastoid cell line (LCL) growth and survival. In addition to proteins, long noncoding RNAs (lncRNAs) regulated by EBV also contributed to LCL growth and survival. The data presented in this paper not only allowed us to further define the molecular pathogenesis of EBV but also serve as a useful resource to the EBV research community.


Assuntos
Linfócitos B/virologia , Infecções por Vírus Epstein-Barr/genética , Infecções por Vírus Epstein-Barr/metabolismo , Regulação Viral da Expressão Gênica , Herpesvirus Humano 4/genética , Herpesvirus Humano 4/fisiologia , Análise de Sequência de RNA , Análise de Variância , Linhagem Celular , Cromatina/metabolismo , Imunoprecipitação da Cromatina , RNA Polimerases Dirigidas por DNA , Infecções por Vírus Epstein-Barr/virologia , Antígenos Nucleares do Vírus Epstein-Barr/genética , Antígenos Nucleares do Vírus Epstein-Barr/metabolismo , Herpesvirus Humano 4/patogenicidade , Sequenciamento de Nucleotídeos em Larga Escala , Interações Hospedeiro-Patógeno/genética , Interações Hospedeiro-Patógeno/fisiologia , Humanos , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo , Fatores de Transcrição/metabolismo , Transcriptoma , Latência Viral/genética
12.
J Virol ; 93(16)2019 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-31167905

RESUMO

Super-enhancers (SEs) are clusters of enhancers marked by extraordinarily high and broad chromatin immunoprecipitation followed by deep sequencing (ChIP-seq) signals for H3K27ac or other transcription factors (TFs). SEs play pivotal roles in development and oncogenesis. Epstein-Barr virus (EBV) super-enhancers (ESEs) are co-occupied by all essential EBV oncogenes and EBV-activated NF-κB subunits. Perturbation of ESEs stops lymphoblastoid cell line (LCL) growth. To further characterize ESEs and identify proteins critical for ESE function, MYC ESEs were cloned upstream of a green fluorescent protein (GFP) reporter. Reporters driven by MYC ESEs 525 kb and 428 kb upstream of MYC (525ESE and 428ESE) had very high activities in LCLs but not in EBV-negative BJAB cells. EBNA2 activated MYC ESE-driven luciferase reporters. CRISPRi targeting 525ESE significantly decreased MYC expression. Genome-wide CRISPR screens identified factors essential for ESE activity. TBP-associated factor (TAF) family proteins, including TAF8, TAF11, and TAF3, were essential for the activity of the integrated 525ESE-driven reporter in LCLs. TAF8 and TAF11 knockout significantly decreased 525ESE activity and MYC transcription. MEF2C was also identified to be essential for 525ESE activity. Depletion of MEF2C decreased 525ESE reporter activity, MYC expression, and LCL growth. MEF2C cDNA resistant to CRIPSR cutting rescued MEF2C knockout and restored 525ESE reporter activity and MYC expression. MEF2C depletion decreased IRF4, EBNA2, and SPI1 binding to 525ESE in LCLs. MEF2C depletion also affected the expression of other ESE target genes, including the ETS1 and BCL2 genes. These data indicated that in addition to EBNA2, TAF family members and MEF2C are essential for ESE activity, MYC expression, and LCL growth.IMPORTANCE SEs play critical roles in cancer development. Since SEs assemble much bigger protein complexes on enhancers than typical enhancers (TEs), they are more sensitive than TEs to perturbations. Understanding the protein composition of SEs that are linked to key oncogenes may identify novel therapeutic targets. A genome-wide CRISPR screen specifically identified proteins essential for MYC ESE activity but not simian virus 40 (SV40) enhancer. These proteins not only were essential for the reporter activity but also were also important for MYC expression and LCL growth. Targeting these proteins may lead to new therapies for EBV-associated cancers.


Assuntos
Infecções por Vírus Epstein-Barr/virologia , Regulação Viral da Expressão Gênica , Herpesvirus Humano 4/fisiologia , Fatores Associados à Proteína de Ligação a TATA/metabolismo , Sistemas CRISPR-Cas , Linhagem Celular Tumoral , Sobrevivência Celular/genética , Elementos Facilitadores Genéticos , Edição de Genes , Expressão Gênica , Técnicas de Inativação de Genes , Genes myc , Histonas/metabolismo , Interações Hospedeiro-Patógeno , Humanos , Fatores de Transcrição MEF2/genética , Fatores de Transcrição MEF2/metabolismo
13.
Biostatistics ; 19(4): 562-578, 2018 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-29121214

RESUMO

Until recently, high-throughput gene expression technology, such as RNA-Sequencing (RNA-seq) required hundreds of thousands of cells to produce reliable measurements. Recent technical advances permit genome-wide gene expression measurement at the single-cell level. Single-cell RNA-Seq (scRNA-seq) is the most widely used and numerous publications are based on data produced with this technology. However, RNA-seq and scRNA-seq data are markedly different. In particular, unlike RNA-seq, the majority of reported expression levels in scRNA-seq are zeros, which could be either biologically-driven, genes not expressing RNA at the time of measurement, or technically-driven, genes expressing RNA, but not at a sufficient level to be detected by sequencing technology. Another difference is that the proportion of genes reporting the expression level to be zero varies substantially across single cells compared to RNA-seq samples. However, it remains unclear to what extent this cell-to-cell variation is being driven by technical rather than biological variation. Furthermore, while systematic errors, including batch effects, have been widely reported as a major challenge in high-throughput technologies, these issues have received minimal attention in published studies based on scRNA-seq technology. Here, we use an assessment experiment to examine data from published studies and demonstrate that systematic errors can explain a substantial percentage of observed cell-to-cell expression variability. Specifically, we present evidence that some of these reported zeros are driven by technical variation by demonstrating that scRNA-seq produces more zeros than expected and that this bias is greater for lower expressed genes. In addition, this missing data problem is exacerbated by the fact that this technical variation varies cell-to-cell. Then, we show how this technical cell-to-cell variability can be confused with novel biological results. Finally, we demonstrate and discuss how batch-effects and confounded experiments can intensify the problem.


Assuntos
Perfilação da Expressão Gênica/normas , Sequenciamento de Nucleotídeos em Larga Escala/normas , Análise de Sequência de RNA/normas , Análise de Célula Única/normas , Transcriptoma , Animais , Humanos
14.
BMC Bioinformatics ; 19(Suppl 5): 112, 2018 04 11.
Artigo em Inglês | MEDLINE | ID: mdl-29671389

RESUMO

BACKGROUND: Somatic copy number alternations (SCNAs) can be utilized to infer tumor subclonal populations in whole genome seuqncing studies, where usually their read count ratios between tumor-normal paired samples serve as the inferring proxy. Existing SCNA based subclonal population inferring tools consider the GC bias of tumor and normal sample is of the same fature, and could be fully offset by read count ratio. However, we found that, the read count ratio on SCNA segments presents a Log linear biased pattern, which influence existing read count ratios based subclonal inferring tools performance. Currently no correction tools take into account the read ratio bias. RESULTS: We present Pre-SCNAClonal, a tool that improving tumor subclonal population inferring by correcting GC-bias at SCNAs level. Pre-SCNAClonal first corrects GC bias using Markov chain Monte Carlo probability model, then accurately locates baseline DNA segments (not containing any SCNAs) with a hierarchy clustering model. We show Pre-SCNAClonal's superiority to exsiting GC-bias correction methods at any level of subclonal population. CONCLUSIONS: Pre-SCNAClonal could be run independently as well as serving as pre-processing/gc-correction step in conjuntion with exsiting SCNA-based subclonal inferring tools.


Assuntos
Composição de Bases/genética , Variações do Número de Cópias de DNA/genética , Modelos Genéticos , Neoplasias/genética , Neoplasias/patologia , Sequenciamento Completo do Genoma , Viés , Linhagem Celular Tumoral , Células Clonais , Heterozigoto , Humanos , Cadeias de Markov , Método de Monte Carlo , Polimorfismo de Nucleotídeo Único/genética
15.
Bioinformatics ; 32(11): 1625-31, 2016 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-26568628

RESUMO

MOTIVATION: Single Molecule Real-Time (SMRT) sequencing has been widely applied in cutting-edge genomic studies. However, it is still an expensive task to align the noisy long SMRT reads to reference genome by state-of-the-art aligners, which is becoming a bottleneck in applications with SMRT sequencing. Novel approach is on demand for improving the efficiency and effectiveness of SMRT read alignment. RESULTS: We propose Regional Hashing-based Alignment Tool (rHAT), a seed-and-extension-based read alignment approach specifically designed for noisy long reads. rHAT indexes reference genome by regional hash table (RHT), a hash table-based index which describes the short tokens within local windows of reference genome. In the seeding phase, rHAT utilizes RHT for efficiently calculating the occurrences of short token matches between partial read and local genomic windows to find highly possible candidate sites. In the extension phase, a sparse dynamic programming-based heuristic approach is used for reducing the cost of aligning read to the candidate sites. By benchmarking on the real and simulated datasets from various prokaryote and eukaryote genomes, we demonstrated that rHAT can effectively align SMRT reads with outstanding throughput. AVAILABILITY AND IMPLEMENTATION: rHAT is implemented in C++; the source code is available at https://github.com/HIT-Bioinformatics/rHAT CONTACT: ydwang@hit.edu.cn SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Software , Algoritmos , Genômica , Alinhamento de Sequência , Análise de Sequência de DNA
16.
Bioinformatics ; 31(14): 2262-8, 2015 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-25788626

RESUMO

MOTIVATION: Families with inherited diseases are widely used in Mendelian/complex disease studies. Owing to the advances in high-throughput sequencing technologies, family genome sequencing becomes more and more prevalent. Visualizing family genomes can greatly facilitate human genetics studies and personalized medicine. However, due to the complex genetic relationships and high similarities among genomes of consanguineous family members, family genomes are difficult to be visualized in traditional genome visualization framework. How to visualize the family genome variants and their functions with integrated pedigree information remains a critical challenge. RESULTS: We developed the Family Genome Browser (FGB) to provide comprehensive analysis and visualization for family genomes. The FGB can visualize family genomes in both individual level and variant level effectively, through integrating genome data with pedigree information. Family genome analysis, including determination of parental origin of the variants, detection of de novo mutations, identification of potential recombination events and identical-by-decent segments, etc., can be performed flexibly. Diverse annotations for the family genome variants, such as dbSNP memberships, linkage disequilibriums, genes, variant effects, potential phenotypes, etc., are illustrated as well. Moreover, the FGB can automatically search de novo mutations and compound heterozygous variants for a selected individual, and guide investigators to find high-risk genes with flexible navigation options. These features enable users to investigate and understand family genomes intuitively and systematically. AVAILABILITY AND IMPLEMENTATION: The FGB is available at http://mlg.hit.edu.cn/FGB/.


Assuntos
Genoma Humano , Linhagem , Software , Gráficos por Computador , Variação Genética , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Anotação de Sequência Molecular
17.
Nucleic Acids Res ; 42(Web Server issue): W192-7, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24799434

RESUMO

Advances in high-throughput sequencing technologies have brought us into the individual genome era. Projects such as the 1000 Genomes Project have led the individual genome sequencing to become more and more popular. How to visualize, analyse and annotate individual genomes with knowledge bases to support genome studies and personalized healthcare is still a big challenge. The Personal Genome Browser (PGB) is developed to provide comprehensive functional annotation and visualization for individual genomes based on the genetic-molecular-phenotypic model. Investigators can easily view individual genetic variants, such as single nucleotide variants (SNVs), INDELs and structural variations (SVs), as well as genomic features and phenotypes associated to the individual genetic variants. The PGB especially highlights potential functional variants using the PGB built-in method or SIFT/PolyPhen2 scores. Moreover, the functional risks of genes could be evaluated by scanning individual genetic variants on the whole genome, a chromosome, or a cytoband based on functional implications of the variants. Investigators can then navigate to high risk genes on the scanned individual genome. The PGB accepts Variant Call Format (VCF) and Genetic Variation Format (GVF) files as the input. The functional annotation of input individual genome variants can be visualized in real time by well-defined symbols and shapes. The PGB is available at http://www.pgbrowser.org/.


Assuntos
Variação Genética , Genoma Humano , Software , Gráficos por Computador , Genômica , Humanos , Internet
18.
bioRxiv ; 2024 May 09.
Artigo em Inglês | MEDLINE | ID: mdl-38766209

RESUMO

Epstein-Barr virus (EBV) uses latency programs to colonize the memory B-cell reservoir, and each program is associated with human malignancies. However, knowledge remains incomplete of epigenetic mechanisms that maintain the highly restricted latency I program, present in memory and Burkitt lymphoma cells, in which EBNA1 is the only EBV-encoded protein expressed. Given increasing appreciation that higher order chromatin architecture is an important determinant of viral and host gene expression, we investigated roles of Wings Apart-Like Protein Homolog (WAPL), a host factor that unloads cohesins to control DNA loop size and that was discovered as an EBNA2-associated protein. WAPL knockout (KO) in Burkitt cells de-repressed LMP1 and LMP2A expression but not other EBV oncogenes to yield a viral program reminiscent of EBV latency II, which is rarely observed in B-cells. WAPL KO also increased LMP1/2A levels in latency III lymphoblastoid cells. WAPL KO altered EBV genome architecture, triggering formation of DNA loops between the LMP promoter region and the EBV origins of lytic replication (oriLyt). Hi-C analysis further demonstrated that WAPL KO reprograms EBV genomic DNA looping. LMP1 and LMP2A de-repression correlated with decreased histone repressive marks at their promoters. We propose that EBV coopts WAPL to negatively regulate latent membrane protein expression to maintain Burkitt latency I. Author Summary: EBV is a highly prevalent herpesvirus etiologically linked to multiple lymphomas, gastric and nasopharyngeal carcinomas, and multiple sclerosis. EBV persists in the human host in B-cells that express a series of latency programs, each of which is observed in a distinct type of human lymphoma. The most restricted form of EBV latency, called latency I, is observed in memory cells and in most Burkitt lymphomas. In this state, EBNA1 is the only EBV-encoded protein expressed to facilitate infected cell immunoevasion. However, epigenetic mechanisms that repress expression of the other eight EBV-encoded latency proteins remain to be fully elucidated. We hypothesized that the host factor WAPL might have a role in restriction of EBV genes, as it is a major regulator of long-range DNA interactions by negatively regulating cohesin proteins that stabilize DNA loops, and WAPL was found in a yeast 2-hybrid screen for EBNA2-interacting host factors. Using CRISPR together with Hi-ChIP and Hi-C DNA architecture analyses, we uncovered WAPL roles in suppressing expression of LMP1 and LMP2A, which mimic signaling by CD40 and B-cell immunoglobulin receptors, respectively. These proteins are expressed together with EBNA1 in the latency II program. We demonstrate that WAPL KO changes EBV genomic architecture, including allowing the formation of DNA loops between the oriLyt enhancers and the LMP promoter regions. Collectively, our study suggests that WAPL reinforces Burkitt latency I by preventing the formation of DNA loops that may instead support the latency II program.

19.
Bioinformatics ; 28(14): 1879-86, 2012 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-22611130

RESUMO

MOTIVATION: One of the fundamental questions in genetics study is to identify functional DNA variants that are responsible to a disease or phenotype of interest. Results from large-scale genetics studies, such as genome-wide association studies (GWAS), and the availability of high-throughput sequencing technologies provide opportunities in identifying causal variants. Despite the technical advances, informatics methodologies need to be developed to prioritize thousands of variants for potential causative effects. RESULTS: We present regSNPs, an informatics strategy that integrates several established bioinformatics tools, for prioritizing regulatory SNPs, i.e. the SNPs in the promoter regions that potentially affect phenotype through changing transcription of downstream genes. Comparing to existing tools, regSNPs has two distinct features. It considers degenerative features of binding motifs by calculating the differences on the binding affinity caused by the candidate variants and integrates potential phenotypic effects of various transcription factors. When tested by using the disease-causing variants documented in the Human Gene Mutation Database, regSNPs showed mixed performance on various diseases. regSNPs predicted three SNPs that can potentially affect bone density in a region detected in an earlier linkage study. Potential effects of one of the variants were validated using luciferase reporter assay.


Assuntos
Biologia Computacional/métodos , Polimorfismo de Nucleotídeo Único , Regiões Promotoras Genéticas , Fatores de Transcrição/genética , Área Sob a Curva , Sítios de Ligação , Bases de Dados Genéticas , Ligação Genética , Genoma Humano , Estudo de Associação Genômica Ampla , Células HEK293 , Humanos , Fenótipo , Curva ROC
20.
Methods Mol Biol ; 2629: 169-181, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36929078

RESUMO

Chromatin immunoprecipitation sequencing (ChIP-seq) has been widely performed to identify protein binding information along the genome. The sequencing protocol is quite flexible and mature to measure different types of protein binding as long as sequencing parameters are properly tailored to accommodate protein features. Two distinct types of protein binding are point-source-like binding by transcription factors and diffused-distribution binding by histone modifications. Consequently, statistical approaches have been proposed to address ChIP-seq-related questions according to different protein features. In this chapter, we briefly summarize statistical principles, approaches, and tools that are widely implemented in modeling ChIP-seq data, from raw data quality control to final result reporting. We discuss the key solutions in addressing eight routine questions in ChIP-seq applications. We also include discussion on approaches fitting unique data features in different ChIP-seq types. We hope this chapter will serve as a brief guide, especially for ChIP-seq beginners, to provide them with a high-level overview to understand and design processing plans for their ChIP-seq experiments.


Assuntos
Sequenciamento de Cromatina por Imunoprecipitação , Fatores de Transcrição , Imunoprecipitação da Cromatina/métodos , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Genoma , Ligação Proteica , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA