Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 141
Filtrar
Mais filtros

Bases de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Bioinformatics ; 39(6)2023 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-37243667

RESUMO

MOTIVATION: Single-cell sequencing enables exploring the pathways and processes of cells, and cell populations. However, there is a paucity of pathway enrichment methods designed to tolerate the high noise and low gene coverage of this technology. When gene expression data are noisy and signals are sparse, testing pathway enrichment based on the genes expression may not yield statistically significant results, which is particularly problematic when detecting the pathways enriched in less abundant cells that are vulnerable to disturbances. RESULTS: In this project, we developed a Weighted Concept Signature Enrichment Analysis specialized for pathway enrichment analysis from single-cell transcriptomics (scRNA-seq). Weighted Concept Signature Enrichment Analysis took a broader approach for assessing the functional relations of pathway gene sets to differentially expressed genes, and leverage the cumulative signature of molecular concepts characteristic of the highly differentially expressed genes, which we termed as the universal concept signature, to tolerate the high noise and low coverage of this technology. We then incorporated Weighted Concept Signature Enrichment Analysis into an R package called "IndepthPathway" for biologists to broadly leverage this method for pathway analysis based on bulk and single-cell sequencing data. Through simulating technical variability and dropouts in gene expression characteristic of scRNA-seq as well as benchmarking on a real dataset of matched single-cell and bulk RNAseq data, we demonstrate that IndepthPathway presents outstanding stability and depth in pathway enrichment results under stochasticity of the data, thus will substantially improve the scientific rigor of the pathway analysis for single-cell sequencing data. AVAILABILITY AND IMPLEMENTATION: The IndepthPathway R package is available through: https://github.com/wangxlab/IndepthPathway.


Assuntos
Análise de Célula Única , Software , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Perfilação da Expressão Gênica/métodos , Sequenciamento do Exoma
2.
Environ Sci Technol ; 58(13): 5889-5898, 2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38501580

RESUMO

Human exposure to toxic chemicals presents a huge health burden. Key to understanding chemical toxicity is knowledge of the molecular target(s) of the chemicals. Because a comprehensive safety assessment for all chemicals is infeasible due to limited resources, a robust computational method for discovering targets of environmental exposures is a promising direction for public health research. In this study, we implemented a novel matrix completion algorithm named coupled matrix-matrix completion (CMMC) for predicting direct and indirect exposome-target interactions, which exploits the vast amount of accumulated data regarding chemical exposures and their molecular targets. Our approach achieved an AUC of 0.89 on a benchmark data set generated using data from the Comparative Toxicogenomics Database. Our case studies with bisphenol A and its analogues, PFAS, dioxins, PCBs, and VOCs show that CMMC can be used to accurately predict molecular targets of novel chemicals without any prior bioactivity knowledge. Our results demonstrate the feasibility and promise of computationally predicting environmental chemical-target interactions to efficiently prioritize chemicals in hazard identification and risk assessment.


Assuntos
Dioxinas , Bifenilos Policlorados , Humanos , Exposição Ambiental/análise , Bifenilos Policlorados/análise , Medição de Risco , Saúde Pública
3.
Brief Bioinform ; 22(2): 2161-2171, 2021 03 22.
Artigo em Inglês | MEDLINE | ID: mdl-32186716

RESUMO

Predicting the interactions between drugs and targets plays an important role in the process of new drug discovery, drug repurposing (also known as drug repositioning). There is a need to develop novel and efficient prediction approaches in order to avoid the costly and laborious process of determining drug-target interactions (DTIs) based on experiments alone. These computational prediction approaches should be capable of identifying the potential DTIs in a timely manner. Matrix factorization methods have been proven to be the most reliable group of methods. Here, we first propose a matrix factorization-based method termed 'Coupled Matrix-Matrix Completion' (CMMC). Next, in order to utilize more comprehensive information provided in different databases and incorporate multiple types of scores for drug-drug similarities and target-target relationship, we then extend CMMC to 'Coupled Tensor-Matrix Completion' (CTMC) by considering drug-drug and target-target similarity/interaction tensors. Results: Evaluation on two benchmark datasets, DrugBank and TTD, shows that CTMC outperforms the matrix-factorization-based methods: GRMF, $L_{2,1}$-GRMF, NRLMF and NRLMF$\beta $. Based on the evaluation, CMMC and CTMC outperform the above three methods in term of area under the curve, F1 score, sensitivity and specificity in a considerably shorter run time.


Assuntos
Biologia Computacional/métodos , Sistemas de Liberação de Medicamentos , Algoritmos , Desenvolvimento de Medicamentos , Interações Medicamentosas , Humanos
4.
Brief Bioinform ; 22(1): 247-269, 2021 01 18.
Artigo em Inglês | MEDLINE | ID: mdl-31950972

RESUMO

The task of predicting the interactions between drugs and targets plays a key role in the process of drug discovery. There is a need to develop novel and efficient prediction approaches in order to avoid costly and laborious yet not-always-deterministic experiments to determine drug-target interactions (DTIs) by experiments alone. These approaches should be capable of identifying the potential DTIs in a timely manner. In this article, we describe the data required for the task of DTI prediction followed by a comprehensive catalog consisting of machine learning methods and databases, which have been proposed and utilized to predict DTIs. The advantages and disadvantages of each set of methods are also briefly discussed. Lastly, the challenges one may face in prediction of DTI using machine learning approaches are highlighted and we conclude by shedding some lights on important future research directions.


Assuntos
Biologia Computacional/métodos , Descoberta de Drogas/métodos , Aprendizado de Máquina , Bases de Dados Factuais , Humanos
5.
Brief Bioinform ; 21(5): 1717-1732, 2020 09 25.
Artigo em Inglês | MEDLINE | ID: mdl-31631213

RESUMO

Identifying new gene functions and pathways underlying diseases and biological processes are major challenges in genomics research. Particularly, most methods for interpreting the pathways characteristic of an experimental gene list defined by genomic data are limited by their dependence on assessing the overlapping genes or their interactome topology, which cannot account for the variety of functional relations. This is particularly problematic for pathway discovery from single-cell genomics with low gene coverage or interpreting complex pathway changes such as during change of cell states. Here, we exploited the comprehensive sets of molecular concepts that combine ontologies, pathways, interactions and domains to help inform the functional relations. We first developed a universal concept signature (uniConSig) analysis for genome-wide quantification of new gene functions underlying biological or pathological processes based on the signature molecular concepts computed from known functional gene lists. We then further developed a novel concept signature enrichment analysis (CSEA) for deep functional assessment of the pathways enriched in an experimental gene list. This method is grounded on the framework of shared concept signatures between gene sets at multiple functional levels, thus overcoming the limitations of the current methods. Through meta-analysis of transcriptomic data sets of cancer cell line models and single hematopoietic stem cells, we demonstrate the broad applications of CSEA on pathway discovery from gene expression and single-cell transcriptomic data sets for genetic perturbations and change of cell states, which complements the current modalities. The R modules for uniConSig analysis and CSEA are available through https://github.com/wangxlab/uniConSig.


Assuntos
Perfilação da Expressão Gênica , Redes Reguladoras de Genes , Algoritmos , Linhagem Celular Tumoral , Bases de Dados Genéticas , Conjuntos de Dados como Assunto , Genômica , Humanos
6.
J Biol Chem ; 295(26): 8834-8845, 2020 06 26.
Artigo em Inglês | MEDLINE | ID: mdl-32398261

RESUMO

Anaplastic thyroid cancer (ATC) is one of the most aggressive human malignancies, with an average life expectancy of ∼6 months from the time of diagnosis. The genetic and epigenetic changes that underlie this malignancy are incompletely understood. We found that ASH1-like histone lysine methyltransferase (ASH1L) is overexpressed in ATC relative to the much less aggressive and more common differentiated thyroid cancer. This increased expression was due at least in part to reduced levels of microRNA-200b-3p (miR-200b-3p), which represses ASH1L expression, in ATC. Genetic knockout of ASH1L protein expression in ATC cell lines decreased cell growth both in culture and in mouse xenografts. RNA-Seq analysis of ASH1L knockout versus WT ATC cell lines revealed that ASH1L is involved in the regulation of numerous cancer-related genes and gene sets. The pro-oncogenic long noncoding RNA colon cancer-associated transcript 1 (CCAT1) was one of the most highly (approximately 68-fold) down-regulated transcripts in ASH1L knockout cells. Therefore, we investigated CCAT1 as a potential mediator of the growth-inducing activity of ASH1L. Supporting this hypothesis, CCAT1 knockdown in ATC cells decreased their growth rate, and ChIP-Seq data indicated that CCAT1 is likely a direct target of ASH1L's histone methyltransferase activity. These results indicate that ASH1L contributes to the aggressiveness of ATC and suggest that ASH1L, along with its upstream regulator miR-200b-3p and its downstream mediator CCAT1, represents a potential therapeutic target in ATC.


Assuntos
Proteínas de Ligação a DNA/genética , Histona-Lisina N-Metiltransferase/genética , Carcinoma Anaplásico da Tireoide/genética , Neoplasias da Glândula Tireoide/genética , Animais , Linhagem Celular Tumoral , Feminino , Regulação Neoplásica da Expressão Gênica , Humanos , Camundongos Endogâmicos NOD , Camundongos SCID , Carcinoma Anaplásico da Tireoide/patologia , Neoplasias da Glândula Tireoide/patologia
7.
J Biol Chem ; 295(25): 8537-8549, 2020 06 19.
Artigo em Inglês | MEDLINE | ID: mdl-32371391

RESUMO

Overexpression of centromeric proteins has been identified in a number of human malignancies, but the functional and mechanistic contributions of these proteins to disease progression have not been characterized. The centromeric histone H3 variant centromere protein A (CENPA) is an epigenetic mark that determines centromere identity. Here, using an array of approaches, including RNA-sequencing and ChIP-sequencing analyses, immunohistochemistry-based tissue microarrays, and various cell biology assays, we demonstrate that CENPA is highly overexpressed in prostate cancer in both tissue and cell lines and that the level of CENPA expression correlates with the disease stage in a large cohort of patients. Gain-of-function and loss-of-function experiments confirmed that CENPA promotes prostate cancer cell line growth. The results from the integrated sequencing experiments suggested a previously unidentified function of CENPA as a transcriptional regulator that modulates expression of critical proliferation, cell-cycle, and centromere/kinetochore genes. Taken together, our findings show that CENPA overexpression is crucial to prostate cancer growth.


Assuntos
Proteína Centromérica A/metabolismo , Histonas/metabolismo , Neoplasias da Próstata/patologia , Proteínas de Ciclo Celular/metabolismo , Divisão Celular , Linhagem Celular Tumoral , Proliferação de Células/genética , Proteína Centromérica A/antagonistas & inibidores , Proteína Centromérica A/genética , Mutação com Ganho de Função , Histonas/genética , Humanos , Masculino , Neoplasias da Próstata/metabolismo , Interferência de RNA , RNA Interferente Pequeno/metabolismo
8.
Nutr Cancer ; 71(5): 772-780, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30862188

RESUMO

AIM: Soy isoflavones have been suggested as epigenetic modulating agents with effects that could be important in carcinogenesis. Hypomethylation of LINE-1 has been associated with head and neck squamous cell carcinoma (HNSCC) development from oral premalignant lesions and with poor prognosis. To determine if neoadjuvant soy isoflavone supplementation could modulate LINE-1 methylation in HNSCC, we undertook a clinical trial. METHODS: Thirty-nine patients received 2-3 weeks of soy isoflavone supplements (300 mg/day) orally prior to surgery. Methylation of LINE-1, and 6 other genes was measured by pyrosequencing in biopsy, resection, and whole blood (WB) specimens. Changes in methylation were tested using paired t tests and ANOVA. Median follow up was 45 months. RESULTS: LINE-1 methylation increased significantly after soy isoflavone (P < 0.005). Amount of change correlated positively with days of isoflavone taken (P = 0.04). Similar changes were not seen in corresponding WB samples. No significant changes in tumor or blood methylation levels were seen in the other candidate genes. CONCLUSION: This is the first demonstration of in vivo increases in tissue-specific global methylation associated with soy isoflavone intake in patients with HNSCC. Prior associations of LINE-1 hypomethylation with genetic instability, carcinogenesis, and prognosis suggest that soy isoflavones maybe potential chemopreventive agents in HNSCC.


Assuntos
Metilação de DNA/efeitos dos fármacos , Suplementos Nutricionais , Neoplasias de Cabeça e Pescoço/tratamento farmacológico , Isoflavonas/farmacologia , Elementos Nucleotídeos Longos e Dispersos/efeitos dos fármacos , Carcinoma de Células Escamosas de Cabeça e Pescoço/tratamento farmacológico , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Glycine max
10.
Bioinformatics ; 33(15): 2381-2383, 2017 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-28369316

RESUMO

MOTIVATION: Analysis of next-generation sequencing data often results in a list of genomic regions. These may include differentially methylated CpGs/regions, transcription factor binding sites, interacting chromatin regions, or GWAS-associated SNPs, among others. A common analysis step is to annotate such genomic regions to genomic annotations (promoters, exons, enhancers, etc.). Existing tools are limited by a lack of annotation sources and flexible options, the time it takes to annotate regions, an artificial one-to-one region-to-annotation mapping, a lack of visualization options to easily summarize data, or some combination thereof. RESULTS: We developed the annotatr Bioconductor package to flexibly and quickly summarize and plot annotations of genomic regions. The annotatr package reports all intersections of regions and annotations, giving a better understanding of the genomic context of the regions. A variety of graphics functions are implemented to easily plot numerical or categorical data associated with the regions across the annotations, and across annotation intersections, providing insight into how characteristics of the regions differ across the annotations. We demonstrate that annotatr is up to 27× faster than comparable R packages. Overall, annotatr enables a richer biological interpretation of experiments. AVAILABILITY AND IMPLEMENTATION: http://bioconductor.org/packages/annotatr/ and https://github.com/rcavalcante/annotatr. CONTACT: rcavalca@umich.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Anotação de Sequência Molecular/métodos , Sequências Reguladoras de Ácido Nucleico , Análise de Sequência de DNA/métodos , Software , Cromatina/metabolismo , Éxons , Genômica/métodos , Polimorfismo de Nucleotídeo Único
11.
J Biol Chem ; 291(37): 19274-86, 2016 09 09.
Artigo em Inglês | MEDLINE | ID: mdl-27435678

RESUMO

A subset of thyroid carcinomas contains a t(2;3)(q13;p25) chromosomal translocation that fuses paired box gene 8 (PAX8) with the peroxisome proliferator-activated receptor γ gene (PPARG), resulting in expression of a PAX8-PPARγ fusion protein, PPFP. We previously generated a transgenic mouse model of PPFP thyroid carcinoma and showed that feeding the PPARγ agonist pioglitazone greatly decreased the size of the primary tumor and prevented metastatic disease in vivo The antitumor effect correlates with the fact that pioglitazone turns PPFP into a strongly PPARγ-like molecule, resulting in trans-differentiation of the thyroid cancer cells into adipocyte-like cells that lose malignant character as they become more differentiated. To further study this process, we performed cell culture experiments with thyrocytes from the PPFP mouse thyroid cancers. Our data show that pioglitazone induced cellular lipid accumulation and the expression of adipocyte marker genes in the cultured cells, and shRNA knockdown of PPFP eliminated this pioglitazone effect. In addition, we found that PPFP and thyroid transcription factor 1 (TTF-1) physically interact, and that these transcription factors bind near each other on numerous target genes. TTF-1 knockdown and overexpression studies showed that TTF-1 inhibits PPFP target gene expression and impairs adipogenic trans-differentiation. Surprisingly, pioglitazone repressed TTF-1 expression in PPFP-expressing thyrocytes. Our data indicate that TTF-1 interacts with PPFP to inhibit the pro-adipogenic response to pioglitazone, and that the ability of pioglitazone to decrease TTF-1 expression contributes to its pro-adipogenic action.


Assuntos
Adipogenia , Diferenciação Celular , Proteínas de Fusão Oncogênica/metabolismo , Fator de Transcrição PAX8/metabolismo , PPAR gama/metabolismo , Neoplasias da Glândula Tireoide/metabolismo , Animais , Linhagem Celular Tumoral , Camundongos , Proteínas Nucleares , Proteínas de Fusão Oncogênica/genética , Fator de Transcrição PAX8/genética , PPAR gama/genética , Ratos , Neoplasias da Glândula Tireoide/genética , Neoplasias da Glândula Tireoide/patologia , Fator Nuclear 1 de Tireoide , Fatores de Transcrição
12.
Bioinformatics ; 32(7): 1100-2, 2016 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-26607492

RESUMO

UNLABELLED: Tests for differential gene expression with RNA-seq data have a tendency to identify certain types of transcripts as significant, e.g. longer and highly-expressed transcripts. This tendency has been shown to bias gene set enrichment (GSE) testing, which is used to find over- or under-represented biological functions in the data. Yet, there remains a surprising lack of tools for GSE testing specific for RNA-seq. We present a new GSE method for RNA-seq data, RNA-Enrich, that accounts for the above tendency empirically by adjusting for average read count per gene. RNA-Enrich is a quick, flexible method and web-based tool, with 16 available gene annotation databases. It does not require a P-value cut-off to define differential expression, and works well even with small sample-sized experiments. We show that adjusting for read counts per gene improves both the type I error rate and detection power of the test. AVAILABILITY AND IMPLEMENTATION: RNA-Enrich is available at http://lrpath.ncibi.org or from supplemental material as R code. CONTACT: sartorma@umich.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Anotação de Sequência Molecular , Análise de Sequência de RNA , Perfilação da Expressão Gênica , RNA , Software
13.
Bioinformatics ; 32(10): 1536-43, 2016 05 15.
Artigo em Inglês | MEDLINE | ID: mdl-26794319

RESUMO

MOTIVATION: Capabilities in the field of metabolomics have grown tremendously in recent years. Many existing resources contain the chemical properties and classifications of commonly identified metabolites. However, the annotation of small molecules (both endogenous and synthetic) to meaningful biological pathways and concepts still lags behind the analytical capabilities and the chemistry-based annotations. Furthermore, no tools are available to visually explore relationships and networks among functionally related groups of metabolites (biomedical concepts). Such a tool would provide the ability to establish testable hypotheses regarding links among metabolic pathways, cellular processes, phenotypes and diseases. RESULTS: Here we present ConceptMetab, an interactive web-based tool for mapping and exploring the relationships among 16 069 biologically defined metabolite sets developed from Gene Ontology, KEGG and Medical Subject Headings, using both KEGG and PubChem compound identifiers, and based on statistical tests for association. We demonstrate the utility of ConceptMetab with multiple scenarios, showing it can be used to identify known and potentially novel relationships among metabolic pathways, cellular processes, phenotypes and diseases, and provides an intuitive interface for linking compounds to their molecular functions and higher level biological effects. AVAILABILITY AND IMPLEMENTATION: http://conceptmetab.med.umich.edu CONTACTS: akarnovsky@umich.edu or sartorma@umich.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Metabolômica , Software , Conjuntos de Dados como Assunto , Humanos , Redes e Vias Metabólicas , Estatística como Assunto , Vocabulário Controlado
14.
Hum Mol Genet ; 23(17): 4528-42, 2014 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-24781209

RESUMO

To globally survey the changes in transcriptional landscape during terminal erythroid differentiation, we performed RNA sequencing (RNA-seq) on primary human CD34(+) cells after ex vivo differentiation from the earliest into the most mature erythroid cell stages. This analysis identified thousands of novel intergenic and intronic transcripts as well as novel alternative transcript isoforms. After rigorous data filtering, 51 (presumptive) novel protein-coding transcripts, 5326 long and 679 small non-coding RNA candidates remained. The analysis also revealed two clear transcriptional trends during terminal erythroid differentiation: first, the complexity of transcript diversity was predominantly achieved by alternative splicing, and second, splicing junctional diversity diminished during erythroid differentiation. Finally, 404 genes that were not known previously to be differentially expressed in erythroid cells were annotated. Analysis of the most extremely differentially expressed transcripts revealed that these gene products were all closely associated with hematopoietic lineage differentiation. Taken together, this study will serve as a comprehensive platform for future in-depth investigation of human erythroid development that, in turn, may reveal new insights into multiple layers of the transcriptional regulatory hierarchy that controls erythropoiesis.


Assuntos
Eritropoese/genética , Perfilação da Expressão Gênica , Regulação da Expressão Gênica no Desenvolvimento , Adulto , Diferenciação Celular/genética , Linhagem da Célula/genética , Células Eritroides/citologia , Células Eritroides/metabolismo , Humanos , Fases de Leitura Aberta/genética , Isoformas de Proteínas/metabolismo , Splicing de RNA/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , RNA não Traduzido/genética , Análise de Sequência de RNA , Globinas beta/metabolismo
15.
Breast Cancer Res Treat ; 158(1): 29-41, 2016 07.
Artigo em Inglês | MEDLINE | ID: mdl-27306423

RESUMO

Curcumin is a potential agent for both the prevention and treatment of cancers. Curcumin treatment alone, or in combination with piperine, limits breast stem cell self-renewal, while remaining non-toxic to normal differentiated cells. We paired fluorescence-activated cell sorting with RNA sequencing to characterize the genome-wide changes induced specifically in normal breast stem cells following treatment with these compounds. We generated genome-wide maps of the transcriptional changes that occur in epithelial-like (ALDH+) and mesenchymal-like (ALDH-/CD44+/CD24-) normal breast stem/progenitor cells following treatment with curcumin and piperine. We show that curcumin targets both stem cell populations by down-regulating expression of breast stem cell genes including ALDH1A3, CD49f, PROM1, and TP63. We also identified novel genes and pathways targeted by curcumin, including downregulation of SCD. Transient siRNA knockdown of SCD in MCF10A cells significantly inhibited mammosphere formation and the mean proportion of CD44+/CD24- cells, suggesting that SCD is a regulator of breast stemness and a target of curcumin in breast stem cells. These findings extend previous reports of curcumin targeting stem cells, here in two phenotypically distinct stem/progenitor populations isolated from normal human breast tissue. We identified novel mechanisms by which curcumin and piperine target breast stem cell self-renewal, such as by targeting lipid metabolism, providing a mechanistic link between curcumin treatment and stem cell self-renewal. These results elucidate the mechanisms by which curcumin may act as a cancer-preventive compound and provide novel targets for cancer prevention and treatment.


Assuntos
Antineoplásicos/farmacologia , Neoplasias da Mama/genética , Curcumina/farmacologia , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Estearoil-CoA Dessaturase/genética , Alcaloides/farmacologia , Benzodioxóis/farmacologia , Neoplasias da Mama/prevenção & controle , Diferenciação Celular/efeitos dos fármacos , Linhagem Celular Tumoral , Proliferação de Células/efeitos dos fármacos , Separação Celular , Feminino , Citometria de Fluxo , Regulação Neoplásica da Expressão Gênica/efeitos dos fármacos , Humanos , Células MCF-7 , Piperidinas/farmacologia , Alcamidas Poli-Insaturadas/farmacologia , Células-Tronco/efeitos dos fármacos , Células-Tronco/metabolismo
16.
Nucleic Acids Res ; 42(13): e105, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24878920

RESUMO

Gene set enrichment testing can enhance the biological interpretation of ChIP-seq data. Here, we develop a method, ChIP-Enrich, for this analysis which empirically adjusts for gene locus length (the length of the gene body and its surrounding non-coding sequence). Adjustment for gene locus length is necessary because it is often positively associated with the presence of one or more peaks and because many biologically defined gene sets have an excess of genes with longer or shorter gene locus lengths. Unlike alternative methods, ChIP-Enrich can account for the wide range of gene locus length-to-peak presence relationships (observed in ENCODE ChIP-seq data sets). We show that ChIP-Enrich has a well-calibrated type I error rate using permuted ENCODE ChIP-seq data sets; in contrast, two commonly used gene set enrichment methods, Fisher's exact test and the binomial test implemented in Genomic Regions Enrichment of Annotations Tool (GREAT), can have highly inflated type I error rates and biases in ranking. We identify DNA-binding proteins, including CTCF, JunD and glucocorticoid receptor α (GRα), that show different enrichment patterns for peaks closer to versus further from transcription start sites. We also identify known and potential new biological functions of GRα. ChIP-Enrich is available as a web interface (http://chip-enrich.med.umich.edu) and Bioconductor package.


Assuntos
Imunoprecipitação da Cromatina/métodos , Genes , Loci Gênicos , Análise de Sequência de DNA/métodos , Proteínas de Ligação a DNA/análise , Modelos Logísticos , Receptores de Glucocorticoides/análise
17.
J Virol ; 88(16): 8924-35, 2014 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-24872592

RESUMO

UNLABELLED: Approximately 8% of the human genome is made up of endogenous retroviral sequences. As the HIV-1 Tat protein activates the overall expression of the human endogenous retrovirus type K (HERV-K) (HML-2), we used next-generation sequencing to determine which of the 91 currently annotated HERV-K (HML-2) proviruses are regulated by Tat. Transcriptome sequencing of total RNA isolated from Tat- and vehicle-treated peripheral blood lymphocytes from a healthy donor showed that Tat significantly activates expression of 26 unique HERV-K (HML-2) proviruses, silences 12, and does not significantly alter the expression of the remaining proviruses. Quantitative reverse transcription-PCR validation of the sequencing data was performed on Tat-treated PBLs of seven donors using provirus-specific primers and corroborated the results with a substantial degree of quantitative similarity. IMPORTANCE: The expression of HERV-K (HML-2) is tightly regulated but becomes markedly increased following infection with HIV-1, in part due to the HIV-1 Tat protein. The findings reported here demonstrate the complexity of the genome-wide regulation of HERV-K (HML-2) expression by Tat. This work also demonstrates that although HERV-K (HML-2) proviruses in the human genome are highly similar in terms of DNA sequence, modulation of the expression of specific proviruses in a given biological situation can be ascertained using next-generation sequencing and bioinformatics analysis.


Assuntos
Retrovirus Endógenos/genética , Produtos do Gene tat/genética , Produtos do Gene tat/metabolismo , HIV-1/genética , HIV-1/metabolismo , Transcriptoma/genética , Células Cultivadas , Retrovirus Endógenos/metabolismo , Genoma Humano/genética , Infecções por HIV/genética , Infecções por HIV/metabolismo , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Linfócitos/virologia , Provírus/genética , Provírus/metabolismo , RNA Viral/genética , Proteínas Virais/genética , Proteínas Virais/metabolismo
18.
Bioinformatics ; 30(17): 2414-22, 2014 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-24836530

RESUMO

MOTIVATION: DNA methylation plays critical roles in gene regulation and cellular specification without altering DNA sequences. The wide application of reduced representation bisulfite sequencing (RRBS) and whole genome bisulfite sequencing (bis-seq) opens the door to study DNA methylation at single CpG site resolution. One challenging question is how best to test for significant methylation differences between groups of biological samples in order to minimize false positive findings. RESULTS: We present a statistical analysis package, methylSig, to analyse genome-wide methylation differences between samples from different treatments or disease groups. MethylSig takes into account both read coverage and biological variation by utilizing a beta-binomial approach across biological samples for a CpG site or region, and identifies relevant differences in CpG methylation. It can also incorporate local information to improve group methylation level and/or variance estimation for experiments with small sample size. A permutation study based on data from enhanced RRBS samples shows that methylSig maintains a well-calibrated type-I error when the number of samples is three or more per group. Our simulations show that methylSig has higher sensitivity compared with several alternative methods. The use of methylSig is illustrated with a comparison of different subtypes of acute leukemia and normal bone marrow samples. AVAILABILITY: methylSig is available as an R package at http://sartorlab.ccmb.med.umich.edu/software. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Metilação de DNA , Análise de Sequência de DNA/métodos , Software , Ilhas de CpG , Genômica , Humanos , Leucemia Mieloide Aguda/genética , Sulfitos
19.
Bioinformatics ; 30(17): i393-400, 2014 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-25161225

RESUMO

MOTIVATION: Functional enrichment testing facilitates the interpretation of Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) data in terms of pathways and other biological contexts. Previous methods developed and used to test for key gene sets affected in ChIP-seq experiments treat peaks as points, and are based on the number of peaks associated with a gene or a binary score for each gene. These approaches work well for transcription factors, but histone modifications often occur over broad domains, and across multiple genes. RESULTS: To incorporate the unique properties of broad domains into functional enrichment testing, we developed Broad-Enrich, a method that uses the proportion of each gene's locus covered by a peak. We show that our method has a well-calibrated false-positive rate, performing well with ChIP-seq data having broad domains compared with alternative approaches. We illustrate Broad-Enrich with 55 ENCODE ChIP-seq datasets using different methods to define gene loci. Broad-Enrich can also be applied to other datasets consisting of broad genomic domains such as copy number variations. AVAILABILITY AND IMPLEMENTATION: http://broad-enrich.med.umich.edu for Web version and R package. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Imunoprecipitação da Cromatina/métodos , Genômica/métodos , Histonas/metabolismo , Linhagem Celular , Loci Gênicos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Modelos Logísticos , Análise de Sequência de DNA , Fatores de Transcrição/metabolismo
20.
Bioinformatics ; 30(18): 2568-75, 2014 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-24894502

RESUMO

MOTIVATION: ChIP-Seq is the standard method to identify genome-wide DNA-binding sites for transcription factors (TFs) and histone modifications. There is a growing need to analyze experiments with biological replicates, especially for epigenomic experiments where variation among biological samples can be substantial. However, tools that can perform group comparisons are currently lacking. RESULTS: We present a peak-calling prioritization pipeline (PePr) for identifying consistent or differential binding sites in ChIP-Seq experiments with biological replicates. PePr models read counts across the genome among biological samples with a negative binomial distribution and uses a local variance estimation method, ranking consistent or differential binding sites more favorably than sites with greater variability. We compared PePr with commonly used and recently proposed approaches on eight TF datasets and show that PePr uniquely identifies consistent regions with enriched read counts, high motif occurrence rate and known characteristics of TF binding based on visual inspection. For histone modification data with broadly enriched regions, PePr identified differential regions that are consistent within groups and outperformed other methods in scaling False Discovery Rate (FDR) analysis. AVAILABILITY AND IMPLEMENTATION: http://code.google.com/p/pepr-chip-seq/.


Assuntos
Algoritmos , Imunoprecipitação da Cromatina/métodos , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Animais , Linhagem Celular Tumoral , Epigenômica , Camundongos , Motivos de Nucleotídeos , Análise de Sequência com Séries de Oligonucleotídeos , Análise de Sequência de DNA , Fatores de Transcrição/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA