Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 99
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Brief Bioinform ; 23(3)2022 05 13.
Artículo en Inglés | MEDLINE | ID: mdl-35272348

RESUMEN

Given most tissues are consist of abundant and diverse (sub-)cell types, an important yet unaddressed problem in bulk RNA-seq analysis is to identify at which (sub-)cell type(s) the differential expression occurs. Single-cell RNA-sequencing (scRNA-seq) technologies can answer the question, but they are often labor-intensive and cost-prohibitive. Here, we present LRcell, a computational method aiming to identify specific (sub-)cell type(s) that drives the changes observed in a bulk RNA-seq experiment. In addition, LRcell provides pre-embedded marker genes computed from putative scRNA-seq experiments as options to execute the analyses. We conduct a simulation study to demonstrate the effectiveness and reliability of LRcell. Using three different real datasets, we show that LRcell successfully identifies known cell types involved in psychiatric disorders. Applying LRcell to bulk RNA-seq results can produce a hypothesis on which (sub-)cell type(s) contributes to the differential expression. LRcell is complementary to cell type deconvolution methods.


Asunto(s)
Perfilación de la Expresión Génica , Análisis de la Célula Individual , Simulación por Computador , Perfilación de la Expresión Génica/métodos , Humanos , RNA-Seq , Reproducibilidad de los Resultados , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos
2.
Brief Bioinform ; 23(1)2022 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-34643213

RESUMEN

Understanding the impact of non-coding sequence variants on complex diseases is an essential problem. We present a novel ensemble learning framework-CASAVA, to predict genomic loci in terms of disease category-specific risk. Using disease-associated variants identified by GWAS as training data, and diverse sequencing-based genomics and epigenomics profiles as features, CASAVA provides risk prediction of 24 major categories of diseases throughout the human genome. Our studies showed that CASAVA scores at a genomic locus provide a reasonable prediction of the disease-specific and disease category-specific risk prediction for non-coding variants located within the locus. Taking MHC2TA and immune system diseases as an example, we demonstrate the potential of CASAVA in revealing variant-disease associations. A website (http://zhanglabtools.org/CASAVA) has been built to facilitate easily access to CASAVA scores.


Asunto(s)
Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , Genoma Humano , Genómica , Humanos , Aprendizaje Automático
3.
Neurobiol Dis ; 185: 106257, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37562656

RESUMEN

Alzheimer's disease (AD) is a neurodegenerative disorder influenced by a complex interplay of environmental, epigenetic, and genetic factors. DNA methylation (5mC) and hydroxymethylation (5hmC) are DNA modifications that serve as tissue-specific and temporal regulators of gene expression. TET family enzymes dynamically regulate these epigenetic modifications in response to environmental conditions, connecting environmental factors with gene expression. Previous epigenetic studies have identified 5mC and 5hmC changes associated with AD. In this study, we performed targeted resequencing of TET1 on a cohort of early-onset AD (EOAD) and control samples. Through gene-wise burden analysis, we observed significant enrichment of rare TET1 variants associated with AD (p = 0.04). We also profiled 5hmC in human postmortem brain tissues from AD and control groups. Our analysis identified differentially hydroxymethylated regions (DhMRs) in key genes responsible for regulating the methylome: TET3, DNMT3L, DNMT3A, and MECP2. To further investigate the role of Tet1 in AD pathogenesis, we used the 5xFAD mouse model with a Tet1 KO allele to examine how Tet1 loss influences AD pathogenesis. We observed significant changes in neuropathology, 5hmC, and RNA expression associated with Tet1 loss, while the behavioral alterations were not significant. The loss of Tet1 significantly increased amyloid plaque burden in the 5xFAD mouse (p = 0.044) and lead to a non-significant trend towards exacerbated AD-associated stress response in 5xFAD mice. At the molecular level, we found significant DhMRs enriched in genes involved in pathways responsible for neuronal projection organization, dendritic spine development and organization, and myelin assembly. RNA-Seq analysis revealed a significant increase in the expression of AD-associated genes such as Mpeg1, Ctsd, and Trem2. In conclusion, our results suggest that TET enzymes, particularly TET1, which regulate the methylome, may contribute to AD pathogenesis, as the loss of TET function increases AD-associated pathology.


Asunto(s)
Enfermedad de Alzheimer , Humanos , Ratones , Animales , Enfermedad de Alzheimer/metabolismo , 5-Metilcitosina , Epigénesis Genética , Metilación de ADN , Factores de Transcripción/metabolismo , Oxigenasas de Función Mixta/genética , Oxigenasas de Función Mixta/metabolismo , Proteínas Proto-Oncogénicas/genética , Proteínas Proto-Oncogénicas/metabolismo , Glicoproteínas de Membrana/metabolismo , Receptores Inmunológicos/metabolismo , Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo
4.
Prostate ; 83(6): 590-601, 2023 05.
Artículo en Inglés | MEDLINE | ID: mdl-36760203

RESUMEN

BACKGROUND: Long noncoding RNAs (lncRNAs) are RNA molecules with over 200 nucleotides that do not code for proteins, but are known to be widely expressed and have key roles in gene regulation and cellular functions. They are also found to be involved in the onset and development of various cancers, including prostate cancer (PCa). Since PCa are commonly driven by androgen regulated signaling, mainly stimulated pathways, identification and determining the influence of lncRNAs in androgen response is useful and necessary. LncRNAs regulated by the androgen receptor (AR) can serve as potential biomarkers for PCa. In the present study, gene expression data analysis were performed to distinguish lncRNAs related to the androgen response pathway. METHODS AND RESULTS: We used publicly available RNA-sequencing and ChIP-seq data to identify lncRNAs that are associated with the androgen response pathway. Using Universal Correlation Coefficient (UCC) and Pearson Correlation Coefficient (PCC) analyses, we found 15 lncRNAs that have (a) highly correlated expression with androgen response genes in PCa and are (b) differentially expressed in the setting of treatment with an androgen agonist as well as antagonist compared to controls. Using publicly available ChIP-seq data, we investigated the role of androgen/AR axis in regulating expression of these lncRNAs. We observed AR binding in the promoter regions of 5 lncRNAs (MIR99AHG, DUBR, DRAIC, PVT1, and COLCA1), showing the direct influence of AR on their expression and highlighting their association with the androgen response pathway. CONCLUSION: By utilizing publicly available multiomics data and by employing in silico methods, we identified five candidate lncRNAs that are involved in the androgen response pathway. These lncRNAs should be investigated as potential biomarkers for PCa.


Asunto(s)
Neoplasias de la Próstata , ARN Largo no Codificante , Masculino , Humanos , Andrógenos , ARN Largo no Codificante/genética , Línea Celular Tumoral , Neoplasias de la Próstata/tratamiento farmacológico , Neoplasias de la Próstata/genética , Neoplasias de la Próstata/metabolismo , Receptores Androgénicos/genética , Receptores Androgénicos/metabolismo , Regulación de la Expresión Génica , Regulación Neoplásica de la Expresión Génica
5.
Mol Cell ; 58(2): 216-31, 2015 Apr 16.
Artículo en Inglés | MEDLINE | ID: mdl-25818644

RESUMEN

Chromosomes of metazoan organisms are partitioned in the interphase nucleus into discrete topologically associating domains (TADs). Borders between TADs are formed in regions containing active genes and clusters of architectural protein binding sites. The transcription of most genes is repressed after temperature stress in Drosophila. Here we show that temperature stress induces relocalization of architectural proteins from TAD borders to inside TADs, and this is accompanied by a dramatic rearrangement in the 3D organization of the nucleus. TAD border strength declines, allowing for an increase in long-distance inter-TAD interactions. Similar but quantitatively weaker effects are observed upon inhibition of transcription or depletion of individual architectural proteins. Heat shock-induced inter-TAD interactions result in increased contacts among enhancers and promoters of silenced genes, which recruit Pc and form Pc bodies in the nucleolus. These results suggest that the TAD organization of metazoan genomes is plastic and can be reconfigured quickly.


Asunto(s)
Cromatina/genética , Cromosomas/genética , Proteínas de Drosophila/genética , Drosophila melanogaster/genética , Proteínas del Grupo Polycomb/metabolismo , Animales , Línea Celular , Proteínas de Drosophila/química , Proteínas de Drosophila/metabolismo , Drosophila melanogaster/metabolismo , Elementos de Facilitación Genéticos , Datos de Secuencia Molecular , Proteínas del Grupo Polycomb/química , Proteínas del Grupo Polycomb/genética , Regiones Promotoras Genéticas , Secuencias Reguladoras de Ácidos Nucleicos , Estrés Fisiológico , Temperatura
6.
Mol Cell ; 53(2): 247-61, 2014 Jan 23.
Artículo en Inglés | MEDLINE | ID: mdl-24389101

RESUMEN

Here we report a comprehensive characterization of our recently developed inhibitor MM-401 that targets the MLL1 H3K4 methyltransferase activity. MM-401 is able to specifically inhibit MLL1 activity by blocking MLL1-WDR5 interaction and thus the complex assembly. This targeting strategy does not affect other mixed-lineage leukemia (MLL) family histone methyltransferases (HMTs), revealing a unique regulatory feature for the MLL1 complex. Using MM-401 and its enantiomer control MM-NC-401, we show that inhibiting MLL1 methyltransferase activity specifically blocks proliferation of MLL cells by inducing cell-cycle arrest, apoptosis, and myeloid differentiation without general toxicity to normal bone marrow cells or non-MLL cells. More importantly, transcriptome analyses show that MM-401 induces changes in gene expression similar to those of MLL1 deletion, supporting a predominant role of MLL1 activity in regulating MLL1-dependent leukemia transcription program. We envision broad applications for MM-401 in basic and translational research.


Asunto(s)
N-Metiltransferasa de Histona-Lisina/antagonistas & inhibidores , N-Metiltransferasa de Histona-Lisina/metabolismo , Histonas/metabolismo , Leucemia Bifenotípica Aguda/enzimología , Proteína de la Leucemia Mieloide-Linfoide/metabolismo , Animales , Apoptosis/efectos de los fármacos , Puntos de Control del Ciclo Celular/efectos de los fármacos , Diferenciación Celular/efectos de los fármacos , Línea Celular Tumoral , Proliferación Celular , Histona Metiltransferasas , N-Metiltransferasa de Histona-Lisina/química , N-Metiltransferasa de Histona-Lisina/genética , Humanos , Péptidos y Proteínas de Señalización Intracelular , Ratones , Proteína de la Leucemia Mieloide-Linfoide/química , Proteína de la Leucemia Mieloide-Linfoide/genética , Oligopéptidos/química , Oligopéptidos/fisiología , Proteínas/metabolismo , Proteínas Proto-Oncogénicas c-bcl-2/metabolismo , Transcriptoma/efectos de los fármacos
7.
Bioinformatics ; 36(3): 690-697, 2020 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-31504167

RESUMEN

MOTIVATION: Annotating a given genomic locus or a set of genomic loci is an important yet challenging task. This is especially true for the non-coding part of the genome which is enormous yet poorly understood. Since gene set enrichment analyses have demonstrated to be effective approach to annotate a set of genes, the same idea can be extended to explore the enrichment of functional elements or features in a set of genomic intervals to reveal potential functional connections. RESULTS: In this study, we describe a novel computational strategy named loci2path that takes advantage of the newly emerged, genome-wide and tissue-specific expression quantitative trait loci (eQTL) information to help annotate a set of genomic intervals in terms of transcription regulation. By checking the presence or the absence of millions of eQTLs in a set of input genomic intervals, combined with grouping eQTLs by the pathways or gene sets that their target genes belong to, loci2path build a bridge connecting genomic intervals to functional pathways and pre-defined biological-meaningful gene sets, revealing potential for regulatory connection. Our method enjoys two key advantages over existing methods: first, we no longer rely on proximity to link a locus to a gene which has shown to be unreliable; second, eQTL allows us to provide the regulatory annotation under the context of specific tissue types. To demonstrate its utilities, we apply loci2path on sets of genomic intervals harboring disease-associated variants as query. Using 1 702 612 eQTLs discovered by the Genotype-Tissue Expression (GTEx) project across 44 tissues and 6320 pathways or gene sets cataloged in MSigDB as annotation resource, our method successfully identifies highly relevant biological pathways and revealed disease mechanisms for psoriasis and other immune-related diseases. Tissue specificity analysis of associated eQTLs provide additional evidence of the distinct roles of different tissues played in the disease mechanisms. AVAILABILITY AND IMPLEMENTATION: loci2path is published as an open source Bioconductor package, and it is available at http://bioconductor.org/packages/release/bioc/html/loci2path.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genómica , Sitios de Carácter Cuantitativo , Regulación de la Expresión Génica , Estudio de Asociación del Genoma Completo , Especificidad de Órganos , Polimorfismo de Nucleótido Simple
8.
Bioinformatics ; 36(8): 2352-2358, 2020 04 15.
Artículo en Inglés | MEDLINE | ID: mdl-31899481

RESUMEN

MOTIVATION: The availability of thousands of genome-wide coupling chromatin immunoprecipitation (ChIP)-Seq datasets across hundreds of transcription factors (TFs) and cell lines provides an unprecedented opportunity to jointly analyze large-scale TF-binding in vivo, making possible the discovery of the potential interaction and cooperation among different TFs. The interacted and cooperated TFs can potentially form a transcriptional regulatory module (TRM) (e.g. co-binding TFs), which helps decipher the combinatorial regulatory mechanisms. RESULTS: We develop a computational method tfLDA to apply state-of-the-art topic models to multiple ChIP-Seq datasets to decipher the combinatorial binding events of multiple TFs. tfLDA is able to learn high-order combinatorial binding patterns of TFs from multiple ChIP-Seq profiles, interpret and visualize the combinatorial patterns. We apply the tfLDA to two cell lines with a rich collection of TFs and identify combinatorial binding patterns that show well-known TRMs and related TF co-binding events. AVAILABILITY AND IMPLEMENTATION: A software R package tfLDA is freely available at https://github.com/lichen-lab/tfLDA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Secuenciación de Inmunoprecipitación de Cromatina , Factores de Transcripción , Sitios de Unión , Inmunoprecipitación de Cromatina , Bases de Datos Genéticas , Análisis de Secuencia de ADN , Factores de Transcripción/genética
9.
Mol Cell ; 49(1): 80-93, 2013 Jan 10.
Artículo en Inglés | MEDLINE | ID: mdl-23159737

RESUMEN

Histone methyltransferases (HMTases), as chromatin modifiers, regulate the transcriptomic landscape in normal development as well in diseases such as cancer. Here, we molecularly order two HMTases, EZH2 and MMSET, that have established genetic links to oncogenesis. EZH2, which mediates histone H3K27 trimethylation and is associated with gene silencing, was shown to be coordinately expressed and function upstream of MMSET, which mediates H3K36 dimethylation and is associated with active transcription. We found that the EZH2-MMSET HMTase axis is coordinated by a microRNA network and that the oncogenic functions of EZH2 require MMSET activity. Together, these results suggest that the EZH2-MMSET HMTase axis coordinately functions as a master regulator of transcriptional repression, activation, and oncogenesis and may represent an attractive therapeutic target in cancer.


Asunto(s)
Regulación Neoplásica de la Expresión Génica , N-Metiltransferasa de Histona-Lisina/metabolismo , Complejo Represivo Polycomb 2/metabolismo , Neoplasias de la Próstata/enzimología , Proteínas Represoras/metabolismo , Regiones no Traducidas 3' , Animales , Línea Celular Tumoral , Proliferación Celular , Transformación Celular Neoplásica/genética , Transformación Celular Neoplásica/metabolismo , Embrión de Pollo , Membrana Corioalantoides/patología , Proteína Potenciadora del Homólogo Zeste 2 , Expresión Génica , Técnicas de Silenciamiento del Gen , N-Metiltransferasa de Histona-Lisina/genética , Histonas/metabolismo , Humanos , Masculino , Ratones , Ratones Endogámicos BALB C , Ratones Desnudos , MicroARNs/metabolismo , Invasividad Neoplásica , Trasplante de Neoplasias , Complejo Represivo Polycomb 2/genética , Neoplasias de la Próstata/metabolismo , Neoplasias de la Próstata/patología , Interferencia de ARN , Proteínas Represoras/genética , Análisis de Matrices Tisulares , Activación Transcripcional
10.
Bioinformatics ; 35(13): 2167-2176, 2019 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-30475980

RESUMEN

MOTIVATION: The replication timing (RT) program has been linked to many key biological processes including cell fate commitment, 3D chromatin organization and transcription regulation. Significant technology progress now allows to characterize the RT program in the entire human genome in a high-throughput and high-resolution fashion. These experiments suggest that RT changes dynamically during development in coordination with gene activity. Since RT is such a fundamental biological process, we believe that an effective quantitative profile of the local RT program from a diverse set of cell types in various developmental stages and lineages can provide crucial biological insights for a genomic locus. RESULTS: In this study, we explored recurrent and spatially coherent combinatorial profiles from 42 RT programs collected from multiple lineages at diverse differentiation states. We found that a Hidden Markov Model with 15 hidden states provide a good model to describe these genome-wide RT profiling data. Each of the hidden state represents a unique combination of RT profiles across different cell types which we refer to as 'RT states'. To understand the biological properties of these RT states, we inspected their relationship with chromatin states, gene expression, functional annotation and 3D chromosomal organization. We found that the newly defined RT states possess interesting genome-wide functional properties that add complementary information to the existing annotation of the human genome. AVAILABILITY AND IMPLEMENTATION: R scripts for inferring HMM models and Perl scripts for further analysis are available https://github.com/PouletAxel/script_HMM_Replication_timing. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Momento de Replicación del ADN , Genoma Humano , Diferenciación Celular , Cromatina , Genómica , Humanos
11.
Nature ; 510(7504): 278-82, 2014 Jun 12.
Artículo en Inglés | MEDLINE | ID: mdl-24759320

RESUMEN

Men who develop metastatic castration-resistant prostate cancer (CRPC) invariably succumb to the disease. Progression to CRPC after androgen ablation therapy is predominantly driven by deregulated androgen receptor (AR) signalling. Despite the success of recently approved therapies targeting AR signalling, such as abiraterone and second-generation anti-androgens including MDV3100 (also known as enzalutamide), durable responses are limited, presumably owing to acquired resistance. Recently, JQ1 and I-BET762 two selective small-molecule inhibitors that target the amino-terminal bromodomains of BRD4, have been shown to exhibit anti-proliferative effects in a range of malignancies. Here we show that AR-signalling-competent human CRPC cell lines are preferentially sensitive to bromodomain and extraterminal (BET) inhibition. BRD4 physically interacts with the N-terminal domain of AR and can be disrupted by JQ1 (refs 11, 13). Like the direct AR antagonist MDV3100, JQ1 disrupted AR recruitment to target gene loci. By contrast with MDV3100, JQ1 functions downstream of AR, and more potently abrogated BRD4 localization to AR target loci and AR-mediated gene transcription, including induction of the TMPRSS2-ERG gene fusion and its oncogenic activity. In vivo, BET bromodomain inhibition was more efficacious than direct AR antagonism in CRPC xenograft mouse models. Taken together, these studies provide a novel epigenetic approach for the concerted blockade of oncogenic drivers in advanced prostate cancer.


Asunto(s)
Azepinas/farmacología , Proteínas Nucleares/química , Neoplasias de la Próstata Resistentes a la Castración/tratamiento farmacológico , Factores de Transcripción/química , Triazoles/farmacología , Antagonistas de Andrógenos/farmacología , Andrógenos/metabolismo , Animales , Azepinas/uso terapéutico , Proteínas de Ciclo Celular , Línea Celular Tumoral , Modelos Animales de Enfermedad , Epigénesis Genética , Humanos , Masculino , Ratones , Proteínas de Fusión Oncogénica/genética , Proteínas de Fusión Oncogénica/metabolismo , Neoplasias de la Próstata Resistentes a la Castración/genética , Estructura Terciaria de Proteína/efectos de los fármacos , Receptores Androgénicos/química , Receptores Androgénicos/metabolismo , Transducción de Señal/efectos de los fármacos , Triazoles/uso terapéutico
12.
Mol Cell ; 48(3): 471-84, 2012 Nov 09.
Artículo en Inglés | MEDLINE | ID: mdl-23041285

RESUMEN

The mechanisms responsible for the establishment of physical domains in metazoan chromosomes are poorly understood. Here we find that physical domains in Drosophila chromosomes are demarcated at regions of active transcription and high gene density that are enriched for transcription factors and specific combinations of insulator proteins. Physical domains contain different types of chromatin defined by the presence of specific proteins and epigenetic marks, with active chromatin preferentially located at the borders and silenced chromatin in the interior. Domain boundaries participate in long-range interactions that may contribute to the clustering of regions of active or silenced chromatin in the nucleus. Analysis of transgenes suggests that chromatin is more accessible and permissive to transcription at the borders than inside domains, independent of the presence of active or silencing histone modifications. These results suggest that the higher-order physical organization of chromatin may impose an additional level of regulation over classical epigenetic marks.


Asunto(s)
Drosophila melanogaster/genética , Genoma de los Insectos/genética , Elementos Aisladores/genética , Transcripción Genética , Animales , Línea Celular , Cromatina/genética , Mapeo Cromosómico , Cromosomas de Insectos/genética , Proteínas de Drosophila/genética , Proteínas de Drosophila/metabolismo , Drosophila melanogaster/citología , Dosificación de Gen , Regulación de la Expresión Génica , Histonas/metabolismo , Sitio de Iniciación de la Transcripción
13.
Mol Cell ; 44(5): 770-84, 2011 Dec 09.
Artículo en Inglés | MEDLINE | ID: mdl-22152480

RESUMEN

Both H4K16 acetylation and H3K4 trimethylation are required for gene activation. However, it is still largely unclear how these modifications are orchestrated by transcriptional factors. Here, we analyzed the mechanism of the transcriptional activation by FOXP3, an X-linked suppressor of autoimmune diseases and cancers. FOXP3 binds near transcriptional start sites of its target genes. By recruiting MOF and displacing histone H3K4 demethylase PLU-1, FOXP3 increases both H4K16 acetylation and H3K4 trimethylation at the FOXP3-associated chromatins of multiple FOXP3-activated genes. RNAi-mediated silencing of MOF reduced both gene activation and tumor suppression by FOXP3, while both somatic mutations in clinical cancer samples and targeted mutation of FOXP3 in mouse prostate epithelial cells disrupted nuclear localization of MOF. Our data demonstrate a pull-push model in which a single transcription factor orchestrates two epigenetic alterations necessary for gene activation and provide a mechanism for somatic inactivation of the FOXP3 protein function in cancer cells.


Asunto(s)
Factores de Transcripción Forkhead/metabolismo , Histona Acetiltransferasas/metabolismo , Histonas/metabolismo , Histona Demetilasas con Dominio de Jumonji/metabolismo , Proteínas Nucleares/metabolismo , Proteínas Represoras/metabolismo , Acetilación , Neoplasias de la Mama/genética , Neoplasias de la Mama/metabolismo , Neoplasias de la Mama/patología , Línea Celular Tumoral , Núcleo Celular/metabolismo , Femenino , Factores de Transcripción Forkhead/genética , Regulación de la Expresión Génica , Células HEK293 , Humanos , Metilación , Mutación
14.
Nucleic Acids Res ; 45(W1): W445-W452, 2017 07 03.
Artículo en Inglés | MEDLINE | ID: mdl-28402462

RESUMEN

The development and application of high-throughput genomics technologies has resulted in massive quantities of diverse omics data that continue to accumulate rapidly. These rich datasets offer unprecedented and exciting opportunities to address long standing questions in biomedical research. However, our ability to explore and query the content of diverse omics data is very limited. Existing dataset search tools rely almost exclusively on the metadata. A text-based query for gene name(s) does not work well on datasets wherein the vast majority of their content is numeric. To overcome this barrier, we have developed Omicseq, a novel web-based platform that facilitates the easy interrogation of omics datasets holistically to improve 'findability' of relevant data. The core component of Omicseq is trackRank, a novel algorithm for ranking omics datasets that fully uses the numerical content of the dataset to determine relevance to the query entity. The Omicseq system is supported by a scalable and elastic, NoSQL database that hosts a large collection of processed omics datasets. In the front end, a simple, web-based interface allows users to enter queries and instantly receive search results as a list of ranked datasets deemed to be the most relevant. Omicseq is freely available at http://www.omicseq.org.


Asunto(s)
Neoplasias de la Mama/genética , Genómica/métodos , Neoplasias de la Próstata/genética , Motor de Búsqueda , Interfaz Usuario-Computador , Algoritmos , Biomarcadores de Tumor/genética , Biomarcadores de Tumor/metabolismo , Neoplasias de la Mama/metabolismo , Neoplasias de la Mama/patología , Bases de Datos Genéticas , Bases de Datos de Proteínas , Conjuntos de Datos como Asunto , Femenino , Expresión Génica , Humanos , Internet , Calicreínas/genética , Calicreínas/metabolismo , Masculino , Metadatos/estadística & datos numéricos , Fosfohidrolasa PTEN/genética , Fosfohidrolasa PTEN/metabolismo , Antígeno Prostático Específico/genética , Antígeno Prostático Específico/metabolismo , Neoplasias de la Próstata/metabolismo , Neoplasias de la Próstata/patología , Receptor ErbB-2/genética , Receptor ErbB-2/metabolismo
15.
PLoS Genet ; 12(5): e1006042, 2016 05.
Artículo en Inglés | MEDLINE | ID: mdl-27152617

RESUMEN

Selective neuronal vulnerability is characteristic of most degenerative disorders of the CNS, yet mechanisms underlying this phenomenon remain poorly characterized. Many forms of cerebellar degeneration exhibit an anterior-to-posterior gradient of Purkinje cell loss including Niemann-Pick type C1 (NPC) disease, a lysosomal storage disorder characterized by progressive neurological deficits that often begin in childhood. Here, we sought to identify candidate genes underlying vulnerability of Purkinje cells in anterior cerebellar lobules using data freely available in the Allen Brain Atlas. This approach led to the identification of 16 candidate neuroprotective or susceptibility genes. We demonstrate that one candidate gene, heat shock protein beta-1 (HSPB1), promoted neuronal survival in cellular models of NPC disease through a mechanism that involved inhibition of apoptosis. Additionally, we show that over-expression of wild type HSPB1 or a phosphomimetic mutant in NPC mice slowed the progression of motor impairment and diminished cerebellar Purkinje cell loss. We confirmed the modulatory effect of Hspb1 on Purkinje cell degeneration in vivo, as knockdown by Hspb1 shRNA significantly enhanced neuron loss. These results suggest that strategies to promote HSPB1 activity may slow the rate of cerebellar degeneration in NPC disease and highlight the use of bioinformatics tools to uncover pathways leading to neuronal protection in neurodegenerative disorders.


Asunto(s)
Proteínas de Choque Térmico HSP27/genética , Degeneración Nerviosa/genética , Enfermedad de Niemann-Pick Tipo C/genética , Células de Purkinje/metabolismo , Animales , Apoptosis/genética , Supervivencia Celular/genética , Cerebelo/metabolismo , Cerebelo/patología , Modelos Animales de Enfermedad , Proteínas de Choque Térmico HSP27/biosíntesis , Humanos , Ratones , Degeneración Nerviosa/patología , Degeneración Nerviosa/terapia , Neuronas/metabolismo , Neuronas/patología , Enfermedad de Niemann-Pick Tipo C/patología , Enfermedad de Niemann-Pick Tipo C/terapia , Células de Purkinje/patología , ARN Interferente Pequeño/genética , ARN Interferente Pequeño/uso terapéutico
16.
Bioinformatics ; 32(8): 1214-6, 2016 04 15.
Artículo en Inglés | MEDLINE | ID: mdl-26685307

RESUMEN

UNLABELLED: Genome-wide association studies (GWASs) have successfully identified many sequence variants that are significantly associated with common diseases and traits. Tens of thousands of such trait-associated SNPs have already been cataloged, which we believe form a great resource for genomic research. Recent studies have demonstrated that the collection of trait-associated SNPs can be exploited to indicate whether a given genomic interval or intervals are likely to be functionally connected with certain phenotypes or diseases. Despite this importance, currently, there is no ready-to-use computational tool able to connect genomic intervals to phenotypes. Here, we present traseR, an easy-to-use R Bioconductor package that performs enrichment analyses of trait-associated SNPs in arbitrary genomic intervals with flexible options, including testing method, type of background and inclusion of SNPs in LD. AVAILABILITY AND IMPLEMENTATION: The traseR R package preloaded with up-to-date collection of trait-associated SNPs are freely available in Bioconductor CONTACT: zhaohui.qin@emory.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Estudio de Asociación del Genoma Completo , Fenotipo , Programas Informáticos , Genómica , Humanos , Polimorfismo de Nucleótido Simple
17.
Bioinformatics ; 32(5): 682-9, 2016 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-26519502

RESUMEN

MOTIVATION: Modern high-throughput biotechnologies such as microarray are capable of producing a massive amount of information for each sample. However, in a typical high-throughput experiment, only limited number of samples were assayed, thus the classical 'large p, small n' problem. On the other hand, rapid propagation of these high-throughput technologies has resulted in a substantial collection of data, often carried out on the same platform and using the same protocol. It is highly desirable to utilize the existing data when performing analysis and inference on a new dataset. RESULTS: Utilizing existing data can be carried out in a straightforward fashion under the Bayesian framework in which the repository of historical data can be exploited to build informative priors and used in new data analysis. In this work, using microarray data, we investigate the feasibility and effectiveness of deriving informative priors from historical data and using them in the problem of detecting differentially expressed genes. Through simulation and real data analysis, we show that the proposed strategy significantly outperforms existing methods including the popular and state-of-the-art Bayesian hierarchical model-based approaches. Our work illustrates the feasibility and benefits of exploiting the increasingly available genomics big data in statistical inference and presents a promising practical strategy for dealing with the 'large p, small n' problem. AVAILABILITY AND IMPLEMENTATION: Our method is implemented in R package IPBT, which is freely available from https://github.com/benliemory/IPBT CONTACT: yuzhu@purdue.edu; zhaohui.qin@emory.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Teorema de Bayes , Algoritmos , Bases de Datos Factuales , Genómica
18.
Nucleic Acids Res ; 43(5): 2757-66, 2015 Mar 11.
Artículo en Inglés | MEDLINE | ID: mdl-25722376

RESUMEN

Detecting in vivo transcription factor (TF) binding is important for understanding gene regulatory circuitries. ChIP-seq is a powerful technique to empirically define TF binding in vivo. However, the multitude of distinct TFs makes genome-wide profiling for them all labor-intensive and costly. Algorithms for in silico prediction of TF binding have been developed, based mostly on histone modification or DNase I hypersensitivity data in conjunction with DNA motif and other genomic features. However, technical limitations of these methods prevent them from being applied broadly, especially in clinical settings. We conducted a comprehensive survey involving multiple cell lines, TFs, and methylation types and found that there are intimate relationships between TF binding and methylation level changes around the binding sites. Exploiting the connection between DNA methylation and TF binding, we proposed a novel supervised learning approach to predict TF-DNA interaction using data from base-resolution whole-genome methylation sequencing experiments. We devised beta-binomial models to characterize methylation data around TF binding sites and the background. Along with other static genomic features, we adopted a random forest framework to predict TF-DNA interaction. After conducting comprehensive tests, we saw that the proposed method accurately predicts TF binding and performs favorably versus competing methods.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Metilación de ADN , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Animales , Línea Celular , Simulación por Computador , ADN/genética , ADN/metabolismo , Humanos , Masculino , Ratones , Modelos Genéticos , Análisis de Secuencia por Matrices de Oligonucleótidos , Unión Proteica , Reproducibilidad de los Resultados , Transcriptoma
19.
Bioinformatics ; 31(12): 1889-96, 2015 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-25682068

RESUMEN

MOTIVATION: ChIP-seq is a powerful technology to measure the protein binding or histone modification strength in the whole genome scale. Although there are a number of methods available for single ChIP-seq data analysis (e.g. 'peak detection'), rigorous statistical method for quantitative comparison of multiple ChIP-seq datasets with the considerations of data from control experiment, signal to noise ratios, biological variations and multiple-factor experimental designs is under-developed. RESULTS: In this work, we develop a statistical method to perform quantitative comparison of multiple ChIP-seq datasets and detect genomic regions showing differential protein binding or histone modification. We first detect peaks from all datasets and then union them to form a single set of candidate regions. The read counts from IP experiment at the candidate regions are assumed to follow Poisson distribution. The underlying Poisson rates are modeled as an experiment-specific function of artifacts and biological signals. We then obtain the estimated biological signals and compare them through the hypothesis testing procedure in a linear model framework. Simulations and real data analyses demonstrate that the proposed method provides more accurate and robust results compared with existing ones. AVAILABILITY AND IMPLEMENTATION: An R software package ChIPComp is freely available at http://web1.sph.emory.edu/users/hwu30/software/ChIPComp.html.


Asunto(s)
Inmunoprecipitación de Cromatina/métodos , Biología Computacional/métodos , Genoma Humano , Histonas/metabolismo , Modelos Estadísticos , Programas Informáticos , Factores de Transcripción/metabolismo , Algoritmos , Artefactos , Simulación por Computador , Conjuntos de Datos como Asunto , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Distribución de Poisson , Unión Proteica , Análisis de Secuencia de ADN/métodos , Factores de Transcripción/genética
20.
PLoS Comput Biol ; 11(8): e1004448, 2015 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-26267278

RESUMEN

With rapid decline of the sequencing cost, researchers today rush to embrace whole genome sequencing (WGS), or whole exome sequencing (WES) approach as the next powerful tool for relating genetic variants to human diseases and phenotypes. A fundamental step in analyzing WGS and WES data is mapping short sequencing reads back to the reference genome. This is an important issue because incorrectly mapped reads affect the downstream variant discovery, genotype calling and association analysis. Although many read mapping algorithms have been developed, the majority of them uses the universal reference genome and do not take sequence variants into consideration. Given that genetic variants are ubiquitous, it is highly desirable if they can be factored into the read mapping procedure. In this work, we developed a novel strategy that utilizes genotypes obtained a priori to customize the universal haploid reference genome into a personalized diploid reference genome. The new strategy is implemented in a program named RefEditor. When applying RefEditor to real data, we achieved encouraging improvements in read mapping, variant discovery and genotype calling. Compared to standard approaches, RefEditor can significantly increase genotype calling consistency (from 43% to 61% at 4X coverage; from 82% to 92% at 20X coverage) and reduce Mendelian inconsistency across various sequencing depths. Because many WGS and WES studies are conducted on cohorts that have been genotyped using array-based genotyping platforms previously or concurrently, we believe the proposed strategy will be of high value in practice, which can also be applied to the scenario where multiple NGS experiments are conducted on the same cohort. The RefEditor sources are available at https://github.com/superyuan/refeditor.


Asunto(s)
Mapeo Cromosómico/métodos , Diploidia , Genómica/métodos , Técnicas de Genotipaje/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Programas Informáticos , Bases de Datos Genéticas , Genoma , Humanos , Análisis de Secuencia de ADN
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA