Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
Nat Commun ; 10(1): 4613, 2019 10 10.
Artículo en Inglés | MEDLINE | ID: mdl-31601804

RESUMEN

Characterizing and interpreting heterogeneous mixtures at the cellular level is a critical problem in genomics. Single-cell assays offer an opportunity to resolve cellular level heterogeneity, e.g., scRNA-seq enables single-cell expression profiling, and scATAC-seq identifies active regulatory elements. Furthermore, while scHi-C can measure the chromatin contacts (i.e., loops) between active regulatory elements to target genes in single cells, bulk HiChIP can measure such contacts in a higher resolution. In this work, we introduce DC3 (De-Convolution and Coupled-Clustering) as a method for the joint analysis of various bulk and single-cell data such as HiChIP, RNA-seq and ATAC-seq from the same heterogeneous cell population. DC3 can simultaneously identify distinct subpopulations, assign single cells to the subpopulations (i.e., clustering) and de-convolve the bulk data into subpopulation-specific data. The subpopulation-specific profiles of gene expression, chromatin accessibility and enhancer-promoter contact obtained by DC3 provide a comprehensive characterization of the gene regulatory system in each subpopulation.


Asunto(s)
Algoritmos , Análisis por Conglomerados , Perfilación de la Expresión Génica/estadística & datos numéricos , Genómica/estadística & datos numéricos , Análisis de la Célula Individual/estadística & datos numéricos , Animales , Línea Celular , Cromatina , Inmunoprecipitación de Cromatina/estadística & datos numéricos , Simulación por Computador , Perfilación de la Expresión Génica/métodos , Redes Reguladoras de Genes , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/estadística & datos numéricos , Humanos , Ratones , Regiones Promotoras Genéticas , Análisis de la Célula Individual/métodos
2.
Pac Symp Biocomput ; 24: 184-195, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-30864321

RESUMEN

Genetic variations of the human genome are linked to many disease phenotypes. While whole-genome sequencing and genome-wide association studies (GWAS) have uncovered a number of genotype-phenotype associations, their functional interpretation remains challenging given most single nucleotide polymorphisms (SNPs) fall into the non-coding region of the genome. Advances in chromatin immunoprecipitation sequencing (ChIP-seq) have made large-scale repositories of epigenetic data available, allowing investigation of coordinated mechanisms of epigenetic markers and transcriptional regulation and their influence on biological function. To address this, we propose SNPs2ChIP, a method to infer biological functions of non-coding variants through unsupervised statistical learning methods applied to publicly-available epigenetic datasets. We systematically characterized latent factors by applying singular value decomposition to ChIP-seq tracks of lymphoblastoid cell lines, and annotated the biological function of each latent factor using the genomic region enrichment analysis tool. Using these annotated latent factors as reference, we developed SNPs2ChIP, a pipeline that takes genomic region(s) as an input, identifies the relevant latent factors with quantitative scores, and returns them along with their inferred functions. As a case study, we focused on systemic lupus erythematosus and demonstrated our method's ability to infer relevant biological function. We systematically applied SNPs2ChIP on publicly available datasets, including known GWAS associations from the GWAS catalogue and ChIP-seq peaks from a previously published study. Our approach to leverage latent patterns across genome-wide epigenetic datasets to infer the biological function will advance understanding of the genetics of human diseases by accelerating the interpretation of non-coding genomes.


Asunto(s)
Inmunoprecipitación de Cromatina/estadística & datos numéricos , Polimorfismo de Nucleótido Simple , Algoritmos , Línea Celular , Biología Computacional/métodos , Bases de Datos de Ácidos Nucleicos/estadística & datos numéricos , Epigénesis Genética , Estudios de Asociación Genética , Genoma Humano , Estudio de Asociación del Genoma Completo/estadística & datos numéricos , Secuenciación de Nucleótidos de Alto Rendimiento/estadística & datos numéricos , Humanos , Lupus Eritematoso Sistémico/genética , Linfocitos/metabolismo , Receptores de Calcitriol/genética
3.
BMC Genomics ; 20(1): 6, 2019 Jan 05.
Artículo en Inglés | MEDLINE | ID: mdl-30611200

RESUMEN

BACKGROUND: Sequencing data has become a standard measure of diverse cellular activities. For example, gene expression is accurately measured by RNA sequencing (RNA-Seq) libraries, protein-DNA interactions are captured by chromatin immunoprecipitation sequencing (ChIP-Seq), protein-RNA interactions by crosslinking immunoprecipitation sequencing (CLIP-Seq) or RNA immunoprecipitation (RIP-Seq) sequencing, DNA accessibility by assay for transposase-accessible chromatin (ATAC-Seq), DNase or MNase sequencing libraries. The processing of these sequencing techniques involves library-specific approaches. However, in all cases, once the sequencing libraries are processed, the result is a count table specifying the estimated number of reads originating from each genomic locus. Differential analysis to determine which loci have different cellular activity under different conditions starts with the count table and iterates through a cycle of data assessment, preparation and analysis. Such complex analysis often relies on multiple programs and is therefore a challenge for those without programming skills. RESULTS: We developed DEBrowser as an R bioconductor project to interactively visualize every step of the differential analysis, without programming. The application provides a rich and interactive web based graphical user interface built on R's shiny infrastructure. DEBrowser allows users to visualize data with various types of graphs that can be explored further by selecting and re-plotting any desired subset of data. Using the visualization approaches provided, users can determine and correct technical variations such as batch effects and sequencing depth that affect differential analysis. We show DEBrowser's ease of use by reproducing the analysis of two previously published data sets. CONCLUSIONS: DEBrowser is a flexible, intuitive, web-based analysis platform that enables an iterative and interactive analysis of count data without any requirement of programming knowledge.


Asunto(s)
Inmunoprecipitación de Cromatina/estadística & datos numéricos , Genoma Humano/genética , Análisis de Secuencia de ARN/estadística & datos numéricos , Programas Informáticos , Cromatina/genética , ADN/genética , Proteínas de Unión al ADN/genética , Interpretación Estadística de Datos , Genómica/estadística & datos numéricos , Secuenciación de Nucleótidos de Alto Rendimiento/estadística & datos numéricos , Humanos , Análisis de Secuencia de ADN
4.
PLoS Comput Biol ; 14(4): e1006090, 2018 04.
Artículo en Inglés | MEDLINE | ID: mdl-29684008

RESUMEN

Genome-wide in vivo protein-DNA interactions are routinely mapped using high-throughput chromatin immunoprecipitation (ChIP). ChIP-reported regions are typically investigated for enriched sequence-motifs, which are likely to model the DNA-binding specificity of the profiled protein and/or of co-occurring proteins. However, simple enrichment analyses can miss insights into the binding-activity of the protein. Note that ChIP reports regions making direct contact with the protein as well as those binding through intermediaries. For example, consider a ChIP experiment targeting protein X, which binds DNA at its cognate sites, but simultaneously interacts with four other proteins. Each of these proteins also binds to its own specific cognate sites along distant parts of the genome, a scenario consistent with the current view of transcriptional hubs and chromatin loops. Since ChIP will pull down all X-associated regions, the final reported data will be a union of five distinct sets of regions, each containing binding sites of one of the five proteins, respectively. Characterizing all five different motifs and the corresponding sets is important to interpret the ChIP experiment and ultimately, the role of X in regulation. We present diversity which attempts exactly this: it partitions the data so that each partition can be characterized with its own de novo motif. Diversity uses a Bayesian approach to identify the optimal number of motifs and the associated partitions, which together explain the entire dataset. This is in contrast to standard motif finders, which report motifs individually enriched in the data, but do not necessarily explain all reported regions. We show that the different motifs and associated regions identified by diversity give insights into the various complexes that may be forming along the chromatin, something that has so far not been attempted from ChIP data. Webserver at http://diversity.ncl.res.in/; standalone (Mac OS X/Linux) from https://github.com/NarlikarLab/DIVERSITY/releases/tag/v1.0.0.


Asunto(s)
Inmunoprecipitación de Cromatina/estadística & datos numéricos , Programas Informáticos , Algoritmos , Animales , Teorema de Bayes , Sitios de Unión , Cromatina/genética , Cromatina/metabolismo , Biología Computacional , ADN/genética , ADN/metabolismo , Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Evolución Molecular , Secuenciación de Nucleótidos de Alto Rendimiento/estadística & datos numéricos , Humanos , Neuronas/metabolismo , Motivos de Nucleótidos , Unión Proteica , Análisis de Secuencia de ADN/estadística & datos numéricos , Factores de Transcripción/genética , Factores de Transcripción/metabolismo
5.
Brief Bioinform ; 19(5): 1069-1081, 2018 09 28.
Artículo en Inglés | MEDLINE | ID: mdl-28334268

RESUMEN

Transcription factors are proteins that bind to specific DNA sequences and play important roles in controlling the expression levels of their target genes. Hence, prediction of transcription factor binding sites (TFBSs) provides a solid foundation for inferring gene regulatory mechanisms and building regulatory networks for a genome. Chromatin immunoprecipitation sequencing (ChIP-seq) technology can generate large-scale experimental data for such protein-DNA interactions, providing an unprecedented opportunity to identify TFBSs (a.k.a. cis-regulatory motifs). The bottleneck, however, is the lack of robust mathematical models, as well as efficient computational methods for TFBS prediction to make effective use of massive ChIP-seq data sets in the public domain. The purpose of this study is to review existing motif-finding methods for ChIP-seq data from an algorithmic perspective and provide new computational insight into this field. The state-of-the-art methods were shown through summarizing eight representative motif-finding algorithms along with corresponding challenges, and introducing some important relative functions according to specific biological demands, including discriminative motif finding and cofactor motifs analysis. Finally, potential directions and plans for ChIP-seq-based motif-finding tools were showcased in support of future algorithm development.


Asunto(s)
Algoritmos , Redes Reguladoras de Genes , Programas Informáticos , Secuencia de Bases , Sitios de Unión/genética , Inmunoprecipitación de Cromatina/estadística & datos numéricos , Biología Computacional/métodos , ADN/genética , ADN/metabolismo , Humanos , Análisis de Secuencia de ADN/estadística & datos numéricos , Factores de Transcripción/metabolismo
6.
Nucleic Acids Res ; 43(6): e40, 2015 Mar 31.
Artículo en Inglés | MEDLINE | ID: mdl-25564527

RESUMEN

RNA-seq is a sensitive and accurate technique to compare steady-state levels of RNA between different cellular states. However, as it does not provide an account of transcriptional activity per se, other technologies are needed to more precisely determine acute transcriptional responses. Here, we have developed an easy, sensitive and accurate novel computational method, IRNA-SEQ: , for genome-wide assessment of transcriptional activity based on analysis of intron coverage from total RNA-seq data. Comparison of the results derived from iRNA-seq analyses with parallel results derived using current methods for genome-wide determination of transcriptional activity, i.e. global run-on (GRO)-seq and RNA polymerase II (RNAPII) ChIP-seq, demonstrate that iRNA-seq provides similar results in terms of number of regulated genes and their fold change. However, unlike the current methods that are all very labor-intensive and demanding in terms of sample material and technologies, iRNA-seq is cheap and easy and requires very little sample material. In conclusion, iRNA-seq offers an attractive novel alternative to current methods for determination of changes in transcriptional activity at a genome-wide level.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Análisis de Secuencia de ARN/métodos , Línea Celular , Inmunoprecipitación de Cromatina/métodos , Inmunoprecipitación de Cromatina/estadística & datos numéricos , Perfilación de la Expresión Génica/estadística & datos numéricos , Regulación de la Expresión Génica , Genoma Humano , Humanos , Intrones , Análisis de Secuencia de ARN/estadística & datos numéricos
7.
Nucleic Acids Res ; 43(6): e38, 2015 Mar 31.
Artículo en Inglés | MEDLINE | ID: mdl-25539918

RESUMEN

Genome-wide chromatin immunoprecipitation (ChIP) studies have brought significant insight into the genomic localization of chromatin-associated proteins and histone modifications. The large amount of data generated by these analyses, however, require approaches that enable rapid validation and analysis of biological relevance. Furthermore, there are still protein and modification targets that are difficult to detect using standard ChIP methods. To address these issues, we developed an immediate chromatin immunoprecipitation procedure which we call ZipChip. ZipChip significantly reduces the time and increases sensitivity allowing for rapid screening of multiple loci. Here we describe how ZipChIP enables detection of histone modifications (H3K4 mono- and trimethylation) and two yeast histone demethylases, Jhd2 and Rph1, which were previously difficult to detect using standard methods. Furthermore, we demonstrate the versatility of ZipChIP by analyzing the enrichment of the histone deacetylase Sir2 at heterochromatin in yeast and enrichment of the chromatin remodeler, PICKLE, at euchromatin in Arabidopsis thaliana.


Asunto(s)
Inmunoprecipitación de Cromatina/métodos , Reacción en Cadena en Tiempo Real de la Polimerasa/métodos , Actinas/genética , Actinas/metabolismo , Arabidopsis/genética , Arabidopsis/metabolismo , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Cromatina/genética , Cromatina/metabolismo , Inmunoprecipitación de Cromatina/estadística & datos numéricos , ADN Helicasas/genética , ADN Helicasas/metabolismo , Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Genes Fúngicos , Genes de Plantas , Histona Demetilasas/genética , Histona Demetilasas/metabolismo , Histonas/genética , Histonas/metabolismo , Histona Demetilasas con Dominio de Jumonji/genética , Histona Demetilasas con Dominio de Jumonji/metabolismo , Sistemas de Lectura Abierta , Regiones Promotoras Genéticas , Proteínas Represoras/genética , Proteínas Represoras/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Proteínas Reguladoras de Información Silente de Saccharomyces cerevisiae/genética , Proteínas Reguladoras de Información Silente de Saccharomyces cerevisiae/metabolismo , Sirtuina 2/genética , Sirtuina 2/metabolismo
8.
Pac Symp Biocomput ; : 320-31, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23424137

RESUMEN

We have developed a novel approach called ChIPModule to systematically discover transcription factors and their cofactors from ChIP-seq data. Given a ChIP-seq dataset and the binding patterns of a large number of transcription factors, ChIPModule can efficiently identify groups of transcription factors, whose binding sites significantly co-occur in the ChIP-seq peak regions. By testing ChIPModule on simulated data and experimental data, we have shown that ChIPModule identifies known cofactors of transcription factors, and predicts new cofactors that are supported by literature. ChIPModule provides a useful tool for studying gene transcriptional regulation.


Asunto(s)
Inmunoprecipitación de Cromatina/estadística & datos numéricos , Análisis de Secuencia/estadística & datos numéricos , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Sitios de Unión/genética , Biología Computacional , Bases de Datos Genéticas/estadística & datos numéricos , Humanos
9.
Brief Bioinform ; 14(2): 225-37, 2013 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-22517426

RESUMEN

Motif discovery has been one of the most widely studied problems in bioinformatics ever since genomic and protein sequences have been available. In particular, its application to the de novo prediction of putative over-represented transcription factor binding sites in nucleotide sequences has been, and still is, one of the most challenging flavors of the problem. Recently, novel experimental techniques like chromatin immunoprecipitation (ChIP) have been introduced, permitting the genome-wide identification of protein-DNA interactions. ChIP, applied to transcription factors and coupled with genome tiling arrays (ChIP on Chip) or next-generation sequencing technologies (ChIP-Seq) has opened new avenues in research, as well as posed new challenges to bioinformaticians developing algorithms and methods for motif discovery.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/estadística & datos numéricos , Elementos Reguladores de la Transcripción , Factores de Transcripción/metabolismo , Algoritmos , Animales , Sitios de Unión/genética , Inmunoprecipitación de Cromatina/estadística & datos numéricos , Biología Computacional , Secuencia de Consenso , ADN/genética , ADN/metabolismo , Perfilación de la Expresión Génica/estadística & datos numéricos , Humanos
10.
PLoS One ; 7(1): e28272, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22238575

RESUMEN

Chromatin Immuno Precipitation (ChIP) profiling detects in vivo protein-DNA binding, and has revealed a large combinatorial complexity in the binding of chromatin associated proteins and their post-translational modifications. To fully explore the spatial and combinatorial patterns in ChIP-profiling data and detect potentially meaningful patterns, the areas of enrichment must be aligned and clustered, which is an algorithmically and computationally challenging task. We have developed CATCHprofiles, a novel tool for exhaustive pattern detection in ChIP profiling data. CATCHprofiles is built upon a computationally efficient implementation for the exhaustive alignment and hierarchical clustering of ChIP profiling data. The tool features a graphical interface for examination and browsing of the clustering results. CATCHprofiles requires no prior knowledge about functional sites, detects known binding patterns "ab initio", and enables the detection of new patterns from ChIP data at a high resolution, exemplified by the detection of asymmetric histone and histone modification patterns around H2A.Z-enriched sites. CATCHprofiles' capability for exhaustive analysis combined with its ease-of-use makes it an invaluable tool for explorative research based on ChIP profiling data. CATCHprofiles and the CATCH algorithm run on all platforms and is available for free through the CATCH website: http://catch.cmbi.ru.nl/. User support is available by subscribing to the mailing list catch-users@bioinformatics.org.


Asunto(s)
Inmunoprecipitación de Cromatina/estadística & datos numéricos , Interpretación Estadística de Datos , Análisis por Micromatrices/estadística & datos numéricos , Alineación de Secuencia , Programas Informáticos , Algoritmos , Secuencia de Bases , Células Cultivadas , Inmunoprecipitación de Cromatina/métodos , Análisis por Conglomerados , Biología Computacional/métodos , Eficiencia , Perfilación de la Expresión Génica/métodos , Perfilación de la Expresión Génica/estadística & datos numéricos , Humanos , Modelos Biológicos , Datos de Secuencia Molecular , Regiones Promotoras Genéticas/genética , Alineación de Secuencia/métodos , Alineación de Secuencia/estadística & datos numéricos
11.
Biostatistics ; 13(1): 113-28, 2012 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-21914728

RESUMEN

Chromatin immunoprecipitation followed by next generation sequencing (ChIP-seq) is a powerful technique that is being used in a wide range of biological studies including genome-wide measurements of protein-DNA interactions, DNA methylation, and histone modifications. The vast amount of data and biases introduced by sequencing and/or genome mapping pose new challenges and call for effective methods and fast computer programs for statistical analysis. To systematically model ChIP-seq data, we build a dynamic signal profile for each chromosome and then model the profile using a fully Bayesian hidden Ising model. The proposed model naturally takes into account spatial dependency and global and local distributions of sequence tags. It can be used for one-sample and two-sample analyses. Through model diagnosis, the proposed method can detect falsely enriched regions caused by sequencing and/or mapping errors, which is usually not offered by the existing hypothesis-testing-based methods. The proposed method is illustrated using 3 transcription factor (TF) ChIP-seq data sets and 2 mixed ChIP-seq data sets and compared with 4 popular and/or well-documented methods: MACS, CisGenome, BayesPeak, and SISSRs. The results indicate that the proposed method achieves equivalent or higher sensitivity and spatial resolution in detecting TF binding sites with false discovery rate at a much lower level.


Asunto(s)
Inmunoprecipitación de Cromatina/estadística & datos numéricos , Modelos Estadísticos , Análisis de Secuencia de ADN/estadística & datos numéricos , Algoritmos , Teorema de Bayes , Sitios de Unión/genética , Biotecnología , ADN/genética , ADN/metabolismo , Interpretación Estadística de Datos , Bases de Datos de Ácidos Nucleicos , Humanos , Cadenas de Markov , Factores de Transcripción/metabolismo
12.
J Bioinform Comput Biol ; 9(2): 269-82, 2011 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-21523932

RESUMEN

New high-throughput sequencing technologies can generate millions of short sequences in a single experiment. As the size of the data increases, comparison of multiple experiments on different cell lines under different experimental conditions becomes a big challenge. In this paper, we investigate ways to compare multiple ChIP-sequencing experiments. We specifically studied epigenetic regulation of breast cancer and the effect of estrogen using 50 ChIP-sequencing data from Illumina Genome Analyzer II. First, we evaluate the correlation among different experiments focusing on the total number of reads in transcribed and promoter regions of the genome. Then, we adopt the method that is used to identify the most stable genes in RT-PCR experiments to understand background signal across all of the experiments and to identify the most variable transcribed and promoter regions of the genome. We observed that the most variable genes for transcribed regions and promoter regions are very distinct. Gene ontology and function enrichment analysis on these most variable genes demonstrate the biological relevance of the results. In this study, we present a method that can effectively select differential regions of the genome based on protein-binding profiles over multiple experiments using real data points without any normalization among the samples.


Asunto(s)
Inmunoprecipitación de Cromatina/estadística & datos numéricos , Algoritmos , Neoplasias de la Mama/genética , Neoplasias de la Mama/metabolismo , Línea Celular , Línea Celular Tumoral , Biología Computacional , Epigénesis Genética , Femenino , Genoma Humano , Humanos , Unión Proteica
13.
Hum Genomics ; 5(2): 117-23, 2011 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-21296745

RESUMEN

Chromatin immunoprecipitation followed by massively parallel next-generation sequencing (ChIP-seq) is a valuable experimental strategy for assaying protein-DNA interaction over the whole genome. Many computational tools have been designed to find the peaks of the signals corresponding to protein binding sites. In this paper, three computational methods, ChIP-seq processing pipeline (spp), PeakSeq and CisGenome, used in ChIP-seq data analysis are reviewed. There is also a comparison of how they agree and disagree on finding peaks using the publically available Signal Transducers and Activators of Transcription protein 1 (STAT1) and RNA polymerase II (PolII) datasets with corresponding negative controls.


Asunto(s)
Inmunoprecipitación de Cromatina/métodos , Análisis de Secuencia de ADN , Programas Informáticos , Inmunoprecipitación de Cromatina/estadística & datos numéricos , Humanos , Unión Proteica , ARN Polimerasa II/genética , Proyectos de Investigación , Factor de Transcripción STAT1/genética
14.
Biometrics ; 66(4): 1284-94, 2010 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-20128774

RESUMEN

ChIP-chip experiments are procedures that combine chromatin immunoprecipitation (ChIP) and DNA microarray (chip) technology to study a variety of biological problems, including protein-DNA interaction, histone modification, and DNA methylation. The most important feature of ChIP-chip data is that the intensity measurements of probes are spatially correlated because the DNA fragments are hybridized to neighboring probes in the experiments. We propose a simple, but powerful Bayesian hierarchical approach to ChIP-chip data through an Ising model with high-order interactions. The proposed method naturally takes into account the intrinsic spatial structure of the data and can be used to analyze data from multiple platforms with different genomic resolutions. The model parameters are estimated using the Gibbs sampler. The proposed method is illustrated using two publicly available data sets from Affymetrix and Agilent platforms, and compared with three alternative Bayesian methods, namely, Bayesian hierarchical model, hierarchical gamma mixture model, and Tilemap hidden Markov model. The numerical results indicate that the proposed method performs as well as the other three methods for the data from Affymetrix tiling arrays, but significantly outperforms the other three methods for the data from Agilent promoter arrays. In addition, we find that the proposed method has better operating characteristics in terms of sensitivities and false discovery rates under various scenarios.


Asunto(s)
Teorema de Bayes , Inmunoprecipitación de Cromatina/estadística & datos numéricos , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Humanos , Métodos , Sensibilidad y Especificidad
15.
Genome Biol ; 10(12): R142, 2009.
Artículo en Inglés | MEDLINE | ID: mdl-20028542

RESUMEN

We present CSDeconv, a computational method that determines locations of transcription factor binding from ChIP-seq data. CSDeconv differs from prior methods in that it uses a blind deconvolution approach that allows closely-spaced binding sites to be called accurately. We apply CSDeconv to novel ChIP-seq data for DosR binding in Mycobacterium tuberculosis and to existing data for GABP in humans and show that it can discriminate binding sites separated by as few as 40 bp.


Asunto(s)
Inmunoprecipitación de Cromatina/estadística & datos numéricos , Biología Computacional/métodos , Programas Informáticos , Factores de Transcripción/metabolismo , Sitios de Unión/genética , Humanos , Mycobacterium tuberculosis/genética , Factores de Transcripción/genética
16.
Methods Mol Biol ; 521: 255-78, 2009.
Artículo en Inglés | MEDLINE | ID: mdl-19563111

RESUMEN

Chromatin immunoprecipitation (ChIP) is a widely used method to study the interactions between proteins and discrete chromosomal loci in vivo. Originally, ChIP was developed for analysis of protein associations with DNA sequences known or suspected to bind the protein of interest. The advent of DNA microarrays has enabled the identification of all DNA sequences enriched by ChIP, providing a genomic view of protein binding. This powerful approach, termed ChIP-chip, is broadly applicable and has been particularly valuable in DNA replication studies to map replication origins in Saccharomyces cerevisiae based on the association of replication proteins with these chromosomal elements. We present a detailed ChIP-chip protocol for S. cerevisiae that uses oligonucleotide DNA microarrays printed on polylysine-coated glass slides and can also be easily adapted for commercially available high-density tiling microarrays from NimbleGen. We also outline general protocols for data analysis; however, microarray data analyses usually must be tailored specifically for individual studies, depending on experimental design, microarray format, and data quality.


Asunto(s)
Inmunoprecipitación de Cromatina/métodos , Cromatina/metabolismo , Replicación del ADN , Proteínas de Unión al ADN/metabolismo , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Inmunoprecipitación de Cromatina/estadística & datos numéricos , Reactivos de Enlaces Cruzados , ADN de Hongos/biosíntesis , ADN de Hongos/aislamiento & purificación , Interpretación Estadística de Datos , Colorantes Fluorescentes , Hibridación de Ácido Nucleico , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Origen de Réplica , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo
17.
Biometrics ; 65(4): 1087-95, 2009 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-19210737

RESUMEN

We propose a unified framework for the analysis of chromatin (Ch) immunoprecipitation (IP) microarray (ChIP-chip) data for detecting transcription factor binding sites (TFBSs) or motifs. ChIP-chip assays are used to focus the genome-wide search for TFBSs by isolating a sample of DNA fragments with TFBSs and applying this sample to a microarray with probes corresponding to tiled segments across the genome. Present analytical methods use a two-step approach: (i) analyze array data to estimate IP-enrichment peaks then (ii) analyze the corresponding sequences independently of intensity information. The proposed model integrates peak finding and motif discovery through a unified Bayesian hidden Markov model (HMM) framework that accommodates the inherent uncertainty in both measurements. A Markov chain Monte Carlo algorithm is formulated for parameter estimation, adapting recursive techniques used for HMMs. In simulations and applications to a yeast RAP1 dataset, the proposed method has favorable TFBS discovery performance compared to currently available two-stage procedures in terms of both sensitivity and specificity.


Asunto(s)
Biometría/métodos , Inmunoprecipitación de Cromatina/estadística & datos numéricos , Genómica/estadística & datos numéricos , Modelos Estadísticos , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Algoritmos , Secuencia de Bases , Teorema de Bayes , Sitios de Unión/genética , ADN de Hongos/genética , ADN de Hongos/metabolismo , Cadenas de Markov , Método de Montecarlo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Complejo Shelterina , Proteínas de Unión a Telómeros/metabolismo , Factores de Transcripción/metabolismo
18.
PLoS Comput Biol ; 4(10): e1000201, 2008 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-18927605

RESUMEN

Computational methods to identify functional genomic elements using genetic information have been very successful in determining gene structure and in identifying a handful of cis-regulatory elements. But the vast majority of regulatory elements have yet to be discovered, and it has become increasingly apparent that their discovery will not come from using genetic information alone. Recently, high-throughput technologies have enabled the creation of information-rich epigenetic maps, most notably for histone modifications. However, tools that search for functional elements using this epigenetic information have been lacking. Here, we describe an unsupervised learning method called ChromaSig to find, in an unbiased fashion, commonly occurring chromatin signatures in both tiling microarray and sequencing data. Applying this algorithm to nine chromatin marks across a 1% sampling of the human genome in HeLa cells, we recover eight clusters of distinct chromatin signatures, five of which correspond to known patterns associated with transcriptional promoters and enhancers. Interestingly, we observe that the distinct chromatin signatures found at enhancers mark distinct functional classes of enhancers in terms of transcription factor and coactivator binding. In addition, we identify three clusters of novel chromatin signatures that contain evolutionarily conserved sequences and potential cis-regulatory elements. Applying ChromaSig to a panel of 21 chromatin marks mapped genomewide by ChIP-Seq reveals 16 classes of genomic elements marked by distinct chromatin signatures. Interestingly, four classes containing enrichment for repressive histone modifications appear to be locally heterochromatic sites and are enriched in quickly evolving regions of the genome. The utility of this approach in uncovering novel, functionally significant genomic elements will aid future efforts of genome annotation via chromatin modifications.


Asunto(s)
Cromatina/genética , Genoma Humano , Modelos Genéticos , Modelos Estadísticos , Inteligencia Artificial , Cromatina/metabolismo , Inmunoprecipitación de Cromatina/estadística & datos numéricos , Biología Computacional , Elementos de Facilitación Genéticos , Células HeLa , Histonas/química , Histonas/genética , Histonas/metabolismo , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Regiones Promotoras Genéticas , Procesamiento Proteico-Postraduccional , Sitio de Iniciación de la Transcripción
19.
Curr Opin Biotechnol ; 19(1): 50-4, 2008 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-18207385

RESUMEN

Changes in transcript levels are assessed by microarray analysis on an individual basis, essentially resulting in long lists of genes that were found to have significantly changed transcript levels. However, in biology these changes do not occur as independent events as such lists suggest, but in a highly coordinated and interdependent manner. Understanding the biological meaning of the observed changes requires elucidating such biological interdependencies. The most common way to achieve this is to project the gene lists onto distinct biological processes often represented in the form of gene-ontology (GO) categories or metabolic and regulatory pathways as derived from literature analysis. This review focuses on different approaches and tools employed for this task, starting form GO-ranking methods, covering pathway mappings, finally converging on biological network analysis. A brief outlook of the application of such approaches to the newest microarray-based technologies (Chromatin-ImmunoPrecipitation, ChIP-on-chip) concludes the review.


Asunto(s)
Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Biotecnología , Inmunoprecipitación de Cromatina/estadística & datos numéricos , Biología Computacional , Interpretación Estadística de Datos , Bases de Datos Genéticas
20.
Pac Symp Biocomput ; : 515-26, 2008.
Artículo en Inglés | MEDLINE | ID: mdl-18229712

RESUMEN

Whole genome tiling arrays at a user specified resolution are becoming a versatile tool in genomics. Chromatin immunoprecipitation on microarrays (ChIP-chip) is a powerful application of these arrays. Although there is an increasing number of methods for analyzing ChIP-chip data, perhaps the most simple and commonly used one, due to its computational efficiency, is testing with a moving average statistic. Current moving average methods assume exchangeability of the measurements within an array. They are not tailored to deal with the issues due to array designs such as overlapping probes that result in correlated measurements. We investigate the correlation structure of data from such arrays and propose an extension of the moving average testing via a robust and rapid method called CMARRT. We illustrate the pitfalls of ignoring the correlation structure in simulations and a case study. Our approach is implemented as an R package called CMARRT and can be used with any tiling array platform.


Asunto(s)
Algoritmos , Inmunoprecipitación de Cromatina/estadística & datos numéricos , Análisis por Micromatrices/estadística & datos numéricos , Biología Computacional , Interpretación Estadística de Datos , Cadenas de Markov , Modelos Estadísticos , Análisis de Regresión , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...