Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
Bioinformatics ; 31(8): 1290-2, 2015 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-25480377

RESUMEN

UNLABELLED: We implemented a high-throughput identification pipeline for promoter interacting enhancer element to streamline the workflow from mapping raw Hi-C reads, identifying DNA-DNA interacting fragments with high confidence and quality control, detecting histone modifications and DNase hypersensitive enrichments in putative enhancer elements, to ultimately extracting possible intra- and inter-chromosomal enhancer-target gene relationships. AVAILABILITY AND IMPLEMENTATION: This software package is designed to run on high-performance computing clusters with Oracle Grid Engine. The source code is freely available under the MIT license for academic and nonprofit use. The source code and instructions are available at the Wang lab website (http://wanglab.pcbi.upenn.edu/hippie/). It is also provided as an Amazon Machine Image to be used directly on Amazon Cloud with minimal installation. CONTACT: lswang@mail.med.upenn.edu or bdgregor@sas.upenn.edu SUPPLEMENTARY INFORMATION: Supplementary Material is available at Bioinformatics online.


Asunto(s)
ADN/genética , ADN/metabolismo , Elementos de Facilitación Genéticos/genética , Regiones Promotoras Genéticas/genética , Análisis de Secuencia de ADN/métodos , Humanos , Lenguajes de Programación
2.
Nucleic Acids Res ; 41(9): 4835-46, 2013 May.
Artículo en Inglés | MEDLINE | ID: mdl-23525463

RESUMEN

Enhancer elements are essential for tissue-specific gene regulation during mammalian development. Although these regulatory elements are often distant from their target genes, they affect gene expression by recruiting transcription factors to specific promoter regions. Because of this long-range action, the annotation of enhancer element-target promoter pairs remains elusive. Here, we developed a novel analysis methodology that takes advantage of Hi-C data to comprehensively identify these interactions throughout the human genome. To do this, we used a geometric distribution-based model to identify DNA-DNA interaction hotspots that contact gene promoters with high confidence. We observed that these promoter-interacting hotspots significantly overlap with known enhancer-associated histone modifications and DNase I hypersensitive sites. Thus, we defined thousands of candidate enhancer elements by incorporating these features, and found that they have a significant propensity to be bound by p300, an enhancer binding transcription factor. Furthermore, we revealed that their target genes are significantly bound by RNA Polymerase II and demonstrate tissue-specific expression. Finally, we uncovered that these elements are generally found within 1 Mb of their targets, and often regulate multiple genes. In total, our study presents a novel high-throughput workflow for confident, genome-wide discovery of enhancer-target promoter pairs, which will significantly improve our understanding of these regulatory interactions.


Asunto(s)
Elementos de Facilitación Genéticos , Genoma Humano , Regiones Promotoras Genéticas , Animales , Secuencia de Bases , Sitios de Unión , Secuencia Conservada , ADN/metabolismo , Expresión Génica , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Motivos de Nucleótidos , ARN Polimerasa II/metabolismo , Análisis de Secuencia de ADN , Vertebrados/genética , Factores de Transcripción p300-CBP/metabolismo
3.
Bioinformatics ; 29(19): 2498-500, 2013 Oct 01.
Artículo en Inglés | MEDLINE | ID: mdl-23943636

RESUMEN

SUMMARY: We report our new DRAW+SneakPeek software for DNA-seq analysis. DNA resequencing analysis workflow (DRAW) automates the workflow of processing raw sequence reads including quality control, read alignment and variant calling on high-performance computing facilities such as Amazon elastic compute cloud. SneakPeek provides an effective interface for reviewing dozens of quality metrics reported by DRAW, so users can assess the quality of data and diagnose problems in their sequencing procedures. Both DRAW and SneakPeek are freely available under the MIT license, and are available as Amazon machine images to be used directly on Amazon cloud with minimal installation. AVAILABILITY: DRAW+SneakPeek is released under the MIT license and is available for academic and nonprofit use for free. The information about source code, Amazon machine images and instructions on how to install and run DRAW+SneakPeek locally and on Amazon elastic compute cloud is available at the National Institute on Aging Genetics of Alzheimer's Disease Data Storage Site (http://www.niagads.org/) and Wang lab Web site (http://wanglab.pcbi.upenn.edu/).


Asunto(s)
Biometría/métodos , ADN/análisis , Análisis de Secuencia de ADN/métodos , Diseño de Software , Internet , Lenguajes de Programación
4.
BMC Biol ; 11: 19, 2013 Feb 28.
Artículo en Inglés | MEDLINE | ID: mdl-23448136

RESUMEN

BACKGROUND: The nuclear factor-KappaB (NF-κB) pathway is conserved from fruit flies to humans and is a key mediator of inflammatory signaling. Aberrant regulation of NF-κB is associated with several disorders including autoimmune disease, chronic inflammation, and cancer, making the NF-κB pathway an attractive therapeutic target. Many regulatory components of the NF-κB pathway have been identified, including microRNAs (miRNAs). miRNAs are small non-coding RNAs and are common components of signal transduction pathways. Here we present a cell-based functional genomics screen to systematically identify miRNAs that regulate NF-κB signaling. RESULTS: We screened a library of miRNA mimics using a NF-κB reporter cell line in the presence and absence of tumor necrosis factor (+/- TNF). There were 9 and 15 hits in the -TNF and +TNF screens, respectively. We identified putative functional targets of these hits by integrating computational predictions with NF-κB modulators identified in a previous genome-wide cDNA screen. miR-517a and miR-517c were the top hits, activating the reporter 86- and 126-fold, respectively. Consistent with these results, miR-517a/c induced the expression of endogenous NF-κB targets and promoted the nuclear localization of p65 and the degradation of IκB. We identified TNFAIP3 interacting protein1 (TNIP1) as a target and characterized a functional SNP in the miR-517a/c binding site. Lastly, miR-517a/c induced apoptosis in vitro, which was phenocopied by knockdown of TNIP1. CONCLUSIONS: Our study suggests that miRNAs are common components of NF-κB signaling and miR-517a/c may play an important role in linking NF-κB signaling with cell survival through TNIP1.


Asunto(s)
Genómica , MicroARNs/fisiología , FN-kappa B/metabolismo , Transducción de Señal/fisiología , Apoptosis , Línea Celular , Humanos
5.
Nat Biotechnol ; 40(5): 672-680, 2022 05.
Artículo en Inglés | MEDLINE | ID: mdl-35132260

RESUMEN

The repetitive nature and complexity of some medically relevant genes poses a challenge for their accurate analysis in a clinical setting. The Genome in a Bottle Consortium has provided variant benchmark sets, but these exclude nearly 400 medically relevant genes due to their repetitiveness or polymorphic complexity. Here, we characterize 273 of these 395 challenging autosomal genes using a haplotype-resolved whole-genome assembly. This curated benchmark reports over 17,000 single-nucleotide variations, 3,600 insertions and deletions and 200 structural variations each for human genome reference GRCh37 and GRCh38 across HG002. We show that false duplications in either GRCh37 or GRCh38 result in reference-specific, missed variants for short- and long-read technologies in medically relevant genes, including CBS, CRYAA and KCNE1. When masking these false duplications, variant recall can improve from 8% to 100%. Forming benchmarks from a haplotype-resolved whole-genome assembly may become a prototype for future benchmarks covering the whole genome.


Asunto(s)
Genoma Humano , Genoma Humano/genética , Haplotipos/genética , Humanos , Análisis de Secuencia de ADN
6.
NAR Genom Bioinform ; 2(2): lqaa022, 2020 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-32270138

RESUMEN

Most regulatory chromatin interactions are mediated by various transcription factors (TFs) and involve physically interacting elements such as enhancers, insulators or promoters. To map these elements and interactions at a fine scale, we developed HIPPIE2 that analyzes raw reads from high-throughput chromosome conformation (Hi-C) experiments to identify precise loci of DNA physically interacting regions (PIRs). Unlike standard genome binning approaches (e.g. 10-kb to 1-Mb bins), HIPPIE2 dynamically infers the physical locations of PIRs using the distribution of restriction sites to increase analysis precision and resolution. We applied HIPPIE2 to in situ Hi-C datasets across six human cell lines (GM12878, IMR90, K562, HMEC, HUVEC, NHEK) with matched ENCODE/Roadmap functional genomic data. HIPPIE2 detected 1042 738 distinct PIRs, with high resolution (average PIR length of 1006 bp) and high reproducibility (92.3% in GM12878). PIRs are enriched for epigenetic marks (H3K27ac, H3K4me1) and open chromatin, suggesting active regulatory roles. HIPPIE2 identified 2.8 million significant PIR-PIR interactions, 27.2% of which were enriched for TF binding sites. 50 608 interactions were enhancer-promoter interactions and were enriched for 33 TFs, including known DNA looping/long-range mediators. These findings demonstrate that the novel dynamic approach of HIPPIE2 (https://bitbucket.com/wanglab-upenn/HIPPIE2) enables the characterization of chromatin and regulatory interactions with high resolution and reproducibility.

7.
Genome Biol ; 18(1): 199, 2017 10 26.
Artículo en Inglés | MEDLINE | ID: mdl-29070071

RESUMEN

Transcriptional enhancers regulate spatio-temporal gene expression. While genomic assays can identify putative enhancers en masse, assigning target genes is a complex challenge. We devised a machine learning approach, McEnhancer, which links target genes to putative enhancers via a semi-supervised learning algorithm that predicts gene expression patterns based on enriched sequence features. Predicted expression patterns were 73-98% accurate, predicted assignments showed strong Hi-C interaction enrichment, enhancer-associated histone modifications were evident, and known functional motifs were recovered. Our model provides a general framework to link globally identified enhancers to targets and contributes to deciphering the regulatory genome.


Asunto(s)
Elementos de Facilitación Genéticos , Regulación del Desarrollo de la Expresión Génica , Aprendizaje Automático , Animales , Desoxirribonucleasa I , Drosophila melanogaster/embriología , Drosophila melanogaster/genética , Desarrollo Embrionario/genética , Genes Reporteros , Código de Histonas , Motivos de Nucleótidos , Regiones Promotoras Genéticas , Análisis de Secuencia de ADN , Factores de Transcripción/metabolismo
8.
PLoS One ; 8(10): e74578, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24098339

RESUMEN

Several recent gene expression studies identified hundreds of genes that are correlated with age in brain and other tissues in human. However, these studies used linear models of age correlation, which are not well equipped to model abrupt changes associated with particular ages. We developed a computational algorithm for age estimation in which the expression of each gene is treated as a dichotomized biomarker for whether the subject is older or younger than a particular age. In addition, for each age-informative gene our algorithm identifies the age threshold with the most drastic change in expression level, which allows us to associate genes with particular age periods. Analysis of human aging brain expression datasets from three frontal cortex regions showed that different pathways undergo transitions at different ages, and the distribution of pathways and age thresholds varies across brain regions. Our study reveals age-correlated expression changes at particular age points and allows one to estimate the age of an individual with better accuracy than previously published methods.


Asunto(s)
Envejecimiento/genética , Encéfalo/metabolismo , Biología Computacional/métodos , Dinámicas no Lineales , Transcriptoma/fisiología , Adulto , Anciano , Envejecimiento/metabolismo , Teorema de Bayes , Regulación hacia Abajo/fisiología , Femenino , Humanos , Masculino , Persona de Mediana Edad , Mapas de Interacción de Proteínas/genética
9.
J Proteome Res ; 8(4): 1925-31, 2009 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-19231892

RESUMEN

Essential genes are responsible for the viability of an organism. Global protein interaction network analysis provides an effective way to understand the relationships between protein products of genes. By means of large-scale identification of essential genes and protein-protein interactions, we investigated the substructure of the protein interaction network in Escherichia coli and identified all the cliques in the network. Our analysis showed that larger cliques tend to have larger fractions of proteins encoded by essential genes. By merging the maximum clique with overlapping neighboring cliques, we observed a dense core of the protein interaction network in Escherichia coliwith significantly higher ratio of essential genes. The protein network of Saccharomyces cerevisiae also shows strong correlation between clique and essentiality, and there exist similar dense clusters with high essentiality. Our results indicated that the observed structure of essential cores might exist in higher organisms and play important roles in their respective protein networks.


Asunto(s)
Proteínas de Escherichia coli/metabolismo , Escherichia coli/metabolismo , Mapeo de Interacción de Proteínas , Proteoma/metabolismo , Análisis por Conglomerados , Unión Proteica/fisiología
10.
Mol Biosyst ; 5(12): 1672-8, 2009 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-19452048

RESUMEN

Essential genes are indispensable to the viability of an organism. Identification and analysis of essential genes is key to understanding the systems level organization of living cells. On the other hand, the ability to predict these genes in pathogens is of great importance for directed drug development. Global analysis of protein interaction networks provides an effective way to elucidate the relationships between genes. It has been found that essential genes tend to be highly connected and generally have more interactions than nonessential ones. With recent large-scale identifications of essential genes and protein-protein interactions in Saccharomyces cerevisiae and Escherichia coli, we have systematically investigated the topological properties of essential and nonessential genes in the protein-protein interaction networks. Essential genes tend to play topologically more important roles in protein interaction networks. Many topological features were found to be statistically discriminative between essential and nonessential genes. In addition, we have also examined sequence properties such as open reading frame length, strand, and phyletic retention for their association with the gene essentiality. Employing the topological features in the protein interaction network and the sequence properties, we have built a machine learning classifier capable of predicting essential genes. Computational prediction of essential genes circumvents expensive and difficult experimental screens and will help antimicrobial drug development.


Asunto(s)
Redes Reguladoras de Genes , Genes Esenciales , Genómica/métodos , Modelos Estadísticos , Análisis de Secuencia de ADN/métodos , Escherichia coli/genética , Genoma Bacteriano/genética , Genoma Fúngico/genética , Modelos Genéticos , Curva ROC , Saccharomyces cerevisiae/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA