Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 75
Filtrar
1.
Bioinformatics ; 40(3)2024 Mar 04.
Artículo en Inglés | MEDLINE | ID: mdl-38407414

RESUMEN

MOTIVATION: Prediction and identification of core promoter elements and transcription factor binding sites is essential for understanding the mechanism of transcription initiation and deciphering the biological activity of a specific locus. Thus, there is a need for an up-to-date tool to detect and curate core promoter elements/motifs in any provided nucleotide sequences. RESULTS: Here, we introduce ElemeNT 2023-a new and enhanced version of the Elements Navigation Tool, which provides novel capabilities for assessing evolutionary conservation and for readily evaluating the quality of high-throughput transcription start site (TSS) datasets, leveraging preferential motif positioning. ElemeNT 2023 is accessible both as a fast web-based tool and via command line (no coding skills are required to run the tool). While this tool is focused on core promoter elements, it can also be used for searching any user-defined motif, including sequence-specific DNA binding sites. Furthermore, ElemeNT's CORE database, which contains predicted core promoter elements around annotated TSSs, is now expanded to cover 10 species, ranging from worms to human. In this applications note, we describe the new workflow and demonstrate a case study using ElemeNT 2023 for core promoter composition analysis of diverse species, revealing motif prevalence and highlighting evolutionary insights. We discuss how this tool facilitates the exploration of uncharted transcriptomic data, appraises TSS quality, and aids in designing synthetic promoters for gene expression optimization. Taken together, ElemeNT 2023 empowers researchers with comprehensive tools for meticulous analysis of sequence elements and gene expression strategies. AVAILABILITY AND IMPLEMENTATION: ElemeNT 2023 is freely available at https://www.juven-gershonlab.org/resources/element-v2023/. The source code and command line version of ElemeNT 2023 are available at https://github.com/OritAdato/ElemeNT. No coding skills are required to run the tool.


Asunto(s)
Programas Informáticos , Humanos , Regiones Promotoras Genéticas , Unión Proteica , Sitio de Iniciación de la Transcripción
2.
Biochim Biophys Acta Gene Regul Mech ; 1865(1): 194768, 2022 01.
Artículo en Inglés | MEDLINE | ID: mdl-34757206

RESUMEN

As computational modeling becomes more essential to analyze and understand biological regulatory mechanisms, governance of the many databases and knowledge bases that support this domain is crucial to guarantee reliability and interoperability of resources. To address this, the COST Action Gene Regulation Ensemble Effort for the Knowledge Commons (GREEKC, CA15205, www.greekc.org) organized nine workshops in a four-year period, starting September 2016. The workshops brought together a wide range of experts from all over the world working on various steps in the knowledge management process that focuses on understanding gene regulatory mechanisms. The discussions between ontologists, curators, text miners, biologists, bioinformaticians, philosophers and computational scientists spawned a host of activities aimed to standardize and update existing knowledge management workflows and involve end-users in the process of designing the Gene Regulation Knowledge Commons (GRKC). Here the GREEKC consortium describes its main achievements in improving this GRKC.


Asunto(s)
Regulación de la Expresión Génica , Reproducibilidad de los Resultados
3.
PLoS Comput Biol ; 17(8): e1009256, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-34383743

RESUMEN

Metazoan core promoters, which direct the initiation of transcription by RNA polymerase II (Pol II), may contain short sequence motifs termed core promoter elements/motifs (e.g. the TATA box, initiator (Inr) and downstream core promoter element (DPE)), which recruit Pol II via the general transcription machinery. The DPE was discovered and extensively characterized in Drosophila, where it is strictly dependent on both the presence of an Inr and the precise spacing from it. Since the Drosophila DPE is recognized by the human transcription machinery, it is most likely that some human promoters contain a downstream element that is similar, though not necessarily identical, to the Drosophila DPE. However, only a couple of human promoters were shown to contain a functional DPE, and attempts to computationally detect human DPE-containing promoters have mostly been unsuccessful. Using a newly-designed motif discovery strategy based on Expectation-Maximization probabilistic partitioning algorithms, we discovered preferred downstream positions (PDP) in human promoters that resemble the Drosophila DPE. Available chromatin accessibility footprints revealed that Drosophila and human Inr+DPE promoter classes are not only highly structured, but also similar to each other, particularly in the proximal downstream region. Clustering of the corresponding sequence motifs using a neighbor-joining algorithm strongly suggests that canonical Inr+DPE promoters could be common to metazoan species. Using reporter assays we demonstrate the contribution of the identified downstream positions to the function of multiple human promoters. Furthermore, we show that alteration of the spacing between the Inr and PDP by two nucleotides results in reduced promoter activity, suggesting a spacing dependency of the newly discovered human PDP on the Inr. Taken together, our strategy identified novel functional downstream positions within human core promoters, supporting the existence of DPE-like motifs in human promoters.


Asunto(s)
Genoma Humano , Regiones Promotoras Genéticas , Algoritmos , Animales , Secuencia de Bases , Biología Computacional , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Regulación de la Expresión Génica , Células HEK293 , Humanos , Modelos Genéticos , Modelos Estadísticos , ARN Polimerasa II/metabolismo , Especificidad de la Especie , TATA Box , Transcripción Genética
4.
EMBO Mol Med ; 13(7): e14314, 2021 07 07.
Artículo en Inglés | MEDLINE | ID: mdl-34042278

RESUMEN

Hormonal contraception exposes women to synthetic progesterone receptor (PR) agonists, progestins, and transiently increases breast cancer risk. How progesterone and progestins affect the breast epithelium is poorly understood because we lack adequate models to study this. We hypothesized that individual progestins differentially affect breast epithelial cell proliferation and hence breast cancer risk. Using mouse mammary tissue ex vivo, we show that testosterone-related progestins induce the PR target and mediator of PR signaling-induced cell proliferation receptor activator of NF-κB ligand (Rankl), whereas progestins with anti-androgenic properties in reporter assays do not. We develop intraductal xenografts of human breast epithelial cells from 36 women, show they remain hormone-responsive and that progesterone and the androgenic progestins, desogestrel, gestodene, and levonorgestrel, promote proliferation but the anti-androgenic, chlormadinone, and cyproterone acetate, do not. Prolonged exposure to androgenic progestins elicits hyperproliferation with cytologic changes. Androgen receptor inhibition interferes with PR agonist- and levonorgestrel-induced RANKL expression and reduces levonorgestrel-driven cell proliferation. Thus, different progestins have distinct biological activities in the breast epithelium to be considered for more informed choices in hormonal contraception.


Asunto(s)
Andrógenos , Progestinas , Animales , Proliferación Celular , Anticonceptivos , Ratones
5.
EMBO Mol Med ; 13(3): e13180, 2021 03 05.
Artículo en Inglés | MEDLINE | ID: mdl-33616307

RESUMEN

Invasive lobular carcinoma (ILC) is the most frequent special histological subtype of breast cancer, typically characterized by loss of E-cadherin. It has clinical features distinct from other estrogen receptor-positive (ER+ ) breast cancers but the molecular mechanisms underlying its characteristic biology are poorly understood because we lack experimental models to study them. Here, we recapitulate the human disease, including its metastatic pattern, by grafting ILC-derived breast cancer cell lines, SUM-44 PE and MDA-MB-134-VI cells, into the mouse milk ducts. Using patient-derived intraductal xenografts from lobular and non-lobular ER+ HER2- tumors to compare global gene expression, we identify extracellular matrix modulation as a lobular carcinoma cell-intrinsic trait. Analysis of TCGA patient datasets shows matrisome signature is enriched in lobular carcinomas with overexpression of elastin, collagens, and the collagen modifying enzyme LOXL1. Treatment with the pan LOX inhibitor BAPN and silencing of LOXL1 expression decrease tumor growth, invasion, and metastasis by disrupting ECM structure resulting in decreased ER signaling. We conclude that LOXL1 inhibition is a promising therapeutic strategy for ILC.


Asunto(s)
Neoplasias de la Mama , Carcinoma Lobular , Aminoácido Oxidorreductasas/genética , Animales , Carcinoma Lobular/genética , Matriz Extracelular , Femenino , Xenoinjertos , Humanos , Ratones , Receptores de Estrógenos
6.
Genome Biol ; 21(1): 114, 2020 05 11.
Artículo en Inglés | MEDLINE | ID: mdl-32393327

RESUMEN

BACKGROUND: Positional weight matrix (PWM) is a de facto standard model to describe transcription factor (TF) DNA binding specificities. PWMs inferred from in vivo or in vitro data are stored in many databases and used in a plethora of biological applications. This calls for comprehensive benchmarking of public PWM models with large experimental reference sets. RESULTS: Here we report results from all-against-all benchmarking of PWM models for DNA binding sites of human TFs on a large compilation of in vitro (HT-SELEX, PBM) and in vivo (ChIP-seq) binding data. We observe that the best performing PWM for a given TF often belongs to another TF, usually from the same family. Occasionally, binding specificity is correlated with the structural class of the DNA binding domain, indicated by good cross-family performance measures. Benchmarking-based selection of family-representative motifs is more effective than motif clustering-based approaches. Overall, there is good agreement between in vitro and in vivo performance measures. However, for some in vivo experiments, the best performing PWM is assigned to an unrelated TF, indicating a binding mode involving protein-protein cooperativity. CONCLUSIONS: In an all-against-all setting, we compute more than 18 million performance measure values for different PWM-experiment combinations and offer these results as a public resource to the research community. The benchmarking protocols are provided via a web interface and as docker images. The methods and results from this study may help others make better use of public TF specificity models, as well as public TF binding data sets.


Asunto(s)
Dominios y Motivos de Interacción de Proteínas , Programas Informáticos , Factores de Transcripción/metabolismo , Animales , Benchmarking , Secuenciación de Inmunoprecipitación de Cromatina , Humanos , Ratones
7.
Nat Commun ; 11(1): 1571, 2020 03 26.
Artículo en Inglés | MEDLINE | ID: mdl-32218432

RESUMEN

Estrogens and progesterone control breast development and carcinogenesis via their cognate receptors expressed in a subset of luminal cells in the mammary epithelium. How they control the extracellular matrix, important to breast physiology and tumorigenesis, remains unclear. Here we report that both hormones induce the secreted protease Adamts18 in myoepithelial cells by controlling Wnt4 expression with consequent paracrine canonical Wnt signaling activation. Adamts18 is required for stem cell activation, has multiple binding partners in the basement membrane and interacts genetically with the basal membrane-specific proteoglycan, Col18a1, pointing to the basement membrane as part of the stem cell niche. In vitro, ADAMTS18 cleaves fibronectin; in vivo, Adamts18 deletion causes increased collagen deposition during puberty, which results in impaired Hippo signaling and reduced Fgfr2 expression both of which control stem cell function. Thus, Adamts18 links luminal hormone receptor signaling to basement membrane remodeling and stem cell activation.


Asunto(s)
Proteínas ADAMTS/metabolismo , Hormonas/farmacología , Glándulas Mamarias Animales/citología , Nicho de Células Madre , Células Madre/metabolismo , Proteínas ADAMTS/deficiencia , Proteínas ADAMTS/genética , Animales , Antígenos CD/metabolismo , Línea Celular , Autorrenovación de las Células/efectos de los fármacos , Epitelio/metabolismo , Matriz Extracelular/efectos de los fármacos , Matriz Extracelular/metabolismo , Femenino , Fibronectinas/metabolismo , Glicoproteínas/metabolismo , Humanos , Ratones Endogámicos C57BL , Modelos Biológicos , ARN Mensajero/genética , ARN Mensajero/metabolismo , Receptores de Progesterona/metabolismo , Regeneración/efectos de los fármacos , Transducción de Señal/efectos de los fármacos , Nicho de Células Madre/efectos de los fármacos , Células Madre/citología , Células Madre/efectos de los fármacos , Transcripción Genética/efectos de los fármacos
8.
Nucleic Acids Res ; 48(D1): D65-D69, 2020 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-31680159

RESUMEN

The Eukaryotic Promoter Database (EPD), available online at https://epd.epfl.ch, provides accurate transcription start site (TSS) information for promoters of 15 model organisms plus corresponding functional genomics data that can be viewed in a genome browser, queried or analyzed via web interfaces, or exported in standard formats (FASTA, BED, CSV) for subsequent analysis with other tools. Recent work has focused on the improvement of the EPD promoter viewers, which use the UCSC Genome Browser as visualization platform. Thousands of high-resolution tracks for CAGE, ChIP-seq and similar data have been generated and organized into public track hubs. Customized, reproducible promoter views, combining EPD-supplied tracks with native UCSC Genome Browser tracks, can be accessed from the organism summary pages or from individual promoter entries. Moreover, thanks to recent improvements and stabilization of ncRNA gene catalogs, we were able to release promoter collections for certain classes of ncRNAs from human and mouse. Furthermore, we developed automatic computational protocols to assign orphan TSS peaks to downstream genes based on paired-end (RAMPAGE) TSS mapping data, which enabled us to add nearly 9000 new entries to the human promoter collection. Since our last article in this journal, EPD was extended to five more model organisms: rhesus monkey, rat, dog, chicken and Plasmodium falciparum.


Asunto(s)
Biología Computacional/métodos , Bases de Datos de Ácidos Nucleicos , Células Eucariotas/metabolismo , Genómica/métodos , Regiones Promotoras Genéticas , ARN no Traducido , Animales , Humanos , Programas Informáticos , Navegador Web
9.
Sci Rep ; 9(1): 18464, 2019 12 05.
Artículo en Inglés | MEDLINE | ID: mdl-31804560

RESUMEN

Parkinson disease (PD) is characterized by a pivotal progressive loss of substantia nigra dopaminergic neurons and aggregation of α-synuclein protein encoded by the SNCA gene. Genome-wide association studies identified almost 100 sequence variants linked to PD in SNCA. However, the consequences of this genetic variability are rather unclear. Herein, our analysis on selective single nucleotide polymorphisms (SNPs) which are highly associated with the PD susceptibility revealed that several SNP sites attribute to the nucleosomes and overlay with bivalent regions poised to adopt either active or repressed chromatin states. We also identified large number of transcription factor (TF) binding sites associated with these variants. In addition, we located two docking sites in the intron-1 methylation prone region of SNCA which are required for the putative interactions with DNMT1. Taken together, our analysis reflects an additional layer of epigenomic contribution for the regulation of the SNCA gene in PD.


Asunto(s)
Epigénesis Genética , Enfermedad de Parkinson/genética , alfa-Sinucleína/genética , Sitios de Unión/genética , Cromatina/metabolismo , ADN (Citosina-5-)-Metiltransferasa 1/metabolismo , Metilación de ADN , Conjuntos de Datos como Asunto , Neuronas Dopaminérgicas/metabolismo , Neuronas Dopaminérgicas/patología , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Histonas/metabolismo , Humanos , Intrones/genética , Nucleosomas/metabolismo , Enfermedad de Parkinson/patología , Polimorfismo de Nucleótido Simple , Unión Proteica/genética , Sustancia Negra/citología , Sustancia Negra/metabolismo , Sustancia Negra/patología , alfa-Sinucleína/metabolismo
10.
Nat Struct Mol Biol ; 26(8): 744-754, 2019 08.
Artículo en Inglés | MEDLINE | ID: mdl-31384063

RESUMEN

Precise nucleosome organization at eukaryotic promoters is thought to be generated by multiple chromatin remodeler (CR) enzymes and to affect transcription initiation. Using an integrated analysis of chromatin remodeler binding and nucleosome occupancy following rapid remodeler depletion, we investigated the interplay between these enzymes and their impact on transcription in yeast. We show that many promoters are affected by multiple CRs that operate in concert or in opposition to position the key transcription start site (TSS)-associated +1 nucleosome. We also show that nucleosome movement after CR inactivation usually results from the activity of another CR and that in the absence of any remodeling activity, +1 nucleosomes largely maintain their positions. Finally, we present functional assays suggesting that +1 nucleosome positioning often reflects a trade-off between maximizing RNA polymerase recruitment and minimizing transcription initiation at incorrect sites. Our results provide a detailed picture of fundamental mechanisms linking promoter nucleosome architecture to transcription initiation.


Asunto(s)
Ensamble y Desensamble de Cromatina/fisiología , Saccharomyces cerevisiae/genética , Sitio de Iniciación de la Transcripción , Iniciación de la Transcripción Genética/fisiología , Ensamble y Desensamble de Cromatina/genética , ADN de Hongos/genética , ADN Intergénico/genética , ADN Intergénico/metabolismo , Sustancias Macromoleculares/metabolismo , Nucleasa Microcócica/metabolismo , Nucleosomas/metabolismo , Saccharomyces cerevisiae/enzimología , Proteínas de Saccharomyces cerevisiae/metabolismo
11.
Bioinformatics ; 35(21): 4440-4441, 2019 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-31116370

RESUMEN

SUMMARY: We present SPar-K (Signal Partitioning with K-means), a method to search for archetypical chromatin architectures by partitioning a set of genomic regions characterized by chromatin signal profiles around ChIP-seq peaks and other kinds of functional sites. This method efficiently deals with problems of data heterogeneity, limited misalignment of anchor points and unknown orientation of asymmetric patterns. AVAILABILITY AND IMPLEMENTATION: SPar-K is a C++ program available on GitHub https://github.com/romaingroux/SPar-K and Docker Hub https://hub.docker.com/r/rgroux/spar-k. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Programas Informáticos , Cromatina , Inmunoprecipitación de Cromatina , Genoma , Genómica
12.
PLoS One ; 13(11): e0206823, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30418981

RESUMEN

Regulation of mRNA stability by RNA-protein interactions contributes significantly to quantitative aspects of gene expression. We have identified potential mRNA targets of the AU-rich element binding protein AUF1. Myc-tagged AUF1 p42 was induced in mouse NIH/3T3 cells and RNA-protein complexes isolated using anti-myc tag antibody beads. Bound mRNAs were analyzed with Affymetrix microarrays. We have identified 508 potential target mRNAs that were at least 3-fold enriched compared to control cells without myc-AUF1. 22.3% of the enriched mRNAs had an AU-rich cluster in the ARED Organism database, against 16.3% of non-enriched control mRNAs. The enrichment towards AU-rich elements was also visible by AREScore with an average value of 5.2 in the enriched mRNAs versus 4.2 in the control group. Yet, numerous mRNAs were enriched without a high ARE score. The enrichment of tetrameric and pentameric sequences suggests a broad AUF1 p42-binding spectrum at short U-rich sequences flanked by A or G. Still, some enriched mRNAs were highly unstable, as those of TNFSF11 (known as RANKL), KLF10, HES1, CCNT2, SMAD6, and BCL6. We have mapped some of the instability determinants. HES1 mRNA appeared to have a coding region determinant. Detailed analysis of the RANKL and BCL6 3'UTR revealed for both that full instability required two elements, which are conserved in evolution. In RANKL mRNA both elements are AU-rich and separated by 30 bases, while in BCL6 mRNA one is AU-rich and 60 bases from a non AU-rich element that potentially forms a stem-loop structure.


Asunto(s)
Ribonucleoproteína Heterogénea-Nuclear Grupo D/metabolismo , Proteínas Proto-Oncogénicas c-bcl-6/genética , Ligando RANK/genética , Estabilidad del ARN/genética , Regiones no Traducidas 3'/genética , Elementos Ricos en Adenilato y Uridilato/genética , Animales , Sitios de Unión/genética , Células HEK293 , Ribonucleoproteína Nuclear Heterogénea D0 , Ribonucleoproteína Heterogénea-Nuclear Grupo D/genética , Humanos , Ratones , Células 3T3 NIH , Análisis de Secuencia por Matrices de Oligonucleótidos , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Proteínas Proto-Oncogénicas c-bcl-6/metabolismo , Ligando RANK/metabolismo , ARN Mensajero/genética , ARN Mensajero/metabolismo
13.
PeerJ ; 6: e5362, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30083469

RESUMEN

To detect functional somatic mutations in tumor samples, whole-exome sequencing (WES) is often used for its reliability and relative low cost. RNA-seq, while generally used to measure gene expression, can potentially also be used for identification of somatic mutations. However there has been little systematic evaluation of the utility of RNA-seq for identifying somatic mutations. Here, we develop and evaluate a pipeline for processing RNA-seq data from glioblastoma multiforme (GBM) tumors in order to identify somatic mutations. The pipeline entails the use of the STAR aligner 2-pass procedure jointly with MuTect2 from genome analysis toolkit (GATK) to detect somatic variants. Variants identified from RNA-seq data were evaluated by comparison against the COSMIC and dbSNP databases, and also compared to somatic variants identified by exome sequencing. We also estimated the putative functional impact of coding variants in the most frequently mutated genes in GBM. Interestingly, variants identified by RNA-seq alone showed better representation of GBM-related mutations cataloged by COSMIC. RNA-seq-only data substantially outperformed the ability of WES to reveal potentially new somatic mutations in known GBM-related pathways, and allowed us to build a high-quality set of somatic mutations common to exome and RNA-seq calls. Using RNA-seq data in parallel with WES data to detect somatic mutations in cancer genomes can thus broaden the scope of discoveries and lend additional support to somatic variants identified by exome sequencing alone.

14.
Bioinformatics ; 34(14): 2483-2484, 2018 07 15.
Artículo en Inglés | MEDLINE | ID: mdl-29514181

RESUMEN

Summary: Transcription factors regulate gene expression by binding to specific short DNA sequences of 5-20 bp to regulate the rate of transcription of genetic information from DNA to messenger RNA. We present PWMScan, a fast web-based tool to scan server-resident genomes for matches to a user-supplied PWM or transcription factor binding site model from a public database. Availability and implementation: The web server and source code are available at http://ccg.vital-it.ch/pwmscan and https://sourceforge.net/projects/pwmscan, respectively. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genómica/métodos , Posición Específica de Matrices de Puntuación , Secuencias Reguladoras de Ácidos Nucleicos , Programas Informáticos , Factores de Transcripción/metabolismo , ADN/metabolismo , Humanos , Unión Proteica
15.
Nucleic Acids Res ; 46(D1): D175-D180, 2018 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-29069466

RESUMEN

The Mass Genome Annotation (MGA) repository is a resource designed to store published next generation sequencing data and other genome annotation data (such as gene start sites, SNPs, etc.) in a completely standardised format. Each sample has undergone local processing in order the meet the strict MGA format requirements. The original data source, the reformatting procedure and the biological characteristics of the samples are described in an accompanying documentation file manually edited by data curators. 10 model organisms are currently represented: Homo sapiens, Mus musculus, Danio rerio, Drosophila melanogaster, Apis mellifera, Caenorhabditis elegans, Arabidopsis thaliana, Zea mays, Saccharomyces cerevisiae and Schizosaccharomyces pombe. As of today, the resource contains over 24 000 samples. In conjunction with other tools developed by our group (the ChIP-Seq and SSA servers), it allows users to carry out a great variety of analysis task with MGA samples, such as making aggregation plots and heat maps for selected genomic regions, finding peak regions, generating custom tracks for visualizing genomic features in a UCSC genome browser window, or downloading chromatin data in a table format suitable for local processing with more advanced statistical analysis software such as R. Home page: http://ccg.vital-it.ch/mga/.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Animales , Inmunoprecipitación de Cromatina , Curaduría de Datos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Internet , Anotación de Secuencia Molecular , Motor de Búsqueda
16.
Nat Methods ; 14(3): 316-322, 2017 03.
Artículo en Inglés | MEDLINE | ID: mdl-28092692

RESUMEN

Resolving the DNA-binding specificities of transcription factors (TFs) is of critical value for understanding gene regulation. Here, we present a novel, semiautomated protein-DNA interaction characterization technology, selective microfluidics-based ligand enrichment followed by sequencing (SMiLE-seq). SMiLE-seq is neither limited by DNA bait length nor biased toward strong affinity binders; it probes the DNA-binding properties of TFs over a wide affinity range in a fast and cost-effective fashion. We validated SMiLE-seq by analyzing 58 full-length human, mouse, and Drosophila TFs from distinct structural classes. All tested TFs yielded DNA-binding models with predictive power comparable to or greater than that of other in vitro assays. De novo motif discovery on all JUN-FOS heterodimers and several nuclear receptor-TF complexes provided novel insights into partner-specific heterodimer DNA-binding preferences. We also successfully analyzed the DNA-binding properties of uncharacterized human C2H2 zinc-finger proteins and validated several using ChIP-exo.


Asunto(s)
Dedos de Zinc CYS2-HIS2/fisiología , Proteínas de Unión al ADN/metabolismo , ADN/metabolismo , Proteínas Quinasas JNK Activadas por Mitógenos/metabolismo , Proteínas Proto-Oncogénicas c-fos/metabolismo , Factores de Transcripción/metabolismo , Animales , Sitios de Unión/genética , Biología Computacional , Drosophila/genética , Regulación de la Expresión Génica , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Proteínas Quinasas JNK Activadas por Mitógenos/genética , Ratones , Microfluídica/métodos , Proteínas Proto-Oncogénicas c-fos/genética , Análisis de Secuencia de ADN/métodos
17.
Nucleic Acids Res ; 45(D1): D139-D144, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27899579

RESUMEN

SNP2TFBS is a computational resource intended to support researchers investigating the molecular mechanisms underlying regulatory variation in the human genome. The database essentially consists of a collection of text files providing specific annotations for human single nucleotide polymorphisms (SNPs), namely whether they are predicted to abolish, create or change the affinity of one or several transcription factor (TF) binding sites. A SNP's effect on TF binding is estimated based on a position weight matrix (PWM) model for the binding specificity of the corresponding factor. These data files are regenerated at regular intervals by an automatic procedure that takes as input a reference genome, a comprehensive SNP catalogue and a collection of PWMs. SNP2TFBS is also accessible over a web interface, enabling users to view the information provided for an individual SNP, to extract SNPs based on various search criteria, to annotate uploaded sets of SNPs or to display statistics about the frequencies of binding sites affected by selected SNPs. Homepage: http://ccg.vital-it.ch/snp2tfbs/.


Asunto(s)
Sitios de Unión , Biología Computacional/métodos , Bases de Datos de Ácidos Nucleicos , Polimorfismo de Nucleótido Simple , Factores de Transcripción , Algoritmos , Genoma Humano , Genómica/métodos , Humanos , Unión Proteica , Factores de Transcripción/metabolismo , Navegador Web
18.
Nucleic Acids Res ; 45(D1): D51-D55, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27899657

RESUMEN

We present an update of the Eukaryotic Promoter Database EPD (http://epd.vital-it.ch), more specifically on the EPDnew division, which contains comprehensive organisms-specific transcription start site (TSS) collections automatically derived from next generation sequencing (NGS) data. Thanks to the abundant release of new high-throughput transcript mapping data (CAGE, TSS-seq, GRO-cap) the database could be extended to plant and fungal species. We further report on the expansion of the mass genome annotation (MGA) repository containing promoter-relevant chromatin profiling data and on improvements for the EPD entry viewers. Finally, we present a new data access tool, ChIP-Extract, which enables computational biologists to extract diverse types of promoter-associated data in numerical table formats that are readily imported into statistical analysis platforms such as R.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Regiones Promotoras Genéticas , Animales , Eucariontes/genética , Hongos/genética , Humanos , Plantas/genética , Sitio de Iniciación de la Transcripción
19.
BMC Genomics ; 17(1): 938, 2016 11 18.
Artículo en Inglés | MEDLINE | ID: mdl-27863463

RESUMEN

BACKGROUND: ChIP-seq and related high-throughput chromatin profilig assays generate ever increasing volumes of highly valuable biological data. To make sense out of it, biologists need versatile, efficient and user-friendly tools for access, visualization and itegrative analysis of such data. RESULTS: Here we present the ChIP-Seq command line tools and web server, implementing basic algorithms for ChIP-seq data analysis starting with a read alignment file. The tools are optimized for memory-efficiency and speed thus allowing for processing of large data volumes on inexpensive hardware. The web interface provides access to a large database of public data. The ChIP-Seq tools have a modular and interoperable design in that the output from one application can serve as input to another one. Complex and innovative tasks can thus be achieved by running several tools in a cascade. CONCLUSIONS: The various ChIP-Seq command line tools and web services either complement or compare favorably to related bioinformatics resources in terms of computational efficiency, ease of access to public data and interoperability with other web-based tools. The ChIP-Seq server is accessible at http://ccg.vital-it.ch/chipseq/ .


Asunto(s)
Inmunoprecipitación de Cromatina , Biología Computacional/métodos , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento , Programas Informáticos , Navegador Web , Anotación de Secuencia Molecular , Interfaz Usuario-Computador
20.
PLoS Comput Biol ; 12(10): e1005144, 2016 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-27716823

RESUMEN

The recruitment of RNA-Pol-II to the transcription start site (TSS) is an important step in gene regulation in all organisms. Core promoter elements (CPE) are conserved sequence motifs that guide Pol-II to the TSS by interacting with specific transcription factors (TFs). However, only a minority of animal promoters contains CPEs. It is still unknown how Pol-II selects the TSS in their absence. Here we present a comparative analysis of promoters' sequence composition and chromatin architecture in five eukaryotic model organisms, which shows the presence of common and unique DNA-encoded features used to organize chromatin. Analysis of Pol-II initiation patterns uncovers that, in the absence of certain CPEs, there is a strong correlation between the spread of initiation and the intensity of the 10 bp periodic signal in the nearest downstream nucleosome. Moreover, promoters' primary and secondary initiation sites show a characteristic 10 bp periodicity in the absence of CPEs. We also show that DNA natural variants in the region immediately downstream the TSS are able to affect both the nucleosome-DNA affinity and Pol-II initiation pattern. These findings support the notion that, in addition to CPEs mediated selection, sequence-induced nucleosome positioning could be a common and conserved mechanism of TSS selection in animals.


Asunto(s)
ADN/genética , Nucleosomas/genética , Regiones Promotoras Genéticas/genética , ARN Polimerasa II/genética , Sitio de Iniciación de la Transcripción/fisiología , Transcripción Genética/genética , Secuencia de Bases , Sitios de Unión , Simulación por Computador , Modelos Genéticos , Datos de Secuencia Molecular , Activación Transcripcional/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...