Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 60
Filtrar
1.
Nucleic Acids Res ; 52(D1): D154-D163, 2024 Jan 05.
Artículo en Inglés | MEDLINE | ID: mdl-37971293

RESUMEN

We present a major update of the HOCOMOCO collection that provides DNA binding specificity patterns of 949 human transcription factors and 720 mouse orthologs. To make this release, we performed motif discovery in peak sets that originated from 14 183 ChIP-Seq experiments and reads from 2554 HT-SELEX experiments yielding more than 400 thousand candidate motifs. The candidate motifs were annotated according to their similarity to known motifs and the hierarchy of DNA-binding domains of the respective transcription factors. Next, the motifs underwent human expert curation to stratify distinct motif subtypes and remove non-informative patterns and common artifacts. Finally, the curated subset of 100 thousand motifs was supplied to the automated benchmarking to select the best-performing motifs for each transcription factor. The resulting HOCOMOCO v12 core collection contains 1443 verified position weight matrices, including distinct subtypes of DNA binding motifs for particular transcription factors. In addition to the core collection, HOCOMOCO v12 provides motif sets optimized for the recognition of binding sites in vivo and in vitro, and for annotation of regulatory sequence variants. HOCOMOCO is available at https://hocomoco12.autosome.org and https://hocomoco.autosome.org.


Asunto(s)
Bases de Datos Genéticas , Regulación de la Expresión Génica , Dominios y Motivos de Interacción de Proteínas , Factores de Transcripción , Animales , Humanos , Ratones , Sitios de Unión/genética , Motivos de Nucleótidos , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Internet , Dominios y Motivos de Interacción de Proteínas/genética
2.
Nucleic Acids Res ; 51(D1): D564-D570, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36350659

RESUMEN

We present an update of EpiFactors, a manually curated database providing information about epigenetic regulators, their complexes, targets, and products which is openly accessible at http://epifactors.autosome.org. An updated version of the EpiFactors contains information on 902 proteins, including 101 histones and protamines, and, as a main update, a newly curated collection of 124 lncRNAs involved in epigenetic regulation. The amount of publications concerning the role of lncRNA in epigenetics is rapidly growing. Yet, the resource that compiles, integrates, organizes, and presents curated information on lncRNAs in epigenetics is missing. EpiFactors fills this gap and provides data on epigenetic regulators in an accessible and user-friendly form. For 820 of the genes in EpiFactors, we include expression estimates across multiple cell types assessed by CAGE-Seq in the FANTOM5 project. In addition, the updated EpiFactors contains information on 73 protein complexes involved in epigenetic regulation. Our resource is practical for a wide range of users, including biologists, bioinformaticians and molecular/systems biologists.


Asunto(s)
Bases de Datos Genéticas , Epigénesis Genética , Humanos , Histonas/genética , Histonas/metabolismo , Protaminas , ARN Largo no Codificante/genética , ARN Largo no Codificante/metabolismo
3.
Nucleic Acids Res ; 51(12): 6087-6100, 2023 07 07.
Artículo en Inglés | MEDLINE | ID: mdl-37140047

RESUMEN

The Polycomb group (PcG) proteins are fundamental epigenetic regulators that control the repressive state of target genes in multicellular organisms. One of the open questions is defining the mechanisms of PcG recruitment to chromatin. In Drosophila, the crucial role in PcG recruitment is thought to belong to DNA-binding proteins associated with Polycomb response elements (PREs). However, current data suggests that not all PRE-binding factors have been identified. Here, we report the identification of the transcription factor Crooked legs (Crol) as a novel PcG recruiter. Crol is a C2H2-type Zinc Finger protein that directly binds to poly(G)-rich DNA sequences. Mutation of Crol binding sites as well as crol CRISPR/Cas9 knockout diminish the repressive activity of PREs in transgenes. Like other PRE-DNA binding proteins, Crol co-localizes with PcG proteins inside and outside of H3K27me3 domains. Crol knockout impairs the recruitment of the PRC1 subunit Polyhomeotic and the PRE-binding protein Combgap at a subset of sites. The decreased binding of PcG proteins is accompanied by dysregulated transcription of target genes. Overall, our study identified Crol as a new important player in PcG recruitment and epigenetic regulation.


Asunto(s)
Proteínas de Drosophila , Drosophila , Factores de Transcripción , Animales , Cromatina/genética , Cromatina/metabolismo , Proteínas de Unión al ADN/genética , Drosophila/genética , Drosophila/metabolismo , Proteínas de Drosophila/genética , Proteínas de Drosophila/metabolismo , Epigénesis Genética , Regulación del Desarrollo de la Expresión Génica , Proteínas del Grupo Polycomb/genética , Proteínas del Grupo Polycomb/metabolismo , Factores de Transcripción/metabolismo
4.
Bioinformatics ; 39(8)2023 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-37490428

RESUMEN

MOTIVATION: The increasing volume of data from high-throughput experiments including parallel reporter assays facilitates the development of complex deep-learning approaches for modeling DNA regulatory grammar. RESULTS: Here, we introduce LegNet, an EfficientNetV2-inspired convolutional network for modeling short gene regulatory regions. By approaching the sequence-to-expression regression problem as a soft classification task, LegNet secured first place for the autosome.org team in the DREAM 2022 challenge of predicting gene expression from gigantic parallel reporter assays. Using published data, here, we demonstrate that LegNet outperforms existing models and accurately predicts gene expression per se as well as the effects of single-nucleotide variants. Furthermore, we show how LegNet can be used in a diffusion network manner for the rational design of promoter sequences yielding the desired expression level. AVAILABILITY AND IMPLEMENTATION: https://github.com/autosome-ru/LegNet. The GitHub repository includes Jupyter Notebook tutorials and Python scripts under the MIT license to reproduce the results presented in the study.


Asunto(s)
Aprendizaje Profundo , Secuencias Reguladoras de Ácidos Nucleicos , ADN , Regiones Promotoras Genéticas , Programas Informáticos
5.
Nucleic Acids Res ; 50(2): 1111-1127, 2022 01 25.
Artículo en Inglés | MEDLINE | ID: mdl-35018467

RESUMEN

eIF4G2 (DAP5 or Nat1) is a homologue of the canonical translation initiation factor eIF4G1 in higher eukaryotes but its function remains poorly understood. Unlike eIF4G1, eIF4G2 does not interact with the cap-binding protein eIF4E and is believed to drive translation under stress when eIF4E activity is impaired. Here, we show that eIF4G2 operates under normal conditions as well and promotes scanning downstream of the eIF4G1-mediated 40S recruitment and cap-proximal scanning. Specifically, eIF4G2 facilitates leaky scanning for a subset of mRNAs. Apparently, eIF4G2 replaces eIF4G1 during scanning of 5' UTR and the necessity for eIF4G2 only arises when eIF4G1 dissociates from the scanning complex. In particular, this event can occur when the leaky scanning complexes interfere with initiating or elongating 80S ribosomes within a translated uORF. This mechanism is therefore crucial for higher eukaryotes which are known to have long 5' UTRs with highly frequent uORFs. We suggest that uORFs are not the only obstacle on the way of scanning complexes towards the main start codon, because certain eIF4G2 mRNA targets lack uORF(s). Thus, higher eukaryotes possess two distinct scanning complexes: the principal one that binds mRNA and initiates scanning, and the accessory one that rescues scanning when the former fails.


Asunto(s)
Factor 4G Eucariótico de Iniciación/metabolismo , ARN Mensajero/metabolismo , Ribosomas/metabolismo , Humanos , Sistemas de Lectura Abierta , Biosíntesis de Proteínas
6.
Nucleic Acids Res ; 50(W1): W51-W56, 2022 07 05.
Artículo en Inglés | MEDLINE | ID: mdl-35446421

RESUMEN

We present ANANASTRA, https://ananastra.autosome.org, a web server for the identification and annotation of regulatory single-nucleotide polymorphisms (SNPs) with allele-specific binding events. ANANASTRA accepts a list of dbSNP IDs or a VCF file and reports allele-specific binding (ASB) sites of particular transcription factors or in specific cell types, highlighting those with ASBs significantly enriched at SNPs in the query list. ANANASTRA is built on top of a systematic analysis of allelic imbalance in ChIP-Seq experiments and performs the ASB enrichment test against background sets of SNPs found in the same source experiments as ASB sites but not displaying significant allelic imbalance. We illustrate ANANASTRA usage with selected case studies and expect that ANANASTRA will help to conduct the follow-up of GWAS in terms of establishing functional hypotheses and designing experimental verification.


Asunto(s)
Polimorfismo de Nucleótido Simple , Factores de Transcripción , Alelos , Sitios de Unión , Estudio de Asociación del Genoma Completo , Unión Proteica , Factores de Transcripción/química , Factores de Transcripción/metabolismo , Proteínas de Unión al ADN
7.
Int J Mol Sci ; 25(3)2024 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-38339016

RESUMEN

Y-box-binding proteins (YB proteins) are multifunctional DNA- and RNA-binding proteins that play an important role in the regulation of gene expression. The high homology of their cold shock domains and the similarity between their long, unstructured C-terminal domains suggest that Y-box-binding proteins may have similar functions in a cell. Here, we consider the functional interchangeability of the somatic YB proteins YB-1 and YB-3. RNA-seq and Ribo-seq are used to track changes in the mRNA abundance or mRNA translation in HEK293T cells solely expressing YB-1, YB-3, or neither of them. We show that YB proteins have a dual effect on translation. Although the expression of YB proteins stimulates global translation, YB-1 and YB-3 inhibit the translation of their direct CLIP-identified mRNA targets. The impact of YB-1 and YB-3 on the translation of their mRNA targets is similar, which suggests that they can substitute each other in inhibiting the translation of their mRNA targets in HEK293T cells.


Asunto(s)
Proteínas de Unión al ADN , Biosíntesis de Proteínas , Humanos , Células HEK293 , ARN Mensajero/genética , ARN Mensajero/metabolismo , Proteínas de Unión al ADN/metabolismo , Proteína 1 de Unión a la Caja Y/genética , Proteína 1 de Unión a la Caja Y/metabolismo
8.
Genome Res ; 30(7): 1060-1072, 2020 07.
Artículo en Inglés | MEDLINE | ID: mdl-32718982

RESUMEN

Long noncoding RNAs (lncRNAs) constitute the majority of transcripts in the mammalian genomes, and yet, their functions remain largely unknown. As part of the FANTOM6 project, we systematically knocked down the expression of 285 lncRNAs in human dermal fibroblasts and quantified cellular growth, morphological changes, and transcriptomic responses using Capped Analysis of Gene Expression (CAGE). Antisense oligonucleotides targeting the same lncRNAs exhibited global concordance, and the molecular phenotype, measured by CAGE, recapitulated the observed cellular phenotypes while providing additional insights on the affected genes and pathways. Here, we disseminate the largest-to-date lncRNA knockdown data set with molecular phenotyping (over 1000 CAGE deep-sequencing libraries) for further exploration and highlight functional roles for ZNF213-AS1 and lnc-KHDC3L-2.


Asunto(s)
ARN Largo no Codificante/fisiología , Procesos de Crecimiento Celular/genética , Movimiento Celular/genética , Fibroblastos/citología , Fibroblastos/metabolismo , Humanos , Canales de Potasio KCNQ/metabolismo , Anotación de Secuencia Molecular , Oligonucleótidos Antisentido , ARN Largo no Codificante/antagonistas & inhibidores , ARN Largo no Codificante/metabolismo , ARN Interferente Pequeño
9.
RNA ; 2021 May 20.
Artículo en Inglés | MEDLINE | ID: mdl-34016706

RESUMEN

Non-coding RNAs play a crucial role in various cellular processes in living organisms, and RNA functions heavily depend on molecule structures composed of stems, loops, and various tertiary motifs. Among those, the most frequent are A-minor interactions, which are often involved in the formation of more complex motifs such as kink-turns and pseudoknots. We present a novel classification of A-minors in terms of RNA secondary structure where each nucleotide of an A-minor is attributed to the stem or loop, and each pair of nucleotides is attributed to their relative position within the secondary structure. By analyzing classes of A-minors in known RNA structures, we found that the largest classes are mostly homogeneous and preferably localize with known A-minor co-motifs, e.g. tetraloop-tetraloop receptor and coaxial stacking. Detailed analysis of local A-minors within internal loops revealed a novel recurrent RNA tertiary motif, the across-bulged motif. Interestingly, the motif resembles the previously known GAAA/11nt motif but with the local adenines performing the role of the GAAA-tetraloop. By using machine learning, we show that particular classes of local A-minors can be predicted from sequence and secondary structure. The proposed classification is the first step toward automatic annotation of not only A-minors and their co-motifs but various types of RNA tertiary motifs as well.

10.
Nucleic Acids Res ; 49(D1): D104-D111, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33231677

RESUMEN

The Gene Transcription Regulation Database (GTRD; http://gtrd.biouml.org/) contains uniformly annotated and processed NGS data related to gene transcription regulation: ChIP-seq, ChIP-exo, DNase-seq, MNase-seq, ATAC-seq and RNA-seq. With the latest release, the database has reached a new level of data integration. All cell types (cell lines and tissues) presented in the GTRD were arranged into a dictionary and linked with different ontologies (BRENDA, Cell Ontology, Uberon, Cellosaurus and Experimental Factor Ontology) and with related experiments in specialized databases on transcription regulation (FANTOM5, ENCODE and GTEx). The updated version of the GTRD provides an integrated view of transcription regulation through a dedicated web interface with advanced browsing and search capabilities, an integrated genome browser, and table reports by cell types, transcription factors, and genes of interest.


Asunto(s)
Bases de Datos Genéticas , Regulación de la Expresión Génica , Genoma , Factores de Transcripción/genética , Transcripción Genética , Animales , Línea Celular , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Ontología de Genes , Humanos , Internet , Ratones , Anotación de Secuencia Molecular , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Programas Informáticos , Factores de Transcripción/clasificación , Factores de Transcripción/metabolismo
11.
Nucleic Acids Res ; 49(19): 11134-11144, 2021 11 08.
Artículo en Inglés | MEDLINE | ID: mdl-34606617

RESUMEN

The Saccharomyces cerevisiae gene deletion collection is widely used for functional gene annotation and genetic interaction analyses. However, the standard G418-resistance cassette used to produce knockout mutants delivers strong regulatory elements into the target genetic loci. To date, its side effects on the expression of neighboring genes have never been systematically assessed. Here, using ribosome profiling data, RT-qPCR, and reporter expression, we investigated perturbations induced by the KanMX module. Our analysis revealed significant alterations in the transcription efficiency of neighboring genes and, more importantly, severe impairment of their mRNA translation, leading to changes in protein abundance. In the 'head-to-head' orientation of the deleted and neighboring genes, knockout often led to a shift of the transcription start site of the latter, introducing new uAUG codon(s) into the expanded 5' untranslated region (5' UTR). In the 'tail-to-tail' arrangement, knockout led to activation of alternative polyadenylation signals in the neighboring gene, thus altering its 3' UTR. These events may explain the so-called neighboring gene effect (NGE), i.e. false genetic interactions of the deleted genes. We estimate that in as much as ∼1/5 of knockout strains the expression of neighboring genes may be substantially (>2-fold) deregulated at the level of translation.


Asunto(s)
Sitios Genéticos/efectos de los fármacos , Gentamicinas/farmacología , Biosíntesis de Proteínas/efectos de los fármacos , Saccharomyces cerevisiae/efectos de los fármacos , Eliminación de Secuencia , Transcripción Genética/efectos de los fármacos , Regiones no Traducidas 3' , Regiones no Traducidas 5' , Secuencia de Bases , Codón , Regulación Fúngica de la Expresión Génica , Técnicas de Inactivación de Genes/métodos , Genes Reporteros , Proteínas Fluorescentes Verdes/genética , Proteínas Fluorescentes Verdes/metabolismo , Sistemas de Lectura Abierta , Ribosomas/efectos de los fármacos , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Sitio de Iniciación de la Transcripción
12.
Proc Natl Acad Sci U S A ; 117(27): 15581-15590, 2020 07 07.
Artículo en Inglés | MEDLINE | ID: mdl-32576685

RESUMEN

Protein synthesis represents a major metabolic activity of the cell. However, how it is affected by aging and how this in turn impacts cell function remains largely unexplored. To address this question, herein we characterized age-related changes in both the transcriptome and translatome of mouse tissues over the entire life span. We showed that the transcriptome changes govern those in the translatome and are associated with altered expression of genes involved in inflammation, extracellular matrix, and lipid metabolism. We also identified genes that may serve as candidate biomarkers of aging. At the translational level, we uncovered sustained down-regulation of a set of 5'-terminal oligopyrimidine (5'-TOP) transcripts encoding protein synthesis and ribosome biogenesis machinery and regulated by the mTOR pathway. For many of them, ribosome occupancy dropped twofold or even more. Moreover, with age, ribosome coverage gradually decreased in the vicinity of start codons and increased near stop codons, revealing complex age-related changes in the translation process. Taken together, our results reveal systematic and multidimensional deregulation of protein synthesis, showing how this major cellular process declines with age.


Asunto(s)
Envejecimiento/fisiología , Regulación de la Expresión Génica/fisiología , Biosíntesis de Proteínas/fisiología , Ribosomas/metabolismo , Animales , Codón Iniciador/metabolismo , Biología Computacional , Masculino , Ratones , ARN Mensajero/genética , ARN Mensajero/metabolismo , RNA-Seq , Ribosomas/genética , Transducción de Señal/fisiología , Serina-Treonina Quinasas TOR/metabolismo , Transcriptoma/fisiología
13.
Int J Mol Sci ; 24(9)2023 May 06.
Artículo en Inglés | MEDLINE | ID: mdl-37176068

RESUMEN

While protein synthesis is vital for the majority of cell types of the human body, diversely differentiated cells require specific translation regulation. This suggests the specialization of translation machinery across tissues and organs. Using transcriptomic data from GTEx, FANTOM, and Gene Atlas, we systematically explored the abundance of transcripts encoding translation factors and aminoacyl-tRNA synthetases (ARSases) in human tissues. We revised a few known and identified several novel translation-related genes exhibiting strict tissue-specific expression. The proteins they encode include eEF1A1, eEF1A2, PABPC1L, PABPC3, eIF1B, eIF4E1B, eIF4ENIF1, and eIF5AL1. Furthermore, our analysis revealed a pervasive tissue-specific relative abundance of translation machinery components (e.g., PABP and eRF3 paralogs, eIF2B and eIF3 subunits, eIF5MPs, and some ARSases), suggesting presumptive variance in the composition of translation initiation, elongation, and termination complexes. These conclusions were largely confirmed by the analysis of proteomic data. Finally, we paid attention to sexual dimorphism in the repertoire of translation factors encoded in sex chromosomes (eIF1A, eIF2γ, and DDX3), and identified the testis and brain as organs with the most diverged expression of translation-associated genes.


Asunto(s)
Aminoacil-ARNt Sintetasas , Proteómica , Humanos , Factores de Iniciación de Péptidos , Factor 1 de Elongación Peptídica
14.
Int J Mol Sci ; 24(18)2023 Sep 07.
Artículo en Inglés | MEDLINE | ID: mdl-37762093

RESUMEN

Single-nucleotide polymorphism rs71327024 located in the human 3p21.31 locus has been associated with an elevated risk of hospitalization upon SARS-CoV-2 infection. The 3p21.31 locus contains several genes encoding chemokine receptors potentially relevant to severe COVID-19. In particular, CXCR6, which is prominently expressed in T lymphocytes, NK, and NKT cells, has been shown to be involved in the recruitment of immune cells to non-lymphoid organs in chronic inflammatory and respiratory diseases. In COVID-19, CXCR6 expression is reduced in lung resident memory T cells from patients with severe disease as compared to the control cohort with moderate symptoms. We demonstrate here that rs71327024 is located within an active enhancer that augments the activity of the CXCR6 promoter in human CD4+ T lymphocytes. The common rs71327024(G) variant makes a functional binding site for the c-Myb transcription factor, while the risk rs71327024(T) variant disrupts c-Myb binding and reduces the enhancer activity. Concordantly, c-Myb knockdown in PMA-treated Jurkat cells negates rs71327024's allele-specific effect on CXCR6 promoter activity. We conclude that a disrupted c-Myb binding site may decrease CXCR6 expression in T helper cells of individuals carrying the minor rs71327024(T) allele and thus may promote the progression of severe COVID-19 and other inflammatory pathologies.


Asunto(s)
COVID-19 , Humanos , COVID-19/genética , Hospitalización , Regiones Promotoras Genéticas , Receptores CXCR6/genética , SARS-CoV-2 , Linfocitos T Colaboradores-Inductores
15.
Biochemistry (Mosc) ; 87(Suppl 1): S48-S167, 2022 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-35501986

RESUMEN

YB proteins are DNA/RNA binding proteins, members of the family of proteins with cold shock domain. Role of YB proteins in the life of cells, tissues, and whole organisms is extremely important. They are involved in transcription regulation, pre-mRNA splicing, mRNA translation and stability, mRNA packaging into mRNPs, including stress granules, DNA repair, and many other cellular events. Many processes, from embryonic development to aging, depend on when and how much of these proteins have been synthesized. Here we discuss regulation of the levels of YB-1 and, in part, of its homologs in the cell. Because the amount of YB-1 is immediately associated with its functioning, understanding the mechanisms of regulation of the protein amount invariably reveals the events where YB-1 is involved. Control over the YB-1 abundance may allow using this gene/protein as a therapeutic target in cancers, where an increased expression of the YBX1 gene often correlates with the disease severity and poor prognosis.


Asunto(s)
Biosíntesis de Proteínas , Proteína 1 de Unión a la Caja Y , Animales , Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Mamíferos/metabolismo , ARN Mensajero/metabolismo , Proteínas de Unión al ARN/genética , Proteínas de Unión al ARN/metabolismo , Proteína 1 de Unión a la Caja Y/metabolismo
16.
BMC Genomics ; 21(1): 754, 2020 Nov 02.
Artículo en Inglés | MEDLINE | ID: mdl-33138777

RESUMEN

BACKGROUND: Efforts to elucidate the function of enhancers in vivo are underway but their vast numbers alongside differing enhancer architectures make it difficult to determine their impact on gene activity. By systematically annotating multiple mouse tissues with super- and typical-enhancers, we have explored their relationship with gene function and phenotype. RESULTS: Though super-enhancers drive high total- and tissue-specific expression of their associated genes, we find that typical-enhancers also contribute heavily to the tissue-specific expression landscape on account of their large numbers in the genome. Unexpectedly, we demonstrate that both enhancer types are preferentially associated with relevant 'tissue-type' phenotypes and exhibit no difference in phenotype effect size or pleiotropy. Modelling regulatory data alongside molecular data, we built a predictive model to infer gene-phenotype associations and use this model to predict potentially novel disease-associated genes. CONCLUSION: Overall our findings reveal that differing enhancer architectures have a similar impact on mammalian phenotypes whilst harbouring differing cellular and expression effects. Together, our results systematically characterise enhancers with predicted phenotypic traits endorsing the role for both types of enhancers in human disease and disorders.


Asunto(s)
Elementos de Facilitación Genéticos , Animales , Elementos de Facilitación Genéticos/genética , Humanos , Ratones , Fenotipo
17.
Nature ; 507(7493): 462-70, 2014 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-24670764

RESUMEN

Regulated transcription controls the diversity, developmental pathways and spatial organization of the hundreds of cell types that make up a mammal. Using single-molecule cDNA sequencing, we mapped transcription start sites (TSSs) and their usage in human and mouse primary cells, cell lines and tissues to produce a comprehensive overview of mammalian gene expression across the human body. We find that few genes are truly 'housekeeping', whereas many mammalian promoters are composite entities composed of several closely separated TSSs, with independent cell-type-specific expression profiles. TSSs specific to different cell types evolve at different rates, whereas promoters of broadly expressed genes are the most conserved. Promoter-based expression analysis reveals key transcription factors defining cell states and links them to binding-site motifs. The functions of identified novel transcripts can be predicted by coexpression and sample ontology enrichment analyses. The functional annotation of the mammalian genome 5 (FANTOM5) project provides comprehensive expression profiles and functional annotation of mammalian cell-type-specific transcriptomes with wide applications in biomedical research.


Asunto(s)
Atlas como Asunto , Anotación de Secuencia Molecular , Regiones Promotoras Genéticas/genética , Transcriptoma/genética , Animales , Línea Celular , Células Cultivadas , Análisis por Conglomerados , Secuencia Conservada/genética , Regulación de la Expresión Génica/genética , Redes Reguladoras de Genes/genética , Genes Esenciales/genética , Genoma/genética , Humanos , Ratones , Sistemas de Lectura Abierta/genética , Especificidad de Órganos , ARN Mensajero/análisis , ARN Mensajero/genética , Factores de Transcripción/metabolismo , Sitio de Iniciación de la Transcripción , Transcripción Genética/genética
18.
Nucleic Acids Res ; 46(D1): D252-D259, 2018 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-29140464

RESUMEN

We present a major update of the HOCOMOCO collection that consists of patterns describing DNA binding specificities for human and mouse transcription factors. In this release, we profited from a nearly doubled volume of published in vivo experiments on transcription factor (TF) binding to expand the repertoire of binding models, replace low-quality models previously based on in vitro data only and cover more than a hundred TFs with previously unknown binding specificities. This was achieved by systematic motif discovery from more than five thousand ChIP-Seq experiments uniformly processed within the BioUML framework with several ChIP-Seq peak calling tools and aggregated in the GTRD database. HOCOMOCO v11 contains binding models for 453 mouse and 680 human transcription factors and includes 1302 mononucleotide and 576 dinucleotide position weight matrices, which describe primary binding preferences of each transcription factor and reliable alternative binding specificities. An interactive interface and bulk downloads are available on the web: http://hocomoco.autosome.ru and http://www.cbrc.kaust.edu.sa/hocomoco11. In this release, we complement HOCOMOCO by MoLoTool (Motif Location Toolbox, http://molotool.autosome.ru) that applies HOCOMOCO models for visualization of binding sites in short DNA sequences.


Asunto(s)
Bases de Datos Genéticas , Factores de Transcripción/metabolismo , Animales , Sitios de Unión/genética , Inmunoprecipitación de Cromatina , Humanos , Ratones , Modelos Genéticos , Motivos de Nucleótidos , Análisis de Secuencia de ADN
19.
BMC Bioinformatics ; 20(1): 113, 2019 Mar 06.
Artículo en Inglés | MEDLINE | ID: mdl-30841857

RESUMEN

BACKGROUND: High-throughput sequencing often provides a foundation for experimental analyses in the life sciences. For many such methods, an intermediate layer of bioinformatics data analysis is the genomic signal track constructed by short read mapping to a particular genome assembly. There are many software tools to visualize genomic tracks in a web browser or with a stand-alone graphical user interface. However, there are only few command-line applications suitable for automated usage or production of publication-ready visualizations. RESULTS: Here we present svist4get, a command-line tool for customizable generation of publication-quality figures based on data from genomic signal tracks. Similarly to generic genome browser software, svist4get visualizes signal tracks at a given genomic location and is able to aggregate data from several tracks on a single plot along with the transcriptome annotation. The resulting plots can be saved as the vector or high-resolution bitmap images. We demonstrate practical use cases of svist4get for Ribo-Seq and RNA-Seq data. CONCLUSIONS: svist4get is implemented in Python 3 and runs on Linux. The command-line interface of svist4get allows for easy integration into bioinformatics pipelines in a console environment. Extra customization is possible through configuration files and Python API. For convenience, svist4get is provided as pypi package. The source code is available at https://bitbucket.org/artegorov/svist4get/.


Asunto(s)
Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Programas Informáticos , Genoma , ARN Mensajero/genética , ARN Mensajero/metabolismo , Ribosomas/metabolismo
20.
Hum Mutat ; 40(9): 1280-1291, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31106481

RESUMEN

The integrative analysis of high-throughput reporter assays, machine learning, and profiles of epigenomic chromatin state in a broad array of cells and tissues has the potential to significantly improve our understanding of noncoding regulatory element function and its contribution to human disease. Here, we report results from the CAGI 5 regulation saturation challenge where participants were asked to predict the impact of nucleotide substitution at every base pair within five disease-associated human enhancers and nine disease-associated promoters. A library of mutations covering all bases was generated by saturation mutagenesis and altered activity was assessed in a massively parallel reporter assay (MPRA) in relevant cell lines. Reporter expression was measured relative to plasmid DNA to determine the impact of variants. The challenge was to predict the functional effects of variants on reporter expression. Comparative analysis of the full range of submitted prediction results identifies the most successful models of transcription factor binding sites, machine learning algorithms, and ways to choose among or incorporate diverse datatypes and cell-types for training computational models. These results have the potential to improve the design of future studies on more diverse sets of regulatory elements and aid the interpretation of disease-associated genetic variation.


Asunto(s)
ADN/química , Epigenómica/métodos , Mutación Puntual , Sitios de Unión , Línea Celular , Cromatina/genética , ADN/metabolismo , Elementos de Facilitación Genéticos , Predisposición Genética a la Enfermedad , Humanos , Aprendizaje Automático , Regiones Promotoras Genéticas , Factores de Transcripción/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA