Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 66
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 52(D1): D154-D163, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37971293

RESUMO

We present a major update of the HOCOMOCO collection that provides DNA binding specificity patterns of 949 human transcription factors and 720 mouse orthologs. To make this release, we performed motif discovery in peak sets that originated from 14 183 ChIP-Seq experiments and reads from 2554 HT-SELEX experiments yielding more than 400 thousand candidate motifs. The candidate motifs were annotated according to their similarity to known motifs and the hierarchy of DNA-binding domains of the respective transcription factors. Next, the motifs underwent human expert curation to stratify distinct motif subtypes and remove non-informative patterns and common artifacts. Finally, the curated subset of 100 thousand motifs was supplied to the automated benchmarking to select the best-performing motifs for each transcription factor. The resulting HOCOMOCO v12 core collection contains 1443 verified position weight matrices, including distinct subtypes of DNA binding motifs for particular transcription factors. In addition to the core collection, HOCOMOCO v12 provides motif sets optimized for the recognition of binding sites in vivo and in vitro, and for annotation of regulatory sequence variants. HOCOMOCO is available at https://hocomoco12.autosome.org and https://hocomoco.autosome.org.


Assuntos
Bases de Dados Genéticas , Regulação da Expressão Gênica , Domínios e Motivos de Interação entre Proteínas , Fatores de Transcrição , Animais , Humanos , Camundongos , Sítios de Ligação/genética , Motivos de Nucleotídeos , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Internet , Domínios e Motivos de Interação entre Proteínas/genética
2.
Nucleic Acids Res ; 51(D1): D564-D570, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36350659

RESUMO

We present an update of EpiFactors, a manually curated database providing information about epigenetic regulators, their complexes, targets, and products which is openly accessible at http://epifactors.autosome.org. An updated version of the EpiFactors contains information on 902 proteins, including 101 histones and protamines, and, as a main update, a newly curated collection of 124 lncRNAs involved in epigenetic regulation. The amount of publications concerning the role of lncRNA in epigenetics is rapidly growing. Yet, the resource that compiles, integrates, organizes, and presents curated information on lncRNAs in epigenetics is missing. EpiFactors fills this gap and provides data on epigenetic regulators in an accessible and user-friendly form. For 820 of the genes in EpiFactors, we include expression estimates across multiple cell types assessed by CAGE-Seq in the FANTOM5 project. In addition, the updated EpiFactors contains information on 73 protein complexes involved in epigenetic regulation. Our resource is practical for a wide range of users, including biologists, bioinformaticians and molecular/systems biologists.


Assuntos
Bases de Dados Genéticas , Epigênese Genética , Humanos , Histonas/genética , Histonas/metabolismo , Protaminas , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo
3.
Nucleic Acids Res ; 51(12): 6087-6100, 2023 07 07.
Artigo em Inglês | MEDLINE | ID: mdl-37140047

RESUMO

The Polycomb group (PcG) proteins are fundamental epigenetic regulators that control the repressive state of target genes in multicellular organisms. One of the open questions is defining the mechanisms of PcG recruitment to chromatin. In Drosophila, the crucial role in PcG recruitment is thought to belong to DNA-binding proteins associated with Polycomb response elements (PREs). However, current data suggests that not all PRE-binding factors have been identified. Here, we report the identification of the transcription factor Crooked legs (Crol) as a novel PcG recruiter. Crol is a C2H2-type Zinc Finger protein that directly binds to poly(G)-rich DNA sequences. Mutation of Crol binding sites as well as crol CRISPR/Cas9 knockout diminish the repressive activity of PREs in transgenes. Like other PRE-DNA binding proteins, Crol co-localizes with PcG proteins inside and outside of H3K27me3 domains. Crol knockout impairs the recruitment of the PRC1 subunit Polyhomeotic and the PRE-binding protein Combgap at a subset of sites. The decreased binding of PcG proteins is accompanied by dysregulated transcription of target genes. Overall, our study identified Crol as a new important player in PcG recruitment and epigenetic regulation.


Assuntos
Proteínas de Drosophila , Drosophila , Fatores de Transcrição , Animais , Cromatina/genética , Cromatina/metabolismo , Proteínas de Ligação a DNA/genética , Drosophila/genética , Drosophila/metabolismo , Proteínas de Drosophila/genética , Proteínas de Drosophila/metabolismo , Epigênese Genética , Regulação da Expressão Gênica no Desenvolvimento , Proteínas do Grupo Polycomb/genética , Proteínas do Grupo Polycomb/metabolismo , Fatores de Transcrição/metabolismo
4.
Bioinformatics ; 39(8)2023 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-37490428

RESUMO

MOTIVATION: The increasing volume of data from high-throughput experiments including parallel reporter assays facilitates the development of complex deep-learning approaches for modeling DNA regulatory grammar. RESULTS: Here, we introduce LegNet, an EfficientNetV2-inspired convolutional network for modeling short gene regulatory regions. By approaching the sequence-to-expression regression problem as a soft classification task, LegNet secured first place for the autosome.org team in the DREAM 2022 challenge of predicting gene expression from gigantic parallel reporter assays. Using published data, here, we demonstrate that LegNet outperforms existing models and accurately predicts gene expression per se as well as the effects of single-nucleotide variants. Furthermore, we show how LegNet can be used in a diffusion network manner for the rational design of promoter sequences yielding the desired expression level. AVAILABILITY AND IMPLEMENTATION: https://github.com/autosome-ru/LegNet. The GitHub repository includes Jupyter Notebook tutorials and Python scripts under the MIT license to reproduce the results presented in the study.


Assuntos
Aprendizado Profundo , Sequências Reguladoras de Ácido Nucleico , DNA , Regiões Promotoras Genéticas , Software
5.
Nucleic Acids Res ; 50(W1): W51-W56, 2022 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-35446421

RESUMO

We present ANANASTRA, https://ananastra.autosome.org, a web server for the identification and annotation of regulatory single-nucleotide polymorphisms (SNPs) with allele-specific binding events. ANANASTRA accepts a list of dbSNP IDs or a VCF file and reports allele-specific binding (ASB) sites of particular transcription factors or in specific cell types, highlighting those with ASBs significantly enriched at SNPs in the query list. ANANASTRA is built on top of a systematic analysis of allelic imbalance in ChIP-Seq experiments and performs the ASB enrichment test against background sets of SNPs found in the same source experiments as ASB sites but not displaying significant allelic imbalance. We illustrate ANANASTRA usage with selected case studies and expect that ANANASTRA will help to conduct the follow-up of GWAS in terms of establishing functional hypotheses and designing experimental verification.


Assuntos
Polimorfismo de Nucleotídeo Único , Fatores de Transcrição , Alelos , Sítios de Ligação , Estudo de Associação Genômica Ampla , Ligação Proteica , Fatores de Transcrição/química , Fatores de Transcrição/metabolismo , Proteínas de Ligação a DNA
6.
Nucleic Acids Res ; 50(2): 1111-1127, 2022 01 25.
Artigo em Inglês | MEDLINE | ID: mdl-35018467

RESUMO

eIF4G2 (DAP5 or Nat1) is a homologue of the canonical translation initiation factor eIF4G1 in higher eukaryotes but its function remains poorly understood. Unlike eIF4G1, eIF4G2 does not interact with the cap-binding protein eIF4E and is believed to drive translation under stress when eIF4E activity is impaired. Here, we show that eIF4G2 operates under normal conditions as well and promotes scanning downstream of the eIF4G1-mediated 40S recruitment and cap-proximal scanning. Specifically, eIF4G2 facilitates leaky scanning for a subset of mRNAs. Apparently, eIF4G2 replaces eIF4G1 during scanning of 5' UTR and the necessity for eIF4G2 only arises when eIF4G1 dissociates from the scanning complex. In particular, this event can occur when the leaky scanning complexes interfere with initiating or elongating 80S ribosomes within a translated uORF. This mechanism is therefore crucial for higher eukaryotes which are known to have long 5' UTRs with highly frequent uORFs. We suggest that uORFs are not the only obstacle on the way of scanning complexes towards the main start codon, because certain eIF4G2 mRNA targets lack uORF(s). Thus, higher eukaryotes possess two distinct scanning complexes: the principal one that binds mRNA and initiates scanning, and the accessory one that rescues scanning when the former fails.


Assuntos
Fator de Iniciação Eucariótico 4G/metabolismo , RNA Mensageiro/metabolismo , Ribossomos/metabolismo , Humanos , Fases de Leitura Aberta , Biossíntese de Proteínas
7.
Int J Mol Sci ; 25(3)2024 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-38339016

RESUMO

Y-box-binding proteins (YB proteins) are multifunctional DNA- and RNA-binding proteins that play an important role in the regulation of gene expression. The high homology of their cold shock domains and the similarity between their long, unstructured C-terminal domains suggest that Y-box-binding proteins may have similar functions in a cell. Here, we consider the functional interchangeability of the somatic YB proteins YB-1 and YB-3. RNA-seq and Ribo-seq are used to track changes in the mRNA abundance or mRNA translation in HEK293T cells solely expressing YB-1, YB-3, or neither of them. We show that YB proteins have a dual effect on translation. Although the expression of YB proteins stimulates global translation, YB-1 and YB-3 inhibit the translation of their direct CLIP-identified mRNA targets. The impact of YB-1 and YB-3 on the translation of their mRNA targets is similar, which suggests that they can substitute each other in inhibiting the translation of their mRNA targets in HEK293T cells.


Assuntos
Proteínas de Ligação a DNA , Biossíntese de Proteínas , Humanos , Células HEK293 , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Proteínas de Ligação a DNA/metabolismo , Proteína 1 de Ligação a Y-Box/genética , Proteína 1 de Ligação a Y-Box/metabolismo
8.
Genome Res ; 30(7): 1060-1072, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-32718982

RESUMO

Long noncoding RNAs (lncRNAs) constitute the majority of transcripts in the mammalian genomes, and yet, their functions remain largely unknown. As part of the FANTOM6 project, we systematically knocked down the expression of 285 lncRNAs in human dermal fibroblasts and quantified cellular growth, morphological changes, and transcriptomic responses using Capped Analysis of Gene Expression (CAGE). Antisense oligonucleotides targeting the same lncRNAs exhibited global concordance, and the molecular phenotype, measured by CAGE, recapitulated the observed cellular phenotypes while providing additional insights on the affected genes and pathways. Here, we disseminate the largest-to-date lncRNA knockdown data set with molecular phenotyping (over 1000 CAGE deep-sequencing libraries) for further exploration and highlight functional roles for ZNF213-AS1 and lnc-KHDC3L-2.


Assuntos
RNA Longo não Codificante/fisiologia , Processos de Crescimento Celular/genética , Movimento Celular/genética , Fibroblastos/citologia , Fibroblastos/metabolismo , Humanos , Canais de Potássio KCNQ/metabolismo , Anotação de Sequência Molecular , Oligonucleotídeos Antissenso , RNA Longo não Codificante/antagonistas & inibidores , RNA Longo não Codificante/metabolismo , RNA Interferente Pequeno
9.
RNA ; 2021 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-34016706

RESUMO

Non-coding RNAs play a crucial role in various cellular processes in living organisms, and RNA functions heavily depend on molecule structures composed of stems, loops, and various tertiary motifs. Among those, the most frequent are A-minor interactions, which are often involved in the formation of more complex motifs such as kink-turns and pseudoknots. We present a novel classification of A-minors in terms of RNA secondary structure where each nucleotide of an A-minor is attributed to the stem or loop, and each pair of nucleotides is attributed to their relative position within the secondary structure. By analyzing classes of A-minors in known RNA structures, we found that the largest classes are mostly homogeneous and preferably localize with known A-minor co-motifs, e.g. tetraloop-tetraloop receptor and coaxial stacking. Detailed analysis of local A-minors within internal loops revealed a novel recurrent RNA tertiary motif, the across-bulged motif. Interestingly, the motif resembles the previously known GAAA/11nt motif but with the local adenines performing the role of the GAAA-tetraloop. By using machine learning, we show that particular classes of local A-minors can be predicted from sequence and secondary structure. The proposed classification is the first step toward automatic annotation of not only A-minors and their co-motifs but various types of RNA tertiary motifs as well.

10.
Nucleic Acids Res ; 49(D1): D104-D111, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33231677

RESUMO

The Gene Transcription Regulation Database (GTRD; http://gtrd.biouml.org/) contains uniformly annotated and processed NGS data related to gene transcription regulation: ChIP-seq, ChIP-exo, DNase-seq, MNase-seq, ATAC-seq and RNA-seq. With the latest release, the database has reached a new level of data integration. All cell types (cell lines and tissues) presented in the GTRD were arranged into a dictionary and linked with different ontologies (BRENDA, Cell Ontology, Uberon, Cellosaurus and Experimental Factor Ontology) and with related experiments in specialized databases on transcription regulation (FANTOM5, ENCODE and GTEx). The updated version of the GTRD provides an integrated view of transcription regulation through a dedicated web interface with advanced browsing and search capabilities, an integrated genome browser, and table reports by cell types, transcription factors, and genes of interest.


Assuntos
Bases de Dados Genéticas , Regulação da Expressão Gênica , Genoma , Fatores de Transcrição/genética , Transcrição Gênica , Animais , Linhagem Celular , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Ontologia Genética , Humanos , Internet , Camundongos , Anotação de Sequência Molecular , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Software , Fatores de Transcrição/classificação , Fatores de Transcrição/metabolismo
11.
Nucleic Acids Res ; 49(19): 11134-11144, 2021 11 08.
Artigo em Inglês | MEDLINE | ID: mdl-34606617

RESUMO

The Saccharomyces cerevisiae gene deletion collection is widely used for functional gene annotation and genetic interaction analyses. However, the standard G418-resistance cassette used to produce knockout mutants delivers strong regulatory elements into the target genetic loci. To date, its side effects on the expression of neighboring genes have never been systematically assessed. Here, using ribosome profiling data, RT-qPCR, and reporter expression, we investigated perturbations induced by the KanMX module. Our analysis revealed significant alterations in the transcription efficiency of neighboring genes and, more importantly, severe impairment of their mRNA translation, leading to changes in protein abundance. In the 'head-to-head' orientation of the deleted and neighboring genes, knockout often led to a shift of the transcription start site of the latter, introducing new uAUG codon(s) into the expanded 5' untranslated region (5' UTR). In the 'tail-to-tail' arrangement, knockout led to activation of alternative polyadenylation signals in the neighboring gene, thus altering its 3' UTR. These events may explain the so-called neighboring gene effect (NGE), i.e. false genetic interactions of the deleted genes. We estimate that in as much as ∼1/5 of knockout strains the expression of neighboring genes may be substantially (>2-fold) deregulated at the level of translation.


Assuntos
Loci Gênicos/efeitos dos fármacos , Gentamicinas/farmacologia , Biossíntese de Proteínas/efeitos dos fármacos , Saccharomyces cerevisiae/efeitos dos fármacos , Deleção de Sequência , Transcrição Gênica/efeitos dos fármacos , Regiões 3' não Traduzidas , Regiões 5' não Traduzidas , Sequência de Bases , Códon , Regulação Fúngica da Expressão Gênica , Técnicas de Inativação de Genes/métodos , Genes Reporter , Proteínas de Fluorescência Verde/genética , Proteínas de Fluorescência Verde/metabolismo , Fases de Leitura Aberta , Ribossomos/efeitos dos fármacos , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Sítio de Iniciação de Transcrição
12.
Proc Natl Acad Sci U S A ; 117(27): 15581-15590, 2020 07 07.
Artigo em Inglês | MEDLINE | ID: mdl-32576685

RESUMO

Protein synthesis represents a major metabolic activity of the cell. However, how it is affected by aging and how this in turn impacts cell function remains largely unexplored. To address this question, herein we characterized age-related changes in both the transcriptome and translatome of mouse tissues over the entire life span. We showed that the transcriptome changes govern those in the translatome and are associated with altered expression of genes involved in inflammation, extracellular matrix, and lipid metabolism. We also identified genes that may serve as candidate biomarkers of aging. At the translational level, we uncovered sustained down-regulation of a set of 5'-terminal oligopyrimidine (5'-TOP) transcripts encoding protein synthesis and ribosome biogenesis machinery and regulated by the mTOR pathway. For many of them, ribosome occupancy dropped twofold or even more. Moreover, with age, ribosome coverage gradually decreased in the vicinity of start codons and increased near stop codons, revealing complex age-related changes in the translation process. Taken together, our results reveal systematic and multidimensional deregulation of protein synthesis, showing how this major cellular process declines with age.


Assuntos
Envelhecimento/fisiologia , Regulação da Expressão Gênica/fisiologia , Biossíntese de Proteínas/fisiologia , Ribossomos/metabolismo , Animais , Códon de Iniciação/metabolismo , Biologia Computacional , Masculino , Camundongos , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , RNA-Seq , Ribossomos/genética , Transdução de Sinais/fisiologia , Serina-Treonina Quinases TOR/metabolismo , Transcriptoma/fisiologia
13.
Int J Mol Sci ; 24(9)2023 May 06.
Artigo em Inglês | MEDLINE | ID: mdl-37176068

RESUMO

While protein synthesis is vital for the majority of cell types of the human body, diversely differentiated cells require specific translation regulation. This suggests the specialization of translation machinery across tissues and organs. Using transcriptomic data from GTEx, FANTOM, and Gene Atlas, we systematically explored the abundance of transcripts encoding translation factors and aminoacyl-tRNA synthetases (ARSases) in human tissues. We revised a few known and identified several novel translation-related genes exhibiting strict tissue-specific expression. The proteins they encode include eEF1A1, eEF1A2, PABPC1L, PABPC3, eIF1B, eIF4E1B, eIF4ENIF1, and eIF5AL1. Furthermore, our analysis revealed a pervasive tissue-specific relative abundance of translation machinery components (e.g., PABP and eRF3 paralogs, eIF2B and eIF3 subunits, eIF5MPs, and some ARSases), suggesting presumptive variance in the composition of translation initiation, elongation, and termination complexes. These conclusions were largely confirmed by the analysis of proteomic data. Finally, we paid attention to sexual dimorphism in the repertoire of translation factors encoded in sex chromosomes (eIF1A, eIF2γ, and DDX3), and identified the testis and brain as organs with the most diverged expression of translation-associated genes.


Assuntos
Aminoacil-tRNA Sintetases , Proteômica , Humanos , Fatores de Iniciação de Peptídeos , Fator 1 de Elongação de Peptídeos
14.
Int J Mol Sci ; 24(18)2023 Sep 07.
Artigo em Inglês | MEDLINE | ID: mdl-37762093

RESUMO

Single-nucleotide polymorphism rs71327024 located in the human 3p21.31 locus has been associated with an elevated risk of hospitalization upon SARS-CoV-2 infection. The 3p21.31 locus contains several genes encoding chemokine receptors potentially relevant to severe COVID-19. In particular, CXCR6, which is prominently expressed in T lymphocytes, NK, and NKT cells, has been shown to be involved in the recruitment of immune cells to non-lymphoid organs in chronic inflammatory and respiratory diseases. In COVID-19, CXCR6 expression is reduced in lung resident memory T cells from patients with severe disease as compared to the control cohort with moderate symptoms. We demonstrate here that rs71327024 is located within an active enhancer that augments the activity of the CXCR6 promoter in human CD4+ T lymphocytes. The common rs71327024(G) variant makes a functional binding site for the c-Myb transcription factor, while the risk rs71327024(T) variant disrupts c-Myb binding and reduces the enhancer activity. Concordantly, c-Myb knockdown in PMA-treated Jurkat cells negates rs71327024's allele-specific effect on CXCR6 promoter activity. We conclude that a disrupted c-Myb binding site may decrease CXCR6 expression in T helper cells of individuals carrying the minor rs71327024(T) allele and thus may promote the progression of severe COVID-19 and other inflammatory pathologies.


Assuntos
COVID-19 , Humanos , COVID-19/genética , Hospitalização , Regiões Promotoras Genéticas , Receptores CXCR6/genética , SARS-CoV-2 , Linfócitos T Auxiliares-Indutores
15.
Biochemistry (Mosc) ; 87(Suppl 1): S48-S167, 2022 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-35501986

RESUMO

YB proteins are DNA/RNA binding proteins, members of the family of proteins with cold shock domain. Role of YB proteins in the life of cells, tissues, and whole organisms is extremely important. They are involved in transcription regulation, pre-mRNA splicing, mRNA translation and stability, mRNA packaging into mRNPs, including stress granules, DNA repair, and many other cellular events. Many processes, from embryonic development to aging, depend on when and how much of these proteins have been synthesized. Here we discuss regulation of the levels of YB-1 and, in part, of its homologs in the cell. Because the amount of YB-1 is immediately associated with its functioning, understanding the mechanisms of regulation of the protein amount invariably reveals the events where YB-1 is involved. Control over the YB-1 abundance may allow using this gene/protein as a therapeutic target in cancers, where an increased expression of the YBX1 gene often correlates with the disease severity and poor prognosis.


Assuntos
Biossíntese de Proteínas , Proteína 1 de Ligação a Y-Box , Animais , Proteínas de Ligação a DNA/genética , Proteínas de Ligação a DNA/metabolismo , Mamíferos/metabolismo , RNA Mensageiro/metabolismo , Proteínas de Ligação a RNA/genética , Proteínas de Ligação a RNA/metabolismo , Proteína 1 de Ligação a Y-Box/metabolismo
16.
BMC Genomics ; 21(1): 754, 2020 Nov 02.
Artigo em Inglês | MEDLINE | ID: mdl-33138777

RESUMO

BACKGROUND: Efforts to elucidate the function of enhancers in vivo are underway but their vast numbers alongside differing enhancer architectures make it difficult to determine their impact on gene activity. By systematically annotating multiple mouse tissues with super- and typical-enhancers, we have explored their relationship with gene function and phenotype. RESULTS: Though super-enhancers drive high total- and tissue-specific expression of their associated genes, we find that typical-enhancers also contribute heavily to the tissue-specific expression landscape on account of their large numbers in the genome. Unexpectedly, we demonstrate that both enhancer types are preferentially associated with relevant 'tissue-type' phenotypes and exhibit no difference in phenotype effect size or pleiotropy. Modelling regulatory data alongside molecular data, we built a predictive model to infer gene-phenotype associations and use this model to predict potentially novel disease-associated genes. CONCLUSION: Overall our findings reveal that differing enhancer architectures have a similar impact on mammalian phenotypes whilst harbouring differing cellular and expression effects. Together, our results systematically characterise enhancers with predicted phenotypic traits endorsing the role for both types of enhancers in human disease and disorders.


Assuntos
Elementos Facilitadores Genéticos , Animais , Elementos Facilitadores Genéticos/genética , Humanos , Camundongos , Fenótipo
17.
Nature ; 507(7493): 462-70, 2014 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-24670764

RESUMO

Regulated transcription controls the diversity, developmental pathways and spatial organization of the hundreds of cell types that make up a mammal. Using single-molecule cDNA sequencing, we mapped transcription start sites (TSSs) and their usage in human and mouse primary cells, cell lines and tissues to produce a comprehensive overview of mammalian gene expression across the human body. We find that few genes are truly 'housekeeping', whereas many mammalian promoters are composite entities composed of several closely separated TSSs, with independent cell-type-specific expression profiles. TSSs specific to different cell types evolve at different rates, whereas promoters of broadly expressed genes are the most conserved. Promoter-based expression analysis reveals key transcription factors defining cell states and links them to binding-site motifs. The functions of identified novel transcripts can be predicted by coexpression and sample ontology enrichment analyses. The functional annotation of the mammalian genome 5 (FANTOM5) project provides comprehensive expression profiles and functional annotation of mammalian cell-type-specific transcriptomes with wide applications in biomedical research.


Assuntos
Atlas como Assunto , Anotação de Sequência Molecular , Regiões Promotoras Genéticas/genética , Transcriptoma/genética , Animais , Linhagem Celular , Células Cultivadas , Análise por Conglomerados , Sequência Conservada/genética , Regulação da Expressão Gênica/genética , Redes Reguladoras de Genes/genética , Genes Essenciais/genética , Genoma/genética , Humanos , Camundongos , Fases de Leitura Aberta/genética , Especificidade de Órgãos , RNA Mensageiro/análise , RNA Mensageiro/genética , Fatores de Transcrição/metabolismo , Sítio de Iniciação de Transcrição , Transcrição Gênica/genética
18.
Nucleic Acids Res ; 46(D1): D252-D259, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29140464

RESUMO

We present a major update of the HOCOMOCO collection that consists of patterns describing DNA binding specificities for human and mouse transcription factors. In this release, we profited from a nearly doubled volume of published in vivo experiments on transcription factor (TF) binding to expand the repertoire of binding models, replace low-quality models previously based on in vitro data only and cover more than a hundred TFs with previously unknown binding specificities. This was achieved by systematic motif discovery from more than five thousand ChIP-Seq experiments uniformly processed within the BioUML framework with several ChIP-Seq peak calling tools and aggregated in the GTRD database. HOCOMOCO v11 contains binding models for 453 mouse and 680 human transcription factors and includes 1302 mononucleotide and 576 dinucleotide position weight matrices, which describe primary binding preferences of each transcription factor and reliable alternative binding specificities. An interactive interface and bulk downloads are available on the web: http://hocomoco.autosome.ru and http://www.cbrc.kaust.edu.sa/hocomoco11. In this release, we complement HOCOMOCO by MoLoTool (Motif Location Toolbox, http://molotool.autosome.ru) that applies HOCOMOCO models for visualization of binding sites in short DNA sequences.


Assuntos
Bases de Dados Genéticas , Fatores de Transcrição/metabolismo , Animais , Sítios de Ligação/genética , Imunoprecipitação da Cromatina , Humanos , Camundongos , Modelos Genéticos , Motivos de Nucleotídeos , Análise de Sequência de DNA
19.
BMC Bioinformatics ; 20(1): 113, 2019 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-30841857

RESUMO

BACKGROUND: High-throughput sequencing often provides a foundation for experimental analyses in the life sciences. For many such methods, an intermediate layer of bioinformatics data analysis is the genomic signal track constructed by short read mapping to a particular genome assembly. There are many software tools to visualize genomic tracks in a web browser or with a stand-alone graphical user interface. However, there are only few command-line applications suitable for automated usage or production of publication-ready visualizations. RESULTS: Here we present svist4get, a command-line tool for customizable generation of publication-quality figures based on data from genomic signal tracks. Similarly to generic genome browser software, svist4get visualizes signal tracks at a given genomic location and is able to aggregate data from several tracks on a single plot along with the transcriptome annotation. The resulting plots can be saved as the vector or high-resolution bitmap images. We demonstrate practical use cases of svist4get for Ribo-Seq and RNA-Seq data. CONCLUSIONS: svist4get is implemented in Python 3 and runs on Linux. The command-line interface of svist4get allows for easy integration into bioinformatics pipelines in a console environment. Extra customization is possible through configuration files and Python API. For convenience, svist4get is provided as pypi package. The source code is available at https://bitbucket.org/artegorov/svist4get/.


Assuntos
Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Software , Genoma , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Ribossomos/metabolismo
20.
Hum Mutat ; 40(9): 1280-1291, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31106481

RESUMO

The integrative analysis of high-throughput reporter assays, machine learning, and profiles of epigenomic chromatin state in a broad array of cells and tissues has the potential to significantly improve our understanding of noncoding regulatory element function and its contribution to human disease. Here, we report results from the CAGI 5 regulation saturation challenge where participants were asked to predict the impact of nucleotide substitution at every base pair within five disease-associated human enhancers and nine disease-associated promoters. A library of mutations covering all bases was generated by saturation mutagenesis and altered activity was assessed in a massively parallel reporter assay (MPRA) in relevant cell lines. Reporter expression was measured relative to plasmid DNA to determine the impact of variants. The challenge was to predict the functional effects of variants on reporter expression. Comparative analysis of the full range of submitted prediction results identifies the most successful models of transcription factor binding sites, machine learning algorithms, and ways to choose among or incorporate diverse datatypes and cell-types for training computational models. These results have the potential to improve the design of future studies on more diverse sets of regulatory elements and aid the interpretation of disease-associated genetic variation.


Assuntos
DNA/química , Epigenômica/métodos , Mutação Puntual , Sítios de Ligação , Linhagem Celular , Cromatina/genética , DNA/metabolismo , Elementos Facilitadores Genéticos , Predisposição Genética para Doença , Humanos , Aprendizado de Máquina , Regiões Promotoras Genéticas , Fatores de Transcrição/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA