Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Bioinformatics ; 36(12): 3902-3904, 2020 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-32246829

RESUMO

MOTIVATION: Identifying the genes regulated by a given transcription factor (TF) (its 'target genes') is a key step in developing a comprehensive understanding of gene regulation. Previously, we developed a method (CisMapper) for predicting the target genes of a TF based solely on the correlation between a histone modification at the TF's binding site and the expression of the gene across a set of tissues or cell lines. That approach is limited to organisms for which extensive histone and expression data are available, and does not explicitly incorporate the genomic distance between the TF and the gene. RESULTS: We present the T-Gene algorithm, which overcomes these limitations. It can be used to predict which genes are most likely to be regulated by a TF, and which of the TF's binding sites are most likely involved in regulating particular genes. T-Gene calculates a novel score that combines distance and histone/expression correlation, and we show that this score accurately predicts when a regulatory element bound by a TF is in contact with a gene's promoter, achieving median precision above 60%. T-Gene is easy to use via its web server or as a command-line tool, and can also make accurate predictions (median precision above 40%) based on distance alone when extensive histone/expression data is not available for the organism. T-Gene provides an estimate of the statistical significance of each of its predictions. AVAILABILITY AND IMPLEMENTATION: The T-Gene web server, source code, histone/expression data and genome annotation files are provided at http://meme-suite.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Software , Sítios de Ligação , Imunoprecipitação da Cromatina , Regulação da Expressão Gênica , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo
2.
Bioinformatics ; 35(16): 2774-2782, 2019 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-30596994

RESUMO

MOTIVATION: Post-translational modifications (PTMs) of proteins are associated with many significant biological functions and can be identified in high throughput using tandem mass spectrometry. Many PTMs are associated with short sequence patterns called 'motifs' that help localize the modifying enzyme. Accordingly, many algorithms have been designed to identify these motifs from mass spectrometry data. Accurate statistical confidence estimates for discovered motifs are critically important for proper interpretation and in the design of downstream experimental validation. RESULTS: We describe a method for assigning statistical confidence estimates to PTM motifs, and we demonstrate that this method provides accurate P-values on both simulated and real data. Our methods are implemented in MoMo, a software tool for discovering motifs among sets of PTMs that we make available as a web server and as downloadable source code. MoMo re-implements the two most widely used PTM motif discovery algorithms-motif-x and MoDL-while offering many enhancements. Relative to motif-x, MoMo offers improved statistical confidence estimates and more accurate calculation of motif scores. The MoMo web server offers more proteome databases, more input formats, larger inputs and longer running times than the motif-x web server. Finally, our study demonstrates that the confidence estimates produced by motif-x are inaccurate. This inaccuracy stems in part from the common practice of drawing 'background' peptides from an unshuffled proteome database. Our results thus suggest that many of the papers that use motif-x to find motifs may be reporting results that lack statistical support. AVAILABILITY AND IMPLEMENTATION: The MoMo web server and source code are provided at http://meme-suite.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Processamento de Proteína Pós-Traducional , Software , Algoritmos , Motivos de Aminoácidos , Proteoma , Espectrometria de Massas em Tandem
3.
Bioinformatics ; 32(8): 1217-9, 2016 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-26704599

RESUMO

UNLABELLED: Precise regulatory control of genes, particularly in eukaryotes, frequently requires the joint action of multiple sequence-specific transcription factors. A cis-regulatory module (CRM) is a genomic locus that is responsible for gene regulation and that contains multiple transcription factor binding sites in close proximity. Given a collection of known transcription factor binding motifs, many bioinformatics methods have been proposed over the past 15 years for identifying within a genomic sequence candidate CRMs consisting of clusters of those motifs. RESULTS: The MCAST algorithm uses a hidden Markov model with a P-value-based scoring scheme to identify candidate CRMs. Here, we introduce a new version of MCAST that offers improved graphical output, a dynamic background model, statistical confidence estimates based on false discovery rate estimation and, most significantly, the ability to predict CRMs while taking into account epigenomic data such as DNase I sensitivity or histone modification data. We demonstrate the validity of MCAST's statistical confidence estimates and the utility of epigenomic priors in identifying CRMs. AVAILABILITY AND IMPLEMENTATION: MCAST is part of the MEME Suite software toolkit. A web server and source code are available at http://meme-suite.org and http://alternate.meme-suite.org CONTACT: t.bailey@imb.uq.edu.au or william-noble@uw.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Sítios de Ligação , Análise de Sequência de DNA , Genoma , Humanos , Elementos Reguladores de Transcrição , Software , Fatores de Transcrição
4.
Nucleic Acids Res ; 43(W1): W39-49, 2015 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-25953851

RESUMO

The MEME Suite is a powerful, integrated set of web-based tools for studying sequence motifs in proteins, DNA and RNA. Such motifs encode many biological functions, and their detection and characterization is important in the study of molecular interactions in the cell, including the regulation of gene expression. Since the previous description of the MEME Suite in the 2009 Nucleic Acids Research Web Server Issue, we have added six new tools. Here we describe the capabilities of all the tools within the suite, give advice on their best use and provide several case studies to illustrate how to combine the results of various MEME Suite tools for successful motif-based analyses. The MEME Suite is freely available for academic use at http://meme-suite.org, and source code is also available for download and local installation.


Assuntos
Motivos de Aminoácidos , Motivos de Nucleotídeos , Software , DNA/química , Internet , Plasmodium falciparum , Domínios e Motivos de Interação entre Proteínas , Sinais Direcionadores de Proteínas , Proteínas de Protozoários/química , Receptores de Calcitriol/química , Análise de Sequência de DNA , Análise de Sequência de Proteína , Análise de Sequência de RNA
5.
J Proteome Res ; 13(10): 4488-91, 2014 Oct 03.
Artigo em Inglês | MEDLINE | ID: mdl-25182276

RESUMO

Efficiently and accurately analyzing big protein tandem mass spectrometry data sets requires robust software that incorporates state-of-the-art computational, machine learning, and statistical methods. The Crux mass spectrometry analysis software toolkit ( http://cruxtoolkit.sourceforge.net ) is an open source project that aims to provide users with a cross-platform suite of analysis tools for interpreting protein mass spectrometry data.


Assuntos
Proteínas/química , Espectrometria de Massas em Tandem/métodos , Bases de Dados de Proteínas , Internet
6.
Bioinformatics ; 27(7): 1017-8, 2011 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-21330290

RESUMO

UNLABELLED: A motif is a short DNA or protein sequence that contributes to the biological function of the sequence in which it resides. Over the past several decades, many computational methods have been described for identifying, characterizing and searching with sequence motifs. Critical to nearly any motif-based sequence analysis pipeline is the ability to scan a sequence database for occurrences of a given motif described by a position-specific frequency matrix. RESULTS: We describe Find Individual Motif Occurrences (FIMO), a software tool for scanning DNA or protein sequences with motifs described as position-specific scoring matrices. The program computes a log-likelihood ratio score for each position in a given sequence database, uses established dynamic programming methods to convert this score to a P-value and then applies false discovery rate analysis to estimate a q-value for each position in the given sequence. FIMO provides output in a variety of formats, including HTML, XML and several Santa Cruz Genome Browser formats. The program is efficient, allowing for the scanning of DNA sequences at a rate of 3.5 Mb/s on a single CPU. AVAILABILITY AND IMPLEMENTATION: FIMO is part of the MEME Suite software toolkit. A web server and source code are available at http://meme.sdsc.edu.


Assuntos
Motivos de Aminoácidos , DNA/química , Análise de Sequência de DNA/métodos , Análise de Sequência de Proteína/métodos , Software , Sequência de Bases , Sítios de Ligação , Fator de Ligação a CCCTC , Sequência Conservada , Bases de Dados Genéticas , Genoma Humano , Humanos , Matrizes de Pontuação de Posição Específica , Proteínas Repressoras/metabolismo
7.
Bioinformatics ; 27(12): 1603-9, 2011 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-21543443

RESUMO

MOTIVATION: A question that often comes up after applying a motif finder to a set of co-regulated DNA sequences is whether the reported putative motif is similar to any known motif. While several tools have been designed for this task, Habib et al. pointed out that the scores that are commonly used for measuring similarity between motifs do not distinguish between a good alignment of two informative columns (say, all-A) and one of two uninformative columns. This observation explains why tools such as Tomtom occasionally return an alignment of uninformative columns which is clearly spurious. To address this problem, Habib et al. suggested a new score [Bayesian Likelihood 2-Component (BLiC)] which uses a Bayesian information criterion to penalize matches that are also similar to the background distribution. RESULTS: We show that the BLiC score exhibits other, highly undesirable properties, and we offer instead a general approach to adjust any motif similarity score so as to reduce the number of reported spurious alignments of uninformative columns. We implement our method in Tomtom and show that, without significantly compromising Tomtom's retrieval accuracy or its runtime, we can drastically reduce the number of uninformative alignments. AVAILABILITY AND IMPLEMENTATION: The modified Tomtom is available as part of the MEME Suite at http://meme.nbcr.net.


Assuntos
Análise de Sequência de DNA , Algoritmos , Teorema de Bayes , Alinhamento de Sequência/métodos , Software
8.
Nucleic Acids Res ; 37(Web Server issue): W202-8, 2009 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-19458158

RESUMO

The MEME Suite web server provides a unified portal for online discovery and analysis of sequence motifs representing features such as DNA binding sites and protein interaction domains. The popular MEME motif discovery algorithm is now complemented by the GLAM2 algorithm which allows discovery of motifs containing gaps. Three sequence scanning algorithms--MAST, FIMO and GLAM2SCAN--allow scanning numerous DNA and protein sequence databases for motifs discovered by MEME and GLAM2. Transcription factor motifs (including those discovered using MEME) can be compared with motifs in many popular motif databases using the motif database scanning algorithm TOMTOM. Transcription factor motifs can be further analyzed for putative function by association with Gene Ontology (GO) terms using the motif-GO term association tool GOMO. MEME output now contains sequence LOGOS for each discovered motif, as well as buttons to allow motifs to be conveniently submitted to the sequence and motif database scanning algorithms (MAST, FIMO and TOMTOM), or to GOMO, for further analysis. GLAM2 output similarly contains buttons for further analysis using GLAM2SCAN and for rerunning GLAM2 with different parameters. All of the motif-based tools are now implemented as web services via Opal. Source code, binaries and a web server are freely available for noncommercial use at http://meme.nbcr.net.


Assuntos
Análise de Sequência de DNA , Análise de Sequência de Proteína , Software , Algoritmos , Sítios de Ligação , Bases de Dados Genéticas , Internet , Elementos Reguladores de Transcrição , Fatores de Transcrição/metabolismo
9.
Evolution ; 71(9): 2159-2177, 2017 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-28640400

RESUMO

There is often large divergence in the effects of key nutrients on life span (LS) and reproduction in the sexes, yet nutrient intake is regulated in the same way in males and females given dietary choice. This suggests that the sexes are constrained from feeding to their sex-specific nutritional optima for these traits. Here, we examine the potential for intralocus sexual conflict (IASC) over optimal protein and carbohydrate intake for LS and reproduction to constrain the evolution of sex-specific nutrient regulation in the field cricket, Teleogryllus commodus. We show clear sex differences in the effects of protein and carbohydrate intake on LS and reproduction and strong positive genetic correlations between the sexes for the regulated intake of these nutrients. However, the between-sex additive genetic covariance matrix had very little effect on the predicted evolutionary response of nutrient regulation in the sexes. Thus, IASC appears unlikely to act as an evolutionary constraint on sex-specific nutrient regulation in T. commodus. This finding is supported by clear sexual dimorphism in the regulated intake of these nutrients under dietary choice. However, nutrient regulation did not coincide with the nutritional optima for LS or reproduction in either sex, suggesting that IASC is not completely resolved in T. commodus.


Assuntos
Gryllidae , Reprodução , Animais , Feminino , Masculino , Fenótipo , Seleção Genética , Caracteres Sexuais , Comportamento Sexual
10.
Proc Natl Acad Sci U S A ; 104(30): 12410-5, 2007 Jul 24.
Artigo em Inglês | MEDLINE | ID: mdl-17640883

RESUMO

It is widely assumed that human noncoding sequences comprise a substantial reservoir for functional variants impacting gene regulation and other chromosomal processes. Evolutionarily conserved noncoding sequences (CNSs) in the human genome have attracted considerable attention for their potential to simplify the search for functional elements and phenotypically important human alleles. A major outstanding question is whether functionally significant human noncoding variation is concentrated in CNSs or distributed more broadly across the genome. Here, we combine whole genome sequence data from four nonhuman species (chimp, dog, mouse, and rat) with recently available comprehensive human polymorphism data to analyze selection at single-nucleotide resolution. We show that a substantial fraction of active purifying selection in human noncoding sequences occurs outside of CNSs and is diffusely distributed across the genome. This finding suggests the existence of a large complement of human noncoding variants that may impact gene expression and phenotypic traits, the majority of which will escape detection with current approaches to genome analysis.


Assuntos
Genoma Humano/genética , RNA não Traduzido/genética , Seleção Genética , Animais , Sequência Conservada , Humanos , Nucleotídeos/genética , Fases de Leitura Aberta/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA