Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 58
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Genet Med ; 26(4): 101068, 2024 04.
Artigo em Inglês | MEDLINE | ID: mdl-38193396

RESUMO

PURPOSE: Widespread application of next-generation sequencing, combined with data exchange platforms, has provided molecular diagnoses for countless families. To maximize diagnostic yield, we implemented an unbiased semi-automated genematching algorithm based on genotype and phenotype matching. METHODS: Rare homozygous variants identified in 2 or more affected individuals, but not in healthy individuals, were extracted from our local database of ∼12,000 exomes. Phenotype similarity scores (PSS), based on human phenotype ontology terms, were assigned to each pair of individuals matched at the genotype level using HPOsim. RESULTS: 33,792 genotype-matched pairs were discovered, representing variants in 7567 unique genes. There was an enrichment of PSS ≥0.1 among pathogenic/likely pathogenic variant-level pairs (94.3% in pathogenic/likely pathogenic variant-level matches vs 34.75% in all matches). We highlighted founder or region-specific variants as an internal positive control and proceeded to identify candidate disease genes. Variant-level matches were particularly helpful in cases involving inframe indels and splice region variants beyond the canonical splice sites, which may otherwise have been disregarded, allowing for detection of candidate disease genes, such as KAT2A, RPAIN, and LAMP3. CONCLUSION: Semi-automated genotype matching combined with PSS is a powerful tool to resolve variants of uncertain significance and to identify candidate disease genes.


Assuntos
Genótipo , Humanos , Fenótipo , Mutação , Homozigoto , Estudos de Associação Genética
2.
Mol Cell ; 57(6): 1034-1046, 2015 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-25794615

RESUMO

DNA binding by numerous transcription factors including the p53 tumor suppressor protein constitutes a vital early step in transcriptional activation. While the role of the central core DNA binding domain (DBD) of p53 in site-specific DNA binding has been established, the contribution of the sequence-independent C-terminal domain (CTD) is still not well understood. We investigated the DNA-binding properties of a series of p53 CTD variants using a combination of in vitro biochemical analyses and in vivo binding experiments. Our results provide several unanticipated and interconnected findings. First, the CTD enables DNA binding in a sequence-dependent manner that is drastically altered by either its modification or deletion. Second, dependence on the CTD correlates with the extent to which the p53 binding site deviates from the canonical consensus sequence. Third, the CTD enables stable formation of p53-DNA complexes to divergent binding sites via DNA-induced conformational changes within the DBD itself.


Assuntos
DNA/metabolismo , Proteína Supressora de Tumor p53/química , Proteína Supressora de Tumor p53/metabolismo , Sítios de Ligação , DNA/química , Humanos , Ligantes , Estrutura Terciária de Proteína , Elementos de Resposta , Deleção de Sequência , Proteína Supressora de Tumor p53/genética
3.
Int J Mol Sci ; 24(11)2023 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-37298562

RESUMO

Origins of replication are genomic regions in which replication initiates in a bidirectional manner. Recently, a new methodology (origin-derived single-stranded DNA sequencing; ori-SSDS) was developed that allows the detection of replication initiation in a strand-specific manner. Reanalysis of the strand-specific data revealed that 18-33% of the peaks are non-symmetrical, suggesting a single direction of replication. Analysis of replication fork direction data revealed that these are origins of replication in which the replication is paused in one of the directions, probably due to the existence of a replication fork barrier. Analysis of the unidirectional origins revealed a preference of G4 quadruplexes for the blocked leading strand. Taken together, our analysis identified hundreds of genomic locations in which the replication initiates only in one direction, and suggests that G4 quadruplexes may serve as replication fork barriers in such places.


Assuntos
Replicação do DNA , DNA de Cadeia Simples , Animais , Camundongos , Replicação do DNA/genética , DNA de Cadeia Simples/genética , Origem de Replicação/genética
4.
Genome Res ; 28(10): 1455-1466, 2018 10.
Artigo em Inglês | MEDLINE | ID: mdl-30166406

RESUMO

Mitosis encompasses key molecular changes including chromatin condensation, nuclear envelope breakdown, and reduced transcription levels. Immediately after mitosis, the interphase chromatin structure is reestablished and transcription resumes. The reestablishment of the interphase chromatin is probably achieved by "bookmarking," i.e., the retention of at least partial information during mitosis. To gain a deeper understanding of the contribution of histone modifications to the mitotic bookmarking process, we merged proteomics, immunofluorescence, and ChIP-seq approaches. We focused on key histone modifications and employed HeLa-S3 cells as a model system. Generally, in spite of the general hypoacetylation observed during mitosis, we observed a global concordance between the genomic organization of histone modifications in interphase and mitosis, suggesting that the epigenomic landscape may serve as a component of the mitotic bookmarking process. Next, we investigated the nucleosome that enters nucleosome depleted regions (NDRs) during mitosis. We observed that in ∼60% of the NDRs, the entering nucleosome is distinct from the surrounding highly acetylated nucleosomes and appears to have either low levels of acetylation or high levels of phosphorylation in adjacent residues (since adjacent phosphorylation may interfere with the ability to detect acetylation). Inhibition of histone deacetylases (HDACs) by the small molecule TSA reverts this pattern, suggesting that these nucleosomes are specifically deacetylated during mitosis. Altogether, by merging multiple approaches, our study provides evidence to support a model where histone modifications may play a role in mitotic bookmarking and uncovers new insights into the deposition of nucleosomes during mitosis.


Assuntos
Histonas/metabolismo , Mitose , Nucleossomos/genética , Acetilação/efeitos dos fármacos , Imunoprecipitação da Cromatina , Células HeLa , Código das Histonas , Inibidores de Histona Desacetilases/farmacologia , Histona Desacetilases/metabolismo , Humanos , Nucleossomos/efeitos dos fármacos , Nucleossomos/metabolismo , Fosforilação , Proteômica
5.
Nature ; 519(7544): 468-71, 2015 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-25762143

RESUMO

Stochastic processes in cells are associated with fluctuations in mRNA, protein production and degradation, noisy partition of cellular components at division, and other cell processes. Variability within a clonal population of cells originates from such stochastic processes, which may be amplified or reduced by deterministic factors. Cell-to-cell variability, such as that seen in the heterogeneous response of bacteria to antibiotics, or of cancer cells to treatment, is understood as the inevitable consequence of stochasticity. Variability in cell-cycle duration was observed long ago; however, its sources are still unknown. A central question is whether the variance of the observed distribution originates from stochastic processes, or whether it arises mostly from a deterministic process that only appears to be random. A surprising feature of cell-cycle-duration inheritance is that it seems to be lost within one generation but to be still present in the next generation, generating poor correlation between mother and daughter cells but high correlation between cousin cells. This observation suggests the existence of underlying deterministic factors that determine the main part of cell-to-cell variability. We developed an experimental system that precisely measures the cell-cycle duration of thousands of mammalian cells along several generations and a mathematical framework that allows discrimination between stochastic and deterministic processes in lineages of cells. We show that the inter- and intra-generation correlations reveal complex inheritance of the cell-cycle duration. Finally, we build a deterministic nonlinear toy model for cell-cycle inheritance that reproduces the main features of our data. Our approach constitutes a general method to identify deterministic variability in lineages of cells or organisms, which may help to predict and, eventually, reduce cell-to-cell heterogeneity in various systems, such as cancer cells under treatment.


Assuntos
Ciclo Celular/genética , Linhagem da Célula , Animais , Antibacterianos/farmacologia , Ciclo Celular/efeitos dos fármacos , Divisão Celular/efeitos dos fármacos , Divisão Celular/genética , Linhagem Celular , Mamíferos , Modelos Biológicos , Processos Estocásticos , Fatores de Tempo
6.
Nucleic Acids Res ; 46(16): 8299-8310, 2018 09 19.
Artigo em Inglês | MEDLINE | ID: mdl-29986092

RESUMO

Mammalian DNA replication is a highly organized and regulated process. Large, Mb-sized regions are replicated at defined times along S-phase. Replication Timing (RT) is thought to play a role in shaping the mammalian genome by affecting mutation rates. Previous analyses relied on somatic RT profiles. However, only germline mutations are passed on to offspring and affect genomic composition. Therefore, germ cell RT information is necessary to evaluate the influences of RT on the mammalian genome. We adapted the RT mapping technique for limited amounts of cells, and measured RT from two stages in the mouse germline - primordial germ cells (PGCs) and spermatogonial stem cells (SSCs). RT in germline cells exhibited stronger correlations to both mutation rate and recombination hotspots density than those of RT in somatic tissues, emphasizing the importance of using correct tissues-of-origin for RT profiling. Germline RT maps exhibited stronger correlations to additional genetic features including GC-content, transposable elements (SINEs and LINEs), and gene density. GC content stratification and multiple regression analysis revealed independent contributions of RT to SINE, gene, mutation, and recombination hotspot densities. Together, our results establish a central role for RT in shaping multiple levels of mammalian genome composition.


Assuntos
Período de Replicação do DNA/genética , Replicação do DNA/genética , Genoma/genética , Células Germinativas/metabolismo , Células-Tronco/metabolismo , Animais , Composição de Bases/genética , Linhagem Celular Tumoral , Células Cultivadas , Elementos de DNA Transponíveis/genética , Feminino , Células Germinativas/citologia , Mutação em Linhagem Germinativa , Masculino , Mamíferos/genética , Camundongos da Linhagem 129 , Camundongos Endogâmicos C57BL , Camundongos Endogâmicos DBA , Camundongos Endogâmicos NOD , Camundongos SCID , Camundongos Transgênicos , Elementos Nucleotídeos Curtos e Dispersos/genética , Células-Tronco/citologia
7.
Nat Rev Genet ; 13(8): 552-64, 2012 Jul 18.
Artigo em Inglês | MEDLINE | ID: mdl-22805708

RESUMO

Biological processes are often dynamic, thus researchers must monitor their activity at multiple time points. The most abundant source of information regarding such dynamic activity is time-series gene expression data. These data are used to identify the complete set of activated genes in a biological process, to infer their rates of change, their order and their causal effects and to model dynamic systems in the cell. In this Review we discuss the basic patterns that have been observed in time-series experiments, how these patterns are combined to form expression programs, and the computational analysis, visualization and integration of these data to infer models of dynamic biological systems.


Assuntos
Perfilação da Expressão Gênica , Expressão Gênica , Modelos Genéticos , Animais , Interpretação Estatística de Dados , Epigênese Genética/genética , Humanos , Camundongos , Transdução de Sinais
8.
Bioessays ; 38(1): 8-13, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26628302

RESUMO

We describe a recent approach for distinguishing between stochastic and deterministic sources of variability, focusing on the mammalian cell cycle. Variability between cells is often attributed to stochastic noise, although it may be generated by deterministic components. Interestingly, lineage information can be used to distinguish between variability and determinism. Analysis of correlations within a lineage of the mammalian cell cycle duration revealed its deterministic nature. Here, we discuss the sources of such variability and the possibility that the underlying deterministic process is due to the circadian clock. Finally, we discuss the "kicked cell cycle" model and its implication on the study of the cell cycle in healthy and cancerous tissues.


Assuntos
Ciclo Celular/genética , Divisão Celular/genética , Modelos Teóricos , Neoplasias/genética , Linhagem da Célula , Humanos , Processos Estocásticos
9.
Nucleic Acids Res ; 44(9): 4222-32, 2016 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-27085808

RESUMO

Genome sequence compositions and epigenetic organizations are correlated extensively across multiple length scales. Replication dynamics, in particular, is highly correlated with GC content. We combine genome-wide time of replication (ToR) data, topological domains maps and detailed functional epigenetic annotations to study the correlations between replication timing and GC content at multiple scales. We find that the decrease in genomic GC content at large scale late replicating regions can be explained by mutation bias favoring A/T nucleotide, without selection or biased gene conversion. Quantification of the free dNTP pool during the cell cycle is consistent with a mechanism involving replication-coupled mutation spectrum that favors AT nucleotides at late S-phase. We suggest that mammalian GC content composition is shaped by independent forces, globally modulating mutation bias and locally selecting on functional element. Deconvoluting these forces and analyzing them on their native scales is important for proper characterization of complex genomic correlations.


Assuntos
Replicação do DNA , Composição de Bases , Linhagem Celular Tumoral , Cromatina/genética , Evolução Molecular , Genoma Humano , Humanos , Mutação
10.
Int J Mol Sci ; 18(6)2017 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-28587102

RESUMO

Cancer and genomic instability are highly impacted by the deoxyribonucleic acid (DNA) replication program. Inaccuracies in DNA replication lead to the increased acquisition of mutations and structural variations. These inaccuracies mainly stem from loss of DNA fidelity due to replication stress or due to aberrations in the temporal organization of the replication process. Here we review the mechanisms and impact of these major sources of error to the replication program.


Assuntos
Replicação do DNA , Instabilidade Genômica , Mutação , Neoplasias/genética , Animais , Carcinógenos , Transformação Celular Neoplásica/genética , Dano ao DNA , Progressão da Doença , Humanos , Neoplasias/metabolismo , Neoplasias/patologia , Neoplasias/terapia , Estresse Fisiológico/genética , Fatores de Tempo
11.
Nat Genet ; 39(2): 232-6, 2007 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-17200670

RESUMO

Many genes associated with CpG islands undergo de novo methylation in cancer. Studies have suggested that the pattern of this modification may be partially determined by an instructive mechanism that recognizes specifically marked regions of the genome. Using chromatin immunoprecipitation analysis, here we show that genes methylated in cancer cells are specifically packaged with nucleosomes containing histone H3 trimethylated on Lys27. This chromatin mark is established on these unmethylated CpG island genes early in development and then maintained in differentiated cell types by the presence of an EZH2-containing Polycomb complex. In cancer cells, as opposed to normal cells, the presence of this complex brings about the recruitment of DNA methyl transferases, leading to de novo methylation. These results suggest that tumor-specific targeting of de novo methylation is pre-programmed by an established epigenetic system that normally has a role in marking embryonic genes for repression.


Assuntos
Metilação de DNA , Histonas/metabolismo , Neoplasias/genética , Células CACO-2 , Proteínas de Transporte , Células Cultivadas , Neoplasias do Colo/genética , Ilhas de CpG/genética , Epigênese Genética , Humanos , Lisina/metabolismo , Metilação , Metiltransferases/metabolismo , Proteínas do Envelope Viral
12.
Nat Genet ; 38(2): 149-53, 2006 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-16444255

RESUMO

DNA methylation has a role in the regulation of gene expression during normal mammalian development but can also mediate epigenetic silencing of CpG island genes in cancer and other diseases. Many individual genes (including tumor suppressors) have been shown to undergo de novo methylation in specific tumor types, but the biological logic inherent in this process is not understood. To decipher this mechanism, we have adopted a new approach for detecting CpG island DNA methylation that can be used together with microarray technology. Genome-wide analysis by this technique demonstrated that tumor-specific methylated genes belong to distinct functional categories, have common sequence motifs in their promoters and are found in clusters on chromosomes. In addition, many are already repressed in normal cells. These results are consistent with the hypothesis that cancer-related de novo methylation may come about through an instructive mechanism.


Assuntos
Metilação de DNA , Regulação Neoplásica da Expressão Gênica , Modelos Genéticos , Neoplasias/genética , Animais , Cromossomos/genética , Biologia Computacional , Genoma , Neoplasias/patologia
13.
Biology (Basel) ; 13(3)2024 Mar 08.
Artigo em Inglês | MEDLINE | ID: mdl-38534445

RESUMO

Traditional gene set enrichment analysis falters when applied to large genomic domains, where neighboring genes often share functions. This spatial dependency creates misleading enrichments, mistaking mere physical proximity for genuine biological connections. Here we present Spatial Adjusted Gene Ontology (SAGO), a novel cyclic permutation-based approach, to tackle this challenge. SAGO separates enrichments due to spatial proximity from genuine biological links by incorporating the genes' spatial arrangement into the analysis. We applied SAGO to various datasets in which the identified genomic intervals are large, including replication timing domains, large H3K9me3 and H3K27me3 domains, HiC compartments and lamina-associated domains (LADs). Intriguingly, applying SAGO to prostate cancer samples with large copy number alteration (CNA) domains eliminated most of the enriched GO terms, thus helping to accurately identify biologically relevant gene sets linked to oncogenic processes, free from spatial bias.

14.
Cell Rep Med ; 5(6): 101608, 2024 Jun 18.
Artigo em Inglês | MEDLINE | ID: mdl-38866015

RESUMO

While mutational signatures provide a plethora of prognostic and therapeutic insights, their application in clinical-setting, targeted gene panels is extremely limited. We develop a mutational representation model (which learns and embeds specific mutation signature connections) that enables prediction of dominant signatures with only a few mutations. We predict the dominant signatures across more than 60,000 tumors with gene panels, delineating their landscape across different cancers. Dominant signature predictions in gene panels are of clinical importance. These included UV, tobacco, and apolipoprotein B mRNA editing enzyme, catalytic polypeptide (APOBEC) signatures that are associated with better survival, independently from mutational burden. Further analyses reveal gene and mutation associations with signatures, such as SBS5 with TP53 and APOBEC with FGFR3S249C. In a clinical use case, APOBEC signature is a robust and specific predictor for resistance to epidermal growth factor receptor-tyrosine kinase inhibitors (EGFR-TKIs). Our model provides an easy-to-use way to detect signatures in clinical setting assays with many possible clinical implications for an unprecedented number of cancer patients.


Assuntos
Mutação , Neoplasias , Humanos , Mutação/genética , Neoplasias/genética , Receptores ErbB/genética , Inibidores de Proteínas Quinases/farmacologia , Proteína Supressora de Tumor p53/genética , Redes Neurais de Computação , Receptor Tipo 3 de Fator de Crescimento de Fibroblastos/genética
15.
Genome Res ; 20(4): 526-36, 2010 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-20219943

RESUMO

Information about the binding preferences of many transcription factors is known and characterized by a sequence binding motif. However, determining regions of the genome in which a transcription factor binds based on its motif is a challenging problem, particularly in species with large genomes, since there are often many sequences containing matches to the motif but are not bound. Several rules based on sequence conservation or location, relative to a transcription start site, have been proposed to help differentiate true binding sites from random ones. Other evidence sources may also be informative for this task. We developed a method for integrating multiple evidence sources using logistic regression classifiers. Our method works in two steps. First, we infer a score quantifying the general binding preferences of transcription factor binding at all locations based on a large set of evidence features, without using any motif specific information. Then, we combined this general binding preference score with motif information for specific transcription factors to improve prediction of regions bound by the factor. Using cross-validation and new experimental data we show that, surprisingly, the general binding preference can be highly predictive of true locations of transcription factor binding even when no binding motif is used. When combined with motif information our method outperforms previous methods for predicting locations of true binding.


Assuntos
Biologia Computacional/métodos , Genoma Humano , Integração de Sistemas , Fatores de Transcrição/metabolismo , Sítios de Ligação/genética , Células Cultivadas , Mapeamento Cromossômico/métodos , Previsões , Células HCT116 , Células HeLa , Humanos , Células Jurkat , Modelos Logísticos , Ligação Proteica , Análise de Sequência de DNA , Sítio de Iniciação de Transcrição
16.
PLoS Genet ; 6(7): e1001011, 2010 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-20617169

RESUMO

Recent evidence suggests that the timing of DNA replication is coordinated across megabase-scale domains in metazoan genomes, yet the importance of this aspect of genome organization is unclear. Here we show that replication timing is remarkably conserved between human and mouse, uncovering large regions that may have been governed by similar replication dynamics since these species have diverged. This conservation is both tissue-specific and independent of the genomic G+C content conservation. Moreover, we show that time of replication is globally conserved despite numerous large-scale genome rearrangements. We systematically identify rearrangement fusion points and demonstrate that replication time can be locally diverged at these loci. Conversely, rearrangements are shown to be correlated with early replication and physical chromosomal proximity. These results suggest that large chromosomal domains of coordinated replication are shuffled by evolution while conserving the large-scale nuclear architecture of the genome.


Assuntos
Cromossomos de Mamíferos/genética , Período de Replicação do DNA , Evolução Molecular , Mamíferos/genética , Animais , Linhagem Celular , Mapeamento Cromossômico , Humanos , Camundongos
17.
Sci Rep ; 13(1): 7833, 2023 05 15.
Artigo em Inglês | MEDLINE | ID: mdl-37188696

RESUMO

Mutational signatures' association with replication timing (RT) has been studied in cancer samples, but the RT distribution of somatic mutations in non-cancerous cells was only minimally explored. Here, we performed comprehensive analyses of mutational signatures in 2.9 million somatic mutations across multiple non-cancerous tissues, stratified by early and late RT regions. We found that many mutational processes are active mainly or solely in early RT, such as SBS16 in hepatocytes and SBS88 in the colon, or in late RT, such as SBS4 in lung and hepatocytes, and SBS18 across many tissues. The two ubiquitous signatures, SBS1 and SBS5, showed late and early bias, respectively, across multiple tissues and in mutations representing germ cells. We also performed a direct comparison with cancer samples in 4 matched tissue-cancer types. Unexpectedly, while for most signatures the RT bias was consistent in normal tissue and in cancer, we found that SBS1's late RT bias is lost in cancer.


Assuntos
Neoplasias , Humanos , Mutação , Neoplasias/genética , Período de Replicação do DNA , Colo , Hepatócitos
18.
Sci Rep ; 13(1): 13143, 2023 08 12.
Artigo em Inglês | MEDLINE | ID: mdl-37573368

RESUMO

Cancer somatic mutations are the product of multiple mutational and repair processes, some of which are tightly associated with DNA replication. Mutation rates (MR) are known to be higher in late replication timing (RT) regions, but different processes can affect this association. Systematic analysis of the mutational landscape of 2787 tumors from 32 tumor types revealed that approximately one third of the tumor samples show weak association between replication timing and mutation rate. Further analyses revealed that those samples have unique mutational signatures and are enriched with mutations in genes involved in DNA replication, DNA repair and chromatin structure. Surprisingly, analysis of differentially expressed genes between weak and strong RT-MR association groups revealed that tumors with weak association are enriched with genes associated with cell-cell communication and the immune system, suggesting a non-autonomous response to DNA damage.


Assuntos
Taxa de Mutação , Neoplasias , Humanos , Mutação , Reparo do DNA/genética , Dano ao DNA/genética , Neoplasias/genética , Neoplasias/patologia , Replicação do DNA/genética , Genoma Humano
19.
Biophys J ; 102(8): 1712-21, 2012 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-22768926

RESUMO

Two major classes of small regulatory RNAs--small interfering RNAs (siRNAs) and microRNA (miRNAs)--are involved in a common RNA interference processing pathway. Small RNAs within each of these families were found to compete for limiting amounts of shared components, required for their biogenesis and processing. Association with Argonaute (Ago), the catalytic component of the RNA silencing complex, was suggested as the central mechanistic point in RNA interference machinery competition. Aiming to better understand the competition between small RNAs in the cell, we present a mathematical model and characterize a range of specific cell and experimental parameters affecting the competition. We apply the model to competition between miRNAs and study the change in the expression level of their target genes under a variety of conditions. We show quantitatively that the amount of Ago and miRNAs in the cell are dominant factors contributing greatly to the competition. Interestingly, we observe what to our knowledge is a novel type of competition that takes place when Ago is abundant, by which miRNAs with shared targets compete over them. Furthermore, we use the model to examine different interaction mechanisms that might operate in establishing the miRNA-Ago complexes, mainly those related to their stability and recycling. Our model provides a mathematical framework for future studies of competition effects in regulation mechanisms involving small RNAs.


Assuntos
MicroRNAs/genética , MicroRNAs/metabolismo , Modelos Genéticos , RNA Interferente Pequeno/genética , RNA Interferente Pequeno/metabolismo , Proteínas Argonautas/metabolismo , Ligação Competitiva , Regulação da Expressão Gênica , Humanos , Cinética , Estabilidade de RNA , RNA Mensageiro/química , RNA Mensageiro/genética , RNA Mensageiro/metabolismo
20.
Bioinformatics ; 27(17): 2361-7, 2011 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-21752801

RESUMO

MOTIVATION: Motif discovery is now routinely used in high-throughput studies including large-scale sequencing and proteomics. These datasets present new challenges. The first is speed. Many motif discovery methods do not scale well to large datasets. Another issue is identifying discriminative rather than generative motifs. Such discriminative motifs are important for identifying co-factors and for explaining changes in behavior between different conditions. RESULTS: To address these issues we developed a method for DECOnvolved Discriminative motif discovery (DECOD). DECOD uses a k-mer count table and so its running time is independent of the size of the input set. By deconvolving the k-mers DECOD considers context information without using the sequences directly. DECOD outperforms previous methods both in speed and in accuracy when using simulated and real biological benchmark data. We performed new binding experiments for p53 mutants and used DECOD to identify p53 co-factors, suggesting new mechanisms for p53 activation. AVAILABILITY: The source code and binaries for DECOD are available at http://www.sb.cs.cmu.edu/DECOD CONTACT: zivbj@cs.cmu.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
DNA/química , Motivos de Nucleotídeos , Análise de Sequência de DNA , Algoritmos , Sequência de Bases , Proteína Supressora de Tumor p53/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA