Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 48
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Cell ; 184(8): 2239-2254.e39, 2021 04 15.
Artículo en Inglés | MEDLINE | ID: mdl-33831375

RESUMEN

Intra-tumor heterogeneity (ITH) is a mechanism of therapeutic resistance and therefore an important clinical challenge. However, the extent, origin, and drivers of ITH across cancer types are poorly understood. To address this, we extensively characterize ITH across whole-genome sequences of 2,658 cancer samples spanning 38 cancer types. Nearly all informative samples (95.1%) contain evidence of distinct subclonal expansions with frequent branching relationships between subclones. We observe positive selection of subclonal driver mutations across most cancer types and identify cancer type-specific subclonal patterns of driver gene mutations, fusions, structural variants, and copy number alterations as well as dynamic changes in mutational processes between subclonal expansions. Our results underline the importance of ITH and its drivers in tumor evolution and provide a pan-cancer resource of comprehensively annotated subclonal events from whole-genome sequencing data.


Asunto(s)
Heterogeneidad Genética , Neoplasias/genética , Variaciones en el Número de Copia de ADN , ADN de Neoplasias/química , ADN de Neoplasias/metabolismo , Bases de Datos Genéticas , Resistencia a Antineoplásicos/genética , Humanos , Neoplasias/patología , Polimorfismo de Nucleótido Simple , Secuenciación Completa del Genoma
2.
Cell ; 173(4): 1003-1013.e15, 2018 05 03.
Artículo en Inglés | MEDLINE | ID: mdl-29681457

RESUMEN

The majority of newly diagnosed prostate cancers are slow growing, with a long natural life history. Yet a subset can metastasize with lethal consequences. We reconstructed the phylogenies of 293 localized prostate tumors linked to clinical outcome data. Multiple subclones were detected in 59% of patients, and specific subclonal architectures associate with adverse clinicopathological features. Early tumor development is characterized by point mutations and deletions followed by later subclonal amplifications and changes in trinucleotide mutational signatures. Specific genes are selectively mutated prior to or following subclonal diversification, including MTOR, NKX3-1, and RB1. Patients with low-risk monoclonal tumors rarely relapse after primary therapy (7%), while those with high-risk polyclonal tumors frequently do (61%). The presence of multiple subclones in an index biopsy may be necessary, but not sufficient, for relapse of localized prostate cancer, suggesting that evolution-aware biomarkers should be studied in prospective studies of low-risk tumors suitable for active surveillance.


Asunto(s)
Neoplasias de la Próstata/patología , Biomarcadores de Tumor/sangre , Secuenciación de Nucleótidos de Alto Rendimiento , Proteínas de Homeodominio/genética , Proteínas de Homeodominio/metabolismo , Humanos , Masculino , Clasificación del Tumor , Recurrencia Local de Neoplasia , Polimorfismo de Nucleótido Simple , Modelos de Riesgos Proporcionales , Estudios Prospectivos , Neoplasias de la Próstata/clasificación , Neoplasias de la Próstata/genética , Proteínas de Unión a Retinoblastoma/genética , Proteínas de Unión a Retinoblastoma/metabolismo , Serina-Treonina Quinasas TOR/genética , Serina-Treonina Quinasas TOR/metabolismo , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Ubiquitina-Proteína Ligasas/genética , Ubiquitina-Proteína Ligasas/metabolismo
3.
Nature ; 578(7793): 122-128, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-32025013

RESUMEN

Cancer develops through a process of somatic evolution1,2. Sequencing data from a single biopsy represent a snapshot of this process that can reveal the timing of specific genomic aberrations and the changing influence of mutational processes3. Here, by whole-genome sequencing analysis of 2,658 cancers as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA)4, we reconstruct the life history and evolution of mutational processes and driver mutation sequences of 38 types of cancer. Early oncogenesis is characterized by mutations in a constrained set of driver genes, and specific copy number gains, such as trisomy 7 in glioblastoma and isochromosome 17q in medulloblastoma. The mutational spectrum changes significantly throughout tumour evolution in 40% of samples. A nearly fourfold diversification of driver genes and increased genomic instability are features of later stages. Copy number alterations often occur in mitotic crises, and lead to simultaneous gains of chromosomal segments. Timing analyses suggest that driver mutations often precede diagnosis by many years, if not decades. Together, these results determine the evolutionary trajectories of cancer, and highlight opportunities for early cancer detection.


Asunto(s)
Evolución Molecular , Genoma Humano/genética , Neoplasias/genética , Reparación del ADN/genética , Dosificación de Gen , Genes Supresores de Tumor , Variación Genética , Humanos , Mutagénesis Insercional/genética
4.
Nat Methods ; 18(2): 144-155, 2021 02.
Artículo en Inglés | MEDLINE | ID: mdl-33398189

RESUMEN

Subclonal reconstruction from bulk tumor DNA sequencing has become a pillar of cancer evolution studies, providing insight into the clonality and relative ordering of mutations and mutational processes. We provide an outline of the complex computational approaches used for subclonal reconstruction from single and multiple tumor samples. We identify the underlying assumptions and uncertainties in each step and suggest best practices for analysis and quality assessment. This guide provides a pragmatic resource for the growing user community of subclonal reconstruction methods.


Asunto(s)
ADN de Neoplasias/genética , Neoplasias/genética , Análisis de Secuencia de ADN/métodos , Algoritmos , Humanos , Polimorfismo de Nucleótido Simple
6.
Cell ; 133(7): 1266-76, 2008 Jun 27.
Artículo en Inglés | MEDLINE | ID: mdl-18585359

RESUMEN

Most homeodomains are unique within a genome, yet many are highly conserved across vast evolutionary distances, implying strong selection on their precise DNA-binding specificities. We determined the binding preferences of the majority (168) of mouse homeodomains to all possible 8-base sequences, revealing rich and complex patterns of sequence specificity and showing that there are at least 65 distinct homeodomain DNA-binding activities. We developed a computational system that successfully predicts binding sites for homeodomain proteins as distant from mouse as Drosophila and C. elegans, and we infer full 8-mer binding profiles for the majority of known animal homeodomains. Our results provide an unprecedented level of resolution in the analysis of this simple domain structure and suggest that variation in sequence recognition may be a factor in its functional diversity and evolutionary success.


Asunto(s)
ADN/química , Proteínas de Homeodominio/química , Animales , Secuencia de Bases , Biología Computacional , Secuencia Conservada , ADN/metabolismo , Evolución Molecular , Proteínas de Homeodominio/metabolismo , Ratones , Modelos Moleculares , Unión Proteica , Factores de Transcripción/química , Factores de Transcripción/metabolismo
7.
Nucleic Acids Res ; 47(6): 2856-2870, 2019 04 08.
Artículo en Inglés | MEDLINE | ID: mdl-30698747

RESUMEN

Stress hormones bind and activate the glucocorticoid receptor (GR) in many tissues including the brain. We identified arginine and glutamate rich 1 (ARGLU1) in a screen for new modulators of glucocorticoid signaling in the CNS. Biochemical studies show that the glutamate rich C-terminus of ARGLU1 coactivates multiple nuclear receptors including the glucocorticoid receptor (GR) and the arginine rich N-terminus interacts with splicing factors and binds to RNA. RNA-seq of neural cells depleted of ARGLU1 revealed significant changes in the expression and alternative splicing of distinct genes involved in neurogenesis. Loss of ARGLU1 is embryonic lethal in mice, and knockdown in zebrafish causes neurodevelopmental and heart defects. Treatment with dexamethasone, a GR activator, also induces changes in the pattern of alternatively spliced genes, many of which were lost when ARGLU1 was absent. Importantly, the genes found to be alternatively spliced in response to glucocorticoid treatment were distinct from those under transcriptional control by GR, suggesting an additional mechanism of glucocorticoid action is present in neural cells. Our results thus show that ARGLU1 is a novel factor for embryonic development that modulates basal transcription and alternative splicing in neural cells with consequences for glucocorticoid signaling.


Asunto(s)
Desarrollo Embrionario , Glucocorticoides/farmacología , Péptidos y Proteínas de Señalización Intracelular/fisiología , Empalme del ARN/genética , Activación Transcripcional/genética , Empalme Alternativo/efectos de los fármacos , Empalme Alternativo/genética , Animales , Animales Modificados Genéticamente , Células Cultivadas , Embrión no Mamífero , Desarrollo Embrionario/efectos de los fármacos , Desarrollo Embrionario/genética , Glucocorticoides/metabolismo , Células HEK293 , Humanos , Ratones , Ratones Endogámicos C57BL , Neurogénesis/efectos de los fármacos , Neurogénesis/genética , Empalme del ARN/efectos de los fármacos , Transducción de Señal/efectos de los fármacos , Transducción de Señal/genética , Estrés Fisiológico/efectos de los fármacos , Estrés Fisiológico/genética , Transactivadores/fisiología , Activación Transcripcional/efectos de los fármacos , Pez Cebra
8.
Nature ; 499(7457): 172-7, 2013 Jul 11.
Artículo en Inglés | MEDLINE | ID: mdl-23846655

RESUMEN

RNA-binding proteins are key regulators of gene expression, yet only a small fraction have been functionally characterized. Here we report a systematic analysis of the RNA motifs recognized by RNA-binding proteins, encompassing 205 distinct genes from 24 diverse eukaryotes. The sequence specificities of RNA-binding proteins display deep evolutionary conservation, and the recognition preferences for a large fraction of metazoan RNA-binding proteins can thus be inferred from their RNA-binding domain sequence. The motifs that we identify in vitro correlate well with in vivo RNA-binding data. Moreover, we can associate them with distinct functional roles in diverse types of post-transcriptional regulation, enabling new insights into the functions of RNA-binding proteins both in normal physiology and in human disease. These data provide an unprecedented overview of RNA-binding proteins and their targets, and constitute an invaluable resource for determining post-transcriptional regulatory mechanisms in eukaryotes.


Asunto(s)
Regulación de la Expresión Génica/genética , Motivos de Nucleótidos/genética , Proteínas de Unión al ARN/metabolismo , Trastorno Autístico/genética , Secuencia de Bases , Sitios de Unión/genética , Secuencia Conservada/genética , Células Eucariotas/metabolismo , Humanos , Datos de Secuencia Molecular , Estructura Terciaria de Proteína/genética , Factores de Empalme de ARN , Estabilidad del ARN/genética , Proteínas de Unión al ARN/química , Proteínas de Unión al ARN/genética
9.
Methods ; 118-119: 3-15, 2017 04 15.
Artículo en Inglés | MEDLINE | ID: mdl-27956239

RESUMEN

RNA-binding proteins (RBPs) participate in diverse cellular processes and have important roles in human development and disease. The human genome, and that of many other eukaryotes, encodes hundreds of RBPs that contain canonical sequence-specific RNA-binding domains (RBDs) as well as numerous other unconventional RNA binding proteins (ucRBPs). ucRBPs physically associate with RNA but lack common RBDs. The degree to which these proteins bind RNA, in a sequence specific manner, is unknown. Here, we provide a detailed description of both the laboratory and data processing methods for RNAcompete, a method we have previously used to analyze the RNA binding preferences of hundreds of RBD-containing RBPs, from diverse eukaryotes. We also determine the RNA-binding preferences for two human ucRBPs, NUDT21 and CNBP, and use this analysis to exemplify the RNAcompete pipeline. The results of our RNAcompete experiments are consistent with independent RNA-binding data for these proteins and demonstrate the utility of RNAcompete for analyzing the growing repertoire of ucRBPs.


Asunto(s)
Factor de Especificidad de Desdoblamiento y Poliadenilación/genética , Análisis por Micromatrices/métodos , Proteínas de Unión al ARN/genética , ARN/química , Animales , Secuencia de Bases , Sitios de Unión , Factor de Especificidad de Desdoblamiento y Poliadenilación/metabolismo , Clonación Molecular , Cartilla de ADN/química , Cartilla de ADN/metabolismo , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Escherichia coli/genética , Escherichia coli/metabolismo , Expresión Génica , Humanos , Unión Proteica , Dominios Proteicos , ARN/genética , ARN/metabolismo , Proteínas de Unión al ARN/metabolismo , Proteínas Recombinantes/genética , Proteínas Recombinantes/metabolismo , Alineación de Secuencia
10.
Methods ; 126: 18-28, 2017 08 15.
Artículo en Inglés | MEDLINE | ID: mdl-28651966

RESUMEN

RNA-binding proteins recognize RNA sequences and structures, but there is currently no systematic and accurate method to derive large (>12base) motifs de novo that reflect a combination of intrinsic preference to both sequence and structure. To address this absence, we introduce RNAcompete-S, which couples a single-step competitive binding reaction with an excess of random RNA 40-mers to a custom computational pipeline for interrogation of the bound RNA sequences and derivation of SSMs (Sequence and Structure Models). RNAcompete-S confirms that HuR, QKI, and SRSF1 prefer binding sites that are single stranded, and recapitulates known 8-10bp sequence and structure preferences for Vts1p and RBMY. We also derive an 18-base long SSM for Drosophila SLBP, which to our knowledge has not been previously determined by selections from pure random sequence, and accurately discriminates human replication-dependent histone mRNAs. Thus, RNAcompete-S enables accurate identification of large, intrinsic sequence-structure specificities with a uniform assay.


Asunto(s)
Secuencia de Bases/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Proteínas de Unión al ARN/genética , Humanos , Proteínas de Unión al ARN/química , Análisis de Secuencia de ARN/métodos
11.
Genome Res ; 24(1): 154-66, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24170600

RESUMEN

Identifying genes in the genomic context is central to a cell's ability to interpret the genome. Yet, in general, the signals used to define eukaryotic genes are poorly described. Here, we derived simple classifiers that identify where transcription will initiate and terminate using nucleic acid sequence features detectable by the yeast cell, which we integrate into a Unified Model (UM) that models transcription as a whole. The cis-elements that denote where transcription initiates function primarily through nucleosome depletion, and, using a synthetic promoter system, we show that most of these elements are sufficient to initiate transcription in vivo. Hrp1 binding sites are the major characteristic of terminators; these binding sites are often clustered in terminator regions and can terminate transcription bidirectionally. The UM predicts global transcript structure by modeling transcription of the genome using a hidden Markov model whose emissions are the outputs of the initiation and termination classifiers. We validated the novel predictions of the UM with available RNA-seq data and tested it further by directly comparing the transcript structure predicted by the model to the transcription generated by the cell for synthetic DNA segments of random design. We show that the UM identifies transcription start sites more accurately than the initiation classifier alone, indicating that the relative arrangement of promoter and terminator elements influences their function. Our model presents a concrete description of how the cell defines transcript units, explains the existence of nongenic transcripts, and provides insight into genome evolution.


Asunto(s)
ADN de Hongos/genética , Modelos Genéticos , Saccharomyces cerevisiae/genética , Sitio de Iniciación de la Transcripción , Transcripción Genética , Sitios de Unión , Simulación por Computador , Genes Fúngicos , Genoma Fúngico , Nucleosomas/genética , Regiones Promotoras Genéticas , Reproducibilidad de los Resultados , Saccharomyces cerevisiae/metabolismo
12.
BMC Bioinformatics ; 16: 156, 2015 May 14.
Artículo en Inglés | MEDLINE | ID: mdl-25972088

RESUMEN

BACKGROUND: Tumour samples containing distinct sub-populations of cancer and normal cells present challenges in the development of reproducible biomarkers, as these biomarkers are based on bulk signals from mixed tumour profiles. ISOpure is the only mRNA computational purification method to date that does not require a paired tumour-normal sample, provides a personalized cancer profile for each patient, and has been tested on clinical data. Replacing mixed tumour profiles with ISOpure-preprocessed cancer profiles led to better prognostic gene signatures for lung and prostate cancer. RESULTS: To simplify the integration of ISOpure into standard R-based bioinformatics analysis pipelines, the algorithm has been implemented as an R package. The ISOpureR package performs analogously to the original code in estimating the fraction of cancer cells and the patient cancer mRNA abundance profile from tumour samples in four cancer datasets. CONCLUSIONS: The ISOpureR package estimates the fraction of cancer cells and personalized patient cancer mRNA abundance profile from a mixed tumour profile. This open-source R implementation enables integration into existing computational pipelines, as well as easy testing, modification and extension of the model.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Perfilación de la Expresión Génica , Neoplasias de la Próstata/diagnóstico , Neoplasias de la Próstata/genética , Programas Informáticos , Humanos , Masculino , Modelos Teóricos , Pronóstico
13.
Nucleic Acids Res ; 41(20): 9438-60, 2013 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-23945942

RESUMEN

Despite studies that have investigated the interactions of double-stranded RNA-binding proteins like Staufen with RNA in vitro, how they achieve target specificity in vivo remains uncertain. We performed RNA co-immunoprecipitations followed by microarray analysis to identify Staufen-associated mRNAs in early Drosophila embryos. Analysis of the localization and functions of these transcripts revealed a number of potentially novel roles for Staufen. Using computational methods, we identified two sequence features that distinguish Staufen's target transcripts from non-targets. First, these Drosophila transcripts, as well as those human transcripts bound by human Staufen1 and 2, have 3' untranslated regions (UTRs) that are 3-4-fold longer than unbound transcripts. Second, the 3'UTRs of Staufen-bound transcripts are highly enriched for three types of secondary structures. These structures map with high precision to previously identified Staufen-binding regions in Drosophila bicoid and human ARF1 3'UTRs. Our results provide the first systematic genome-wide analysis showing how a double-stranded RNA-binding protein achieves target specificity.


Asunto(s)
Regiones no Traducidas 3' , Proteínas del Citoesqueleto/metabolismo , Proteínas de Drosophila/metabolismo , Proteínas de Unión al ARN/metabolismo , Animales , Drosophila/embriología , Drosophila/genética , Genoma de los Insectos , Humanos , Conformación de Ácido Nucleico , ARN Bicatenario/química , ARN Bicatenario/metabolismo , ARN Mensajero/análisis , ARN Mensajero/metabolismo
14.
Nat Genet ; 37(9): 991-6, 2005 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-16127451

RESUMEN

Recent mammalian microarray experiments detected widespread transcription and indicated that there may be many undiscovered multiple-exon protein-coding genes. To explore this possibility, we labeled cDNA from unamplified, polyadenylation-selected RNA samples from 37 mouse tissues to microarrays encompassing 1.14 million exon probes. We analyzed these data using GenRate, a Bayesian algorithm that uses a genome-wide scoring function in a factor graph to infer genes. At a stringent exon false detection rate of 2.7%, GenRate detected 12,145 gene-length transcripts and confirmed 81% of the 10,000 most highly expressed known genes. Notably, our analysis showed that most of the 155,839 exons detected by GenRate were associated with known genes, providing microarray-based evidence that most multiple-exon genes have already been identified. GenRate also detected tens of thousands of potential new exons and reconciled discrepancies in current cDNA databases by 'stitching' new transcribed regions into previously annotated genes.


Asunto(s)
Biología Computacional , ADN Complementario/química , Bases de Datos como Asunto , Exones/genética , Genoma , Transcripción Genética , Algoritmos , Animales , Perfilación de la Expresión Génica , Humanos , Ratones , Análisis por Micromatrices , ARN Mensajero/química , ARN Mensajero/metabolismo
15.
Nat Biotechnol ; 2024 Jun 11.
Artículo en Inglés | MEDLINE | ID: mdl-38862616

RESUMEN

Subclonal reconstruction algorithms use bulk DNA sequencing data to quantify parameters of tumor evolution, allowing an assessment of how cancers initiate, progress and respond to selective pressures. We launched the ICGC-TCGA (International Cancer Genome Consortium-The Cancer Genome Atlas) DREAM Somatic Mutation Calling Tumor Heterogeneity and Evolution Challenge to benchmark existing subclonal reconstruction algorithms. This 7-year community effort used cloud computing to benchmark 31 subclonal reconstruction algorithms on 51 simulated tumors. Algorithms were scored on seven independent tasks, leading to 12,061 total runs. Algorithm choice influenced performance substantially more than tumor features but purity-adjusted read depth, copy-number state and read mappability were associated with the performance of most algorithms on most tasks. No single algorithm was a top performer for all seven tasks and existing ensemble strategies were unable to outperform the best individual methods, highlighting a key research need. All containerized methods, evaluation code and datasets are available to support further assessment of the determinants of subclonal reconstruction accuracy and development of improved methods to understand tumor evolution.

16.
Cell Stem Cell ; 30(12): 1658-1673.e10, 2023 12 07.
Artículo en Inglés | MEDLINE | ID: mdl-38065069

RESUMEN

Stem cells regulate their self-renewal and differentiation fate outcomes through both symmetric and asymmetric divisions. m6A RNA methylation controls symmetric commitment and inflammation of hematopoietic stem cells (HSCs) through unknown mechanisms. Here, we demonstrate that the nuclear speckle protein SON is an essential m6A target required for murine HSC self-renewal, symmetric commitment, and inflammation control. Global profiling of m6A identified that m6A mRNA methylation of Son increases during HSC commitment. Upon m6A depletion, Son mRNA increases, but its protein is depleted. Reintroduction of SON rescues defects in HSC symmetric commitment divisions and engraftment. Conversely, Son deletion results in a loss of HSC fitness, while overexpression of SON improves mouse and human HSC engraftment potential by increasing quiescence. Mechanistically, we found that SON rescues MYC and suppresses the METTL3-HSC inflammatory gene expression program, including CCL5, through transcriptional regulation. Thus, our findings define a m6A-SON-CCL5 axis that controls inflammation and HSC fate.


Asunto(s)
Proteínas de Unión al ADN , Células Madre Hematopoyéticas , Inflamación , Metilación de ARN , Animales , Humanos , Ratones , Diferenciación Celular/genética , Células Madre Hematopoyéticas/metabolismo , Metilación , Metiltransferasas/genética , Metiltransferasas/metabolismo , ARN Mensajero/metabolismo , Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Metilación de ARN/genética
17.
Bioinformatics ; 27(22): 3166-72, 2011 Nov 15.
Artículo en Inglés | MEDLINE | ID: mdl-21965819

RESUMEN

MOTIVATION: Lung cancer is often discovered long after its onset, making identifying genes important in its initiation and progression a challenge. By the time the tumors are discovered, we only observe the final sum of changes of the few genes that initiated cancer and thousands of genes that they have influenced. Gene interactions and heterogeneity of samples make it difficult to identify genes consistent between different cohorts. Using gene and gene-product interaction networks, we propose a principled approach to identify a small subset of genes whose network neighbors exhibit consistently high expression change (in cancerous tissue versus normal) regardless of their own expression. We hypothesize that these genes can shed light on the larger scale perturbations in the overall landscape of expression levels. RESULTS: We benchmark our method on simulated data, and show that we can recover a true gene list in noisy measurement data. We then apply our method to four non-small cell lung cancer and two pancreatic cancer cohorts, finding several genes that are consistent within all cohorts of the same cancer type. CONCLUSION: Our model is flexible, robust and identifies gene sets that are more consistent across cohorts than several other approaches. Additionally, our method can be applied on a per-patient basis not requiring large cohorts of patients to find genes of influence. Our approach is generally applicable to gene expression studies where the goal is to identify a small set of influential genes that may in turn explain the much larger set of genome-wide expression changes.


Asunto(s)
Redes Reguladoras de Genes , Genes Relacionados con las Neoplasias , Neoplasias Pulmonares/genética , Mapas de Interacción de Proteínas , Carcinoma de Pulmón de Células no Pequeñas/genética , Perfilación de la Expresión Génica , Humanos , Neoplasias Pulmonares/metabolismo
18.
Nat Methods ; 4(12): 1045-9, 2007 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-18026111

RESUMEN

We demonstrate that paired expression profiles of microRNAs (miRNAs) and mRNAs can be used to identify functional miRNA-target relationships with high precision. We used a Bayesian data analysis algorithm, GenMiR++, to identify a network of 1,597 high-confidence target predictions for 104 human miRNAs, which was supported by RNA expression data across 88 tissues and cell types, sequence complementarity and comparative genomics data. We experimentally verified our predictions by investigating the result of let-7b downregulation in retinoblastoma using quantitative reverse transcriptase (RT)-PCR and microarray profiling: some of our verified let-7b targets include CDC25A and BCL7A. Compared to sequence-based predictions, our high-scoring GenMiR++ predictions had much more consistent Gene Ontology annotations and were more accurate predictors of which mRNA levels respond to changes in let-7b levels.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Marcación de Gen/métodos , MicroARNs/genética , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Análisis de Secuencia de ARN/métodos , Secuencia de Bases , Humanos , Datos de Secuencia Molecular
19.
Bioinformatics ; 25(8): 1012-8, 2009 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-19088121

RESUMEN

MOTIVATION: Recognition of specific DNA sequences is a central mechanism by which transcription factors (TFs) control gene expression. Many TF-binding preferences, however, are unknown or poorly characterized, in part due to the difficulty associated with determining their specificity experimentally, and an incomplete understanding of the mechanisms governing sequence specificity. New techniques that estimate the affinity of TFs to all possible k-mers provide a new opportunity to study DNA-protein interaction mechanisms, and may facilitate inference of binding preferences for members of a given TF family when such information is available for other family members. RESULTS: We employed a new dataset consisting of the relative preferences of mouse homeodomains for all eight-base DNA sequences in order to ask how well we can predict the binding profiles of homeodomains when only their protein sequences are given. We evaluated a panel of standard statistical inference techniques, as well as variations of the protein features considered. Nearest neighbour among functionally important residues emerged among the most effective methods. Our results underscore the complexity of TF-DNA recognition, and suggest a rational approach for future analyses of TF families.


Asunto(s)
Biología Computacional/métodos , ADN/química , Análisis de Secuencia de ADN/métodos , Factores de Transcripción/metabolismo , Sitios de Unión , ADN/metabolismo , Factores de Transcripción/química
20.
Nat Commun ; 11(1): 6247, 2020 12 07.
Artículo en Inglés | MEDLINE | ID: mdl-33288765

RESUMEN

Whole-genome sequencing can be used to estimate subclonal populations in tumours and this intra-tumoural heterogeneity is linked to clinical outcomes. Many algorithms have been developed for subclonal reconstruction, but their variabilities and consistencies are largely unknown. We evaluate sixteen pipelines for reconstructing the evolutionary histories of 293 localized prostate cancers from single samples, and eighteen pipelines for the reconstruction of 10 tumours with multi-region sampling. We show that predictions of subclonal architecture and timing of somatic mutations vary extensively across pipelines. Pipelines show consistent types of biases, with those incorporating SomaticSniper and Battenberg preferentially predicting homogenous cancer cell populations and those using MuTect tending to predict multiple populations of cancer cells. Subclonal reconstructions using multi-region sampling confirm that single-sample reconstructions systematically underestimate intra-tumoural heterogeneity, predicting on average fewer than half of the cancer cell populations identified by multi-region sequencing. Overall, these biases suggest caution in interpreting specific architectures and subclonal variants.


Asunto(s)
Algoritmos , Heterogeneidad Genética , Mutación , Neoplasias de la Próstata/genética , Secuenciación Completa del Genoma/métodos , Biomarcadores de Tumor/genética , Evolución Clonal , Células Clonales/metabolismo , Biología Computacional/métodos , Variaciones en el Número de Copia de ADN , Humanos , Masculino , Modelos Genéticos , Polimorfismo de Nucleótido Simple , Neoplasias de la Próstata/clasificación , Neoplasias de la Próstata/patología
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA