Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
1.
Front Genet ; 8: 38, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28443131

RESUMEN

Deep sequencing of cDNAs made from spliced mRNAs indicates that most coding genes in many animals and plants have pre-mRNA transcripts that are alternatively spliced. In pre-mRNAs, in addition to invariant exons that are present in almost all mature mRNA products, there are at least 6 additional types of exons, such as exons from alternative promoters or with alternative polyA sites, mutually exclusive exons, skipped exons, or exons with alternative 5' or 3' splice sites. Our bioinformatics-based hypothesis is that, in analogy to the genetic code, there is an "alternative-splicing code" in introns and flanking exon sequences, analogous to the genetic code, that directs alternative splicing of many of the 36 types of introns. In humans, we identified 42 different consensus sequences that are each present in at least 100 human introns. 37 of the 42 top consensus sequences are significantly enriched or depleted in at least one of the 36 types of introns. We further supported our hypothesis by showing that 96 out of 96 analyzed human disease mutations that affect RNA splicing, and change alternative splicing from one class to another, can be partially explained by a mutation altering a consensus sequence from one type of intron to that of another type of intron. Some of the alternative splicing consensus sequences, and presumably their small-RNA or protein targets, are evolutionarily conserved from 50 plant to animal species. We also noticed the set of introns within a gene usually share the same splicing codes, thus arguing that one sub-type of splicesosome might process all (or most) of the introns in a given gene. Our work sheds new light on a possible mechanism for generating the tremendous diversity in protein structure by alternative splicing of pre-mRNAs.

2.
Cancer Res ; 73(15): 4830-9, 2013 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-23786772

RESUMEN

Topoisomerase I (Top1) relaxes DNA supercoiling by forming transient cleavage complexes (Top1cc) up- and downstream of transcription complexes. Top1cc can be trapped by carcinogenic and endogenous DNA lesions and by camptothecin, resulting in transcription blocks. Here, we undertook genome-wide analysis of camptothecin-treated cells at exon resolution. RNA samples from HCT116 and MCF7 cells were analyzed with the Affy Exon Array platform, allowing high-resolution mapping along 18,537 genes. Long genes that are highly expressed were the most susceptible to downregulation, whereas short genes were preferentially upregulated. Along the body of genes, downregulation was most important toward the 3'-end and increased with the number of exon-intron junctions. Ubiquitin and RNA degradation-related pathway genes were selectively downregulated. Parallel analysis of microRNA with the Agilent miRNA microarray platform revealed that miR-142-3p was highly induced by camptothecin. More than 10% of the downregulated genes were targets of this p53-dependent microRNA. Our study shows the profound impact of Top1cc on transcription elongation, especially at intron-exon junctions and on transcript stability by microRNA miR-142-3p upregulation.


Asunto(s)
ADN-Topoisomerasas de Tipo I/genética , Regulación de la Expresión Génica/genética , MicroARNs/genética , Transcripción Genética/genética , Camptotecina/farmacología , Línea Celular Tumoral , Células HCT116 , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos , Empalme del ARN/efectos de los fármacos , Empalme del ARN/genética , ARN Interferente Pequeño , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa , Inhibidores de Topoisomerasa I/farmacología , Transcripción Genética/efectos de los fármacos , Transfección
3.
PLoS One ; 7(7): e40062, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22848369

RESUMEN

BACKGROUND: The NCI-60 is a panel of 60 diverse human cancer cell lines used by the U.S. National Cancer Institute to screen compounds for anticancer activity. We recently clustered genes based on correlation of expression profiles across the NCI-60. Many of the resulting clusters were characterized by cancer-associated biological functions. The set of curated glioblastoma (GBM) gene expression data from the Cancer Genome Atlas (TCGA) initiative has recently become available. Thus, we are now able to determine which of the processes are robustly shared by both the immortalized cell lines and clinical cancers. RESULTS: Our central observation is that some sets of highly correlated genes in the NCI-60 expression data are also highly correlated in the GBM expression data. Furthermore, a "double fishing" strategy identified many sets of genes that show Pearson correlation ≥0.60 in both the NCI-60 and the GBM data sets relative to a given "bait" gene. The number of such gene sets far exceeds the number expected by chance. CONCLUSION: Many of the gene-gene correlations found in the NCI-60 do not reflect just the conditions of cell lines in culture; rather, they reflect processes and gene networks that also function in vivo. A number of gene network correlations co-occur in the NCI-60 and GBM data sets, but there are others that occur only in NCI-60 or only in GBM. In sum, this analysis provides an additional perspective on both the utility and the limitations of the NCI-60 in furthering our understanding of cancers in vivo.


Asunto(s)
Bases de Datos Genéticas , Regulación Neoplásica de la Expresión Génica , Genes Relacionados con las Neoplasias , Genoma Humano , Glioblastoma/metabolismo , Línea Celular Tumoral , Perfilación de la Expresión Génica , Glioblastoma/genética , Humanos , National Cancer Institute (U.S.) , Estados Unidos
4.
PLoS One ; 7(5): e35716, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22570691

RESUMEN

Although there is extensive information on gene expression and molecular interactions in various cell types, integrating those data in a functionally coherent manner remains challenging. This study explores the premise that genes whose expression at the mRNA level is correlated over diverse cell lines are likely to function together in a network of molecular interactions. We previously derived expression-correlated gene clusters from the database of the NCI-60 human tumor cell lines and associated each cluster with function categories of the Gene Ontology (GO) database. From a cluster rich in genes associated with GO categories related to cell migration, we extracted 15 genes that were highly cross-correlated; prominent among them were RRAS, AXL, ADAM9, FN14, and integrin-beta1. We then used those 15 genes as bait to identify other correlated genes in the NCI-60 database. A survey of current literature disclosed, not only that many of the expression-correlated genes engaged in molecular interactions related to migration, invasion, and metastasis, but that highly cross-correlated subsets of those genes engaged in specific cell migration processes. We assembled this information in molecular interaction maps (MIMs) that depict networks governing 3 cell migration processes: degradation of extracellular matrix, production of transient focal complexes at the leading edge of the cell, and retraction of the rear part of the cell. Also depicted are interactions controlling the release and effects of calcium ions, which may regulate migration in a spaciotemporal manner in the cell. The MIMs and associated text comprise a detailed and integrated summary of what is currently known or surmised about the role of the expression cross-correlated genes in molecular networks governing those processes.


Asunto(s)
Movimiento Celular/genética , Redes Reguladoras de Genes , Transcriptoma , Actinas/genética , Actinas/metabolismo , Calcio/metabolismo , Calpaína/genética , Calpaína/metabolismo , Carcinoma de Pulmón de Células no Pequeñas/genética , Carcinoma de Pulmón de Células no Pequeñas/metabolismo , Línea Celular Tumoral , Membrana Celular/metabolismo , Análisis por Conglomerados , Transición Epitelial-Mesenquimal/genética , Matriz Extracelular/genética , Matriz Extracelular/metabolismo , Regulación Neoplásica de la Expresión Génica , Humanos , Integrinas/genética , Integrinas/metabolismo , Péptidos y Proteínas de Señalización Intercelular/genética , Péptidos y Proteínas de Señalización Intercelular/metabolismo , Mapas de Interacción de Proteínas , Proteínas Proto-Oncogénicas/genética , Proteínas Proto-Oncogénicas/metabolismo , Proteínas Tirosina Quinasas Receptoras/genética , Proteínas Tirosina Quinasas Receptoras/metabolismo , Proteínas ras/genética , Proteínas ras/metabolismo , Tirosina Quinasa del Receptor Axl
5.
PLoS One ; 7(1): e30317, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22291933

RESUMEN

BACKGROUND: The NCI-60 is a panel of 60 diverse human cancer cell lines used by the U.S. National Cancer Institute to screen compounds for anticancer activity. In the current study, gene expression levels from five platforms were integrated to yield a single composite transcriptome profile. The comprehensive and reliable nature of that dataset allows us to study gene co-expression across cancer cell lines. METHODOLOGY/PRINCIPAL FINDINGS: Hierarchical clustering revealed numerous clusters of genes in which the genes co-vary across the NCI-60. To determine functional categorization associated with each cluster, we used the Gene Ontology (GO) Consortium database and the GoMiner tool. GO maps genes to hierarchically-organized biological process categories. GoMiner can leverage GO to perform ontological analyses of gene expression studies, generating a list of significant functional categories. CONCLUSIONS/SIGNIFICANCE: GoMiner analysis revealed many clusters of coregulated genes that are associated with functional groupings of GO biological process categories. Notably, those categories arising from coherent co-expression groupings reflect cancer-related themes such as adhesion, cell migration, RNA splicing, immune response and signal transduction. Thus, these clusters demonstrate transcriptional coregulation of functionally-related genes.


Asunto(s)
Regulación Neoplásica de la Expresión Génica , Redes Reguladoras de Genes/fisiología , Familia de Multigenes/genética , Familia de Multigenes/fisiología , Neoplasias/genética , Algoritmos , Antineoplásicos/aislamiento & purificación , Antineoplásicos/farmacología , Línea Celular Tumoral , Análisis por Conglomerados , Evaluación Preclínica de Medicamentos/métodos , Perfilación de la Expresión Génica , Estudios de Asociación Genética , Ensayos Analíticos de Alto Rendimiento/métodos , Humanos , Neoplasias/patología
6.
BMC Proc ; 5 Suppl 2: S3, 2011 May 28.
Artículo en Inglés | MEDLINE | ID: mdl-21554761

RESUMEN

BACKGROUND: The gene networks underlying closure of the optic fissure during vertebrate eye development are not well-understood. We use a novel clustering method based on nonlinear dimension reduction with data labeling to analyze microarray data from laser capture microdissected (LCM) cells at the site and developmental stages (days 10.5 to 12.5) of optic fissure closure. RESULTS: Our nonlinear methods created clusters of genes that mapped onto more specific biological processes and functions related to eye development as defined by Gene Ontology at lower false discovery rates than conventional linear cluster algorithms. Our new methods build on the advantages of LCM to isolate pure phenotypic populations within complex tissues in order to identify systems biology relationships among critical gene products expressed at lower copy number. CONCLUSIONS: The combination of LCM of embryonic organs, gene expression microarrays, and nonlinear dimension reduction with labeling is a potentially useful approach to extract subtle spatial and temporal co-variations within the gene regulatory networks that specify mammalian organogenesis and organ function. Our results motivate further analysis of nonlinear dimension reduction with labeling within other microarray data sets from LCM dissected tissues or other cell specific samples to determine the more general utility of our method for uncovering more specific biological functional relationships.

7.
BMC Bioinformatics ; 12: 52, 2011 Feb 10.
Artículo en Inglés | MEDLINE | ID: mdl-21310028

RESUMEN

BACKGROUND: The Gene Ontology (GO) Consortium organizes genes into hierarchical categories based on biological process, molecular function and subcellular localization. Tools such as GoMiner can leverage GO to perform ontological analysis of microarray and proteomics studies, typically generating a list of significant functional categories. Two or more of the categories are often redundant, in the sense that identical or nearly-identical sets of genes map to the categories. The redundancy might typically inflate the report of significant categories by a factor of three-fold, create an illusion of an overly long list of significant categories, and obscure the relevant biological interpretation. RESULTS: We now introduce a new resource, RedundancyMiner, that de-replicates the redundant and nearly-redundant GO categories that had been determined by first running GoMiner. The main algorithm of RedundancyMiner, MultiClust, performs a novel form of cluster analysis in which a GO category might belong to several category clusters. Each category cluster follows a "complete linkage" paradigm. The metric is a similarity measure that captures the overlap in gene mapping between pairs of categories. CONCLUSIONS: RedundancyMiner effectively eliminated redundancies from a set of GO categories. For illustration, we have applied it to the clarification of the results arising from two current studies: (1) assessment of the gene expression profiles obtained by laser capture microdissection (LCM) of serial cryosections of the retina at the site of final optic fissure closure in the mouse embryos at specific embryonic stages, and (2) analysis of a conceptual data set obtained by examining a list of genes deemed to be "kinetochore" genes.


Asunto(s)
Minería de Datos/métodos , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Proteómica/métodos , Algoritmos , Animales , Análisis por Conglomerados , Biología Computacional/métodos , Ratones , Programas Informáticos
8.
Cancer Res ; 70(20): 8055-65, 2010 Oct 15.
Artículo en Inglés | MEDLINE | ID: mdl-20817775

RESUMEN

RNA splicing is required to remove introns from pre-mRNA, and alternative splicing generates protein diversity. Topoisomerase I (Top1) has been shown to be coupled with splicing by regulating serine/arginine-rich splicing proteins. Prior studies on isolated genes also showed that Top1 poisoning by camptothecin (CPT), which traps Top1 cleavage complexes (Top1cc), can alter RNA splicing. Here, we tested the effect of Top1 inhibition on splicing at the genome-wide level in human colon carcinoma HCT116 and breast carcinoma MCF7 cells. The RNA of HCT116 cells treated with CPT for various times was analyzed with ExonHit Human Splice Array. Unlike other exon array platforms, the ExonHit arrays include junction probes that allow the detection of splice variants with high sensitivity and specificity. We report that CPT treatment preferentially affects the splicing of splicing-related factors, such as RBM8A, and generates transcripts coding for inactive proteins lacking key functional domains. The splicing alterations induced by CPT are not observed with cisplatin or vinblastine and are not simply due to reduced Top1 activity, as Top1 downregulation by short interfering RNA did not alter splicing like CPT treatment. Inhibition of RNA polymerase II (Pol II) hyperphosphorylation by 5,6-dichloro-1-ß-d-ribofuranosylbenzimidazole (DRB) blocked the splicing alteration induced by CPT, which suggests that the rapid Pol II hyperphosphorylation induced by CPT interferes with normal splicing. The preferential effect of CPT on genes encoding splicing factors may explain the abnormal splicing of a large number of genes in response to Top1cc.


Asunto(s)
ADN-Topoisomerasas de Tipo I/envenenamiento , Estudio de Asociación del Genoma Completo/métodos , Empalme Alternativo/efectos de los fármacos , Empalme Alternativo/genética , Antineoplásicos Fitogénicos/farmacología , Camptotecina/farmacología , Cisplatino/farmacología , Neoplasias del Colon/tratamiento farmacológico , Neoplasias del Colon/genética , ADN-Topoisomerasas de Tipo I/genética , Proteínas de Unión al ADN/efectos de los fármacos , Proteínas de Unión al ADN/genética , Regulación hacia Abajo/efectos de los fármacos , Exones/efectos de los fármacos , Exones/genética , Variación Genética , Humanos , Modelos Estadísticos , Fosforilación , ARN Polimerasa II/efectos de los fármacos , ARN Polimerasa II/metabolismo , Empalme del ARN/genética , ARN Interferente Pequeño/efectos de los fármacos , ARN Interferente Pequeño/genética , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa , Vinblastina/farmacología
9.
Bioinformatics ; 26(16): 1945-9, 2010 Aug 15.
Artículo en Inglés | MEDLINE | ID: mdl-20616384

RESUMEN

MOTIVATION: Splice variation plays important roles in evolution and cancer. Different splice variants of a gene may be characteristic of particular cellular processes, subcellular locations or organs. Although several genomic projects have identified splice variants, there have been no large-scale computational studies of the relationship between number of splice variants and biological function. The Gene Ontology (GO) and tools for leveraging GO, such as GoMiner, now make such a study feasible. RESULTS: We partitioned genes into two groups: those with numbers of splice variants b (b=1,..., 10). Then we used GoMiner to determine whether any GO categories are enriched in genes with particular numbers of splice variants. Since there was no a priori 'appropriate' partition boundary, we studied those 'robust' categories whose enrichment did not depend on the selection of a particular partition boundary. Furthermore, because the distribution of splice variant number was a snapshot taken at a particular point in time, we confirmed that those observations were stable across successive builds of GenBank. A small number of categories were found for genes in the lower partitions. A larger number of categories were found for genes in the higher partitions. Those categories were largely associated with cell death and signal transduction. Apoptotic genes tended to have a large repertoire of splice variants, and genes with splice variants exhibited a distinctive 'apoptotic island' in clustered image maps (CIMs). AVAILABILITY: Supplementary tables and figures are available at URL http://discover.nci.nih.gov/OG/supplementaryMaterials.html. The Safari browser appears to perform better than Firefox for these particular items.


Asunto(s)
Empalme Alternativo , Genómica/métodos , Análisis por Conglomerados , Bases de Datos de Ácidos Nucleicos , Genes , Variación Genética , Genoma , Humanos , Transducción de Señal/genética , Programas Informáticos
10.
BMC Bioinformatics ; 9: 313, 2008 Jul 18.
Artículo en Inglés | MEDLINE | ID: mdl-18638396

RESUMEN

BACKGROUND: Over 60% of protein-coding genes in vertebrates express mRNAs that undergo alternative splicing. The resulting collection of transcript isoforms poses significant challenges for contemporary biological assays. For example, RT-PCR validation of gene expression microarray results may be unsuccessful if the two technologies target different splice variants. Effective use of sequence-based technologies requires knowledge of the specific splice variant(s) that are targeted. In addition, the critical roles of alternative splice forms in biological function and in disease suggest that assay results may be more informative if analyzed in the context of the targeted splice variant. RESULTS: A number of contemporary technologies are used for analyzing transcripts or proteins. To enable investigation of the impact of splice variation on the interpretation of data derived from those technologies, we have developed SpliceCenter. SpliceCenter is a suite of user-friendly, web-based applications that includes programs for analysis of RT-PCR primer/probe sets, effectors of RNAi, microarrays, and protein-targeting technologies. Both interactive and high-throughput implementations of the tools are provided. The interactive versions of SpliceCenter tools provide visualizations of a gene's alternative transcripts and probe target positions, enabling the user to identify which splice variants are or are not targeted. The high-throughput batch versions accept user query files and provide results in tabular form. When, for example, we used SpliceCenter's batch siRNA-Check to process the Cancer Genome Anatomy Project's large-scale shRNA library, we found that only 59% of the 50,766 shRNAs in the library target all known splice variants of the target gene, 32% target some but not all, and 9% do not target any currently annotated transcript. CONCLUSION: SpliceCenter http://discover.nci.nih.gov/splicecenter provides unique, user-friendly applications for assessing the impact of transcript variation on the design and interpretation of RT-PCR, RNAi, gene expression microarrays, antibody-based detection, and mass spectrometry proteomics. The tools are intended for use by bench biologists as well as bioinformaticists.


Asunto(s)
Empalme Alternativo , Investigación Biomédica/métodos , Sistemas de Administración de Bases de Datos , Proyectos de Investigación , Interfaz Usuario-Computador , Sondas de ADN/clasificación , Bases de Datos de Ácidos Nucleicos , Difusión de la Información , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Péptidos/análisis , Péptidos/química , Interferencia de ARN , Sitios de Empalme de ARN , ARN no Traducido/clasificación , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa/métodos , Transcripción Genética
11.
Front Neuroendocrinol ; 29(3): 428-44, 2008 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-18295320

RESUMEN

Trans-generational epigenetic phenomena, such as contamination with endocrine-disrupting chemicals (EDCs) that decrease fertility and the global methylation status of DNA in the offspring, are of great concern because they may affect health, particularly the health of children. However, of even greater concern is the possibility that trans-generational changes in the methylation status of the DNA might lead to permanent changes in the DNA sequence itself. By contaminating the environment with EDCs, mankind might be permanently affecting the health of future generations. In this section, we present evidence from our laboratory and others that trans-generational epigenetic changes in DNA might lead to mutations directed to genes encoding amino acid repeat-containing proteins (RCPs) that are important for adaptive evolution or cancer progression. Such epigenetic changes can be induced "naturally" by hormones or "unnaturally" by EDCs or environmental stress. To illustrate the phenomenon, we present new bioinformatic evidence that the only RCP ontological categories conserved from Drosophila to humans are "regulation of splicing," "regulation of transcription," and "regulation of synaptogenesis," which are classes of genes likely to be important for evolutionary processes. Based on that and other evidence, we propose a model for evolution that we call the EDGE (Epigenetically Directed Genetic Errors) hypothesis for the mechanism by which mutations are targeted at epigenetically modified "contingency genes" encoding RCPs. In the model, "epigenetic assimilation" of metastable epialleles of RCPs over many generations can lead to mutations directed to those genes, thereby permanently stabilizing the adaptive phenotype.


Asunto(s)
Evolución Biológica , Epigénesis Genética , Modelos Teóricos , Neoplasias/fisiopatología , Sistemas Neurosecretores/fisiología , Secuencias Repetitivas de Aminoácido/genética , Transducción de Señal/fisiología , Animales , Cruzamiento , Disruptores Endocrinos/metabolismo , Humanos , Mutación , Fenotipo , Filogenia
12.
BMC Bioinformatics ; 9: 67, 2008 Jan 29.
Artículo en Inglés | MEDLINE | ID: mdl-18230172

RESUMEN

BACKGROUND: Microarray experiments generate vast amounts of data. The functional context of differentially expressed genes can be assessed by querying the Gene Ontology (GO) database via GoMiner. Directed acyclic graph representations, which are used to depict GO categories enriched with differentially expressed genes, are difficult to interpret and, depending on the particular analysis, may not be well suited for formulating new hypotheses. Additional graphical methods are therefore needed to augment the GO graphical representation. RESULTS: We present an alternative visualization approach, area-proportional Euler diagrams, showing set relationships with semi-quantitative size information in a single diagram to support biological hypothesis formulation. The cardinalities of sets and intersection sets are represented by area-proportional Euler diagrams and their corresponding graphical (circular or polygonal) intersection areas. Optimally proportional representations are obtained using swarm and evolutionary optimization algorithms. CONCLUSION: VennMaster's area-proportional Euler diagrams effectively structure and visualize the results of a GO analysis by indicating to what extent flagged genes are shared by different categories. In addition to reducing the complexity of the output, the visualizations facilitate generation of novel hypotheses from the analysis of seemingly unrelated categories that share differentially expressed genes.


Asunto(s)
Algoritmos , Gráficos por Computador , Bases de Datos de Proteínas , Perfilación de la Expresión Génica/métodos , Almacenamiento y Recuperación de la Información/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Interfaz Usuario-Computador , Modelos Logísticos , Modelos Genéticos
13.
Bioinformatics ; 23(18): 2385-90, 2007 Sep 15.
Artículo en Inglés | MEDLINE | ID: mdl-17660211

RESUMEN

MOTIVATION: Affymetrix microarrays are widely used to measure global expression of mRNA transcripts. That technology is based on the concept of a probe set. Individual probes within a probe set were originally designated by Affymetrix to hybridize with the same unique mRNA transcript. Because of increasing accuracy in knowledge of genomic sequences, however, a substantial number of the manufacturer's original probe groupings and mappings are now known to be inaccurate and must be corrected. Otherwise, analysis and interpretation of an Affymetrix microarray experiment will be in error. RESULTS: AffyProbeMiner is a computationally efficient platform-independent tool that uses all RefSeq mature RNA protein coding transcripts and validated complete coding sequences in GenBank to (1) regroup the individual probes into consistent probe sets and (2) remap the probe sets to the correct sets of mRNA transcripts. The individual probes are grouped into probe sets that are 'transcript-consistent' in that they hybridize to the same mRNA transcript (or transcripts) and, therefore, measure the same entity (or entities). About 65.6% of the probe sets on the HG-U133A chip were affected by the remapping. Pre-computed regrouped and remapped probe sets for many Affymetrix microarrays are made freely available at the AffyProbeMiner web site. Alternatively, we provide a web service that enables the user to perform the remapping for any type of short-oligo commercial or custom array that has an Affymetrix-format Chip Definition File (CDF). Important features that differentiate AffyProbeMiner from other approaches are flexibility in the handling of splice variants, computational efficiency, extensibility, customizability and user-friendliness of the interface. AVAILABILITY: The web interface and software (GPL open source license), are publicly-accessible at http://discover.nci.nih.gov/affyprobeminer.


Asunto(s)
Sondas de ADN/genética , Bases de Datos Genéticas , Almacenamiento y Recuperación de la Información/métodos , Internet , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Interfaz Usuario-Computador , Secuencia de Bases , Sistemas de Administración de Bases de Datos , Datos de Secuencia Molecular , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos
14.
BMC Bioinformatics ; 8: 75, 2007 Mar 05.
Artículo en Inglés | MEDLINE | ID: mdl-17338820

RESUMEN

BACKGROUND: There are many fewer genes in the human genome than there are expressed transcripts. Alternative splicing is the reason. Alternatively spliced transcripts are often specific to tissue type, developmental stage, environmental condition, or disease state. Accurate analysis of microarray expression data and design of new arrays for alternative splicing require assessment of probes at the sequence and exon levels. DESCRIPTION: SpliceMiner is a web interface for querying Evidence Viewer Database (EVDB). EVDB is a comprehensive, non-redundant compendium of splice variant data for human genes. We constructed EVDB as a queryable implementation of the NCBI Evidence Viewer (EV). EVDB is based on data obtained from NCBI Entrez Gene and EV. The automated EVDB build process uses only complete coding sequences, which may or may not include partial or complete 5' and 3' UTRs, and filters redundant splice variants. Unlike EV, which supports only one-at-a-time queries, SpliceMiner supports high-throughput batch queries and provides results in an easily parsable format. SpliceMiner maps probes to splice variants, effectively delineating the variants identified by a probe. CONCLUSION: EVDB can be queried by gene symbol, genomic coordinates, or probe sequence via a user-friendly web-based tool we call SpliceMiner (http://discover.nci.nih.gov/spliceminer). The EVDB/SpliceMiner combination provides an interface with human splice variant information and, going beyond the very valuable NCBI Evidence Viewer, supports fluent, high-throughput analysis. Integration of EVDB information into microarray analysis and design pipelines has the potential to improve the analysis and bioinformatic interpretation of gene expression data, for both batch and interactive processing. For example, whenever a gene expression value is recognized as important or appears anomalous in a microarray experiment, the interactive mode of SpliceMiner can be used quickly and easily to check for possible splice variant issues.


Asunto(s)
Empalme Alternativo , Bases de Datos de Ácidos Nucleicos , Variación Genética/genética , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Programas Informáticos , Genoma Humano/genética , Humanos , National Library of Medicine (U.S.) , Análisis de Secuencia por Matrices de Oligonucleótidos/instrumentación , Estados Unidos
15.
Genome Res ; 16(6): 796-803, 2006 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-16672307

RESUMEN

Sequencing of full-insert clones from full-length cDNA libraries from both Xenopus laevis and Xenopus tropicalis has been ongoing as part of the Xenopus Gene Collection Initiative. Here we present 10,967 full ORF verified cDNA clones (8049 from X. laevis and 2918 from X. tropicalis) as a community resource. Because the genome of X. laevis, but not X. tropicalis, has undergone allotetraploidization, comparison of coding sequences from these two clawed (pipid) frogs provides a unique angle for exploring the molecular evolution of duplicate genes. Within our clone set, we have identified 445 gene trios, each comprised of an allotetraploidization-derived X. laevis gene pair and their shared X. tropicalis ortholog. Pairwise dN/dS, comparisons within trios show strong evidence for purifying selection acting on all three members. However, dN/dS ratios between X. laevis gene pairs are elevated relative to their X. tropicalis ortholog. This difference is highly significant and indicates an overall relaxation of selective pressures on duplicated gene pairs. We have found that the paralogs that have been lost since the tetraploidization event are enriched for several molecular functions, but have found no such enrichment in the extant paralogs. Approximately 14% of the paralogous pairs analyzed here also show differential expression indicative of subfunctionalization.


Asunto(s)
Secuencia de Bases , Biblioteca de Genes , Poliploidía , Xenopus laevis/genética , Xenopus/genética , Animales , Evolución Molecular , Expresión Génica , Genes Duplicados , Genoma , Datos de Secuencia Molecular , Sistemas de Lectura Abierta/genética , Filogenia , Homología de Secuencia de Ácido Nucleico
16.
Cancer Res ; 65(22): 10255-64, 2005 Nov 15.
Artículo en Inglés | MEDLINE | ID: mdl-16288013

RESUMEN

Activation of the p53 network plays a central role in the inflammatory stress response associated with ulcerative colitis and may modulate cancer risk in patients afflicted with this chronic disease. Here, we describe the gene expression profiles associated with four microenvironmental components of the inflammatory response (NO*, H2O2, DNA replication arrest, and hypoxia) that result in p53 stabilization and activation. Isogenic HCT116 and HCT116 TP53-/- colon cancer cells were exposed to the NO* donor Sper/NO, H2O2, hypoxia, or hydroxyurea, and their mRNA was analyzed using oligonucleotide microarrays. Overall, 1,396 genes changed in a p53-dependent manner (P < 0.001), with the majority representing a "unique" profile for each condition. Only 14 genes were common to all four conditions. Included were eight known p53 target genes. Hierarchical sample clustering distinguished early (1 and 4 hours) from late responses (8, 12, and 24 hours), and each treatment was differentiated from the others. Overall, NO* and hypoxia stimulated similar transcriptional responses. Gene ontology analysis revealed cell cycle as a key feature of stress responses and confirmed the similarity between NO* and hypoxia. Cell cycle profiles analyzed by flow cytometry showed that NO* and hypoxia induced quiescent S-phase and G2-M arrest. Using a novel bioinformatic algorithm, we identified several putative p53-responsive elements among the genes induced in a p53-dependent manner, including four [KIAA0247, FLJ12484, p53CSV (HSPC132), and CNK (PLK3)] common to all exposures. In summary, the inflammatory stress response is a complex, integrated biological network in which p53 is a key molecular node regulating gene expression.


Asunto(s)
Inflamación/metabolismo , Proteína p53 Supresora de Tumor/metabolismo , Ciclo Celular , Hipoxia de la Célula , Citometría de Flujo , Perfilación de la Expresión Génica , Células HCT116 , Humanos , Peróxido de Hidrógeno , Inflamación/etiología , Inflamación/genética , Donantes de Óxido Nítrico , Óxidos de Nitrógeno , Estrés Oxidativo/genética , Estrés Oxidativo/fisiología , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa , Espermina/análogos & derivados , Proteína p53 Supresora de Tumor/biosíntesis , Proteína p53 Supresora de Tumor/deficiencia , Proteína p53 Supresora de Tumor/genética
17.
Nat Genet ; 37(8): 844-52, 2005 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-16041372

RESUMEN

Alternative RNA splicing greatly increases proteome diversity and may thereby contribute to tissue-specific functions. We carried out genome-wide quantitative analysis of alternative splicing using a custom Affymetrix microarray to assess the role of the neuronal splicing factor Nova in the brain. We used a stringent algorithm to identify 591 exons that were differentially spliced in the brain relative to immune tissues, and 6.6% of these showed major splicing defects in the neocortex of Nova2-/- mice. We tested 49 exons with the largest predicted Nova-dependent splicing changes and validated all 49 by RT-PCR. We analyzed the encoded proteins and found that all those with defined brain functions acted in the synapse (34 of 40, including neurotransmitter receptors, cation channels, adhesion and scaffold proteins) or in axon guidance (8 of 40). Moreover, of the 35 proteins with known interaction partners, 74% (26) interact with each other. Validating a large set of Nova RNA targets has led us to identify a multi-tiered network in which Nova regulates the exon content of RNAs encoding proteins that interact in the synapse.


Asunto(s)
Empalme Alternativo/fisiología , Antígenos de Neoplasias/fisiología , Proteínas del Tejido Nervioso/fisiología , Proteínas de Unión al ARN/fisiología , Sinapsis/metabolismo , Animales , Ratones , Ratones Noqueados , Neocórtex/metabolismo , Antígeno Ventral Neuro-Oncológico , Análisis de Secuencia por Matrices de Oligonucleótidos
18.
BMC Bioinformatics ; 6: 168, 2005 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-15998470

RESUMEN

BACKGROUND: We previously developed GoMiner, an application that organizes lists of 'interesting' genes (for example, under-and overexpressed genes from a microarray experiment) for biological interpretation in the context of the Gene Ontology. The original version of GoMiner was oriented toward visualization and interpretation of the results from a single microarray (or other high-throughput experimental platform), using a graphical user interface. Although that version can be used to examine the results from a number of microarrays one at a time, that is a rather tedious task, and original GoMiner includes no apparatus for obtaining a global picture of results from an experiment that consists of multiple microarrays. We wanted to provide a computational resource that automates the analysis of multiple microarrays and then integrates the results across all of them in useful exportable output files and visualizations. RESULTS: We now introduce a new tool, High-Throughput GoMiner, that has those capabilities and a number of others: It (i) efficiently performs the computationally-intensive task of automated batch processing of an arbitrary number of microarrays, (ii) produces a human-or computer-readable report that rank-orders the multiple microarray results according to the number of significant GO categories, (iii) integrates the multiple microarray results by providing organized, global clustered image map visualizations of the relationships of significant GO categories, (iv) provides a fast form of 'false discovery rate' multiple comparisons calculation, and (v) provides annotations and visualizations for relating transcription factor binding sites to genes and GO categories. CONCLUSION: High-Throughput GoMiner achieves the desired goal of providing a computational resource that automates the analysis of multiple microarrays and integrates results across all of the microarrays. For illustration, we show an application of this new tool to the interpretation of altered gene expression patterns in Common Variable Immune Deficiency (CVID). High-Throughput GoMiner will be useful in a wide range of applications, including the study of time-courses, evaluation of multiple drug treatments, comparison of multiple gene knock-outs or knock-downs, and screening of large numbers of chemical derivatives generated from a promising lead compound.


Asunto(s)
Inmunodeficiencia Variable Común/genética , Perfilación de la Expresión Génica/instrumentación , Análisis por Matrices de Proteínas/instrumentación , Programas Informáticos , Interfaz Usuario-Computador , Sitios de Unión , Mapeo Cromosómico , Análisis por Conglomerados , Inmunodeficiencia Variable Común/tratamiento farmacológico , Presentación de Datos , Bases de Datos Genéticas , Procesamiento Automatizado de Datos , Humanos , Fenotipo , Esquistosomiasis/genética , Diseño de Software , Factores de Transcripción/metabolismo
19.
Mol Pharmacol ; 66(6): 1397-405, 2004 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-15342794

RESUMEN

Discovery of the multidrug resistance protein 1 (MDR1), an ATP-binding cassette (ABC) transporter able to transport many anticancer drugs, was a clinically relevant breakthrough in multidrug resistance research. Although the overexpression of ABC transporters such as P-glycoprotein/ABCB1, MRP1/ABCC1, and MXR/ABCG2 seems to be a major cause of failure in the treatment of cancer, acquired resistance to multiple anticancer drugs may also be multifactorial, involving alteration of detoxification processes, apoptosis, DNA repair, drug uptake, and overexpression of other ABC transporters. As a tool for the study of such phenomena, we designed and created a microarray platform, the ABC-ToxChip, to evaluate relative levels of transcriptional activation among genes involved in the various mechanisms of resistance. In the ABC-ToxChip, a comprehensive set of genes important in toxicological responses (represented by 2200 cDNA probes) is complemented with probes specifically matching ABC transporters as well as oligonucleotides representing 18,000 unique human genes. By comparing the transcriptional profiles of KB-3-1 and DU-145 parental cells with resistant derivatives selected in colchicine (KB-8-5), and 9-nitro-camptothecin (RCO.1), respectively, we demonstrate that ABC transporters (ABCB1/MDR1 and ABCC2/MRP2, respectively) show dramatic overexpression, whereas the glutathione S-transferase gene GST-Pi shows the strongest decrease in expression among the 20,000 genes studied. The results were confirmed by quantitative reverse transcription-polymerase chain reaction and immunohistochemistry. The custom-designed ABC-Tox microarray presented here will be helpful to elucidate mechanisms leading to anticancer drug resistance.


Asunto(s)
Transportadoras de Casetes de Unión a ATP/genética , Resistencia a Múltiples Medicamentos/genética , Análisis de Secuencia por Matrices de Oligonucleótidos , Transportadoras de Casetes de Unión a ATP/metabolismo , Apoptosis , Transporte Biológico , Línea Celular , Sondas de ADN , Reparación del ADN , Exones , Humanos , Proteína 2 Asociada a Resistencia a Múltiples Medicamentos
20.
BMC Bioinformatics ; 5: 80, 2004 Jun 23.
Artículo en Inglés | MEDLINE | ID: mdl-15214961

RESUMEN

BACKGROUND: When processing microarray data sets, we recently noticed that some gene names were being changed inadvertently to non-gene names. RESULTS: A little detective work traced the problem to default date format conversions and floating-point format conversions in the very useful Excel program package. The date conversions affect at least 30 gene names; the floating-point conversions affect at least 2,000 if Riken identifiers are included. These conversions are irreversible; the original gene names cannot be recovered. CONCLUSIONS: Users of Excel for analyses involving gene names should be aware of this problem, which can cause genes, including medically important ones, to be lost from view and which has contaminated even carefully curated public databases. We provide work-arounds and scripts for circumventing the problem.


Asunto(s)
Biología Computacional/clasificación , Biología Computacional/normas , Genes , Proyectos de Investigación , Programas Informáticos , Animales , Humanos , Ratones , Análisis de Secuencia por Matrices de Oligonucleótidos/clasificación , Programas Informáticos/clasificación , Programas Informáticos/normas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...