Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
BMC Dev Biol ; 11: 15, 2011 Mar 12.
Artículo en Inglés | MEDLINE | ID: mdl-21396121

RESUMEN

BACKGROUND: The production of nephrons suddenly ends in mice shortly after birth when the remaining cells of the multi-potent progenitor mesenchyme begin to differentiate into nephrons. We exploited this terminal wave of nephron production using both microarrays and RNA-Seq to serially evaluate gene transcript levels in the progenitors. This strategy allowed us to define the changing gene expression states following induction and the onset of differentiation after birth. RESULTS: Microarray and RNA-Seq studies of the progenitors detected a change in the expression profiles of several classes of genes early after birth. One functional class, a class of genes associated with cellular proliferation, was activated. Analysis of proliferation with a nucleotide analog demonstrated in vivo that entry into the S-phase of the cell cycle preceded increases in transcript levels of genetic markers of differentiation. Microarrays and RNA-Seq also detected the onset of expression of markers of differentiation within the population of progenitors prior to detectable Six2 repression. Validation by in situ hybridization demonstrated that the markers were expressed in a subset of Six2 expressing progenitors. Finally, the studies identified a third set of genes that provide indirect evidence of an altered cellular microenvironment of the multi-potential progenitors after birth. CONCLUSIONS: These results demonstrate that Six2 expression is not sufficient to suppress activation of genes associated with growth and differentiation of nephrons. They also better define the sequence of events after induction and suggest mechanisms contributing to the rapid end of nephron production after birth in mice.


Asunto(s)
Regulación del Desarrollo de la Expresión Génica , Proteínas de Homeodominio/genética , Nefronas/crecimiento & desarrollo , Nefronas/metabolismo , Proteínas Nucleares/genética , Transactivadores/genética , Factores de Transcripción/genética , Animales , Proteínas Reguladoras de la Apoptosis , Secuencia de Bases , Ciclo Celular , Diferenciación Celular , Proliferación Celular , Citometría de Flujo , Glucólisis , Proteínas Fluorescentes Verdes , Hibridación in Situ , Ratones , Ratones Transgénicos , Análisis por Micromatrices , Nefronas/citología , ARN/genética , Análisis de Secuencia de ARN , Células Madre/metabolismo
2.
Bioinformatics ; 26(16): 1945-9, 2010 Aug 15.
Artículo en Inglés | MEDLINE | ID: mdl-20616384

RESUMEN

MOTIVATION: Splice variation plays important roles in evolution and cancer. Different splice variants of a gene may be characteristic of particular cellular processes, subcellular locations or organs. Although several genomic projects have identified splice variants, there have been no large-scale computational studies of the relationship between number of splice variants and biological function. The Gene Ontology (GO) and tools for leveraging GO, such as GoMiner, now make such a study feasible. RESULTS: We partitioned genes into two groups: those with numbers of splice variants b (b=1,..., 10). Then we used GoMiner to determine whether any GO categories are enriched in genes with particular numbers of splice variants. Since there was no a priori 'appropriate' partition boundary, we studied those 'robust' categories whose enrichment did not depend on the selection of a particular partition boundary. Furthermore, because the distribution of splice variant number was a snapshot taken at a particular point in time, we confirmed that those observations were stable across successive builds of GenBank. A small number of categories were found for genes in the lower partitions. A larger number of categories were found for genes in the higher partitions. Those categories were largely associated with cell death and signal transduction. Apoptotic genes tended to have a large repertoire of splice variants, and genes with splice variants exhibited a distinctive 'apoptotic island' in clustered image maps (CIMs). AVAILABILITY: Supplementary tables and figures are available at URL http://discover.nci.nih.gov/OG/supplementaryMaterials.html. The Safari browser appears to perform better than Firefox for these particular items.


Asunto(s)
Empalme Alternativo , Genómica/métodos , Análisis por Conglomerados , Bases de Datos de Ácidos Nucleicos , Genes , Variación Genética , Genoma , Humanos , Transducción de Señal/genética , Programas Informáticos
3.
Front Neuroendocrinol ; 29(3): 428-44, 2008 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-18295320

RESUMEN

Trans-generational epigenetic phenomena, such as contamination with endocrine-disrupting chemicals (EDCs) that decrease fertility and the global methylation status of DNA in the offspring, are of great concern because they may affect health, particularly the health of children. However, of even greater concern is the possibility that trans-generational changes in the methylation status of the DNA might lead to permanent changes in the DNA sequence itself. By contaminating the environment with EDCs, mankind might be permanently affecting the health of future generations. In this section, we present evidence from our laboratory and others that trans-generational epigenetic changes in DNA might lead to mutations directed to genes encoding amino acid repeat-containing proteins (RCPs) that are important for adaptive evolution or cancer progression. Such epigenetic changes can be induced "naturally" by hormones or "unnaturally" by EDCs or environmental stress. To illustrate the phenomenon, we present new bioinformatic evidence that the only RCP ontological categories conserved from Drosophila to humans are "regulation of splicing," "regulation of transcription," and "regulation of synaptogenesis," which are classes of genes likely to be important for evolutionary processes. Based on that and other evidence, we propose a model for evolution that we call the EDGE (Epigenetically Directed Genetic Errors) hypothesis for the mechanism by which mutations are targeted at epigenetically modified "contingency genes" encoding RCPs. In the model, "epigenetic assimilation" of metastable epialleles of RCPs over many generations can lead to mutations directed to those genes, thereby permanently stabilizing the adaptive phenotype.


Asunto(s)
Evolución Biológica , Epigénesis Genética , Modelos Teóricos , Neoplasias/fisiopatología , Sistemas Neurosecretores/fisiología , Secuencias Repetitivas de Aminoácido/genética , Transducción de Señal/fisiología , Animales , Cruzamiento , Disruptores Endocrinos/metabolismo , Humanos , Mutación , Fenotipo , Filogenia
4.
Proteins ; 71(4): 1930-9, 2008 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-18186470

RESUMEN

There is substantial interest in methods designed to predict the effect of nonsynonymous single nucleotide polymorphisms (nsSNPs) on protein function, given their potential relationship to heritable diseases. Current state-of-the-art supervised machine learning algorithms, such as random forest (RF), train models that classify single amino acid mutations in proteins as either neutral or deleterious to function. However, it is frequently the case that the functional effect of a polymorphism on a protein resides between these two extremes. The utilization of classifiers that incorporate fuzzy logic provides a natural extension in order to account for the spectrum of possible functional consequences. We generated a dataset of single amino acid substitutions in human proteins having known three-dimensional structures. Each variant was uniquely represented as a feature vector that included computational geometry and knowledge-based statistical potential predictors obtained though application of Delaunay tessellation of protein structures. Additional attributes consisted of physicochemical properties of the native and replacement amino acids as well as topological location of the mutated residue position in the solved structure. Classification performance of the RF algorithm was evaluated on a training set consisting of the disease-associated and neutral nsSNPs taken from our dataset, and attributes were ranked according to their relative importance. Similarly, we evaluated the performance of adaptive neuro-fuzzy inference system (ANFIS). The utility of statistical geometry predictors was compared with that of traditional structural and evolutionary attributes employed by other researchers, revealing an equally effective yet complementary methodology. Among all attributes in our feature set, the statistical geometry predictors were found to be the most highly ranked. On the basis of the AUC (area under the ROC curve) measure of performance, the ANFIS and RF models were equally effective when only statistical geometry features were utilized. Tenfold cross-validation studies evaluating AUC, balanced error rate (BER), and Matthew's correlation coefficient (MCC) showed that our RF model was at least comparable with the well-established methods of SIFT and PolyPhen. The trained RF and ANFIS models were each subsequently used to predict the disease potential of human nsSNPs in our dataset that are currently unclassified (http://rna.gmu.edu/FuzzySnps/).


Asunto(s)
Árboles de Decisión , Lógica Difusa , Polimorfismo de Nucleótido Simple/genética , Proteínas/química , Proteínas/genética , Algoritmos , Secuencia de Aminoácidos , Sustitución de Aminoácidos , Área Bajo la Curva , Inteligencia Artificial , Distribución de Chi-Cuadrado , Biología Computacional/métodos , Bases de Datos Factuales , Humanos , Interacciones Hidrofóbicas e Hidrofílicas , Modelos Estadísticos , Datos de Secuencia Molecular , Redes Neurales de la Computación , Filogenia , Valor Predictivo de las Pruebas , Conformación Proteica , Estructura Secundaria de Proteína , Estructura Terciaria de Proteína , Curva ROC , Reproducibilidad de los Resultados , Homología de Secuencia de Aminoácido
5.
BMC Bioinformatics ; 8: 75, 2007 Mar 05.
Artículo en Inglés | MEDLINE | ID: mdl-17338820

RESUMEN

BACKGROUND: There are many fewer genes in the human genome than there are expressed transcripts. Alternative splicing is the reason. Alternatively spliced transcripts are often specific to tissue type, developmental stage, environmental condition, or disease state. Accurate analysis of microarray expression data and design of new arrays for alternative splicing require assessment of probes at the sequence and exon levels. DESCRIPTION: SpliceMiner is a web interface for querying Evidence Viewer Database (EVDB). EVDB is a comprehensive, non-redundant compendium of splice variant data for human genes. We constructed EVDB as a queryable implementation of the NCBI Evidence Viewer (EV). EVDB is based on data obtained from NCBI Entrez Gene and EV. The automated EVDB build process uses only complete coding sequences, which may or may not include partial or complete 5' and 3' UTRs, and filters redundant splice variants. Unlike EV, which supports only one-at-a-time queries, SpliceMiner supports high-throughput batch queries and provides results in an easily parsable format. SpliceMiner maps probes to splice variants, effectively delineating the variants identified by a probe. CONCLUSION: EVDB can be queried by gene symbol, genomic coordinates, or probe sequence via a user-friendly web-based tool we call SpliceMiner (http://discover.nci.nih.gov/spliceminer). The EVDB/SpliceMiner combination provides an interface with human splice variant information and, going beyond the very valuable NCBI Evidence Viewer, supports fluent, high-throughput analysis. Integration of EVDB information into microarray analysis and design pipelines has the potential to improve the analysis and bioinformatic interpretation of gene expression data, for both batch and interactive processing. For example, whenever a gene expression value is recognized as important or appears anomalous in a microarray experiment, the interactive mode of SpliceMiner can be used quickly and easily to check for possible splice variant issues.


Asunto(s)
Empalme Alternativo , Bases de Datos de Ácidos Nucleicos , Variación Genética/genética , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Programas Informáticos , Genoma Humano/genética , Humanos , National Library of Medicine (U.S.) , Análisis de Secuencia por Matrices de Oligonucleótidos/instrumentación , Estados Unidos
6.
BMC Genomics ; 6: 140, 2005 Oct 06.
Artículo en Inglés | MEDLINE | ID: mdl-16209714

RESUMEN

BACKGROUND: As a result of high-throughput genotyping methods, millions of human genetic variants have been reported in recent years. To efficiently identify those with significant biological functions, a practical strategy is to concentrate on variants located in important sequence regions such as gene regulatory regions. RESULTS: Analysis of the most common type of variant, single nucleotide polymorphisms (SNPs), shows that in gene promoter regions more SNPs occur in close proximity to transcriptional start sites than in regions further upstream, and a disproportionate number of those SNPs represent nucleotide transversions. Additionally, the number of SNPs found in the predicted transcription factor binding sites is higher than in non-binding site sequences. CONCLUSION: Current information about transcription factor binding site sequence patterns may not be exhaustive, and SNPs may be actively involved in influencing gene expression by affecting the transcription factor binding sites.


Asunto(s)
Genómica/métodos , Modelos Genéticos , Polimorfismo de Nucleótido Simple , Sitios de Unión , Mapeo Contig , Bases de Datos Genéticas , Regulación de la Expresión Génica , Humanos , Polimorfismo Genético , Regiones Promotoras Genéticas , Secuencias Reguladoras de Ácidos Nucleicos , Análisis de Secuencia de ADN , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Transcripción Genética
7.
Hum Mutat ; 26(5): 471-6, 2005 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-16200641

RESUMEN

The ability to predict the effect of nonsynonymous SNPs (nsSNPs) on protein function is important for the success of genetic disease association studies. Here we present a statistical geometry approach to nsSNP classification based on Delaunay tessellation, whereby the impact of nsSNPs on protein function is correlated with the change in the four-body statistical potential (DeltaQ) of the protein caused by the amino acid substitution. We observed that the DeltaQ of polymorphic proteins with disease-associated nsSNPs (daSNPs) was on average significantly lower than the DeltaQ of the proteins with neutral SNPs (ntSNPs). Clustering amino acid substitutions into conservative and nonconservative groups, and using a three-letter alphabet based on side-chain polarity showed significantly lower DeltaQ in nonconservative changes to daSNPs and when hydrophobic residues were substituted by charged or by polar residues. We also found that the daSNPs in the protein core caused much lower DeltaQ than surface daSNPs. This approach demonstrates a strong correlation between the computed DeltaQ and SNP classification. Integration of our approach with the existing models will help achieve a more precise recognition of nsSNPs that underlie polygenic diseases. All of the programs were written in Java and are available from the authors upon request.


Asunto(s)
Biología Computacional/métodos , Predisposición Genética a la Enfermedad , Polimorfismo de Nucleótido Simple/fisiología , Sustitución de Aminoácidos , Interpretación Estadística de Datos , Humanos , Cómputos Matemáticos , Mutación , Conformación Proteica , Proteínas/química , Proteínas/genética
8.
J Biomed Biotechnol ; 2005(2): 181-8, 2005 Jun 30.
Artículo en Inglés | MEDLINE | ID: mdl-16046824

RESUMEN

Gene expression databases contain a wealth of information, but current data mining tools are limited in their speed and effectiveness in extracting meaningful biological knowledge from them. Online analytical processing (OLAP) can be used as a supplement to cluster analysis for fast and effective data mining of gene expression databases. We used Analysis Services 2000, a product that ships with SQLServer2000, to construct an OLAP cube that was used to mine a time series experiment designed to identify genes associated with resistance of soybean to the soybean cyst nematode, a devastating pest of soybean. The data for these experiments is stored in the soybean genomics and microarray database (SGMD). A number of candidate resistance genes and pathways were found. Compared to traditional cluster analysis of gene expression data, OLAP was more effective and faster in finding biologically meaningful information. OLAP is available from a number of vendors and can work with any relational database management system through OLE DB.

10.
Curr Protoc Bioinformatics ; Chapter 9: Unit9.2, 2003 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-18428713

RESUMEN

Relational databases provide the most common platform for storing data. The Structured Query Language (SQL) is a powerful tool for interacting with relational database systems. SQL enables the user to concoct complex and powerful queries in a straightforward manner, allowing sophisticated data analysis using simple syntax and structure. This unit demonstrates how to use the MySQL package to build and interact with a relational database.


Asunto(s)
Sistemas de Administración de Bases de Datos , Bases de Datos Factuales , Almacenamiento y Recuperación de la Información/métodos , Lenguajes de Programación , Programas Informáticos , Terminología como Asunto , Interfaz Usuario-Computador
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...