Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Bioinformatics ; 20(4): 452-9, 2004 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-14990441

RESUMEN

MOTIVATION: Gene expression array technology has become increasingly widespread among researchers who recognize its numerous promises. At the same time, bench biologists and bioinformaticians have come to appreciate increasingly the importance of establishing a collaborative dialog from the onset of a study and of collecting and exchanging detailed information on the many experimental and computational procedures using a structured mechanism. This is crucial for adequate analyses of this kind of data. RESULTS: The RNA Abundance Database (RAD; http://www.cbil.upenn.edu/RAD) provides a comprehensive MIAME-supportive infrastructure for gene expression data management and makes extensive use of ontologies. Specific details on protocols, biomaterials, study designs, etc. are collected through a user-friendly suite of web annotation forms. Software has been developed to generate MAGE-ML documents to enable easy export of studies stored in RAD to any other database accepting data in this format (e.g. ArrayExpress). RAD is part of a more general Genomics Unified Schema (http://www.gusdb.org), which includes a richly annotated gene index (http://www.allgenes.org), thus providing a platform that integrates genomic and transcriptomic data from multiple organisms. This infrastructure enables a large variety of queries that incorporate visualization and analysis tools and have been tailored to serve the specific needs of projects focusing on particular organisms or biological systems.


Asunto(s)
Indización y Redacción de Resúmenes/métodos , Sistemas de Administración de Bases de Datos , Bases de Datos de Ácidos Nucleicos , Documentación/métodos , Perfilación de la Expresión Génica/métodos , Almacenamiento y Recuperación de la Información/métodos , ARN/química , ARN/genética , Internet , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , ARN/clasificación , ARN/metabolismo , Programas Informáticos , Interfaz Usuario-Computador
2.
Biotechnol Bioeng ; 84(7): 795-800, 2003 Dec 30.
Artículo en Inglés | MEDLINE | ID: mdl-14708120

RESUMEN

Gene expression microarrays are a relatively new technology, dating back just a few years, yet they have already become a very widely used tool in biology, and have evolved to a wide range of applications well beyond their original design intent. However, while the use of microarrays has expanded, and the issues of performance optimization have been intensively studied, the fundamental issue of data integrity management has largely been ignored. Now that performance has improved so greatly, the shortcomings of data integrity control methods constitute a greater percent of the stumbling blocks for investigators. Microarray data are cumbersome, and the rule up to this point has mostly been one of hands-on transformations, leading to human errors which often have dramatic consequences. We show in this review that the time lost on such mistakes is enormous and dramatically affects results; therefore, mistakes should be mitigated in any way possible. We outline the scope of the data integrity issue, to survey some of the most common and dangerous data transformations, and their shortcomings. To illustrate, we review some case studies. We then look at the work done by the research community on this issue (which admittedly is meager up to this point). Some data integrity issues are always going to be difficult, while others will become easier-one of our goals is to expedite the use of integrity control methods. Finally, we present some preliminary guidelines and some specific approaches that we believe should be the focus of future research.


Asunto(s)
Algoritmos , Sistemas de Administración de Bases de Datos , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Artefactos , Interpretación Estadística de Datos , Perfilación de la Expresión Génica/normas , Análisis de Secuencia por Matrices de Oligonucleótidos/normas , Control de Calidad , Estándares de Referencia , Reproducibilidad de los Resultados , Alineación de Secuencia/normas , Análisis de Secuencia de ADN/normas
3.
Bioinformatics ; 17(4): 300-8, 2001 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-11301298

RESUMEN

MOTIVATION AND RESULTS: A relational schema is described for capturing highly parallel gene expression experiments using different technologies. This schema grew out of efforts to build a database for collaborators working on different biological systems and using different types of platforms in their gene expression experiments as well as different types of image quantification software. The tables are conceptually organized into three categories of information: Platform, Experiment (which includes image scanning and quantification), and Data. The strengths of the schema are: (i) integrating information on array elements using a gene index; (ii) describing samples using ontologies; (iii) reducing an experiment to a single RNA source for precise descriptions yet not losing the relationships between experiments done at the same time or for the same project; and (iv) maintaining both raw and processed (e.g. cleansed and normalized) data and recording how the data is processed. The result is a novel schema, which can hold both array and non-array data, is extensible for detailed experimental descriptions that are precise and consistent, and allows for meaningful comparisons of genes between experiments.


Asunto(s)
Bases de Datos Factuales , Expresión Génica , Análisis de Secuencia por Matrices de Oligonucleótidos
4.
Bioinformatics ; 16(8): 685-98, 2000 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-11099255

RESUMEN

MOTIVATION: A protocol is described to attach expression patterns to genes represented in a collection of hybridization array experiments. Discrete values are used to provide an easily interpretable description of differential expression. Binning cutoffs for each sample type are chosen automatically, depending on the desired false-positive rate for the predictions of differential expression. Confidence levels are derived for the statement that changes in observed levels represent true changes in expression. We have a novel method for calculating this confidence, which gives better results than the standard methods. Our method reflects the broader change of focus in the field from studying a few genes with many replicates to studying many (possibly thousands) of genes simultaneously, but with relatively few replicates. Our approach differs from standard methods in that it exploits the fact that there are many genes on the arrays. These are used to estimate for each sample type an appropriate distribution that is employed to control the false-positive rate of the predictions made. Satisfactory results can be obtained using this method with as few as two replicates. RESULTS: The method is illustrated through applications to macroarray and microarray datasets. The first is an erythroid development dataset that we have generated using nylon filter arrays. Clones for genes whose expression is known in these cells were assigned expression patterns which are in accordance with what was expected and which are not picked up by the standards methods. Moreover, genes differentially expressed between normal and leukemic cells were identified. These included genes whose expression was altered upon induction of the leukemic cells to differentiate. The second application is to the microarray data by Alizadeh et al. (2000). Our results are in accordance with their major findings and offer confidence measures for the predictions made. They also provide new insights for further analysis.


Asunto(s)
Bases de Datos Factuales , Perfilación de la Expresión Génica , Análisis de Secuencia por Matrices de Oligonucleótidos , Algoritmos , Humanos , Leucemia Eritroblástica Aguda/genética , Nylons , Células Tumorales Cultivadas
5.
Ann Hum Genet ; 63(Pt 5): 441-54, 1999 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-10735585

RESUMEN

Direct identity-by-descent mapping is a technique for narrowing down the location of the gene or genes responsible for a given genetic disease to small segments of the genome. The technique involves DNA comparisons between pairs of affected individuals. The data generated are in the form of matching segments of the genome, representing regions likely to be identical-by-descent (IBD). Regions in the genome over which there are significantly more segments aligned than is expected by chance are taken as candidate regions for the disease gene or genes. Due to the complex geometric nature of the data, significance testing involves certain mathematical difficulties. We present here a new method for measuring this significance. This method introduces a novel statistic and is appropriate whether or not the relationships between the paired individuals are known. We give examples that we have calculated by implementing this method, including an application to real data.


Asunto(s)
Mapeo Cromosómico/métodos , Genoma , Modelos Estadísticos , Análisis de Secuencia por Matrices de Oligonucleótidos , Análisis de Secuencia de ADN
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...