RESUMO
Comprehensive and structured annotations for all genes on a microarray chip are essential for the interpretation of its expression data. Currently, most chip gene annotations are one-line free text descriptions that are often partial, outdated and unsuitable for large-scale data analysis. Therefore the interpretation of microarray gene expression clusters is often limited. Although researchers can manually navigate a collection of databases for better annotations, it is only practical for limited number of genes. Existing meta-databases fail to provide comprehensive categorized annotations for hundreds of genes simultaneously. We have developed an automatic system to address this issue. GeneView system monitors various data sources, extracts gene information from a source whenever it is updated, comprehensively matches genes, and integrates them into a central database by categories, such as pathway, genetic mapping, phenotype, expression profile, domain structure, protein interaction, disease association, and references. The system consists of four major components: (1) relational database; (2) data processing; (3) user curation; (4) data query. We evaluated it by analyzing genes on cDNA and Affymetrix Oligo chips. In both cases, the system provided more accurate and comprehensive information than those provided by the vendors or the chip users, and helped identify new common functions among genes in the same expression clusters.