RESUMO
An international consortium is sequencing the euchromatic genespace of Medicago truncatula. Extensive bioinformatic and database resources support the marker-anchored bacterial artificial chromosome (BAC) sequencing strategy. Existing physical and genetic maps and deep BAC-end sequencing help to guide the sequencing effort, while EST databases provide essential resources for genome annotation as well as transcriptome characterization and microarray design. Finished BAC sequences are joined into overlapping sequence assemblies and undergo an automated annotation process that integrates ab initio predictions with EST, protein, and other recognizable features. Because of the sequencing project's international and collaborative nature, data production, storage, and visualization tools are broadly distributed. This paper describes databases and Web resources for the project, which provide support for physical and genetic maps, genome sequence assembly, gene prediction, and integration of EST data. A central project Web site at medicago.org/genome provides access to genome viewers and other resources project-wide, including an Ensembl implementation at medicago.org, physical map and marker resources at mtgenome.ucdavis.edu, and genome viewers at the University of Oklahoma (www.genome.ou.edu), the Institute for Genomic Research (www.tigr.org), and Munich Information for Protein Sequences Center (mips.gsf.de).
Assuntos
Bases de Dados Genéticas , Genoma de Planta , Internet , Medicago truncatula/genética , Transcrição Gênica , Sequência de Bases , Cromossomos Artificiais BacterianosRESUMO
In order to identify the genes and gene functions that underlie key aspects of legume biology, researchers have selected the cool season legume Medicago truncatula (Mt) as a model system for legume research. A set of >170 000 Mt ESTs has been assembled based on in-depth sampling from various developmental stages and pathogen-challenged tissues. MtDB is a relational database that integrates Mt transcriptome data and provides a wide range of user-defined data mining options. The database is interrogated through a series of interfaces with 58 options grouped into two filters. In addition, the user can select and compare unigene sets generated by different assemblers: Phrap, Cap3 and Cap4. Sequence identifiers from all public Mt sites (e.g. IDs from GenBank, CCGB, TIGR, NCGR, INRA) are fully cross-referenced to facilitate comparisons between different sites, and hypertext links to the appropriate database records are provided for all queries' results. MtDB's goal is to provide researchers with the means to quickly and independently identify sequences that match specific research interests based on user-defined criteria. The underlying database and query software have been designed for ease of updates and portability to other model organisms. Public access to the database is at http://www.medicago.org/MtDB.