Normalization of gene/protein names in biological literatures using Vector-Space Model.
Annu Int Conf IEEE Eng Med Biol Soc
; 2007: 390-3, 2007.
Article
en En
| MEDLINE
| ID: mdl-18001972
ABSTRACT
As the number of biological literatures grows exponentially, needs for text mining system are increased. In text mining area, normalization is mapping gene/protein names to a database. It is necessary to combine extracted information from various literatures and to create a database or an ontology using literatures. Previous normalization researches used direct comparison methods between a database and literatures, but it is weak to extremely variational gene/protein names in literatures. Therefore, in this paper, we propose a normalization method using Vector-Space Model. For each gene/protein name, we rank identifiers using Vector-Space Model, and find the most similar identifier with the name. Experimental result shows the proposed method has 70.7% f-measure.
Buscar en Google
Colección:
01-internacional
Base de datos:
MEDLINE
Asunto principal:
Proteínas
/
Bases de Datos Genéticas
/
Genes
/
Modelos Teóricos
/
Terminología como Asunto
Idioma:
En
Revista:
Annu Int Conf IEEE Eng Med Biol Soc
Año:
2007
Tipo del documento:
Article