Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 1 de 1
Filtrar
Más filtros

Banco de datos
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Pac Symp Biocomput ; : 391-402, 2007.
Artículo en Inglés | MEDLINE | ID: mdl-17992751

RESUMEN

Today, digitization of legacy literature is a big issue. This also applies to the domain of biosystematics, where this process has just started. Digitized biosystematics literature requires a very precise and fine grained markup in order to be useful for detailed search, data linkage and mining. However, manual markup on sentence level and below is cumbersome and time consuming. In this paper, we present and evaluate the GoldenGATE editor, which is designed for the special needs of marking up OCR output with XML. It is built in order to support the user in this process as far as possible: Its functionality ranges from easy, intuitive tagging through markup conversion to dynamic binding of configurable plug-ins provided by third parties. Our evaluation shows that marking up an OCR document using GoldenGATE is three to four times faster than with an off-the-shelf XML editor like XML-Spy. Using domain-specific NLP-based plug-ins, these numbers are even higher.


Asunto(s)
Procesamiento de Imagen Asistido por Computador , Biología de Sistemas , Biología Computacional , Bibliotecas Digitales , Bibliotecas Médicas , Literatura , Procesamiento de Lenguaje Natural , Publicaciones Periódicas como Asunto , Edición
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA