Application of text mining to the development and validation of a geographic search filter to facilitate evidence retrieval in Ovid MEDLINE: An example from the United States.
Health Info Libr J
; 40(2): 169-180, 2023 Jun.
Article
en En
| MEDLINE
| ID: mdl-36541200
ABSTRACT
BACKGROUND:
Given the increasing volume of published research in bibliographic databases, efficient retrieval of evidence is crucial and represents an opportunity to integrate novel techniques such as text mining.OBJECTIVES:
To develop and validate a geographic search filter for identifying research from the United States (US) in Ovid MEDLINE.METHODS:
US and non-US citations were collected from bibliographies of evidence-based reviews. Citations were partitioned by US/non-US status and randomly divided to a training and testing set. Using text mining, common one- and two-word terms in title/abstract fields were identified, and frequencies compared between US/non-US citations.RESULTS:
Common US-related terms included (as ratio of frequency in US/non-US citations) US populations and geographic terms [e.g., 'Americans' (15.5), 'Baltimore' (20.0)]. Common non-US terms were non-US geographic terms [e.g., 'Japan' (0.04), 'French' (0.05)]. A search filter was developed with 98.3% sensitivity and 82.7% specificity.DISCUSSION:
This search filter will streamline the identification of evidence from the US. Periodic updates may be necessary to reflect changes in MEDLINE's controlled vocabulary.CONCLUSION:
Text mining was instrumental to the development of this search filter. A novel technique generated a gold standard set comprising >20,000 citations. This method may be adapted to develop subsequent geographic search filters.Palabras clave
Texto completo:
1
Colección:
01-internacional
Banco de datos:
MEDLINE
Asunto principal:
Minería de Datos
Límite:
Humans
País/Región como asunto:
America do norte
Idioma:
En
Revista:
Health Info Libr J
Asunto de la revista:
INFORMATICA MEDICA
/
SERVICOS DE SAUDE
Año:
2023
Tipo del documento:
Article
País de afiliación:
Canadá