PMC text mining subset in BioC: about three million full-text articles and growing.
Bioinformatics
; 35(18): 3533-3535, 2019 09 15.
Article
en En
| MEDLINE
| ID: mdl-30715220
MOTIVATION: Interest in text mining full-text biomedical research articles is growing. To facilitate automated processing of nearly 3 million full-text articles (in PubMed Central® Open Access and Author Manuscript subsets) and to improve interoperability, we convert these articles to BioC, a community-driven simple data structure in either XML or JavaScript Object Notation format for conveniently sharing text and annotations. RESULTS: The resultant articles can be downloaded via both File Transfer Protocol for bulk access and a Web API for updates or a more focused collection. Since the availability of the Web API in 2017, our BioC collection has been widely used by the research community. AVAILABILITY AND IMPLEMENTATION: https://www.ncbi.nlm.nih.gov/research/bionlp/APIs/BioC-PMC/.
Texto completo:
1
Bases de datos:
MEDLINE
Asunto principal:
Minería de Datos
Idioma:
En
Revista:
Bioinformatics
Asunto de la revista:
INFORMATICA MEDICA
Año:
2019
Tipo del documento:
Article
País de afiliación:
Estados Unidos